تعداد نشریات | 26 |
تعداد شمارهها | 550 |
تعداد مقالات | 5,705 |
تعداد مشاهده مقاله | 7,969,931 |
تعداد دریافت فایل اصل مقاله | 5,351,952 |
GENERATING FUZZY RULES FOR PROTEIN CLASSIFICATION | ||
Iranian Journal of Fuzzy Systems | ||
مقاله 3، دوره 5، شماره 2، تابستان 2008، صفحه 21-33 اصل مقاله (158.15 K) | ||
نوع مقاله: Research Paper | ||
شناسه دیجیتال (DOI): 10.22111/ijfs.2008.325 | ||
نویسندگان | ||
EGHBAL G. MANSOORI ![]() | ||
1COMPUTER SCIENCE AND ENGINEERING DEPARTMENT, COLLEGE OF ENGINEERING, SHIRAZ UNIVERSITY, SHIRAZ, IRAN | ||
2COMPUTER SCIENCE AND ENGINEERING DEPARTMENT, COLLEGE OF ENGINEERING, SHIRAZ UNIVERSITY, SHIRAZ, IRAN | ||
3BIOLOGY DEPARTMENT, COLLEGE OF SCIENCE, SHIRAZ UNIVERSITY, SHIRAZ, IRAN | ||
چکیده | ||
This paper considers the generation of some interpretable fuzzy rules for assigning an amino acid sequence into the appropriate protein superfamily. Since the main objective of this classifier is the interpretability of rules, we have used the distribution of amino acids in the sequences of proteins as features. These features are the occurrence probabilities of six exchange groups in the sequences. To generate the fuzzy rules, we have used some modified versions of a common approach. The generated rules are simple and understandable, especially for biologists. To evaluate our fuzzy classifiers, we have used four protein superfamilies from UniProt database. Experimental results show the comprehensibility of generated fuzzy rules with comparable classification accuracy. | ||
کلیدواژهها | ||
Amino acid sequence؛ Protein classification؛ Fuzzy rule-based classifier | ||
مراجع | ||
[1]R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A. I. Verkamo, Fast discovery of association rules, in U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy,Advances in Knowledge Discovery and Data Mining , AAAI Press, 1996.
[2]S. F. Altschul, T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller and D. J.Lipman, Gapped blast and PSI-blast: A new generation of protein database search programs,Nucleic Acids Research,
25 (17) (1997), 3389-3402.
[3]S. Bandyopadhyay, An efficient technique for superfamily classification of amino acid sequences: feature extraction, fuzzy clustering and prototype selection , Fuzzy Sets andSystems,152 (2005), 5-16.
[4]A. Baxevanis and F.B.F. Ouellette, Bioinformatics: A practical guide to the analysis of genes and proteins, Wiley, New York, 1998. [5]M. O. Dayhoff, R. M. Schwartz and B. C. Orcutt, A model of evolutionary change in proteins, Atlas of Protein Sequence and Structure, 5 (1978), 345-352. [6]L. French, A. Ngom and L. Rueda, Fast protein superfamily classification using principal component null space analysis, Proc. 18th Canadian Conference Artificial Intelligence, Victoria, Canada, (2005), 158-169. [7]A. Gonzalez and R. Perez, SLAVE: A genetic learning system based on an iterative approach, IEEE Trans. Fuzzy Systems, 7 (2) (1999), 176-191. [8]H. Ishibuchi, T. Nakashima and T. Morisawa, Voting in fuzzy rule-based systems for pattern classification problems, Fuzzy Sets and Systems, 103 (2) (1999), 223-238. [9]H. Ishibuchi, K. Nozaki, and H. Tanaka, Distributed representation of fuzzy rules and its application to pattern classification, Fuzzy Sets and Systems, 52 (1) (1992), 21-32. [10]H. Ishibuchi and T. Yamamoto, Comparison of heuristic criteria for fuzzy rule selection in classification problems, Fuzzy Optimization and Decision Making, 3 (2) (2004), 119-139. [11]H. Ishibuchi and T. Yamamoto, Rule weight specification in fuzzy rule-based classification systems, IEEE Trans. Fuzzy Systems, 13 (4) (2005), 428-435. [12]T. Jaakkola, M. Diekhans and D. Haussler, A discriminative framework for detecting remote protein homologiesJournal of Computational Biology, 2000. [13]C. Leslie, E. Eskin and W.S. Noble, The spectrum kernel: a string kernel for SVM protein classification, Pac. Symp. Biocomputing, (2002), 564-575. [14]M. Madera and J. Gough, A comparison of profile hidden Markov model procedures for remote homology detectionNucleic Acids Res., 30 (2002), 4321–4328. [15]E. G. Mansoori, M. J. Zolghadri and S. D. Katebi, A weighting function for improving fuzzy classification systems performance, Fuzzy Sets and Systems, 158 (5) (2007), 583-591. [16]E. G. Mansoori, M. J. Zolghadri and S. D. Katebi, Using distribution of data to enhance performance of fuzzy classification systems, Iranian Journal of Fuzzy Systems, 4 (1) (2007),21-36. [17]R. Mikut, J. Jäkel and L. Gröll, Interpretability issues in data-based learning of fuzzy systems, Fuzzy Sets and Systems,150 (2005), 179-197. [18]W. Pedrycz, Why triangular membership functions?, Fuzzy Sets and Systems, 64 (1) (1994), 21-30. [19]J. R. Quinlan, Improved use of continuous attributes in C4.5, Journal of Artificial Intelligence Research,4 (1996), 77-90. [20]J. A. Roubos, M. Setnes and J. Abonyi, Learning fuzzy classification rules from labeled data, IEEE Trans. Fuzzy Systems,8 (5) (2001), 509-522.
[21]The UniProt Consortium, The Universal Protein Resource (UniProt), Nucleic Acids Research,5 (2007), D193-D197. [22]D. Wang and G. Huang, Protein sequence classification using extreme learning machine, Proc. Int. Joint Conf. Neural Networks, Canada, 2005.
[23]D. Wang, N. K. Lee and T. S. Dillon, Extraction and optimization of fuzzy protein sequences classification rules using GRBF neural networks, Neural Information Processing - Letters and Reviews,1 (1) (2003), 53-59. [24]J. T. L. Wang, Q. C. Ma, D. Shasha and C. H. Wu, New techniques for extracting features from protein sequences, IBM Systems Journal, 40 (2) (2001), 426-441. [25]C. H. Wu and J. W. McLarty, Neural Networks and Genome Informatics, Elsevier, Amsterdam, (2000).
[26]M. J. Zolghadri and E. G. Mansoori, Weighting fuzzy classification rules using Receiver Operating Characteristics (ROC) analysis, Information Sciences, 177 (11) (2007), 2296-2307.
| ||
آمار تعداد مشاهده مقاله: 2,463 تعداد دریافت فایل اصل مقاله: 1,274 |