Experimental Study on Performance of Symbolic Classifier with Gene selection Methods for Multiclass Microarray Gene Expression Data

Sheela T


Microarray is a useful technique for measuring expression data of thousands of genes simultaneously. The expression level of genes is known to contain the keys to address fundamental problems relating to the prevention and cure of diseases, biological evolution mechanisms and drug discovery. Previous research has demonstrated that this technology can be useful in the classification of cancers. Most proposed cancer classification methods work well only on binary class problems and not extensible to multi-class problems. This work is an attempt to classify high dimensional, multiclass Microarray Gene expression data using symbolic classifier.


Microarray Gene Expression; cancer classification; binary class data; high dimensional data; multiclass data; symbolic classifier.

Full Text:



Ahmed, O., and Brifcani, A. (2019, April). Gene Expression Classification Based on Deep Learning. 4th Scientific International Conference Najaf (SICN) pp. 145-149, 2019.

Alomari, O.A., Khader, A.T., Al-Betar, M.A., Abualigah L.M. MRMR BA: a hybrid gene selection algorithm for cancer classification. J Theor Appl Inf Technol , 95 (12):2610–8, 2017.

Androulakis, I.P. Yang E. Almon, R.R. Analysis of time-series gene expression data: methods, challenges, and opportunities. Annu Rev Biomed Eng., 9:205–228, 2007.

Billard, L. and Diday, E.. Symbolic data analysis:Conceptual statistics and data mining. Wiley series in computational statistics. 2006.

Bock, H.H and Diday, E. Analysis of Symbolic Data. Springer Verlag, 1999.

Breiman L: Random Forests. Machine Learning. 45:5-32, 2001.

Cahyaningrum, K., and Astuti, W. Microarray Gene Expression Classification for Cancer Detection using Artificial Neural Networks and Genetic Algorithm Hybrid Intelligence. International Conference on Data Science and Its Applications (ICoDSA) (pp. 1-7). IEEE, 2020.

Christoph Bartenhagen, Hans-Ulrich Klein, Christian Ruckert, Xiaoyi Jiang and Martin Dugas. Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data, BMC Bioinformatics, 11:567, 2010.

Ding, C., Peng, H. Minimum redundancy feature selection from microarray gene expression data. In:Journal Bioinformatics and Computer Biology, pp.523-529, 2003.

Diday. An introduction to symbolic data analysis and sodas software. Electro. J.Symb. Data Anal. 1-25, 2002.

Golub T.R., Slonim D.K. and Tamayo. Classification of Cancer: Class discovery and Class Prediction by Gene Expression Monitoring. Science. 286, 315-333, 1999.

Hatim Z Almarzouki. Deep-Learning-Based Cancer Profiles Classification Using Gene Expression Data Profile. Journal of Healthcare Engineering, Article ID 4715998, 13 pages, https://doi.org/10.1155/2022/4715998, 2022.

Hedjazi L. Aguilar-Martin J. Le Lann M.-V., et al. Similarity-margin based feature selection for symbolic interval data. Pattern Recognit. Lett. 32:578–585. 2011.

Hedjazi.L, Marie-Veronique Le Lann, Tatiana Kempowsky, Florence Dalenc, Joseph Aguilar-Martin, and Gilles Favre. Symbolic Data Analysis to Defy Low Signal-to-Noise Ratio in Microarray Data for Breast Cancer Prognosis J Comput Biol. 20(8): 610–620. 2013.

Inza I., Larrañaga P., Blanco R., Cerrolaza A.J. Filter versus wrapper gene selection approaches in DNA microarray domains, Artif Intell Med, 31(2):91-103, 2002.

Jeremy D. Scheff,1 Richard R. Almon,2,,3 Debra C. DuBois,2,,3 William J. Jusko,3 and Ioannis P. Androulakis A New Symbolic Representation for the Identification of Informative Genes in Replicated Microarray Experiments OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY Jun 2010; 14(3): 239–248. 2010.

Lai C. M., and Huang H. P. A gene selection algorithm using simplified swarm optimization with multi-filter ensemble technique. Applied Soft Computing, 106994, 2020.

Li S. Random KNN Modeling and Variable Selection for High Dimensional Data. PhD thesis. West Virginia University, 2009.

Lin J. Keogh E. Wei L. Lonardi S. Experiencing SAX: a novel symbolic representation of time series. Data Mining Knowledge Discov. 15:107–144, 2007.

Liu Q, et al. Gene selection and classification for cancer microarray data based on machine learning and similarity measures. BMC Genomics 12(Suppl 5):S1, 2011.

Maniruzzaman M, et al. Statistical characterization and classification of colon microarray gene expression data using multiple machine learning paradigms. Comput Methods Prog Biomed;176:173–93, 2019.

E. Maguire T. Yarmush M.L. Berthiaume F. Androulakis I.P. Bioinformatics analysis of the early inflammatory response in a rat thermal injury model. BMC Bioinformatics. 2007;8:10

Othman M.S., Kumaran S. R., and Yusuf L.M. Gene Selection Using Hybrid Multi-Objective Cuckoo Search Algorithm with Evolutionary Operators for Cancer Microarray Data. IEEE Access, 8, 186348-186361, 2020.

Peng,H.,Long,F.,Ding,C. Feature selection based on mutual information : Criteria of max-dependency, max-relevance and min-redundancy. IEEE Trans Pattern Anal Machine Intell. 27(8), 1226-1238. 2005.

T.Ragunthar, S.Selvakumar. Classification of Gene Expression Data with Optimized Feature Selection. International Journal of Recent Technology and Engineering (IJRTE). ISSN: 2277-3878, Volume-8 Issue-2, July2019.

Saeys Y., Inza I., Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2007.

Statnikov A., Aliferis C.F., Tsamardinos I., Hardin D., Levy, S. A Comprehensive Evaluation of Multicategory Classification Methods for Microarray Gene Expression Cancer Diagnosis. Bioinformatics 21(5), 631–643, 2005.

Whitney A.W. A Direct Method of Nonparametric Measurement Selection. IEEE Trans. Comput., 20, 1100–1103. doi: 10.1109/T-C.1971.223410. 1971.

Xing E., Jordan M., Karp R. Feature selection for high-dimensional genomic microarray data. Proceedings of the 18th International Conference on Machine Learning, 2001.

Zhang X., He T., Ouyang L., Xu X., and Chen S. A Survey of Gene Selection and Classification Techniques Based on Cancer Microarray Data Analysis. IEEE 4th International Conference on Computer and Communications (ICCC) (pp. 1809-1813) IEEE, 2018.

DOI: https://doi.org/10.26483/ijarcs.v13i5.6914


  • There are currently no refbacks.

Copyright (c) 2022 International Journal of Advanced Research in Computer Science