A STUDY AND ANALYSIS OF CLUSTERING ALGORITHMS ON HIV-1 INFECTION MICROARRAY DATASET FOR FINDING CLUSTER WISE COMMON GENES

M S UMA; DR. R. PORKODI

doi:10.26483/ijarcs.v8i5.4049

PDF

Published: Jun 25, 2017

DOI: https://doi.org/10.26483/ijarcs.v8i5.4049

Keywords:

Data mining, Microarray, Preprocessing, Clustering algorithm, Finding common genes cluster wise.

M S UMA

BHARATHIAR UNIVERSITY, COIMBATORE 641046.

DR. R. PORKODI

Abstract

Data mining refers to collecting or mining knowledge from large amounts of data. It is used in various medical applications like tumor clustering, protein structure prediction, gene selection, cancer classification based on microarray data, clustering of gene expression data, statistical model of protein-protein interaction etc. The analyzing the clustering algorithms phase consist of four clustering algorithms namely K-means, Fuzzy câ€“means, Hierarchical algorithm and Partitioning Around Medoids(PAM) on HIV â€“ 1 infection effect on macrophages in vitro time course microarray data set. The clustering algorithms are validated using validation measures and based on internal validation measures such as Dunn index, Dunn index 2, Calinski-Harabasz index and Average Silhouette width, the best clustering algorithm out of 4 is to be identified and finally the proposed research work is also to find common genes present in each cluster produced by the four clustering algorithms.

Downloads

Download data is not yet available.

Issue

Vol. 8 No. 5 (2017): May-June 2017

Section

Articles

COPYRIGHT

Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
The journal allows the author(s) to retain publishing rights without restrictions.
The journal allows the author(s) to hold the copyright without restrictions.

References

Kaufman, L. and Rousseeuw, P.J. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley and Sons, 1990.

Han, Jiawei, Kamber, Micheline (2001), Data mining: concepts and techniques, Morgan Kaufmann. p. 5. ISBN 978-1-55860-489-6.

Ambroise C and McLachlan, G (2002), â€œSelection bias in gene extraction on the basis of microarray gene-expression dataâ€, Proc Natl Acad Sci U S A 99(10):6562â€“6.

Alberts B, Johnson A, Lewis J, Raff M, Roberts K and Walter P (2002), â€œMolecular Biology of the Cell. Garland Publishing, New York, fourth edition.

Pan W (2002),â€œA comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments, â€œBioinformatics 18(4):546â€“554.

Dhanalakshmi, K., and H. Hannah Inbarani. "Fuzzy soft rough K-Means clustering approach for gene expression data." arXiv preprint arXiv:1212.5359 (2012).

Rajendran, Porkodi, and Deepika Thangavel. "Clustering of Microarray Data to Identify Enriched Go Terms of Genes in Severe Asthma Dataset using Gene Enrichment Analyze." Indian Journal of Science and Technology 9.8 (2016).

Yanchi Liu1,2, Zhongmou Li2, Hui Xiong2, Xuedong Gao1, Junjie Wu3 Understanding of Internal Clustering Validation Measures 2010 IEEE International Conference on Data Mining.

Sarah M., Kim Matthew, I. Penam, Mark Moll George Giannakopoulos George N. Bennett, Lydia E. Kavraki, "An Evaluation of Different Clustering Methods and Distance Measures Used for Grouping Metabolic Pathways,â€ To appear in the Proc. of the Eighth Intl. Conf. on Bioinformatics and Computational Biology (BICoB 2016).

ErÃ©ndira RendÃ³n, Itzel Abundez, Alejandra Arizmendi and Elvia M. Quiroz, â€œInternal versus External cluster validation Indexes, "International Journal of Computers and Communications Issue 1, Volume 5, 2011.

Satya Chaitanya Sripada., Dr. M.Sreenivasa Rao, "Comparison Of Purity And Entropy Of K-Means Clustering And Fuzzy C Means Clustering, "Indian Journal of Computer Science and Engineering (IJCSE)

Hamerly G, Elkan C. (2002), "Alternatives to the k-means algorithm that find better clusterings" (PDF). Proceedings of the eleventh international conference on Information and knowledge management (CIKM).

J. C. Bezdek (1981), "Pattern Recognition with Fuzzy Objective Function Algoritms", Plenum Press, New York Tariq Rashid: â€œClustering.

Ward, Joe H. (1963), "Hierarchical Grouping to Optimize an Objective Function", Journal of the American Statistical Association. 58 (301): 236â€“244. doi:10.2307/2282967. JSTOR 2282967. MR 0148188.

H.S. Park, C.H. Jun, â€œA simple and fast algorithm for K-medoids clusteringâ€, Expert Systems with Applications, 36, (2) (2009), 3336â€“3341.

Dunn 1974, Dunn J. (1974), "Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics ,4, 95-104.

Calinski RB, Harabasz J A, â€œdendrite method for cluster analysisâ€, Communications in Statistics 1974, 3:127.

Rousseeuw 1987, Rousseeuw, P.J, (1987), " Silhouettes a graphical aid to the interpretation and validation of cluster analysis,â€ Journal of Computational and Applied Mathematics, 20, 53-65.

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

References