PROBABILISTIC LATENT FEATURE DISCOVERY MODEL AND  MULTI LABEL CONTENT CATEGORIZATION IN E-LEARNING  USING R PACKAGE

arulselvarani pugazh

doi:10.26483/ijarcs.v8i5.3582

PDF PROBABILISTIC LATENT FEATURE DISCOVERY MODEL AND MULTI LABEL CONTENT CATEGORIZATION IN E-LEARNING USING R PACKAGE

Published: Jun 20, 2017

DOI: https://doi.org/10.26483/ijarcs.v8i5.3582

Keywords:

E-learning, information retrieval, LDA, Latent feature recovery

arulselvarani pugazh

Govt. arts college, Trichy-22

Abstract

In an E-learning environment, users have access to huge online documents of various learning materials, hence finding the suitable learning content becomes harder. In this research paper, author uses high level machine learning approach using R packages to propose a method namely probabilistic latent feature discovery approach and multi label content categorization in e-learning. Probabilistic latent feature discovery model is a generative model for multi label e-learning content categorization, which has significant, effects of both accuracy and efficiency. Latent Dirichlet Allocation (LDA) is an effective probabilistic approach to develop a topic models depends on a formal generative model of a document, viable and efficient algorithm in e-learning text modeling. Authors propose a generative LDA-based model within the information retrieval approach, and estimate it on an e-learning environment, training the learning documents via Gibbs sampling. The predictive distribution LDA fit model is used to predict new words. The experimental results on e-learning, multi label content categorization demonstrate the accuracy and effectiveness of the proposed research approach.

Downloads

Download data is not yet available.

Issue

Vol. 8 No. 5 (2017): May-June 2017

Section

Articles

COPYRIGHT

Submission of a manuscript implies: that the work described has not been published before, that it is not under consideration for publication elsewhere; that if and when the manuscript is accepted for publication, the authors agree to automatic transfer of the copyright to the publisher.

Authors who publish with this journal agree to the following terms:

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work
The journal allows the author(s) to retain publishing rights without restrictions.
The journal allows the author(s) to hold the copyright without restrictions.

Author Biography

arulselvarani pugazh, Govt. arts college, Trichy-22

Ph.D in Computer Science from Mother Teressa Womens University, kodaikanal,Tamilnadu, India.

References

Chong Wang and David M. Blei.Collaborative Topic Modeling for Recommending Scientific Articles.KDD '11 the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego, CA, USA â€” August 21 - 24, 2011

Jason D. M. Rennie, â€œImproving Multi-class Text Classification with Naive Bayes ,â€ 2001, Massachusetts.Institute of Technology, http://citeseer.ist.psu.edu/cs

Yang Y., Zhang J. and Kisiel B, â€œA scalability analysis of classifiers in text categorization,â€ ACM SIGIR'03, 2003.

Canasai Kruengkrai , Chuleerat Jaruskulchai, â€œA Parallel Learning Algorithm for Text Classification,â€ TheEighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2002), Canada, July 2002.

H. Wallach, â€œTopic modeling: beyond bag-of-words,â€Proceedings of the 23rd International Conference on Machine Learning, (2006)

S. Deerwester, S. Dumais, G. Furnas, T. Landauer and R. Harshman, â€œIndexing by latent semantic analysis,â€ Journal of the American Society for Information Science, 41(6), 391â€“407 (1990).

T. Hofmann, â€œProbabilistic Latent Semantic Indexing,â€Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, 50â€“57 (1999).

T. Hofmann, Probabilistic latent semantic indexing, in: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Develop- ment in Information Retrieval, ACM, New York, NY, USA, 1999, pp. 50â€“57.

D. Blei, A. Ng, M. Jordan, Latent Dirichlet allocation, The Journal of Machine Learning Research 3 (2003) 993â€“1022.

D.Cai, X. Wang, X. He, Probabilistic dyadic data analysis with local and global consistency, in: Proceedings of the 26th Annual International Conference on Machine Learning, ACM, New York, NY, USA, 2009.

M. Girolami and A. Kaban, â€œOn an equivalence between PLSI and LDA,â€ Proceedings of the 26th annual international ACMSIGIR conference on Research and development in information retrieval, 433â€“434 (2003).

J. Ponte and W. Croft, â€œA Language Modeling Approach to Information Retrieval,â€ Proceedings of the21st annual international ACM SIGIR conference onResearch and development in information retrieval 275â€“281 (1998).

T. Griffiths and M. Steyvers, â€œFinding scientific topics,â€ Proceedings of the National Academy of Sciences, 101 5228â€“5235 (2004).

J. Lasserre, C. Bishop, T. Minka, Principled hybrids of generative and discriminative models, in: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 2006.

Bosch, A. Zisserman, X. Munoz, Scene classification using a hybrid generative/discriminative approach, IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (2008) 712.

Jordan, M., Ghahramani, Z., Jaakkola, T. and Saul, L. (1999). Introduction to variational methods for graphical models. Machine Learning 37 183â€“233.

Wainwright, M. and Jordan, M. (2003). Graphical models, exponential families, and variational inference. Technical Report 649, Dept. Statistics, U.C. Berkeley.

Feinerer. An introduction to text mining in R. R News, 8(2):19{22, Oct. 2008. URL http://CRAN.R-project.org/doc/Rnews/.

Feinerer, K. Hornik, and D. Meyer. Text mining infrastructure in R. Journal of Statistical Software, 25(5):1{54, March 2008. ISSN 1548-7660. URL http://www.jstatsoft.org/v25/i05.

Feinerer I (2011). tm: Text Mining Package. R package version 0.5-5., URL http://CRAN. R-project.org/package=tm.

R Development Core Team (2011). R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/

Chang J (2010). lda: Collapsed Gibbs Sampling Methods for Topic Models. R package version 1.2.3, URL http://CRAN.R-project.org/package=lda

Gruhl, D., Guha, R., Liben-Nowell, D., & Tomkins, A. (2004). Information disusion through blogspace. In: Proceedings of international conference on world wide web (pp. 491â€“501).

Morinaga, S., & Yamanishi, K. (2004). Tracking dynamics of topic trends using a finite mixture model. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, USA, (pp. 811â€“816).

Mei, Q., Ling, X., Wondra, M., Su, H., & Zhai, C.X. (2007). Topic sentiment mixture: Modeling facets and opinions in weblogs. In: Proceedings of international conference on world wide web

Zeng, J. P., Zhang, S. Y., Wu, C. R., & Ji, X. W. (2009). Modelling the topic propagation over the internet. Mathematical and Computer Modelling of Dynamical Systems., 15(1), 83â€“93.

Salton, G., & McGill, M. (1983). Introduction to modern information retrieval. New York: McGraw-Hill.

Nigam, K., McCallum, A., Thrun, S., & Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine Learning, 2/3, 103â€“134.

Morinaga, S., & Yamanishi, K. (2004). Tracking dynamics of topic trends using a finite mixture model. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discov ery and data mining, Seattle, USA, (pp. 811â€“816).

Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, volume 20, pages 487â€“494, 2004.

Li and A. McCallum. Pachinko allocation: DAG-structured mixture models of topic correlations. In Proceedings of the International Conference on Machine Learning, volume 23, pages 577â€“584,New York, NY, 2006. ACM.

Chemudugunta, P. Smyth, and M. Steyvers. Modeling general and specific aspects of documents with a probabilistic topic model. In Advances in Neural Information Processing Systems 19, pages 241â€“248. MIT Press, Cambridge, MA, 2007.

Yanwei Wang;Tsinghua,Xiaoqing Ding;Changsong Liu.Topic Model Adaption for Recogniotion of Homologous offline Handwritten Chinese Text image,IEEE transactions on Volume: 19, Issue:12, 2014.

Jia Zeng;Cheung. W.K;Jiming Liu, Learning Topic Models by Belief Probagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume: 35,Issue:5,2013.

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

arulselvarani pugazh, Govt. arts college, Trichy-22

References