-
Notifications
You must be signed in to change notification settings - Fork 4
Clojure wrapper for LDA topic modeling in MALLET
License
davidandrzej/chisel
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
# chisel This code provides a Clojure wrapper to do Latent Dirichlet Allocation (LDA) [1] topic modeling via the MALLET [2] Java API. For example, we may wish to run LDA directly against text records in a database without touching the local filesystem, programmatically extract the learned topics, and write these to a different database. The goal of this wrapper code is to make these kind of tasks easier. References [1] Blei, David M., Ng, Andrew Y., and Jordan, Michael I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research (JMLR) 3 (Mar. 2003), 993-1022. [2] McCallum, Andrew K. "MALLET: A Machine Learning for Language Toolkit." http://mallet.cs.umass.edu. 2002. ## Usage This wrapper requires you to have already built and installed the MALLET jar to your local maven repo (see the above link [2]). You should then be able to build the wrapper with leiningen and use the chisel namespaces in your own code. For example: (ns my-example (:use (chisel instances lda))) (let [docs {"doc1" "cat cat bat bat" "doc2" "cat cat bat bat dog dog" "doc3" "dog dog dog"} inst (chisel.instances/get-instance-list docs) tm (chisel.lda/run-lda inst :T 2 :numiter 50 :topwords 3 :alpha 0.5)] (chisel.lda/write-topics tm "example.topics" :topwords 3)) ## License Copyright (c) 2011, Lawrence Livermore National Security, LLC. Produced at the Lawrence Livermore National Laboratory. Written by David Andrzejewski, [email protected] OCEC-10-073 All rights reserved. This file is part of the C-Cat package and is covered under the terms and conditions therein. See https://github.com/fozziethebeat/C-Cat for details. This code is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation and distributed hereunder to you. THIS SOFTWARE IS PROVIDED "AS IS" AND NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED ARE MADE. BY WAY OF EXAMPLE, BUT NOT LIMITATION, WE MAKE NO REPRESENTATIONS OR WARRANTIES OF MERCHANT- ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE LICENSED SOFTWARE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
About
Clojure wrapper for LDA topic modeling in MALLET
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published