API Reference¶
Modules:
interfaces– Core gensim interfacesutils– Various utility functionsmatutils– Math utilsdownloader– Downloader API for gensimcorpora.bleicorpus– Corpus in Blei’s LDA-C formatcorpora.csvcorpus– Corpus in CSV formatcorpora.dictionary– Construct word<->id mappingscorpora.hashdictionary– Construct word<->id mappingscorpora.indexedcorpus– Random access to corpus documentscorpora.lowcorpus– Corpus in GibbsLda++ formatcorpora.malletcorpus– Corpus in Mallet formatcorpora.mmcorpus– Corpus in Matrix Market formatcorpora.opinosiscorpus– Topic related review sentencescorpora.sharded_corpus– Corpus stored in separate filescorpora.svmlightcorpus– Corpus in SVMlight formatcorpora.textcorpus– Tools for building corpora with dictionariescorpora.ucicorpus– Corpus in UCI formatcorpora.wikicorpus– Corpus from a Wikipedia dumpmodels.ldamodel– Latent Dirichlet Allocationmodels.ldamulticore– parallelized Latent Dirichlet Allocationmodels.ensembelda– Ensemble Latent Dirichlet Allocationmodels.nmf– Non-Negative Matrix factorizationmodels.lsimodel– Latent Semantic Indexingmodels.ldaseqmodel– Dynamic Topic Modeling in Pythonmodels.tfidfmodel– TF-IDF modelmodels.rpmodel– Random Projectionsmodels.hdpmodel– Hierarchical Dirichlet Processmodels.logentropy_model– LogEntropy modelmodels.normmodel– Normalization modelmodels.translation_matrix– Translation Matrix modelmodels.lsi_dispatcher– Dispatcher for distributed LSImodels.lsi_worker– Worker for distributed LSImodels.lda_dispatcher– Dispatcher for distributed LDAmodels.lda_worker– Worker for distributed LDAmodels.atmodel– Author-topic modelsmodels.word2vec– Word2vec embeddingsmodels.keyedvectors– Store and query word vectorsmodels.doc2vec– Doc2vec paragraph embeddingsmodels.fasttext– FastText modelmodels._fasttext_bin– Facebook’s fastText I/Omodels.phrases– Phrase (collocation) detectionmodels.poincare– Train and use Poincare embeddingsmodels.coherencemodel– Topic coherence pipelinemodels.basemodel– Core TM interfacemodels.callbacks– Callbacks for track and viz LDA train processmodels.word2vec_inner– Cython routines for training Word2Vec modelsmodels.doc2vec_inner– Cython routines for training Doc2Vec modelsmodels.fasttext_inner– Cython routines for training FastText modelssimilarities.docsim– Document similarity queriessimilarities.termsim– Term similarity queriessimilarities.annoy– Approximate Vector Search using Annoysimilarities.nmslib– Approximate Vector Search using NMSLIBsimilarities.levenshtein– Fast soft-cosine semantic similarity searchsimilarities.fastss– Fast Levenshtein edit distancetest.utils– Internal testing functionstopic_coherence.aggregation– Aggregation moduletopic_coherence.direct_confirmation_measure– Direct confirmation measure moduletopic_coherence.indirect_confirmation_measure– Indirect confirmation measure moduletopic_coherence.probability_estimation– Probability estimation moduletopic_coherence.segmentation– Segmentation moduletopic_coherence.text_analysis– Analyzing the texts of a corpus to accumulate statistical information about word occurrencesscripts.package_info– Information about gensim packagescripts.glove2word2vec– Convert glove format to word2vecscripts.make_wikicorpus– Convert articles from a Wikipedia dump to vectors.scripts.word2vec_standalone– Train word2vec on text file CORPUSscripts.make_wiki_online– Convert articles from a Wikipedia dumpscripts.make_wiki_online_nodebug– Convert articles from a Wikipedia dumpscripts.word2vec2tensor– Convert the word2vec format to Tensorflow 2D tensorscripts.segment_wiki– Convert wikipedia dump to json-line formatparsing.porter– Porter Stemming Algorithmparsing.preprocessing– Functions to preprocess raw text
