Documentation¶
We welcome contributions to our documentation via GitHub pull requests, whether it’s fixing a typo or authoring an entirely new tutorial or guide. If you’re thinking about contributing documentation, please see How to Author Gensim Documentation.
Core Tutorials: New Users Start Here!¶
If you’re new to gensim, we recommend going through all core tutorials in order. Understanding this functionality is vital for using gensim effectively.
Tutorials: Learning Oriented Lessons¶
Learning-oriented lessons that introduce a particular gensim feature, e.g. a model (Word2Vec, FastText) or technique (similarity queries or text summarization).
Fast Similarity Queries with Annoy and Word2Vec
How-to Guides: Solve a Problem¶
These goal-oriented guides demonstrate how to solve a specific problem using gensim.
How to download pre-trained models and corpora
How to Author Gensim Documentation
How to reproduce the doc2vec ‘Paragraph Vector’ paper
Other Resources¶
Blog posts, tutorial videos, hackathons and other useful Gensim resources, from around the internet.
Use FastText or Word2Vec? Comparison of embedding quality and performance. Jupyter Notebook
Multiword phrases extracted from How I Met Your Mother. Blog post by Mark Needham
Using Gensim LDA for hierarchical document clustering. Jupyter notebook by Brandon Rose
Evolution of Voldemort topic through the 7 Harry Potter books. Blog post
Movie plots by genre: Document classification using various techniques: TF-IDF, word2vec averaging, Deep IR, Word Movers Distance and doc2vec. Github repo
Word2vec: Faster than Google? Optimization lessons in Python, talk by Radim Řehůřek at PyData Berlin 2014. Youtube video
Word2vec & friends, talk by Radim Řehůřek at MLMU.cz 7.1.2015. Youtube video