gensim logo

gensim
gensim tagline

Get Expert Help From The Gensim Authors

Consulting in Machine Learning & NLP

• Commercial document similarity engine: ScaleText.ai

Corporate trainings in Python Data Science and Deep Learning

models.fasttext_inner – Cython routines for training FastText models

models.fasttext_inner – Cython routines for training FastText models

Optimized cython functions for training FastText model.

gensim.models.fasttext_inner.init()

Precompute function sigmoid(x) = 1 / (1 + exp(-x)), for x values discretized into table EXP_TABLE. Also calculate log(sigmoid(x)) into LOG_TABLE.

Returns:Enumeration to signify underlying data type returned by the BLAS dot product calculation. 0 signifies double, 1 signifies double, and 2 signifies that custom cython loops were used instead of BLAS.
Return type:{0, 1, 2}
gensim.models.fasttext_inner.train_batch_cbow(model, sentences, alpha, _work, _neu1)

Update the CBOW model by training on a sequence of sentences.

Each sentence is a list of string tokens, which are looked up in the model’s vocab dictionary. Called internally from gensim.models.fasttext.FastText.train().

Parameters:
  • model (FastText) – Model to be trained.
  • sentences (iterable of list of str) – Corpus streamed directly from disk/network.
  • alpha (float) – Learning rate.
  • _work (np.ndarray) – Private working memory for each worker.
  • _neu1 (np.ndarray) – Private working memory for each worker.
Returns:

Effective number of words trained.

Return type:

int

gensim.models.fasttext_inner.train_batch_sg(model, sentences, alpha, _work, _l1)

Update skip-gram model by training on a sequence of sentences.

Each sentence is a list of string tokens, which are looked up in the model’s vocab dictionary. Called internally from gensim.models.fasttext.FastText.train().

Parameters:
  • model (FastText) – Model to be trained.
  • sentences (iterable of list of str) – Corpus streamed directly from disk/network.
  • alpha (float) – Learning rate.
  • _work (np.ndarray) – Private working memory for each worker.
  • _l1 (np.ndarray) – Private working memory for each worker.
Returns:

Effective number of words trained.

Return type:

int