models.callbacks – Callbacks for track and viz LDA train process

`models.callbacks` – Callbacks for track and viz LDA train process¶

Callbacks can be used to observe the training process.

Since training in huge corpora can be time consuming, we want to offer the users some insight into the process, in real time. In this way, convergence issues or other potential problems can be identified early in the process, saving precious time and resources.

The metrics exposed through this module can be used to construct Callbacks, which will be called at specific points in the training process, such as “epoch starts” or “epoch finished”. These metrics can be used to assess mod’s convergence or correctness, for example to save the model, visualize intermediate results, or anything else.

Usage examples¶

To implement a Callback, inherit from this base class and override one or more of its methods.

Create a callback to save the training model after each epoch

>>> from gensim.test.utils import get_tmpfile
>>> from gensim.models.callbacks import CallbackAny2Vec
>>>
>>>
>>> class EpochSaver(CallbackAny2Vec):
...     '''Callback to save model after each epoch.'''
...
...     def __init__(self, path_prefix):
...         self.path_prefix = path_prefix
...         self.epoch = 0
...
...     def on_epoch_end(self, model):
...         output_path = get_tmpfile('{}_epoch{}.model'.format(self.path_prefix, self.epoch))
...         model.save(output_path)
...         self.epoch += 1
...

Create a callback to print progress information to the console:

>>> class EpochLogger(CallbackAny2Vec):
...     '''Callback to log information about training'''
...
...     def __init__(self):
...         self.epoch = 0
...
...     def on_epoch_begin(self, model):
...         print("Epoch #{} start".format(self.epoch))
...
...     def on_epoch_end(self, model):
...         print("Epoch #{} end".format(self.epoch))
...         self.epoch += 1
...
>>>
>>> epoch_logger = EpochLogger()
>>> w2v_model = Word2Vec(common_texts, iter=5, size=10, min_count=0, seed=42, callbacks=[epoch_logger])
Epoch #0 start
Epoch #0 end
Epoch #1 start
Epoch #1 end
Epoch #2 start
Epoch #2 end
Epoch #3 start
Epoch #3 end
Epoch #4 start
Epoch #4 end

Create and bind a callback to a topic model. This callback will log the perplexity metric in real time:

>>> from gensim.models.callbacks import PerplexityMetric
>>> from gensim.models.ldamodel import LdaModel
>>> from gensim.test.utils import common_corpus, common_dictionary
>>>
>>> # Log the perplexity score at the end of each epoch.
>>> perplexity_logger = PerplexityMetric(corpus=common_corpus, logger='shell')
>>> lda = LdaModel(common_corpus, id2word=common_dictionary, num_topics=5, callbacks=[perplexity_logger])

class gensim.models.callbacks.Callback(metrics)¶

Bases: object

A class representing routines called reactively at specific phases during trained.

These can be used to log or visualize the training progress using any of the metric scores developed before. The values are stored at the end of each training epoch. The following metric scores are currently available:

CoherenceMetric

PerplexityMetric

DiffMetric

ConvergenceMetric

Parameters: metrics (list of Metric) – The list of metrics to be reported by the callback.

on_epoch_end(epoch, topics=None)¶

Report the current epoch’s metric value.

Called at the end of each training iteration.

Parameters

epoch (int) – The epoch that just ended.
topics (list of list of str, optional) – List of tokenized topics. This is required for the coherence metric.

Returns

Mapping from metric names to their values. The type of each value depends on the metric type, for example DiffMetric computes a matrix while ConvergenceMetric computes a float.

Return type

dict of (str, object)

set_model(model)¶

Save the model instance and initialize any required variables which would be updated throughout training.

Parameters: model (BaseTopicModel) – The model for which the training will be reported (logged or visualized) by the callback.

class gensim.models.callbacks.CallbackAny2Vec¶

Bases: object

Base class to build callbacks for BaseWordEmbeddingsModel.

Callbacks are used to apply custom functions over the model at specific points during training (epoch start, batch end etc.). This is a base class and its purpose is to be inherited by custom Callbacks that implement one or more of its methods (depending on the point during training where they want some action to be taken).

See examples at the module level docstring for how to define your own callbacks by inheriting from this class.

on_batch_begin(model)¶

Method called at the start of each batch.

Parameters: model (BaseWordEmbeddingsModel) – Current model.

on_batch_end(model)¶

Method called at the end of each batch.

Parameters: model (BaseWordEmbeddingsModel) – Current model.

on_epoch_begin(model)¶

Method called at the start of each epoch.

Parameters: model (BaseWordEmbeddingsModel) – Current model.

on_epoch_end(model)¶

Method called at the end of each epoch.

Parameters: model (BaseWordEmbeddingsModel) – Current model.

on_train_begin(model)¶

Method called at the start of the training process.

Parameters: model (BaseWordEmbeddingsModel) – Current model.

on_train_end(model)¶

Method called at the end of the training process.

Parameters: model (BaseWordEmbeddingsModel) – Current model.

class gensim.models.callbacks.CoherenceMetric(corpus=None, texts=None, dictionary=None, coherence=None, window_size=None, topn=10, logger=None, viz_env=None, title=None)¶

Bases: gensim.models.callbacks.Metric

Metric class for coherence evaluation.

Get Expert Help From The Gensim Authors

models.callbacks – Callbacks for track and viz LDA train process¶

Usage examples¶

`models.callbacks` – Callbacks for track and viz LDA train process¶