`models.lsi_worker` – Worker for distributed LSI¶

Worker (“slave”) process used in computing distributed Latent Semantic Indexing (LSI, LsiModel) models.

Run this script on every node in your cluster. If you wish, you may even run it multiple times on a single machine, to make better use of multiple cores (just beware that memory footprint increases linearly).

How to use distributed LSI¶

Install needed dependencies (Pyro4)
```
pip install gensim[distributed]
```

Setup serialization (on each machine)

export PYRO_SERIALIZERS_ACCEPTED=pickle
export PYRO_SERIALIZER=pickle

Run nameserver
```
python -m Pyro4.naming -n 0.0.0.0 &
```
Run workers (on each machine)
```
python -m gensim.models.lsi_worker &
```

Run dispatcher

python -m gensim.models.lsi_dispatcher &

Run LsiModel in distributed mode:

>>> from gensim.test.utils import common_corpus, common_dictionary
>>> from gensim.models import LsiModel
>>>
>>> model = LsiModel(common_corpus, id2word=common_dictionary, distributed=True)

Command line arguments¶

...

options:
  -h, --help  show this help message and exit

class gensim.models.lsi_worker.Worker¶

Bases: object

Partly initialize the model.

A full initialization requires a call to initialize().

exit()¶: Terminate the worker.

getstate()¶

Log and get the LSI model’s current projection.

Returns: The current projection.
Return type: Projection

initialize(myid, dispatcher, **model_params)¶

Fully initialize the worker.

Parameters

myid (int) – An ID number used to identify this worker in the dispatcher object.
dispatcher (Dispatcher) – The dispatcher responsible for scheduling this worker.
**model_params – Keyword parameters to initialize the inner LSI model, see LsiModel.

processjob(job)¶

Incrementally process the job and potentially logs progress.

Parameters: job (iterable of list of (int, float)) – Corpus in BoW format.

requestjob()¶

Request jobs from the dispatcher, in a perpetual loop until getstate() is called.

Raises: RuntimeError – If self.model is None (i.e. worker not initialized).

reset()¶: Reset the worker by deleting its current projection.

Please sponsor Gensim to help sustain this open source project!

models.lsi_worker – Worker for distributed LSI¶

How to use distributed LSI¶

Command line arguments¶

`models.lsi_worker` – Worker for distributed LSI¶