sklearn_api.rpmodel – Scikit learn wrapper for Random Projection model

`sklearn_api.rpmodel` – Scikit learn wrapper for Random Projection model¶

Scikit learn interface for RpModel.

Follows scikit-learn API conventions to facilitate using gensim along with scikit-learn.

Examples

>>> from gensim.sklearn_api.rpmodel import RpTransformer
>>> from gensim.test.utils import common_dictionary, common_corpus
>>>
>>> # Initialize and fit the model.
>>> model = RpTransformer(id2word=common_dictionary).fit(common_corpus)
>>>
>>> # Use the trained model to transform a document.
>>> result = model.transform(common_corpus[3])

class gensim.sklearn_api.rpmodel.RpTransformer(id2word=None, num_topics=300)¶

Bases: sklearn.base.TransformerMixin, sklearn.base.BaseEstimator

Base Word2Vec module, wraps RpModel.

For more information please have a look to Random projection.

Parameters

id2word (Dictionary, optional) – Mapping token_id -> token, will be determined from corpus if id2word == None.
num_topics (int, optional) – Number of dimensions.

fit(X, y=None)¶

Fit the model according to the given training data.

Parameters: X (iterable of list of (int, number)) – Input corpus in BOW format.
Returns: The trained model.
Return type: RpTransformer

fit_transform(X, y=None, **fit_params)¶

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

X (numpy array of shape [n_samples, n_features]) – Training set.
y (numpy array of shape [n_samples]) – Target values.

Returns

X_new – Transformed array.

Return type

numpy array of shape [n_samples, n_features_new]

get_params(deep=True)¶

Get parameters for this estimator.

Parameters: deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: mapping of string to any

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns
Return type: self

transform(docs)¶

Find the Random Projection factors for docs.

Parameters: docs ({iterable of iterable of (int, int), list of (int, number)}) – Document or documents to be transformed in BOW format.
Returns: RP representation for each input document.
Return type: numpy.ndarray of shape [len(docs), num_topics]

Get Expert Help From The Gensim Authors

sklearn_api.rpmodel – Scikit learn wrapper for Random Projection model¶

`sklearn_api.rpmodel` – Scikit learn wrapper for Random Projection model¶