sklearn_api.rpmodel
– Scikit learn wrapper for Random Projection model¶Scikit learn interface for RpModel
.
Follows scikit-learn API conventions to facilitate using gensim along with scikit-learn.
Examples
>>> from gensim.sklearn_api.rpmodel import RpTransformer
>>> from gensim.test.utils import common_dictionary, common_corpus
>>>
>>> # Initialize and fit the model.
>>> model = RpTransformer(id2word=common_dictionary).fit(common_corpus)
>>>
>>> # Use the trained model to transform a document.
>>> result = model.transform(common_corpus[3])
gensim.sklearn_api.rpmodel.
RpTransformer
(id2word=None, num_topics=300)¶Bases: sklearn.base.TransformerMixin
, sklearn.base.BaseEstimator
Base Word2Vec module, wraps RpModel
.
For more information please have a look to Random projection.
id2word (Dictionary
, optional) – Mapping token_id -> token, will be determined from corpus if id2word == None.
num_topics (int, optional) – Number of dimensions.
fit
(X, y=None)¶Fit the model according to the given training data.
X (iterable of list of (int, number)) – Input corpus in BOW format.
The trained model.
fit_transform
(X, y=None, **fit_params)¶Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
X (numpy array of shape [n_samples, n_features]) – Training set.
y (numpy array of shape [n_samples]) – Target values.
X_new – Transformed array.
numpy array of shape [n_samples, n_features_new]
get_params
(deep=True)¶Get parameters for this estimator.
deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
params – Parameter names mapped to their values.
mapping of string to any
set_params
(**params)¶Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each
component of a nested object.
self
transform
(docs)¶Find the Random Projection factors for docs.
docs ({iterable of iterable of (int, int), list of (int, number)}) – Document or documents to be transformed in BOW format.
RP representation for each input document.
numpy.ndarray of shape [len(docs), num_topics]