corpora.opinosiscorpus – Topic related review sentences

Creates a corpus and dictionary from the Opinosis dataset.

References

1

Ganesan, Kavita and Zhai, ChengXiang and Han, Jiawei. Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions [online]. In : Proceedings of the 23rd International Conference on Computational Linguistics. 2010. p. 340-348. Available from: https://kavita-ganesan.com/opinosis/

class gensim.corpora.opinosiscorpus.OpinosisCorpus(path)

Bases: object

Creates a corpus and dictionary from the Opinosis dataset.

http://kavita-ganesan.com/opinosis-opinion-dataset/

This data is organized in folders, each folder containing a few short docs.

Data can be obtained quickly using the following commands in bash:

mkdir opinosis && cd opinosis wget https://github.com/kavgan/opinosis/raw/master/OpinosisDataset1.0_0.zip unzip OpinosisDataset1.0_0.zip

corpus and dictionary can be accessed by using the .corpus and .id2word members

Load the downloaded corpus.

Parameters

path (string) – Path to the extracted zip file. If ‘summaries-gold’ is in a folder called ‘opinosis’, then the Path parameter would be ‘opinosis’, either relative to you current working directory or absolute.