corpora.opinosiscorpus
– Topic related review sentences¶
Creates a corpus and dictionary from the Opinosis dataset.
References
- 1
Ganesan, Kavita and Zhai, ChengXiang and Han, Jiawei. Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions [online]. In : Proceedings of the 23rd International Conference on Computational Linguistics. 2010. p. 340-348. Available from: https://kavita-ganesan.com/opinosis/
- class gensim.corpora.opinosiscorpus.OpinosisCorpus(path)¶
Bases:
object
Creates a corpus and dictionary from the Opinosis dataset.
http://kavita-ganesan.com/opinosis-opinion-dataset/
This data is organized in folders, each folder containing a few short docs.
Data can be obtained quickly using the following commands in bash:
mkdir opinosis && cd opinosis wget https://github.com/kavgan/opinosis/raw/master/OpinosisDataset1.0_0.zip unzip OpinosisDataset1.0_0.zip
corpus and dictionary can be accessed by using the .corpus and .id2word members
Load the downloaded corpus.
- Parameters
path (string) – Path to the extracted zip file. If ‘summaries-gold’ is in a folder called ‘opinosis’, then the Path parameter would be ‘opinosis’, either relative to you current working directory or absolute.