IAL for Opinion Analysis

This project aims at extending IAL to develop information extractors for opinion analysis on the TREC 06 Blog Corpus (BLOG 06). These extractors will be used in the Ephyra question answering system as part of CMU's submission to the TEC 2008 QA track.

News

Meeting Minutes

Related Literature

Trec Blog Corpus

Overview of the TREC-2006 Blog Track, Ounis et al, TREC 2006.

Macdonald and Ounis, The TREC Blogs06 Collection : Creating and Analysing a Blog Test Collection

(to appear) Jonathan Elsas, Vitor R. Carvalho, Jaime G. Carbonell. Fast Learning of Document Ranking Functions with the Committee Perceptron, Proceedings of the First ACM International Conference on Web Search and Data Mining (WSDM 2008), 2008.

Jonathan Elsas, Jaime Arguello, Jamie Callan, Jaime G. Carbonell. Retrieval and Feedback Models for Blog Distillation, Proceedings of the 2007 Text REtrieval Conference (TREC 2007) slides Best performing group in the TREC 2007 Blog Distillation task, Alternate Link

Other Corpora and Resources

The MPQA opinion corpus can be downloaded from this web site. The UPitt OpinionFinder and a subjectivity lexicon can also be found there.

NRRC corpus. Downloaded version available on seit1.lti.cs.cmu.edu at /usr1/data/mpqa.

Opinion Annotation in GATE : Annotation tool implemented within GATE used for annotating opinions in the NRRC corpus (by UPitt)

Opinion and Feature Extraction

Extracting Product Features and Opinions from Reviews, Ana-Maria Popescu, Oren Etzioni, Proceedings of HLT-EMNLP, 2005. The OPINE System.

Joint Extraction of Entities and Relations for Opinion Recognition, Choi et al, EMNLP 2006.

Annotating Expressions of Opinions and Emotions in Language, Wiebe et al., Computational Linguistics 2005.

Recognizing Strong and Weak Opinion Clauses, Wilson et al, Computational Intelligence. 2006

Feature Subsumption for Opinion Analysis, Riloff et al., EMNLP 2006.

Just how mad are you? Finding strong and weak opinion clauses, Wilson et al., AAAI 2004.

Learning Extraction Patterns for Subjective Expressions, Riloff et al., EMNLP 2003

Annotating Opinions in the World Press Wilson et al SIGdial-03

OpinionFinder: A system for subjectivity analysis, Wilson et al., HLT/EMNLP 2005.

Knowledge Transfer and Opinion Detection in the TREC2006 Blog Track, Yang et al., Text REtrieval Conference 2006 (TREC2006).

Sentiment Analysis

Thumbs up or thumbs down? semantic orientation..., Turney, ACL 2001.

Thumbs up? Sentiment Classification using Machine Learning ..., Pang et al, EMNLP 2002.

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification, Blitzer et al, ACL 2007.

Large-Scale Sentiment Analysis for News and Blogs, Godbole et al, ICWSM 2007

Opinion Analysis for QA

QA with Attitude, Somasundaran et al, ICWSM 2007

Papers on research in Sentiment, Opinions and Subjectivity (A very useful resource)

Social Media page on Sentiment, Opinions and Subjectivity - by Swapna Somasundarn (U.Pitt)

Spam Blog detection

  • These use a dataset crawled from the web blogs and manually annotated as splogs or legitimate blogs

Detecting Spam Blogs: A Machine Learning Approach, Kolari et al, AAAI 2006

Blocking Blog Spam with Language Model Disagreement, Mishne et al, WWW 2005

Blog Track Open Task: Spam Blog Classification, Kolari et al, TREC 2006

SVMs for the Blogosphere: Blog Identification and Splog Detection, Kolari, AAAI 2006

  • These use BLOG06 dataset manually annotated a part of it as splogs or legitimate blogs

Splog Detection Using Self-Similarity Analysis on Blog Temporal Dynamics, Lin et al, AIRWeb 2007

Splog Detection using Content, Time and Link Structures, Lin et al, ICME 2007

Attachments