CMU-ISRI-04-127 Institute for Software Research International School of Computer Science, Carnegie Mellon University
Sentiment Extraction from Unstructured Text using Xue Bai, Rema Padman, Edoardo Airoldi July 2004
CMU-ISRI-04-127.ps
In this paper, we propose a two-stage Bayesian algorithm that is able to capture the dependencies among words, and, at the same time, finds a vocabulary that is efficient for the purpose of extracting sentiments. Experimental results on the Movie Reviews data set show that our algorithm is able to select a parsimonious feature set with substantially fewer predictor variables than in the full data set and leads to better predictions about sentiment orientations than several state-of-the-art machine learning methods. Our findings suggest that sentiments are captured by conditional dependence relations among words, rather than by keywords or high-frequency words. 13 pages
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |