Introduction
Sentiment analysis is increasingly viewed as a vital task both from an academic and a commercial standpoint. The majority of current approaches, however, attempt to detect the overall polarity of a sentence, paragraph, or text span, regardless of the entities mentioned (e.g., laptops, restaurants) and their aspects (e.g., battery, screen ; food, service). By contrast, this task is concerned with aspect based sentiment analysis (ABSA), where the goal is to identify the aspects of given target entities and the sentiment expressed towards each aspect.
Problem Statement
This task is concerned with aspect based sentiment analysis (ABSA), where the goal is to identify the aspects of given target entities and the sentiment expressed towards each aspect.
Challenges Faced
- Aspect term extraction,
- We used rule based method, and some of the aspect terms weren’t extracted correctly by the rules.
- We assumed all sentences are grammatically correct which is not always the real life scenario.
- Aspect term polarity,
- In sentences with many aspect terms, it was difficult to assign polarity.
Dataset
We used the SemEval 2016 dataset available at: http://metashare.ilsp.gr:8080/repository/browse/semeval-2014-absa-restaurant-reviews-trial-data/1790ab94464211e388f5842b2b6a04d79bb0323cac4f49939bf5b99878dc38be/ This dataset consists of restaurant reviews presented in an XML file. Contains around 100 reviews, with varied number of aspect terms in each.
Tools Used
- Stanford CoreNLP Parser
- NLTK
Phases of the Project
Phase 1
Task 1
- Aspect term Extraction This task deals with extracting the aspect terms from the reviews. Given a set of sentences with pre-identified entities (e.g., restaurants), identify the aspect terms present in the sentence and return a list containing all the distinct aspect terms. An aspect term names a particular aspect of the target entity.
Task2
- Aspect Polarity Detection For a given set of aspect terms within a sentence, determine whether the polarity of each aspect term is positive, negative, neutral or conflict (i.e., both positive and negative).
Phase 2
Task 2
- Given a predefined set of aspect categories (e.g., price, food), identify the aspect categories discussed in a given sentence. Aspect categories are typically coarser than the aspect terms of Subtask 1, and they do not necessarily occur as terms in the given sentence.
Task 4
- Given a set of pre-identified aspect categories (e.g., {food, price}), determine the polarity (positive, negative, neutral or conflict) of each aspect category.
Results
The system was run on SemEval 2016 dataset for restaurant reviews. The evaluation outputs are obtained as follows.
For Aspect Terms
Aspects
- System Aspect Terms=747
- Gold Aspect Terms=737
- Pre: 0.2570281 (192/747)
- Rec: 0.2605156 (192/737)
- F:0.2587601
Categories
- System Aspect Categories=570
- Gold Aspect Categories=752
- Pre: 0.36666667 (209/570)
- Rec: 0.27792552 (209/752)
- F:0.3161876
For polarity
Aspects
- Accuracy: 0.44791666(86/92)
label\measure | Precision | Recall | F-measure |
---|---|---|---|
conflict | NaN(0/0) | 0(0/6) | NaN |
negative | 0.3333(11/33) | 0.3235(11/34) | 0.3284 |
neutral | 0.1389(10/72) | 0.5(10/20) | 0.2174 |
positive | 0.7471(65/87) | 0.4924(65/132) | 0.5936 |
Categories
- Accuracy: 0.3110048 (65/209)
label\measure | Precision | Recall | F-measure |
---|---|---|---|
conflict | NaN(0/0) | 0(0/9) | NaN |
negative | 0.2143(6/28) | 0.1667(6/36) | 0.1875 |
neutral | 0.087(10/115) | 0.5(10/20) | 0.1481 |
positive | 0.7424(49/66) | 0.3403(49/144) | 0.4667 |
Tags
- IIIT HYDERABAD.
- Information Retrieval and Extraction
- Major Project
- Restaurant Reviews
- Reviews
- Sentiment Analysis
- Aspect Based Sentiment Analysis
- Rule based
- Stanford CoreNLP
The source code is available at:
Video explaining the project is at:
Slideshare ppt is at:
DropBox link(has video, ppt,and report of the project):