Introduction


Sentiment analysis is increasingly viewed as a vital task both from an academic and a commercial standpoint. The majority of current approaches, however, attempt to detect the overall polarity of a sentence, paragraph, or text span, regardless of the entities mentioned (e.g., laptops, restaurants) and their aspects (e.g., battery, screen ; food, service). By contrast, this task is concerned with aspect based sentiment analysis (ABSA), where the goal is to identify the aspects of given target entities and the sentiment expressed towards each aspect.

Problem Statement


This task is concerned with aspect based sentiment analysis (ABSA), where the goal is to identify the aspects of given target entities and the sentiment expressed towards each aspect.

Challenges Faced


  • Aspect term extraction,
  • We used rule based method, and some of the aspect terms weren’t extracted correctly by the rules.
  • We assumed all sentences are grammatically correct which is not always the real life scenario.
  • Aspect term polarity,
  • In sentences with many aspect terms, it was difficult to assign polarity.

Dataset


We used the SemEval 2016 dataset available at: ​ http://metashare.ilsp.gr:8080/repository/browse/semeval-2014-absa-restaurant-reviews-trial-data/1790ab94464211e388f5842b2b6a04d79bb0323cac4f49939bf5b99878dc38be/ This dataset consists of restaurant reviews presented in an XML file. Contains around 100 reviews, with varied number of aspect terms in each.

Tools Used


  • Stanford CoreNLP Parser
  • NLTK

Phases of the Project

Phase 1

Task 1

  • Aspect term Extraction This task deals with extracting the aspect terms from the reviews. ​ Given a set of sentences with pre-identified entities (e.g., restaurants), identify the aspect terms present in the sentence and return a list containing all the distinct aspect terms. An aspect term names a particular aspect of the target entity.

Task2

  • Aspect Polarity Detection For a given set of aspect terms within a sentence, determine whether the polarity of each aspect term is positive, negative, neutral or conflict (i.e., both positive and negative).

Phase 2

Task 2

  • Given a predefined set of aspect categories (e.g., price, food), identify the aspect categories discussed in a given sentence. Aspect categories are typically coarser than the aspect terms of Subtask 1, and they do not necessarily occur as terms in the given sentence.

Task 4

  • Given a set of pre-identified aspect categories (e.g., {food, price}), determine the polarity (positive, negative, neutral ​or conflict) of each aspect category.

Results


The system was run on SemEval 2016 dataset for restaurant reviews. The evaluation outputs are obtained as follows.

For Aspect Terms

Aspects

  • System Aspect Terms=747
  • Gold Aspect Terms=737
  • Pre: 0.2570281 (192/747)
  • Rec: 0.2605156 (192/737)
  • F:0.2587601

Categories

  • System Aspect Categories=570
  • Gold Aspect Categories=752
  • Pre: 0.36666667 (209/570)
  • Rec: 0.27792552 (209/752)
  • F:0.3161876

For polarity

Aspects

  • Accuracy: 0.44791666(86/92)
label\measure Precision Recall F-measure
conflict NaN(0/0) 0(0/6) NaN
negative 0.3333(11/33) 0.3235(11/34) 0.3284
neutral 0.1389(10/72) 0.5(10/20) 0.2174
positive 0.7471(65/87) 0.4924(65/132) 0.5936

Categories

  • Accuracy: 0.3110048 (65/209)
label\measure Precision Recall F-measure
conflict NaN(0/0) 0(0/9) NaN
negative 0.2143(6/28) 0.1667(6/36) 0.1875
neutral 0.087(10/115) 0.5(10/20) 0.1481
positive 0.7424(49/66) 0.3403(49/144) 0.4667

Tags


  • IIIT HYDERABAD.
  • Information Retrieval and Extraction
  • Major Project
  • Restaurant Reviews
  • Reviews
  • Sentiment Analysis
  • Aspect Based Sentiment Analysis
  • Rule based
  • Stanford CoreNLP

The source code is available at:

Video explaining the project is at:

Slideshare ppt is at:

DropBox link(has video, ppt,and report of the project):