ABSA (Aspect-Based Sentiment Analysis) for fine-grained sentiment insights

|

Research

Aspect-Based Sentiment Analysis (ABSA) research is present as a scientific breakthrough to dissect public opinion in super detail (fine-grained) directly on the specific aspects of a review. By targeting the world’s best standards (SOTA), this research combines the richness of regional languages in Indonesia with the sophistication of the latest large language models (LLMs).

Have you ever read internet reviews that have mixed content?

“The hotel is very clean and the mattress is soft, but unfortunately the restaurant food is bland and the reception service is sluggish.”

For humans, we know these consumers love the room amenities but are disappointed with the food and service. However, traditional AI (Artificial Intelligence) will be confusing. The old AI could only read one whole sentence and then guess one label: Positive or Negative. Because the content is the opposite, the old AI usually gives up and labels it Neutral. As a result, hotel owners lose valuable information about which parts need to be repaired.

To bridge this gap, a cutting-edge research is developing by Indonesia AI Institute (IAII) with a focus on Aspect-Based Sentiment Analysis (ABSA) to produce fine-grained insights. This research immediately aims at a big goal: to become the world’s best state-of-the-art (SOTA) method for ABSA tasks.

 

Research Focus: Dissecting Texts Through ASTE Assignments

This research not only guesses sentiment but focuses on a much more complex task called ASTE (Aspect Sentiment Triplet Extraction) or its extensions. In the ASTE task, the AI is trained to extract four elements at once (Quadruplet) from a single review sentence:

[Target/Aspect Object] ──► [Opinion Modifier/Descriptor] ──► [Category Type] ──► [Sentiment Value/Polarity]

  • Aspect Term: Finding the physical object being commented on (Example: “restaurant food”).
  • Opinion Term: Find consumer expression adjectives (Example: “bland”).
  • Sentiment/Polarity: Determining the value of his emotions (Example: Negative 👎).

 

By mapping these four elements automatically, business owners can get a razor-sharp analytics dashboard without the need to read through millions of manual reviews one by one.

 

Research Scope: Caring for Regional Languages through Multilingual Datasets

One of the biggest weaknesses of foreign-made AI models is their inability to understand local or regional languages in Indonesia. This research breaks down these limitations by building large-scale New Datasets.

  • Raw Material: This research takes the foundation from the Hospitality sector review dataset.
  • Localization & Improvement: The existing Indonesian dataset has been improved in terms of structure from typos or confusion of meaning.
  • Regional Language Expansion: This high-quality dataset is then translated and culturally adjusted into the 6 largest regional languages in Indonesia plus English. Languages covered include Indonesian, English, Javanese, Sundanese, Minang, Bugis, and Madura.

 

This step ensures that people from various corners of Indonesia who review local accommodations using their native language can still be understood with precision by AI.

 

Research Publication 1: Generative Approach (LLM) vs Agentic AI

The first experiment of this research was poured into Paper 1, which comparing two methods of modern artificial intelligence technology against each other in solving multilingual ABSA tasks:

A. Supervised Fine-Tuning (SFT) Method

The IAII researchers trained small-medium language models specifically using the 7 language datasets. The models used are Qwen 2.5 (0.5B) and Gemma 3 (270m). Despite its compact size and computational cost-effectiveness, the model was intensively “trained” in order to become an expert in recognizing the structure of ASTE.

B. Agentic AI Method

On the other side, the IAII researcher uses giant models (Large Language Models) such as Gemini and Qwen (large size) configured as Agents. This AI is given the ability to think, criticize its own answers (self-reflection), and validate the results of its extraction before giving a final answer.

So, when there is a question “Is a small, specially trained model (SFT) capable of matching or even surpassing the intelligence of a giant model (Agentic AI) that requires large memory?” Paper 1 will answer this computational efficiency dilemma for the needs of industry.

 

Research Publication 2: Looking at the Contents of the AI Head (Multilingual Steering & Mechanistic Interpretability)

Over the years, LLMs have often been dubbed the “Black Box” because humans know their inputs and outputs, but do not know how the thought processes are in their artificial neural networks. Paper 2 in this research is here to solve the mystery through a method called Mechanistic Interpretability.

The IAII researchers performed digital “brain surgery” on the LLM as the model read a variety of regional languages.

  • Finding an Active Attention Head: The researchers tracked which parts of the internal circuits (attention heads) turned on when the AI read words in Javanese, Sundanese, or Minang.
  • Steering Mechanism (Steering/Shift): After knowing which head is responsible for a particular language, the researcher intervenes or shifts.

 

Simply put, if the AI is reading the Madurese language but is suddenly confused, the researcher can “drive” or activate the right language circuit forcibly in the model so that the results of the sentiment analysis aspect remain accurate. This steering technology ensures that the model does not lose accuracy even when there is a sudden mixing of languages (code-switching) in one sentence of the review.

 

Impact and Future Direction

This research not only lays new standards (SOTA) on the international academic scene, but also brings real social and economic impacts:

  1. Tourism Sector: Local hotels in the area can use this technology to map customer satisfaction objectively, even from reviews written in the local language.
  2. Digital Inclusion: Regional languages in Indonesia are no longer marginalized in the development of global artificial intelligence technology.

 

Through a combination of local multilingual datasets, generative model optimization (SFT vs Agent), and circuit dissection in LLM (mechanistic interpretability), this IAII research aim to successfully ushered Indonesia into one of the mecca of the world-class Fine-Grained Sentiment Analysis development.

 

~This research is in progress.