IndoBERT Binary Sentiment Classifier

A context-conditioned binary sentiment classifier for Indonesian text, fine-tuned from IndoBERT Large P2 (335M parameters). This is the binary variant of apriandito/indobert-sentiment-classifier (3-class), designed for use cases that only need polarity detection (Negatif / Positif) without a Netral class.

Like its sibling models, this model evaluates sentiment in relation to a given topic context, making it more accurate for topic-specific analysis such as brand monitoring, public opinion polling, and crisis detection.

Model Details

Property Value
Base model indobenchmark/indobert-large-p2 (335M params)
Task Context-conditioned binary sentiment classification
Labels NEGATIF (0), POSITIF (1)
Input format [CLS] context [SEP] text [SEP]
Max length 256 tokens
Training data 14,045 context-text pairs across 188 topics

Performance

Evaluated on a held-out validation set of 2,107 samples.

Metric Value
Accuracy 96.06%
F1 Macro 0.949
F1 Weighted 0.961
Precision Macro 0.947
Recall Macro 0.952

Comparison with 3-class variant

Model Classes Accuracy F1 Macro
This model (binary) 2 (Negatif, Positif) 96.06% 0.949
indobert-sentiment-classifier 3 (Negatif, Netral, Positif) 88.1% 0.856

The binary model achieves higher metrics because the task is simpler (no ambiguous Netral class). Choose binary when you only care about polarity; choose 3-class when you need to distinguish neutral text.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model = AutoModelForSequenceClassification.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model.eval()

LABELS = {0: "Negatif", 1: "Positif"}

context = "harga sembako"
text = "harga beras naik terus bikin rakyat susah"

encoding = tokenizer(context, text, truncation=True, max_length=256, return_tensors="pt")
with torch.no_grad():
    probs = torch.softmax(model(**encoding).logits, dim=-1)[0]
    pred = torch.argmax(probs).item()

print(f"{LABELS[pred]} ({probs[pred]:.4f})")
# Output: Negatif (0.9987)

Why Context Matters

Standard sentiment models classify text in isolation. This can lead to errors when sentiment depends on context:

Context Text With Context Without Context
harga sembako harganya gila-gilaan Negatif ???
produk luxury harganya gila-gilaan Positif ???
korupsi KPK tangkap bupati korupsi dana bansos Positif Negatif
polusi udara Jakarta peringkat 1 paling berpolusi Negatif Positif

Training Details

  • Data: Derived from the same 31,360 context-text pairs used in the 3-class model. Netral samples (17,315) were dropped, leaving 14,045 binary samples (10,357 Negatif / 3,688 Positif).
  • Epochs: 5 (best at epoch 4, early stopping patience 2)
  • Batch size: 16
  • Learning rate: 2e-5
  • Optimizer: AdamW (weight decay 0.01, warmup ratio 0.1)
  • Loss: CrossEntropyLoss with class weights (Negatif: 0.678, Positif: 1.904)
  • GPU: NVIDIA RTX 4090

When to Use Which Model

Use Case Recommended Model
Filter irrelevant text first indobert-relevancy-classifier
Full sentiment breakdown (pos/neu/neg) indobert-sentiment-classifier
Polarity only (pos/neg) This model

Suggested Pipeline

Raw Text โ†’ Relevancy Filter โ†’ Sentiment Analysis (3-class or binary)
  1. Use indobert-relevancy-classifier to filter relevant text
  2. Use this model (or the 3-class variant) to classify sentiment on relevant text only

Related Models

Model Task Labels Accuracy
indobert-relevancy-classifier Relevancy Relevant / Not Relevant 96.5%
indobert-sentiment-classifier Sentiment (3-class) Negatif / Netral / Positif 88.1%
indobert-binary-sentiment-classifier Sentiment (binary) Negatif / Positif 96.06%

All three models share the same architecture (IndoBERT Large P2, 335M params) and the same context-conditioned input format ([CLS] context [SEP] text [SEP]).

Citation

@misc{saputra2026indobert-binary-sentiment,
  title={IndoBERT Binary Sentiment Classifier: Context-Conditioned Binary Sentiment Classification for Indonesian Text},
  author={Saputra, Muhammad Apriandito Arya},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/apriandito/indobert-binary-sentiment-classifier}
}
Downloads last month
151
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for apriandito/indobert-binary-sentiment-classifier

Finetuned
(29)
this model

Evaluation results