IndoBERT Binary Sentiment Classifier

A context-conditioned binary sentiment classifier for Indonesian text, fine-tuned from IndoBERT Large P2 (335M parameters). This is the binary variant of apriandito/indobert-sentiment-classifier (3-class), designed for use cases that only need polarity detection (Negatif / Positif) without a Netral class.

Like its sibling models, this model evaluates sentiment in relation to a given topic context, making it more accurate for topic-specific analysis such as brand monitoring, public opinion polling, and crisis detection.

Model Details

Property	Value
Base model	indobenchmark/indobert-large-p2 (335M params)
Task	Context-conditioned binary sentiment classification
Labels	`NEGATIF` (0), `POSITIF` (1)
Input format	`[CLS] context [SEP] text [SEP]`
Max length	256 tokens
Training data	14,045 context-text pairs across 188 topics

Performance

Evaluated on a held-out validation set of 2,107 samples.

Metric	Value
Accuracy	96.06%
F1 Macro	0.949
F1 Weighted	0.961
Precision Macro	0.947
Recall Macro	0.952

Comparison with 3-class variant

Model	Classes	Accuracy	F1 Macro
This model (binary)	2 (Negatif, Positif)	96.06%	0.949
indobert-sentiment-classifier	3 (Negatif, Netral, Positif)	88.1%	0.856

The binary model achieves higher metrics because the task is simpler (no ambiguous Netral class). Choose binary when you only care about polarity; choose 3-class when you need to distinguish neutral text.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model = AutoModelForSequenceClassification.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model.eval()

LABELS = {0: "Negatif", 1: "Positif"}

context = "harga sembako"
text = "harga beras naik terus bikin rakyat susah"

encoding = tokenizer(context, text, truncation=True, max_length=256, return_tensors="pt")
with torch.no_grad():
    probs = torch.softmax(model(**encoding).logits, dim=-1)[0]
    pred = torch.argmax(probs).item()

print(f"{LABELS[pred]} ({probs[pred]:.4f})")
# Output: Negatif (0.9987)

Why Context Matters

Standard sentiment models classify text in isolation. This can lead to errors when sentiment depends on context:

Context	Text	With Context	Without Context
harga sembako	harganya gila-gilaan	Negatif	???
produk luxury	harganya gila-gilaan	Positif	???
korupsi	KPK tangkap bupati korupsi dana bansos	Positif	Negatif
polusi udara	Jakarta peringkat 1 paling berpolusi	Negatif	Positif

Training Details

Data: Derived from the same 31,360 context-text pairs used in the 3-class model. Netral samples (17,315) were dropped, leaving 14,045 binary samples (10,357 Negatif / 3,688 Positif).
Epochs: 5 (best at epoch 4, early stopping patience 2)
Batch size: 16
Learning rate: 2e-5
Optimizer: AdamW (weight decay 0.01, warmup ratio 0.1)
Loss: CrossEntropyLoss with class weights (Negatif: 0.678, Positif: 1.904)
GPU: NVIDIA RTX 4090

When to Use Which Model

Use Case	Recommended Model
Filter irrelevant text first	indobert-relevancy-classifier
Full sentiment breakdown (pos/neu/neg)	indobert-sentiment-classifier
Polarity only (pos/neg)	This model

Suggested Pipeline

Raw Text → Relevancy Filter → Sentiment Analysis (3-class or binary)

Use indobert-relevancy-classifier to filter relevant text
Use this model (or the 3-class variant) to classify sentiment on relevant text only

Related Models

Model	Task	Labels	Accuracy
indobert-relevancy-classifier	Relevancy	Relevant / Not Relevant	96.5%
indobert-sentiment-classifier	Sentiment (3-class)	Negatif / Netral / Positif	88.1%
indobert-binary-sentiment-classifier	Sentiment (binary)	Negatif / Positif	96.06%

All three models share the same architecture (IndoBERT Large P2, 335M params) and the same context-conditioned input format ([CLS] context [SEP] text [SEP]).

Citation

@misc{saputra2026indobert-binary-sentiment,
  title={IndoBERT Binary Sentiment Classifier: Context-Conditioned Binary Sentiment Classification for Indonesian Text},
  author={Saputra, Muhammad Apriandito Arya},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/apriandito/indobert-binary-sentiment-classifier}
}

Downloads last month: 151

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for apriandito/indobert-binary-sentiment-classifier

Base model

indobenchmark/indobert-large-p2

Finetuned

(29)

this model

Evaluation results

Accuracy
self-reported

0.961
F1 Macro
self-reported

0.949