IndoBERT Binary Sentiment Classifier
A context-conditioned binary sentiment classifier for Indonesian text, fine-tuned from IndoBERT Large P2 (335M parameters). This is the binary variant of apriandito/indobert-sentiment-classifier (3-class), designed for use cases that only need polarity detection (Negatif / Positif) without a Netral class.
Like its sibling models, this model evaluates sentiment in relation to a given topic context, making it more accurate for topic-specific analysis such as brand monitoring, public opinion polling, and crisis detection.
Model Details
| Property | Value |
|---|---|
| Base model | indobenchmark/indobert-large-p2 (335M params) |
| Task | Context-conditioned binary sentiment classification |
| Labels | NEGATIF (0), POSITIF (1) |
| Input format | [CLS] context [SEP] text [SEP] |
| Max length | 256 tokens |
| Training data | 14,045 context-text pairs across 188 topics |
Performance
Evaluated on a held-out validation set of 2,107 samples.
| Metric | Value |
|---|---|
| Accuracy | 96.06% |
| F1 Macro | 0.949 |
| F1 Weighted | 0.961 |
| Precision Macro | 0.947 |
| Recall Macro | 0.952 |
Comparison with 3-class variant
| Model | Classes | Accuracy | F1 Macro |
|---|---|---|---|
| This model (binary) | 2 (Negatif, Positif) | 96.06% | 0.949 |
| indobert-sentiment-classifier | 3 (Negatif, Netral, Positif) | 88.1% | 0.856 |
The binary model achieves higher metrics because the task is simpler (no ambiguous Netral class). Choose binary when you only care about polarity; choose 3-class when you need to distinguish neutral text.
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model = AutoModelForSequenceClassification.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model.eval()
LABELS = {0: "Negatif", 1: "Positif"}
context = "harga sembako"
text = "harga beras naik terus bikin rakyat susah"
encoding = tokenizer(context, text, truncation=True, max_length=256, return_tensors="pt")
with torch.no_grad():
probs = torch.softmax(model(**encoding).logits, dim=-1)[0]
pred = torch.argmax(probs).item()
print(f"{LABELS[pred]} ({probs[pred]:.4f})")
# Output: Negatif (0.9987)
Why Context Matters
Standard sentiment models classify text in isolation. This can lead to errors when sentiment depends on context:
| Context | Text | With Context | Without Context |
|---|---|---|---|
| harga sembako | harganya gila-gilaan | Negatif | ??? |
| produk luxury | harganya gila-gilaan | Positif | ??? |
| korupsi | KPK tangkap bupati korupsi dana bansos | Positif | Negatif |
| polusi udara | Jakarta peringkat 1 paling berpolusi | Negatif | Positif |
Training Details
- Data: Derived from the same 31,360 context-text pairs used in the 3-class model. Netral samples (17,315) were dropped, leaving 14,045 binary samples (10,357 Negatif / 3,688 Positif).
- Epochs: 5 (best at epoch 4, early stopping patience 2)
- Batch size: 16
- Learning rate: 2e-5
- Optimizer: AdamW (weight decay 0.01, warmup ratio 0.1)
- Loss: CrossEntropyLoss with class weights (Negatif: 0.678, Positif: 1.904)
- GPU: NVIDIA RTX 4090
When to Use Which Model
| Use Case | Recommended Model |
|---|---|
| Filter irrelevant text first | indobert-relevancy-classifier |
| Full sentiment breakdown (pos/neu/neg) | indobert-sentiment-classifier |
| Polarity only (pos/neg) | This model |
Suggested Pipeline
Raw Text โ Relevancy Filter โ Sentiment Analysis (3-class or binary)
- Use indobert-relevancy-classifier to filter relevant text
- Use this model (or the 3-class variant) to classify sentiment on relevant text only
Related Models
| Model | Task | Labels | Accuracy |
|---|---|---|---|
| indobert-relevancy-classifier | Relevancy | Relevant / Not Relevant | 96.5% |
| indobert-sentiment-classifier | Sentiment (3-class) | Negatif / Netral / Positif | 88.1% |
| indobert-binary-sentiment-classifier | Sentiment (binary) | Negatif / Positif | 96.06% |
All three models share the same architecture (IndoBERT Large P2, 335M params) and the same context-conditioned input format ([CLS] context [SEP] text [SEP]).
Citation
@misc{saputra2026indobert-binary-sentiment,
title={IndoBERT Binary Sentiment Classifier: Context-Conditioned Binary Sentiment Classification for Indonesian Text},
author={Saputra, Muhammad Apriandito Arya},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/apriandito/indobert-binary-sentiment-classifier}
}
- Downloads last month
- 151
Model tree for apriandito/indobert-binary-sentiment-classifier
Base model
indobenchmark/indobert-large-p2Evaluation results
- Accuracyself-reported0.961
- F1 Macroself-reported0.949