---
language:
- id
license: mit
tags:
- sentiment-analysis
- indonesian
- indobert
- text-classification
- context-conditioned
datasets:
- custom
base_model: indobenchmark/indobert-large-p2
pipeline_tag: text-classification
model-index:
- name: indobert-binary-sentiment-classifier
  results:
  - task:
      type: text-classification
      name: Binary Sentiment Classification
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.9606
    - name: F1 Macro
      type: f1
      value: 0.9494
---

# IndoBERT Binary Sentiment Classifier

A context-conditioned **binary** sentiment classifier for Indonesian text, fine-tuned from [IndoBERT Large P2](https://huggingface.co/indobenchmark/indobert-large-p2) (335M parameters). This is the binary variant of [apriandito/indobert-sentiment-classifier](https://huggingface.co/apriandito/indobert-sentiment-classifier) (3-class), designed for use cases that only need polarity detection (Negatif / Positif) without a Netral class.

Like its sibling models, this model evaluates sentiment *in relation to a given topic context*, making it more accurate for topic-specific analysis such as brand monitoring, public opinion polling, and crisis detection.

## Model Details

| Property | Value |
|----------|-------|
| **Base model** | [indobenchmark/indobert-large-p2](https://huggingface.co/indobenchmark/indobert-large-p2) (335M params) |
| **Task** | Context-conditioned binary sentiment classification |
| **Labels** | `NEGATIF` (0), `POSITIF` (1) |
| **Input format** | `[CLS] context [SEP] text [SEP]` |
| **Max length** | 256 tokens |
| **Training data** | 14,045 context-text pairs across 188 topics |

## Performance

Evaluated on a held-out validation set of 2,107 samples.

| Metric | Value |
|--------|-------|
| **Accuracy** | **96.06%** |
| **F1 Macro** | **0.949** |
| F1 Weighted | 0.961 |
| Precision Macro | 0.947 |
| Recall Macro | 0.952 |

### Comparison with 3-class variant

| Model | Classes | Accuracy | F1 Macro |
|-------|---------|----------|----------|
| **This model (binary)** | 2 (Negatif, Positif) | **96.06%** | **0.949** |
| [indobert-sentiment-classifier](https://huggingface.co/apriandito/indobert-sentiment-classifier) | 3 (Negatif, Netral, Positif) | 88.1% | 0.856 |

> The binary model achieves higher metrics because the task is simpler (no ambiguous Netral class). Choose binary when you only care about polarity; choose 3-class when you need to distinguish neutral text.

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model = AutoModelForSequenceClassification.from_pretrained("apriandito/indobert-binary-sentiment-classifier")
model.eval()

LABELS = {0: "Negatif", 1: "Positif"}

context = "harga sembako"
text = "harga beras naik terus bikin rakyat susah"

encoding = tokenizer(context, text, truncation=True, max_length=256, return_tensors="pt")
with torch.no_grad():
    probs = torch.softmax(model(**encoding).logits, dim=-1)[0]
    pred = torch.argmax(probs).item()

print(f"{LABELS[pred]} ({probs[pred]:.4f})")
# Output: Negatif (0.9987)
```

## Why Context Matters

Standard sentiment models classify text in isolation. This can lead to errors when sentiment depends on context:

| Context | Text | With Context | Without Context |
|---------|------|--------------|-----------------|
| harga sembako | harganya gila-gilaan | **Negatif** | ??? |
| produk luxury | harganya gila-gilaan | **Positif** | ??? |
| korupsi | KPK tangkap bupati korupsi dana bansos | **Positif** | Negatif |
| polusi udara | Jakarta peringkat 1 paling berpolusi | **Negatif** | Positif |

## Training Details

- **Data**: Derived from the same 31,360 context-text pairs used in the 3-class model. Netral samples (17,315) were dropped, leaving 14,045 binary samples (10,357 Negatif / 3,688 Positif).
- **Epochs**: 5 (best at epoch 4, early stopping patience 2)
- **Batch size**: 16
- **Learning rate**: 2e-5
- **Optimizer**: AdamW (weight decay 0.01, warmup ratio 0.1)
- **Loss**: CrossEntropyLoss with class weights (Negatif: 0.678, Positif: 1.904)
- **GPU**: NVIDIA RTX 4090

## When to Use Which Model

| Use Case | Recommended Model |
|----------|-------------------|
| Filter irrelevant text first | [indobert-relevancy-classifier](https://huggingface.co/apriandito/indobert-relevancy-classifier) |
| Full sentiment breakdown (pos/neu/neg) | [indobert-sentiment-classifier](https://huggingface.co/apriandito/indobert-sentiment-classifier) |
| Polarity only (pos/neg) | **This model** |

### Suggested Pipeline

```
Raw Text → Relevancy Filter → Sentiment Analysis (3-class or binary)
```

1. Use [indobert-relevancy-classifier](https://huggingface.co/apriandito/indobert-relevancy-classifier) to filter relevant text
2. Use this model (or the 3-class variant) to classify sentiment on relevant text only

## Related Models

| Model | Task | Labels | Accuracy |
|-------|------|--------|----------|
| [indobert-relevancy-classifier](https://huggingface.co/apriandito/indobert-relevancy-classifier) | Relevancy | Relevant / Not Relevant | 96.5% |
| [indobert-sentiment-classifier](https://huggingface.co/apriandito/indobert-sentiment-classifier) | Sentiment (3-class) | Negatif / Netral / Positif | 88.1% |
| **indobert-binary-sentiment-classifier** | **Sentiment (binary)** | **Negatif / Positif** | **96.06%** |

All three models share the same architecture (IndoBERT Large P2, 335M params) and the same context-conditioned input format (`[CLS] context [SEP] text [SEP]`).

## Citation

```bibtex
@misc{saputra2026indobert-binary-sentiment,
  title={IndoBERT Binary Sentiment Classifier: Context-Conditioned Binary Sentiment Classification for Indonesian Text},
  author={Saputra, Muhammad Apriandito Arya},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/apriandito/indobert-binary-sentiment-classifier}
}
```