File size: 2,205 Bytes
3948fd0
 
1d570aa
51c9f43
d64dfbd
1d570aa
d64dfbd
1d570aa
3948fd0
1d570aa
 
 
3948fd0
1d570aa
 
3948fd0
 
 
 
1d570aa
51c9f43
56817c4
 
 
 
 
 
d64dfbd
51c9f43
56817c4
 
 
 
 
 
 
 
 
 
 
 
 
 
a6f6346
56817c4
 
 
 
 
 
 
 
 
 
 
a6f6346
d64dfbd
607fc50
1d570aa
d64dfbd
 
3948fd0
d64dfbd
3948fd0
1d570aa
d64dfbd
 
 
 
3948fd0
d64dfbd
3948fd0
d64dfbd
 
3948fd0
d64dfbd
 
 
3948fd0
d64dfbd
 
 
3948fd0
1d570aa
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---

library_name: transformers
license: apache-2.0
base_model: indobenchmark/indobert-base-p1
language:
    - id
datasets:
    - Bangkah/atha-text-dataset
tags:
    - sentiment-analysis
    - text-classification
    - indonesian
metrics:
    - accuracy
    - f1
---


# atha-text-classifier

Model ini adalah fine-tuned indobenchmark/indobert-base-p1 untuk klasifikasi sentimen Bahasa Indonesia 3 kelas.

Label output:

- `negative`
- `neutral`
- `positive`

Training data: https://huggingface.co/datasets/Bangkah/atha-text-dataset

## Quick Use (Transformers)

```python

from transformers import AutoTokenizer, AutoModelForSequenceClassification

import torch



model_id = "Bangkah/atha-text-classifier"

tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForSequenceClassification.from_pretrained(model_id)



text = "produk ini bagus dan pengirimannya cepat"

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

with torch.no_grad():

    logits = model(**inputs).logits



probs = torch.softmax(logits, dim=-1)[0]

label_id = int(torch.argmax(probs).item())

label = model.config.id2label[label_id]

score = float(probs[label_id].item())

print({"label": label, "confidence": round(score, 4)})

```

## Limitations

- Dataset training masih sintetis, sehingga metrik tinggi tidak langsung merepresentasikan performa produksi.
- Untuk use-case production, tetap lakukan fine-tuning ulang dengan data real domain aplikasi.

## Validation Metrics

- Loss: 0.0004
- Accuracy: 1.0000
- Macro F1: 1.0000

## Confusion Matrix

| true\pred | negative | neutral | positive |
|---|---:|---:|---:|
| negative | 100 | 0 | 0 |
| neutral | 0 | 100 | 0 |
| positive | 0 | 0 | 100 |

## Classification Report

```text

              precision    recall  f1-score   support



    negative     1.0000    1.0000    1.0000       100

     neutral     1.0000    1.0000    1.0000       100

    positive     1.0000    1.0000    1.0000       100



    accuracy                         1.0000       300

   macro avg     1.0000    1.0000    1.0000       300

weighted avg     1.0000    1.0000    1.0000       300



```