Fine-tuned IndoBERT for Indonesian Person and Address Extraction
Model summary
This model is a fine-tuned version of indobenchmark/indobert-base-p1 for token classification on short Indonesian transactional text.
The model predicts two target entity types:
PER: person nameADDR: address or address detail
The token-label space is:
OB-PERI-PERB-ADDRI-ADDR
Intended use
This model is intended for Indonesian short transactional utterances such as:
- transfer instructions
- account-owner replies
- short recipient-name replies
- electricity-payment requests
- short address replies
- noisy or informal chat-style transaction text
It is intended for research and experimental use in named entity recognition / sensitive entity extraction for:
- person names
- address-like spans
Training data
The model was fine-tuned on a synthetic token-classification corpus with:
10,678unique records in total8,522training records1,048validation records1,108internal synthetic test records
The synthetic generator introduces variation at three levels:
- sentence-level variation
- entity-level variation
- noise-level variation
The corpus includes transfer, electricity-payment, ambiguity, and short-reply cases, including:
- formal instructions
- answer-style replies
- noisy or abbreviated chat forms
- person/bank ambiguity
- person/road-name ambiguity
- abbreviated and full-form addresses
- block/unit/apartment/ruko-style addresses
- control cases with no target entities
External benchmark
The model was externally evaluated on a separate frozen reviewed benchmark with:
320cases16categories
This benchmark is distinct from the synthetic train/validation/test split.
Reported overall benchmark results on that reviewed benchmark:
| Metric | Value |
|---|---|
| Person Precision | 0.9296 |
| Person Recall | 0.9429 |
| Person F1 | 0.9362 |
| Address Precision | 0.8730 |
| Address Recall | 0.9167 |
| Address F1 | 0.8943 |
| Full exact match | 0.9406 |
These external benchmark values are the most realistic summary of model behavior on the final reviewed evaluation set, because the 320 benchmark cases are separate from the synthetic fine-tuning corpus.
Internal synthetic metrics
The following values are internal synthetic metrics, computed on the synthetic validation and synthetic test splits derived from the same generated corpus family used for fine-tuning:
| Split | Overall F1 | PER F1 | ADDR F1 |
|---|---|---|---|
| Validation | 1.0000 | 1.0000 | 1.0000 |
| Test | 1.0000 | 1.0000 | 1.0000 |
These values should therefore be interpreted as internal synthetic-fit results, not as the main external generalization result. For external performance, refer to the 320-case benchmark reported above.
Data and benchmark availability
The fine-tuning dataset splits, the frozen benchmark dataset, and the benchmark comparison results are available in the project repository:
Readers looking for replication materials should refer in particular to:
- the fine-tuning dataset (
training,validation,test) - the reviewed benchmark dataset
- the benchmark comparison outputs across the published models
Label mapping
The model configuration stores the following mapping:
0 -> O1 -> B-PER2 -> I-PER3 -> B-ADDR4 -> I-ADDR
Example usage
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
model_id = "ericodh/indobert-id-bankchat-name-address-ner"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id)
ner = pipeline(
"token-classification",
model=model,
tokenizer=tokenizer,
aggregation_strategy="simple",
)
text = "Transfer ke Asep Nainggolan dan bayarkan listrik rumah yang di Jalan Cideng Timur No. 28."
print(ner(text))
Repository contents
For Hugging Face upload, the core files are:
config.jsonmodel.safetensorstokenizer.jsontokenizer_config.jsonspecial_tokens_map.jsonvocab.txtREADME.md
Optional supporting files already present in the artifact directory:
metrics.jsontrainer_state.jsontraining_args.bin
Limitations
- The model is specialized for short Indonesian transactional text rather than general long-form NER.
- It focuses on
PERandADDR, not a broad general-purpose entity taxonomy. - Synthetic training data reduces coverage gaps but does not remove all risk of template memorization.
- Performance is stronger on the intended domain than on unrelated language domains or document styles.
- Downloads last month
- 51
Model tree for ericodh/indobert-id-bankchat-name-address-ner
Base model
indobenchmark/indobert-base-p1