| | --- |
| | license: apache-2.0 |
| | base_model: bartpho |
| | tags: |
| | - vietnamese |
| | - hate-speech |
| | - span-detection |
| | - token-classification |
| | - nlp |
| | datasets: |
| | - visolex/ViHOS |
| | model-index: |
| | - name: bartpho-hsd-span |
| | results: |
| | - task: |
| | type: token-classification |
| | name: Hate Speech Span Detection |
| | dataset: |
| | name: visolex/ViHOS |
| | type: visolex/ViHOS |
| | metrics: |
| | - type: f1 |
| | value: 0.3361 |
| | - type: precision |
| | value: 0.5521 |
| | - type: recall |
| | value: 0.5095 |
| | - type: exact_match |
| | value: 0.0226 |
| | --- |
| | |
| | # bartpho-hsd-span: Hate Speech Span Detection (Vietnamese) |
| |
|
| | This model is a fine-tuned version of [bartpho](https://huggingface.co/bartpho) for Vietnamese **Hate Speech Span Detection**. |
| |
|
| | ## Model Details |
| |
|
| | - Base Model: `bartpho` |
| | - Description: Vietnamese Hate Speech Span Detection |
| | - Framework: HuggingFace Transformers |
| | - Task: Hate Speech Span Detection (token/char-level spans) |
| |
|
| | ### Hyperparameters |
| |
|
| | - Max sequence length: `64` |
| | - Learning rate: `5e-6` |
| | - Batch size: `32` |
| | - Epochs: `100` |
| | - Early stopping patience: `5` |
| |
|
| | ## Results |
| |
|
| | - F1: `0.3361` |
| | - Precision: `0.5521` |
| | - Recall: `0.5095` |
| | - Exact Match: `0.0226` |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForTokenClassification |
| | import torch |
| | |
| | model_name = "visolex/bartpho-hsd-span" |
| | tok = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForTokenClassification.from_pretrained(model_name) |
| | text = "Ví dụ câu tiếng Việt có nội dung thù ghét ..." |
| | enc = tok(text, return_tensors="pt", truncation=True, max_length=256, is_split_into_words=False) |
| | with torch.no_grad(): |
| | logits = model(**enc).logits |
| | pred_ids = logits.argmax(-1)[0].tolist() |
| | # TODO: chuyển pred_ids -> spans theo scheme nhãn của bạn (BIO/BILOU/char-offset) |
| | ``` |
| |
|
| | ## License |
| |
|
| | Apache-2.0 |
| |
|
| | ## Acknowledgments |
| |
|
| | - Base model: [bartpho](https://huggingface.co/bartpho) |
| |
|