| | --- |
| | library_name: transformers |
| | language: |
| | - my |
| | - en |
| | --- |
| | |
| | # Burmese-Bert |
| |
|
| | Burmese-Bert is a Bilingual Mask Language Model based on "bert-large-uncased". |
| |
|
| | The architecture is based on bidirectional encoder representations from transformers. |
| |
|
| | Supports English and Burmese language. |
| |
|
| | ## Model Details |
| |
|
| | Coming Soon |
| |
|
| | ### Model Description |
| |
|
| | - **Developed by:** Min Si Thu |
| | - **Model type:** bidirectional encoder representations from transformers |
| | - **Language(s) (NLP):** [More Information Needed] |
| | - **License:** [More Information Needed] |
| | - **Finetuned from model [optional]:** [More Information Needed] |
| |
|
| | ### Model Sources [optional] |
| |
|
| | <!-- Provide the basic links for the model. --> |
| |
|
| | - **Repository:** [More Information Needed] |
| | - **Paper [optional]:** [More Information Needed] |
| | - **Demo [optional]:** [More Information Needed] |
| |
|
| | ## Uses |
| |
|
| | - Mask Filling Language Model |
| | - Burmese Natural Language Understanding |
| |
|
| | ### How to use |
| |
|
| | ```shell |
| | # install the dependencies |
| | pip install transformers |
| | ``` |
| |
|
| | ```python |
| | from transformers import AutoModelForMaskedLM,AutoTokenizer |
| | |
| | model_checkpoint = "jojo-ai-mst/BurmeseBert" |
| | model = AutoModelForMaskedLM.from_pretrained(model_checkpoint) |
| | tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) |
| | |
| | text = "This is a great [MASK]." |
| | |
| | import torch |
| | |
| | inputs = tokenizer(text, return_tensors="pt") |
| | token_logits = model(**inputs).logits |
| | # Find the location of [MASK] and extract its logits |
| | mask_token_index = torch.where(inputs["input_ids"] == tokenizer.mask_token_id)[1] |
| | mask_token_logits = token_logits[0, mask_token_index, :] |
| | # Pick the [MASK] candidates with the highest logits |
| | top_5_tokens = torch.topk(mask_token_logits, 5, dim=1).indices[0].tolist() |
| | |
| | for token in top_5_tokens: |
| | print(f"'>>> {text.replace(tokenizer.mask_token, tokenizer.decode([token]))}'") |
| | ``` |
| |
|
| | ## Citation [optional] |
| |
|
| | Coming Soon |