Sentence Similarity
sentence-transformers
Safetensors
English
bert
feature-extraction
Generated from Trainer
dataset_size:6300
loss:MatryoshkaLoss
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use riphunter7001x/bge-base-financial with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use riphunter7001x/bge-base-financial with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("riphunter7001x/bge-base-financial") sentences = [ "As of January 31, 2023, the Company's net operating loss and capital loss carryforwards totaled approximately $32.3 billion.", "What was the percentage change in general and administrative expenses in 2023 compared to 2022?", "What was the amount of the company's net operating loss and capital loss carryforwards as of January 31, 2023?", "What are common challenges in pharmaceutical research and development?" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
| { | |
| "added_tokens_decoder": { | |
| "0": { | |
| "content": "[PAD]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "100": { | |
| "content": "[UNK]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "101": { | |
| "content": "[CLS]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "102": { | |
| "content": "[SEP]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "103": { | |
| "content": "[MASK]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| } | |
| }, | |
| "clean_up_tokenization_spaces": true, | |
| "cls_token": "[CLS]", | |
| "do_basic_tokenize": true, | |
| "do_lower_case": true, | |
| "mask_token": "[MASK]", | |
| "model_max_length": 512, | |
| "never_split": null, | |
| "pad_token": "[PAD]", | |
| "sep_token": "[SEP]", | |
| "strip_accents": null, | |
| "tokenize_chinese_chars": true, | |
| "tokenizer_class": "BertTokenizer", | |
| "unk_token": "[UNK]" | |
| } | |