Matryoshka Representation Learning
Paper
• 2205.13147 • Published
• 25
This is a sentence-transformers model finetuned from flax-sentence-embeddings/all_datasets_v4_MiniLM-L6 on the json dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("FareedKhan/flax-sentence-embeddings_all_datasets_v4_MiniLM-L6_FareedKhan_prime_synthetic_data_2k_10_32")
# Run inference
sentences = [
'\nAtypical hemolytic uremic syndrome (aHUS) with H factor anomaly is a disease characterized by an atypical form of hemolytic uremic syndrome, a severe thrombotic microangiopathy that leads to kidney failure, anemia, and thrombocytopenia. This specific subtype of aHUS is notable for its association with an anomaly in the H factor, potentially involving complement system dysregulation. As such, it falls under the broader category of hemolytic uremic syndrome, a condition marked by differential diagnosis complexity and distinct etiologies. Patients with aHUS often require a nuanced approach to diagnosis and management, emphasizing awareness of its distinct characteristics in comparison with other forms of hemolytic uremic syndrome, ensuring comprehensive and accurate differential diagnosis which might include conditions like thrombotic thrombocytopenic purpura (TTP) or disseminated intravascular coagulation (DIC). The identification and management of aHUS with H factor anomaly necessitates multidisciplinary collaboration and up-to-date knowledge alongside genetic and clinical features specific to this condition.',
'Could you list the diseases related to or subtypes of type 1 atypical hemolytic uremic syndrome for differential diagnosis purposes?',
'Which diseases are associated with anomalies in the CD4 gene or protein, alongside genetic mutations that impact muscle protein synthesis?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
dim_384InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.3861 |
| cosine_accuracy@3 | 0.4604 |
| cosine_accuracy@5 | 0.4901 |
| cosine_accuracy@10 | 0.5149 |
| cosine_precision@1 | 0.3861 |
| cosine_precision@3 | 0.1535 |
| cosine_precision@5 | 0.098 |
| cosine_precision@10 | 0.0515 |
| cosine_recall@1 | 0.3861 |
| cosine_recall@3 | 0.4604 |
| cosine_recall@5 | 0.4901 |
| cosine_recall@10 | 0.5149 |
| cosine_ndcg@10 | 0.4514 |
| cosine_mrr@10 | 0.4312 |
| cosine_map@100 | 0.4383 |
positive and anchor| positive | anchor | |
|---|---|---|
| type | string | string |
| details |
|
|
| positive | anchor |
|---|---|
|
Search for medical conditions not treatable by any known medications that present with hoarseness as a symptom. |
|
What could be the condition causing frequent stomach discomfort, nausea, appetite loss, fatigue, and weakness in me, possibly linked to a family history of Cestode infection and associated with vitamin B12 deficiency and abnormal red blood cells resembling Biermer's anemia symptoms? |
|
Which cellular structures engage in interactions with genes or proteins that are affected by the administration of Mevastatin? |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
384
],
"matryoshka_weights": [
1
],
"n_dims_per_step": -1
}
eval_strategy: epochper_device_train_batch_size: 32learning_rate: 1e-05num_train_epochs: 10warmup_ratio: 0.1bf16: Truetf32: Falseload_best_model_at_end: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 1e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Falselocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | dim_384_cosine_map@100 |
|---|---|---|---|
| 0 | 0 | - | 0.3748 |
| 0.1754 | 10 | 1.5606 | - |
| 0.3509 | 20 | 1.5914 | - |
| 0.5263 | 30 | 1.6623 | - |
| 0.7018 | 40 | 1.7258 | - |
| 0.8772 | 50 | 1.6031 | - |
| 1.0 | 57 | - | 0.4241 |
| 1.0526 | 60 | 1.4494 | - |
| 1.2281 | 70 | 1.4091 | - |
| 1.4035 | 80 | 1.3177 | - |
| 1.5789 | 90 | 1.3299 | - |
| 1.7544 | 100 | 1.459 | - |
| 1.9298 | 110 | 1.3534 | - |
| 2.0 | 114 | - | 0.4214 |
| 2.1053 | 120 | 1.3023 | - |
| 2.2807 | 130 | 1.2222 | - |
| 2.4561 | 140 | 1.2191 | - |
| 2.6316 | 150 | 1.0443 | - |
| 2.8070 | 160 | 1.1894 | - |
| 2.9825 | 170 | 1.0955 | - |
| 3.0 | 171 | - | 0.4156 |
| 3.1579 | 180 | 1.1698 | - |
| 3.3333 | 190 | 0.9699 | - |
| 3.5088 | 200 | 1.0524 | - |
| 3.6842 | 210 | 0.9902 | - |
| 3.8596 | 220 | 1.0943 | - |
| 4.0 | 228 | - | 0.4221 |
| 4.0351 | 230 | 0.9793 | - |
| 4.2105 | 240 | 0.9786 | - |
| 4.3860 | 250 | 1.0352 | - |
| 4.5614 | 260 | 0.9809 | - |
| 4.7368 | 270 | 0.8568 | - |
| 4.9123 | 280 | 0.9372 | - |
| 5.0 | 285 | - | 0.4264 |
| 5.0877 | 290 | 0.8529 | - |
| 5.2632 | 300 | 0.9472 | - |
| 5.4386 | 310 | 0.8436 | - |
| 5.6140 | 320 | 0.8166 | - |
| 5.7895 | 330 | 0.8731 | - |
| 5.9649 | 340 | 0.9489 | - |
| 6.0 | 342 | - | 0.4274 |
| 6.1404 | 350 | 0.9991 | - |
| 6.3158 | 360 | 0.7533 | - |
| 6.4912 | 370 | 0.9122 | - |
| 6.6667 | 380 | 0.8404 | - |
| 6.8421 | 390 | 0.7928 | - |
| 7.0 | 399 | - | 0.4302 |
| 7.0175 | 400 | 0.8332 | - |
| 7.1930 | 410 | 0.7534 | - |
| 7.3684 | 420 | 0.8424 | - |
| 7.5439 | 430 | 0.8465 | - |
| 7.7193 | 440 | 0.8461 | - |
| 7.8947 | 450 | 0.7203 | - |
| 8.0 | 456 | - | 0.4344 |
| 8.0702 | 460 | 0.8144 | - |
| 8.2456 | 470 | 0.7895 | - |
| 8.4211 | 480 | 0.7665 | - |
| 8.5965 | 490 | 0.883 | - |
| 8.7719 | 500 | 0.6908 | - |
| 8.9474 | 510 | 0.8481 | - |
| 9.0 | 513 | - | 0.4365 |
| 9.1228 | 520 | 0.7521 | - |
| 9.2982 | 530 | 0.6971 | - |
| 9.4737 | 540 | 0.7081 | - |
| 9.6491 | 550 | 0.8272 | - |
| 9.8246 | 560 | 0.7922 | - |
| 10.0 | 570 | 0.7998 | 0.4383 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}