SentenceTransformer based on hyrinmansoor/text2frappe-s2-sbert

This is a sentence-transformers model finetuned from hyrinmansoor/text2frappe-s2-sbert. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: hyrinmansoor/text2frappe-s2-sbert
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Doctype: Supplier\nQuestion: Vendors tally per country?',
    'country: supplier country',
    'default_currency: currency used',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.4299, -0.3074],
#         [ 0.4299,  1.0000,  0.0622],
#         [-0.3074,  0.0622,  1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 92,692 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 9 tokens
    • mean: 18.14 tokens
    • max: 37 tokens
    • min: 5 tokens
    • mean: 11.06 tokens
    • max: 27 tokens
    • min: 3 tokens
    • mean: 10.69 tokens
    • max: 24 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    Doctype: Employee
    Question: List employees with designation โ€œSenior Managerโ€.
    designation: Designation of the employee. date_of_joining: Date when the employee joined.
    Doctype: Company
    Question: Give me the tax ID, company name, and establishment date for every business.
    company_name: The official name of the company. company_description: Description of the company.
    Doctype: Item
    Question: Which items have product variants and on what basis?
    variant_based_on: The basis for item variants. customer_items: Customer-specific item details.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.3
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 15
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss
0.0863 500 0.0392
0.1726 1000 0.0294
0.2589 1500 0.0249
0.3452 2000 0.0158
0.4315 2500 0.0124
0.5178 3000 0.0102
0.6041 3500 0.0083
0.6904 4000 0.0064
0.7767 4500 0.0067
0.8630 5000 0.0057
0.9493 5500 0.0058
1.0356 6000 0.0049
1.1219 6500 0.0041
1.2081 7000 0.0036
1.2944 7500 0.0044
1.3807 8000 0.0038
1.4670 8500 0.0032
1.5533 9000 0.0035
1.6396 9500 0.0037
1.7259 10000 0.0034
1.8122 10500 0.003
1.8985 11000 0.0027
1.9848 11500 0.0028
2.0711 12000 0.0023
2.1574 12500 0.0021
2.2437 13000 0.0021
2.3300 13500 0.0021
2.4163 14000 0.0021
2.5026 14500 0.0022
2.5889 15000 0.002
2.6752 15500 0.0021
2.7615 16000 0.002
2.8478 16500 0.0019
2.9341 17000 0.0019
3.0204 17500 0.0016
3.1067 18000 0.0011
3.1930 18500 0.0012
3.2793 19000 0.0016
3.3656 19500 0.0015
3.4518 20000 0.0013
3.5381 20500 0.0013
3.6244 21000 0.0008
3.7107 21500 0.0013
3.7970 22000 0.0012
3.8833 22500 0.0017
3.9696 23000 0.0011
4.0559 23500 0.0006
4.1422 24000 0.0007
4.2285 24500 0.001
4.3148 25000 0.0009
4.4011 25500 0.001
4.4874 26000 0.0006
4.5737 26500 0.0009
4.6600 27000 0.0008
4.7463 27500 0.0008
4.8326 28000 0.001
4.9189 28500 0.0008
5.0052 29000 0.0008
5.0915 29500 0.0007
5.1778 30000 0.0007
5.2641 30500 0.0006
5.3504 31000 0.0005
5.4367 31500 0.0006
5.5230 32000 0.0007
5.6093 32500 0.0006
5.6955 33000 0.0005
5.7818 33500 0.0006
5.8681 34000 0.0007
5.9544 34500 0.0007
6.0407 35000 0.0006
6.1270 35500 0.0004
6.2133 36000 0.0005
6.2996 36500 0.0003
6.3859 37000 0.0004
6.4722 37500 0.0003
6.5585 38000 0.0005
6.6448 38500 0.0005
6.7311 39000 0.0003
6.8174 39500 0.0005
6.9037 40000 0.0004
6.9900 40500 0.0006
7.0763 41000 0.0004
7.1626 41500 0.0003
7.2489 42000 0.0004
7.3352 42500 0.0003
7.4215 43000 0.0005
7.5078 43500 0.0005
7.5941 44000 0.0002
7.6804 44500 0.0002
7.7667 45000 0.0004
7.8530 45500 0.0004
7.9392 46000 0.0003
8.0255 46500 0.0003
8.1118 47000 0.0003
8.1981 47500 0.0003
8.2844 48000 0.0002
8.3707 48500 0.0002
8.4570 49000 0.0004
8.5433 49500 0.0002
8.6296 50000 0.0002
8.7159 50500 0.0002
8.8022 51000 0.0002
8.8885 51500 0.0002
8.9748 52000 0.0002
9.0611 52500 0.0001
9.1474 53000 0.0001
9.2337 53500 0.0002
9.3200 54000 0.0002
9.4063 54500 0.0002
9.4926 55000 0.0001
9.5789 55500 0.0001
9.6652 56000 0.0002
9.7515 56500 0.0001
9.8378 57000 0.0003
9.9241 57500 0.0001
10.0104 58000 0.0001
10.0967 58500 0.0001
10.1829 59000 0.0001
10.2692 59500 0.0001
10.3555 60000 0.0001
10.4418 60500 0.0002
10.5281 61000 0.0001
10.6144 61500 0.0002
10.7007 62000 0.0002
10.7870 62500 0.0002
10.8733 63000 0.0001
10.9596 63500 0.0001
11.0459 64000 0.0002
11.1322 64500 0.0001
11.2185 65000 0.0001
11.3048 65500 0.0001
11.3911 66000 0.0001
11.4774 66500 0.0001
11.5637 67000 0.0001
11.6500 67500 0.0001
11.7363 68000 0.0001
11.8226 68500 0.0
11.9089 69000 0.0001
11.9952 69500 0.0
12.0815 70000 0.0
12.1678 70500 0.0
12.2541 71000 0.0001
12.3404 71500 0.0001
12.4266 72000 0.0001
12.5129 72500 0.0001
12.5992 73000 0.0
12.6855 73500 0.0001
12.7718 74000 0.0001
12.8581 74500 0.0001
12.9444 75000 0.0001
13.0307 75500 0.0
13.1170 76000 0.0001
13.2033 76500 0.0001
13.2896 77000 0.0
13.3759 77500 0.0
13.4622 78000 0.0
13.5485 78500 0.0001
13.6348 79000 0.0001
13.7211 79500 0.0
13.8074 80000 0.0
13.8937 80500 0.0001
13.9800 81000 0.0
14.0663 81500 0.0
14.1526 82000 0.0
14.2389 82500 0.0
14.3252 83000 0.0
14.4115 83500 0.0001
14.4978 84000 0.0
14.5841 84500 0.0
14.6703 85000 0.0001
14.7566 85500 0.0001
14.8429 86000 0.0
14.9292 86500 0.0

Framework Versions

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
12
Safetensors
Model size
22.7M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hyrinmansoor/text2frappe-s2-sbert

Unable to build the model tree, the base model loops to the model itself. Learn more.