train_record_1745950254

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4941
  • Num Input Tokens Seen: 60870656

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8115 0.0064 200 1.5858 306656
1.7881 0.0128 400 1.1619 608000
0.9015 0.0192 600 0.9876 914496
0.932 0.0256 800 0.8973 1218512
0.9304 0.0320 1000 0.8336 1521872
0.8057 0.0384 1200 0.7928 1824176
0.7164 0.0448 1400 0.7639 2127280
0.7748 0.0512 1600 0.7435 2429264
0.5404 0.0576 1800 0.7271 2736704
0.6634 0.0640 2000 0.7104 3041072
1.0893 0.0704 2200 0.6960 3343472
0.5612 0.0768 2400 0.6853 3649488
0.5582 0.0832 2600 0.6775 3953824
0.5903 0.0896 2800 0.6691 4257904
0.6387 0.0960 3000 0.6626 4560128
0.5909 0.1024 3200 0.6562 4865424
0.6038 0.1088 3400 0.6508 5169024
0.4876 0.1152 3600 0.6441 5477808
0.7048 0.1216 3800 0.6394 5785408
0.5233 0.1280 4000 0.6348 6091152
0.7181 0.1344 4200 0.6323 6393712
0.6783 0.1408 4400 0.6271 6700592
0.5696 0.1472 4600 0.6254 7006048
0.5197 0.1536 4800 0.6217 7307424
0.6421 0.1600 5000 0.6180 7613952
0.5504 0.1664 5200 0.6143 7920784
0.6105 0.1728 5400 0.6104 8223472
0.5743 0.1792 5600 0.6074 8527408
0.5296 0.1856 5800 0.6054 8834416
0.4067 0.1920 6000 0.6036 9138176
0.7168 0.1985 6200 0.6011 9444880
0.461 0.2049 6400 0.5976 9746832
0.7191 0.2113 6600 0.5960 10049776
0.4289 0.2177 6800 0.5925 10352672
0.4285 0.2241 7000 0.5903 10655360
0.6494 0.2305 7200 0.5882 10961872
0.5142 0.2369 7400 0.5867 11265584
0.6324 0.2433 7600 0.5849 11571872
0.5091 0.2497 7800 0.5823 11879344
0.7383 0.2561 8000 0.5801 12182400
0.4138 0.2625 8200 0.5772 12484784
0.5738 0.2689 8400 0.5762 12786640
0.6092 0.2753 8600 0.5741 13088384
0.626 0.2817 8800 0.5725 13394624
0.5067 0.2881 9000 0.5700 13699248
0.5356 0.2945 9200 0.5686 14003792
0.7087 0.3009 9400 0.5670 14305968
0.5085 0.3073 9600 0.5657 14607328
0.5123 0.3137 9800 0.5639 14911184
0.5891 0.3201 10000 0.5626 15217456
0.624 0.3265 10200 0.5614 15521408
0.5232 0.3329 10400 0.5594 15825776
0.6099 0.3393 10600 0.5583 16132208
0.5191 0.3457 10800 0.5565 16434512
0.576 0.3521 11000 0.5550 16737888
0.5733 0.3585 11200 0.5528 17040512
0.7007 0.3649 11400 0.5517 17343664
0.7309 0.3713 11600 0.5503 17646512
0.44 0.3777 11800 0.5488 17949824
0.5858 0.3841 12000 0.5480 18252224
0.5043 0.3905 12200 0.5468 18558800
0.7766 0.3969 12400 0.5457 18863792
0.4641 0.4033 12600 0.5443 19165216
0.5321 0.4097 12800 0.5429 19467712
0.5 0.4161 13000 0.5418 19768464
0.3868 0.4225 13200 0.5408 20071264
0.5129 0.4289 13400 0.5402 20376832
0.5026 0.4353 13600 0.5395 20683904
0.6542 0.4417 13800 0.5381 20987920
0.4867 0.4481 14000 0.5365 21293728
0.5284 0.4545 14200 0.5356 21600160
0.4407 0.4609 14400 0.5345 21906528
0.4864 0.4673 14600 0.5339 22212960
0.465 0.4737 14800 0.5333 22518592
0.5099 0.4801 15000 0.5328 22821472
0.5401 0.4865 15200 0.5316 23124960
0.507 0.4929 15400 0.5299 23428832
0.4591 0.4993 15600 0.5292 23734320
0.6651 0.5057 15800 0.5285 24037968
0.5211 0.5121 16000 0.5284 24344064
0.4678 0.5185 16200 0.5274 24647968
0.4696 0.5249 16400 0.5261 24952928
0.4875 0.5313 16600 0.5255 25257664
0.441 0.5377 16800 0.5250 25561392
0.4989 0.5441 17000 0.5236 25862992
0.4765 0.5505 17200 0.5230 26169264
0.5298 0.5569 17400 0.5215 26471584
0.4642 0.5633 17600 0.5213 26773280
0.422 0.5697 17800 0.5204 27077600
0.6037 0.5761 18000 0.5198 27381664
0.5 0.5825 18200 0.5192 27688016
0.5292 0.5890 18400 0.5183 27992592
0.3928 0.5954 18600 0.5179 28297696
0.4373 0.6018 18800 0.5176 28603264
0.5141 0.6082 19000 0.5169 28910096
0.562 0.6146 19200 0.5163 29218624
0.4593 0.6210 19400 0.5156 29521824
0.5216 0.6274 19600 0.5154 29825872
0.4665 0.6338 19800 0.5146 30128800
0.3965 0.6402 20000 0.5142 30432080
0.463 0.6466 20200 0.5129 30737872
0.6503 0.6530 20400 0.5125 31041328
0.4899 0.6594 20600 0.5115 31344080
0.5516 0.6658 20800 0.5113 31646576
0.4349 0.6722 21000 0.5107 31951744
0.4682 0.6786 21200 0.5106 32257664
0.6893 0.6850 21400 0.5100 32561408
0.3469 0.6914 21600 0.5095 32868640
0.343 0.6978 21800 0.5089 33175008
0.532 0.7042 22000 0.5088 33481296
0.3321 0.7106 22200 0.5081 33782240
0.6552 0.7170 22400 0.5079 34088112
0.5395 0.7234 22600 0.5079 34390944
0.4636 0.7298 22800 0.5070 34696480
0.5396 0.7362 23000 0.5068 34998048
0.5457 0.7426 23200 0.5064 35302272
0.5577 0.7490 23400 0.5055 35611216
0.6506 0.7554 23600 0.5051 35918512
0.4636 0.7618 23800 0.5045 36223840
0.5077 0.7682 24000 0.5045 36528496
0.4869 0.7746 24200 0.5038 36832464
0.4854 0.7810 24400 0.5034 37139696
0.3727 0.7874 24600 0.5032 37440752
0.6113 0.7938 24800 0.5029 37744512
0.4791 0.8002 25000 0.5027 38050896
0.4708 0.8066 25200 0.5021 38353216
0.5611 0.8130 25400 0.5021 38659776
0.4683 0.8194 25600 0.5018 38963712
0.4308 0.8258 25800 0.5015 39269392
0.6569 0.8322 26000 0.5013 39571936
0.4806 0.8386 26200 0.5010 39875600
0.6366 0.8450 26400 0.5006 40181360
0.5174 0.8514 26600 0.5000 40483344
0.5153 0.8578 26800 0.4996 40786576
0.5619 0.8642 27000 0.4993 41094912
0.3401 0.8706 27200 0.4990 41395648
0.5198 0.8770 27400 0.4989 41695584
0.525 0.8834 27600 0.4985 42001344
0.3574 0.8898 27800 0.4983 42305104
0.4493 0.8962 28000 0.4981 42606944
0.4451 0.9026 28200 0.4979 42909232
0.4438 0.9090 28400 0.4979 43212768
0.419 0.9154 28600 0.4979 43517168
0.4622 0.9218 28800 0.4978 43820288
0.5337 0.9282 29000 0.4974 44124992
0.5332 0.9346 29200 0.4972 44428848
0.3471 0.9410 29400 0.4969 44735232
0.5661 0.9474 29600 0.4967 45040832
0.468 0.9538 29800 0.4968 45343072
0.3829 0.9602 30000 0.4965 45648256
0.5386 0.9666 30200 0.4967 45951600
0.4994 0.9730 30400 0.4964 46252880
0.5633 0.9795 30600 0.4963 46556736
0.6597 0.9859 30800 0.4963 46858080
0.4899 0.9923 31000 0.4961 47163856
0.4158 0.9987 31200 0.4961 47469712
0.379 1.0051 31400 0.4959 47773184
0.5601 1.0115 31600 0.4957 48080192
0.475 1.0179 31800 0.4956 48384816
0.5515 1.0243 32000 0.4956 48688416
0.4853 1.0307 32200 0.4952 48993296
0.4677 1.0371 32400 0.4952 49298128
0.2249 1.0435 32600 0.4952 49602368
0.4776 1.0499 32800 0.4949 49909872
0.6094 1.0563 33000 0.4949 50217632
0.394 1.0627 33200 0.4948 50518112
0.4968 1.0691 33400 0.4949 50821280
0.5216 1.0755 33600 0.4948 51127232
0.5109 1.0819 33800 0.4945 51435040
0.6351 1.0883 34000 0.4946 51738784
0.4113 1.0947 34200 0.4946 52042176
0.5206 1.1011 34400 0.4946 52348928
0.5279 1.1075 34600 0.4945 52651440
0.4664 1.1139 34800 0.4944 52960256
0.4789 1.1203 35000 0.4945 53265840
0.5124 1.1267 35200 0.4945 53570848
0.3906 1.1331 35400 0.4944 53873104
0.4565 1.1395 35600 0.4944 54178960
0.4616 1.1459 35800 0.4946 54486752
0.6993 1.1523 36000 0.4942 54787456
0.3966 1.1587 36200 0.4943 55090272
0.4756 1.1651 36400 0.4943 55393072
0.3546 1.1715 36600 0.4942 55696592
0.4819 1.1779 36800 0.4943 56001936
0.5585 1.1843 37000 0.4943 56306928
0.5079 1.1907 37200 0.4942 56613088
0.6697 1.1971 37400 0.4943 56917104
0.5126 1.2035 37600 0.4943 57225504
0.4852 1.2099 37800 0.4941 57529136
0.3548 1.2163 38000 0.4941 57830720
0.4162 1.2227 38200 0.4944 58135872
0.4424 1.2291 38400 0.4942 58439424
0.4559 1.2355 38600 0.4942 58743296
0.5134 1.2419 38800 0.4942 59046368
0.4145 1.2483 39000 0.4942 59351696
0.3358 1.2547 39200 0.4942 59657728
0.4092 1.2611 39400 0.4941 59960256
0.3274 1.2675 39600 0.4941 60265552
0.5532 1.2739 39800 0.4941 60567296
0.4226 1.2803 40000 0.4941 60870656

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1745950254

Adapter
(526)
this model

Evaluation results