base_sami_22k_ft_pseudo_widv_ep60_tde

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 156.9329
  • Wer: 0.1783
  • Cer: 0.0520

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.25
  • num_epochs: 60.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
4165.6836 1.0 972 257.4717 0.5061 0.1236
711.8623 2.0 1944 178.0096 0.3002 0.0752
536.9188 3.0 2916 177.6306 0.3027 0.0734
499.5686 4.0 3888 160.8973 0.2520 0.0654
489.7358 5.0 4860 174.7495 0.2900 0.0775
508.3668 6.0 5832 193.4865 0.2944 0.0783
537.328 7.0 6804 230.1123 0.3104 0.0846
553.6925 8.0 7776 256.9771 0.3465 0.1000
581.7494 9.0 8748 254.6887 0.3680 0.1056
623.7695 10.0 9720 268.8913 0.3810 0.1115
652.2123 11.0 10692 315.2755 0.4066 0.1199
702.8481 12.0 11664 314.9687 0.4256 0.1261
743.4121 13.0 12636 311.7790 0.5521 0.1613
799.5601 14.0 13608 359.8279 0.4546 0.1385
816.097 15.0 14580 376.9053 0.5721 0.1836
815.892 16.0 15552 378.6718 0.5346 0.1670
803.2213 17.0 16524 358.1064 0.4638 0.1451
783.7681 18.0 17496 332.4148 0.4775 0.1448
730.573 19.0 18468 312.7887 0.4548 0.1391
722.3447 20.0 19440 316.9329 0.4269 0.1279
683.925 21.0 20412 290.0936 0.4145 0.1273
650.4165 22.0 21384 288.4570 0.3989 0.1194
631.0532 23.0 22356 271.8179 0.4043 0.1208
617.0345 24.0 23328 240.8433 0.3880 0.1139
579.2521 25.0 24300 261.6252 0.3954 0.1137
571.9061 26.0 25272 248.2078 0.3651 0.1068
535.8284 27.0 26244 210.3216 0.3549 0.1017
528.5526 28.0 27216 223.0247 0.3554 0.1005
495.6612 29.0 28188 214.1045 0.3421 0.1011
489.6183 30.0 29160 223.9064 0.3301 0.0963
465.3683 31.0 30132 204.1657 0.3408 0.0976
447.7585 32.0 31104 204.1836 0.3147 0.0902
430.5009 33.0 32076 199.8400 0.3149 0.0906
423.8555 34.0 33048 188.0466 0.2939 0.0840
409.8573 35.0 34020 204.4093 0.2827 0.0807
385.368 36.0 34992 187.5865 0.2911 0.0833
372.0869 37.0 35964 181.9698 0.2730 0.0770
360.2271 38.0 36936 181.9897 0.2687 0.0768
340.9098 39.0 37908 188.7458 0.2650 0.0775
329.051 40.0 38880 165.1118 0.2632 0.0753
311.5036 41.0 39852 184.7068 0.2631 0.0730
300.7056 42.0 40824 169.6482 0.2503 0.0715
293.3254 43.0 41796 156.1380 0.2447 0.0705
272.1289 44.0 42768 174.9299 0.2404 0.0683
266.0369 45.0 43740 156.8380 0.2368 0.0673
251.5002 46.0 44712 170.4171 0.2351 0.0665
235.648 47.0 45684 159.8278 0.2227 0.0626
229.8135 48.0 46656 163.2967 0.2198 0.0625
220.4877 49.0 47628 167.3510 0.2138 0.0612
210.467 50.0 48600 163.8741 0.2138 0.0615
208.0692 51.0 49572 159.9132 0.2048 0.0587
188.9099 52.0 50544 160.8002 0.2096 0.0596
183.2089 53.0 51516 154.5008 0.1979 0.0564
174.8595 54.0 52488 157.2374 0.1964 0.0560
167.0076 55.0 53460 157.2821 0.1918 0.0550
162.867 56.0 54432 158.1763 0.1851 0.0539
154.225 57.0 55404 157.7223 0.1829 0.0532
149.2202 58.0 56376 159.2531 0.1818 0.0528
144.8521 59.0 57348 158.1030 0.1796 0.0522
145.8063 60.0 58320 156.3732 0.1787 0.0519

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
94.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results