base_sami_22k_ft_pseudo_widv_ep60_tde

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 156.9329
Wer: 0.1783
Cer: 0.0520

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.25
num_epochs: 60.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
4165.6836	1.0	972	257.4717	0.5061	0.1236
711.8623	2.0	1944	178.0096	0.3002	0.0752
536.9188	3.0	2916	177.6306	0.3027	0.0734
499.5686	4.0	3888	160.8973	0.2520	0.0654
489.7358	5.0	4860	174.7495	0.2900	0.0775
508.3668	6.0	5832	193.4865	0.2944	0.0783
537.328	7.0	6804	230.1123	0.3104	0.0846
553.6925	8.0	7776	256.9771	0.3465	0.1000
581.7494	9.0	8748	254.6887	0.3680	0.1056
623.7695	10.0	9720	268.8913	0.3810	0.1115
652.2123	11.0	10692	315.2755	0.4066	0.1199
702.8481	12.0	11664	314.9687	0.4256	0.1261
743.4121	13.0	12636	311.7790	0.5521	0.1613
799.5601	14.0	13608	359.8279	0.4546	0.1385
816.097	15.0	14580	376.9053	0.5721	0.1836
815.892	16.0	15552	378.6718	0.5346	0.1670
803.2213	17.0	16524	358.1064	0.4638	0.1451
783.7681	18.0	17496	332.4148	0.4775	0.1448
730.573	19.0	18468	312.7887	0.4548	0.1391
722.3447	20.0	19440	316.9329	0.4269	0.1279
683.925	21.0	20412	290.0936	0.4145	0.1273
650.4165	22.0	21384	288.4570	0.3989	0.1194
631.0532	23.0	22356	271.8179	0.4043	0.1208
617.0345	24.0	23328	240.8433	0.3880	0.1139
579.2521	25.0	24300	261.6252	0.3954	0.1137
571.9061	26.0	25272	248.2078	0.3651	0.1068
535.8284	27.0	26244	210.3216	0.3549	0.1017
528.5526	28.0	27216	223.0247	0.3554	0.1005
495.6612	29.0	28188	214.1045	0.3421	0.1011
489.6183	30.0	29160	223.9064	0.3301	0.0963
465.3683	31.0	30132	204.1657	0.3408	0.0976
447.7585	32.0	31104	204.1836	0.3147	0.0902
430.5009	33.0	32076	199.8400	0.3149	0.0906
423.8555	34.0	33048	188.0466	0.2939	0.0840
409.8573	35.0	34020	204.4093	0.2827	0.0807
385.368	36.0	34992	187.5865	0.2911	0.0833
372.0869	37.0	35964	181.9698	0.2730	0.0770
360.2271	38.0	36936	181.9897	0.2687	0.0768
340.9098	39.0	37908	188.7458	0.2650	0.0775
329.051	40.0	38880	165.1118	0.2632	0.0753
311.5036	41.0	39852	184.7068	0.2631	0.0730
300.7056	42.0	40824	169.6482	0.2503	0.0715
293.3254	43.0	41796	156.1380	0.2447	0.0705
272.1289	44.0	42768	174.9299	0.2404	0.0683
266.0369	45.0	43740	156.8380	0.2368	0.0673
251.5002	46.0	44712	170.4171	0.2351	0.0665
235.648	47.0	45684	159.8278	0.2227	0.0626
229.8135	48.0	46656	163.2967	0.2198	0.0625
220.4877	49.0	47628	167.3510	0.2138	0.0612
210.467	50.0	48600	163.8741	0.2138	0.0615
208.0692	51.0	49572	159.9132	0.2048	0.0587
188.9099	52.0	50544	160.8002	0.2096	0.0596
183.2089	53.0	51516	154.5008	0.1979	0.0564
174.8595	54.0	52488	157.2374	0.1964	0.0560
167.0076	55.0	53460	157.2821	0.1918	0.0550
162.867	56.0	54432	158.1763	0.1851	0.0539
154.225	57.0	55404	157.7223	0.1829	0.0532
149.2202	58.0	56376	159.2531	0.1818	0.0528
144.8521	59.0	57348	158.1030	0.1796	0.0522
145.8063	60.0	58320	156.3732	0.1787	0.0519

Framework versions

Transformers 4.48.3
Pytorch 2.5.1
Datasets 3.2.0
Tokenizers 0.21.0

Downloads last month: 3

Safetensors

Model size

94.4M params

Tensor type

F32

Priyanship
/

base_sami_22k_ft_pseudo_widv_ep60_tde

base_sami_22k_ft_pseudo_widv_ep60_tde

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results