Nougat_CH_20k

This model is a fine-tuned version of facebook/nougat-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 6
total_train_batch_size: 48
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20

Training Loss	Epoch	Step	Validation Loss
10.4291	1.0	334	1.5553
4.8127	2.0	668	0.8458
2.9138	3.0	1002	0.4572
1.8249	4.0	1336	0.4366
1.1195	5.0	1670	0.2577
1.0455	6.0	2004	0.1916
0.9011	7.0	2338	0.2527
0.5274	8.0	2672	0.2161
0.4131	9.0	3006	0.1683
0.2793	10.0	3340	0.1742
0.2865	11.0	3674	0.1788
0.2515	12.0	4008	0.1852
0.1991	13.0	4342	0.1589
0.1869	14.0	4676	0.1569
0.1761	15.0	5010	0.2064
0.1546	16.0	5344	0.1736
0.1553	17.0	5678	0.1631
0.171	18.0	6012	0.1642
0.1502	19.0	6346	0.1668

Safetensors

Model size

0.3B params

Tensor type

I64

BF16

Base model

Finetuned

(31)

this model