A simple ViT model based on google/vit-base-patch16-224-in21k to classify a given image into KF, NonKF, or Rejected categories. It has been finetuned on a private dataset of 11.4gb of images, 58,781 total.

Real world accuracy of the model is decent, I wouldn't use it to decide whether or not images are deleted or not, but it is good enough to sort your saved Kemono Friends fanart images. It was not explicitly trained on cosplay, real world, or other non-fanart KF images.

The model was trained to 8 epochs, this model being the 5th. Around the 6th epoch and beyond, it stopped learning. Currently the goal for the model is to further train it on more NonKF images, due to the size of that category being so small.

This model was trained using the code on this repo. I trained it locally on my 7900xtx using the provided code and configs. The repo also has some tools if you are interested in creating your own finetuned ViT model.

Downloads last month
41
Safetensors
Model size
85.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for HAV0X1014/Kemono-Friends-Sorter

Finetuned
(2463)
this model
Quantizations
1 model