Train
Training Dataset
You should prepare GazeFollow and VideoAttentionTarget for training.
- Get GazeFollow.
- If train with auxiliary regression, use
scripts\gen_gazefollow_head_masks.pyto generate head masks. - Get VideoAttentionTarget.
Check ViTGaze/configs/common/dataloader to modify DATA_ROOT.
Pretrained Model
Get DINOv2 pretrained ViT-S.
Or you could download and preprocess pretrained weights by
cd ViTGaze mkdir pretrained && cd pretrained wget https://dl.fbaipublicfiles.com/dinov2/dinov2_vits14/dinov2_vits14_pretrain.pthPreprocess the model weights with
scripts\convert_pth.pyto fit Detectron2 format.
Train ViTGaze
You can modify configs in configs/gazefollow.py, configs/gazefollow_518.py and configs/videoattentiontarget.py.
Run:
bash train.sh
to train ViTGaze on the two datasets.
Training output will be saved in ViTGaze/output/.