File size: 5,783 Bytes
2c46c4d 6d4aa31 2c46c4d 6d4aa31 d7afebb 2c46c4d 6d4aa31 2c46c4d 6d4aa31 2c46c4d 6d4aa31 2c46c4d 6d4aa31 2c46c4d 6d4aa31 2c46c4d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | # LLM auto annotation for HICO-DET dataset (Pose from [Halpe](https://github.com/Fang-Haoshu/Halpe-FullBody), Part State from [HAKE](https://github.com/DirtyHarryLYL/HAKE)).
## Environment
The code is developed using python 3.11.11 on Ubuntu 21.xx with torch==2.6.0+cu124,
transformers==4.57.3 (with Qwen3 series)
## Annotating HICO-Det
### A. Installation
1. Install required packges and dependencies.
2. Clone this repo, and we'll call the directory that you cloned as ${ROOT}.
3. Creat necessary directories:
```
mkdir outputs
mkdir model_weights
```
4. Download LLM's weights into model_weights from hugging face.
### B. Prepare Dataset
5. Install COCO API:
```
pip install pycocotools
```
6. Download [dataset](https://huggingface.co/datasets/ayh015/HICO-Det_Halpe_HAKE).
7. Organize dataset, your directory tree of dataset should look like this (there maybe extra files.):
```
{DATA_ROOT}
|-- Annotation
| |--hico-det-instance-level
| | |--hico-det-training-set-instance-level.json
| `--hico-fullbody-pose
| |--halpe_train_v1.json
| `--halpe_val_v1.json
|ββ Configs
| |--hico_hoi_list.txt
| `--Part_State_76.txt
|ββ Images
| |--images
| |--test2015
| | |--HICO_test2015_00000001.jpg
| | |--HICO_test2015_00000002.jpg
| | ...
| `--train2015
| |--HICO_train2015_00000001.jpg
| |--HICO_train2015_00000002.jpg
| ...
`ββ Logic_Rules
|--gather_rule.pkl
`--read_rules.py
```
### C. Start annotation
#### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate.sh".
```
IDX={YOUR_GPU_IDS}
export PYTHONPATH=$PYTHONPATH:./
data_path={DATA_ROOT}
model_path={ROOT}/model_weights/{YOUR_MODEL_NAME}
output_dir={ROOT}/outputs
if [ -d ${output_dir} ];then
echo "dir already exists"
else
mkdir ${output_dir}
fi
CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
tools/annotate_hico.py \
--model-path ${model_path} \
--data-path ${data_path} \
--output-dir ${output_dir} \
```
#### Start auto-annotation
```
bash scripts/annotate_hico.sh
```
### D. Annotation format
A list of dict that contains the following keys:
```
{
'file_name': 'HICO_train2015_00009511.jpg',
'image_id': 0,
'keypoints': a 51-elements list (17x3 keypoints with x, y, v),
'vis': a 51-elements list (17 keypionts, each has 3 visiblity flags),
'instance_id':0,
'action_labels': [{'human_part': part_id, 'partstate': state_id}, ...],
'height': 640,
'width': 480,
'human_bbox': [126, 258, 150, 305],
'object_bbox': [128, 276, 144, 313],
'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
}
```
## Annotate COCO
1. Download COCO dataset.
2. Organize dataset, your directory tree of dataset should look like this (the files inside the Config is copied from the HICO-Det):
```
{DATA_ROOT}
|-- annotations
| |--person_keypoints_train2017.json
| `--person_keypoints_val2017.json
|ββ Configs
| |--hico_hoi_list.txt
| `--Part_State_76.txt
|ββ train2017
| |--000000000009.jpg
| |--000000000025.jpg
| ...
`-- val2017
|--000000000139.jpg
|--000000000285.jpg
...
```
### Start annotation
#### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate_coco.sh".
```
IDX={YOUR_GPU_IDS}
export PYTHONPATH=$PYTHONPATH:./
data_path={DATA_ROOT}
model_path={ROOT}/model_weights/{YOUR_MODEL_NAME}
output_dir={ROOT}/outputs
if [ -d ${output_dir} ];then
echo "dir already exists"
else
mkdir ${output_dir}
fi
CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
tools/annotate_coco.py \
--model-path ${model_path} \
--data-path ${data_path} \
--output-dir ${output_dir} \
```
#### Start auto-annotation
```
bash scripts/annotate_coco.sh
```
By defualt, the annotation script only annotates the COCO train2017 set. To annotate val2017, find the following two code in Line167-Line168 in the tools/annotate_coco.py and replace the 'train2017' to 'val2017'.
```
dataset = PoseCOCODataset(
data_path=os.path.join(args.data_path, 'annotations', 'person_keypoints_train2017.json'), # <- Line 167
multimodal_cfg=dict(image_folder=os.path.join(args.data_path, 'train2017'), # <- Line 168
data_augmentation=False,
image_size=336,),)
```
## Annotation format
A list of dict that contains the following keys:
```
{
'file_name': '000000000009.jpg',
'image_id': 9,
'keypoints': a 51-elements list (17x3 keypoints with x, y, v),
'vis': a 51-elements list (17 keypionts, each has 3 visiblity flags),
'height': 640,
'width': 480,
'human_bbox': [126, 258, 150, 305],
'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
}
``` |