AutoLLMAnnotation / README.md
ayh015's picture
Update README file
6d4aa31
# LLM auto annotation for HICO-DET dataset (Pose from [Halpe](https://github.com/Fang-Haoshu/Halpe-FullBody), Part State from [HAKE](https://github.com/DirtyHarryLYL/HAKE)).
## Environment
The code is developed using python 3.11.11 on Ubuntu 21.xx with torch==2.6.0+cu124,
transformers==4.57.3 (with Qwen3 series)
## Annotating HICO-Det
### A. Installation
1. Install required packges and dependencies.
2. Clone this repo, and we'll call the directory that you cloned as ${ROOT}.
3. Creat necessary directories:
```
mkdir outputs
mkdir model_weights
```
4. Download LLM's weights into model_weights from hugging face.
### B. Prepare Dataset
5. Install COCO API:
```
pip install pycocotools
```
6. Download [dataset](https://huggingface.co/datasets/ayh015/HICO-Det_Halpe_HAKE).
7. Organize dataset, your directory tree of dataset should look like this (there maybe extra files.):
```
{DATA_ROOT}
|-- Annotation
| |--hico-det-instance-level
| | |--hico-det-training-set-instance-level.json
| `--hico-fullbody-pose
| |--halpe_train_v1.json
| `--halpe_val_v1.json
|── Configs
| |--hico_hoi_list.txt
| `--Part_State_76.txt
|── Images
| |--images
| |--test2015
| | |--HICO_test2015_00000001.jpg
| | |--HICO_test2015_00000002.jpg
| | ...
| `--train2015
| |--HICO_train2015_00000001.jpg
| |--HICO_train2015_00000002.jpg
| ...
`── Logic_Rules
|--gather_rule.pkl
`--read_rules.py
```
### C. Start annotation
#### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate.sh".
```
IDX={YOUR_GPU_IDS}
export PYTHONPATH=$PYTHONPATH:./
data_path={DATA_ROOT}
model_path={ROOT}/model_weights/{YOUR_MODEL_NAME}
output_dir={ROOT}/outputs
if [ -d ${output_dir} ];then
echo "dir already exists"
else
mkdir ${output_dir}
fi
CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
tools/annotate_hico.py \
--model-path ${model_path} \
--data-path ${data_path} \
--output-dir ${output_dir} \
```
#### Start auto-annotation
```
bash scripts/annotate_hico.sh
```
### D. Annotation format
A list of dict that contains the following keys:
```
{
'file_name': 'HICO_train2015_00009511.jpg',
'image_id': 0,
'keypoints': a 51-elements list (17x3 keypoints with x, y, v),
'vis': a 51-elements list (17 keypionts, each has 3 visiblity flags),
'instance_id':0,
'action_labels': [{'human_part': part_id, 'partstate': state_id}, ...],
'height': 640,
'width': 480,
'human_bbox': [126, 258, 150, 305],
'object_bbox': [128, 276, 144, 313],
'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
}
```
## Annotate COCO
1. Download COCO dataset.
2. Organize dataset, your directory tree of dataset should look like this (the files inside the Config is copied from the HICO-Det):
```
{DATA_ROOT}
|-- annotations
| |--person_keypoints_train2017.json
| `--person_keypoints_val2017.json
|── Configs
| |--hico_hoi_list.txt
| `--Part_State_76.txt
|── train2017
| |--000000000009.jpg
| |--000000000025.jpg
| ...
`-- val2017
|--000000000139.jpg
|--000000000285.jpg
...
```
### Start annotation
#### Modify the data_path, model_path, output_dir='outputs' by your configuration in "{ROOT}/scripts/annotate_coco.sh".
```
IDX={YOUR_GPU_IDS}
export PYTHONPATH=$PYTHONPATH:./
data_path={DATA_ROOT}
model_path={ROOT}/model_weights/{YOUR_MODEL_NAME}
output_dir={ROOT}/outputs
if [ -d ${output_dir} ];then
echo "dir already exists"
else
mkdir ${output_dir}
fi
CUDA_VISIBLE_DEVICES=$IDX OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node={NUM_YOUR_GPUs} --master_port=25005 \
tools/annotate_coco.py \
--model-path ${model_path} \
--data-path ${data_path} \
--output-dir ${output_dir} \
```
#### Start auto-annotation
```
bash scripts/annotate_coco.sh
```
By defualt, the annotation script only annotates the COCO train2017 set. To annotate val2017, find the following two code in Line167-Line168 in the tools/annotate_coco.py and replace the 'train2017' to 'val2017'.
```
dataset = PoseCOCODataset(
data_path=os.path.join(args.data_path, 'annotations', 'person_keypoints_train2017.json'), # <- Line 167
multimodal_cfg=dict(image_folder=os.path.join(args.data_path, 'train2017'), # <- Line 168
data_augmentation=False,
image_size=336,),)
```
## Annotation format
A list of dict that contains the following keys:
```
{
'file_name': '000000000009.jpg',
'image_id': 9,
'keypoints': a 51-elements list (17x3 keypoints with x, y, v),
'vis': a 51-elements list (17 keypionts, each has 3 visiblity flags),
'height': 640,
'width': 480,
'human_bbox': [126, 258, 150, 305],
'description': "The person is riding a bicycle, supported by visible evidence of their body interacting with the bike.\n\n- The right hand is holding the right handlebar.\n- The left hand is holding the left handlebar.\n- The right hip is positioned over the seat, indicating the person is sitting on the bicycle.\n- The right foot is on the right pedal.\n- The left foot is on the left pedal."
}
```