📄 Paper | 🌐 Website | 🤗 Dataset |
Overview
CoVe-4B is a compact 4B interactive tool-use agent fine-tuned from Qwen3-4B-Instruct-2507 using the CoVe (Constraint-Verification) post-training framework. It is trained on CoVe-12K, a dataset of 12K high-quality multi-turn tool-use trajectories synthesized and verified by deterministic constraint checking.
Framework
The CoVe framework. Explicit constraints are fuzzified to guide a User Simulator LLM, and original constraints act as a deterministic checklist to verify the agent's tool invocations.
Performance
Main results on τ²-bench. CoVe-4B achieves top performance in the ≤8B group and rivals models up to 70B.
Deployment and Evaluation
CoVe-4B uses the Hermes tool-call format and can be deployed with vLLM.
Serve with vLLM
CUDA_VISIBLE_DEVICES=0,1,2,3 vllm serve [MODEL_HF_URL] \
--served-model-name CoVe \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--tensor-parallel-size 1 \
--data-parallel-size 4 \
--host 0.0.0.0 \
--port ${PORT}
Evaluate with τ²-bench
Once the model is running, evaluate using the official τ²-bench code. Set the agent model to the vLLM-served CoVe endpoint.
Citation
@article{Chen2026CoVe,
title = {CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification},
author = {Chen, Jinpeng and Gong, Cheng and Li, Hanbo and Liu, Ziru and Tian, Zichen and Fu, Xinyu and Wu, Shi and Zhang, Chenyang and Zhang, Wu and Zhang, Suiyun and Tu, Dandan and Liu, Rui},
journal = {arXiv preprint arXiv:2603.01940},
year = {2026}
}
- Downloads last month
- 42