the chat template might be incorrect: missing `<|tool_call_start|>` part in chat_template.jinja

#17

by kumapo - opened Oct 10

Oct 10

We saw the <|tool_call_start|> .. <|tool_call_end|> part in the example of tool usage on the model card:

<|startoftext|><|im_start|>system
List of tools: <|tool_list_start|>[{"name": "get_candidate_status", "description": "Retrieves the current status of a candidate in the recruitment process", "parameters": {"type": "object", "properties": {"candidate_id": {"type": "string", "description": "Unique identifier for the candidate"}}, "required": ["candidate_id"]}}]<|tool_list_end|><|im_end|>
<|im_start|>user
What is the current status of candidate ID 12345?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>Checking the current status of candidate ID 12345.<|im_end|>
<|im_start|>tool
<|tool_response_start|>{"candidate_id": "12345", "status": "Interview Scheduled", "position": "Clinical Research Associate", "date": "2023-11-20"}<|tool_response_end|><|im_end|>
<|im_start|>assistant
The candidate with ID 12345 is currently in the "Interview Scheduled" stage for the position of Clinical Research Associate, with an interview date set for 2023-11-20.<|im_end|>

but, there is no <|tool_call_start|> part in the chat_template.jinja and tokenizer_config files for now.
Am I missing something around that?

mlabonne

Liquid AI org Oct 11

Hey, the model has been trained to output these tokens, which is why they're not part of the chat template. Did you have an issue with the model answer for function calling?

LeaveNhA

Oct 11

•

edited Oct 11

I have, it repeats calls forever. It’s just me or anyone have the same issue?

mlabonne

Liquid AI org Oct 12

I recommend using these generation parameters:

temperature=0.3
min_p=0.15
repetition_penalty=1.05

If you still have this issue, you can increase the repetition_penalty in the 1.1-1.5 range.

kumapo

Oct 12

I'm talking about function calling SFT.
In that case, we need these special tokens in chat template if the models are trained with it in post training.

mlabonne

Liquid AI org Oct 12

@kumapo You don't need these special tokens in the chat template, but you need them in the assistant's answers of your instruction dataset (exactly like in the example shown in your first message).

kumapo

Oct 14

Hi @mlabonne , do you have any plans to update the chat template to include tool-use tokens?

I think adding them would make it easier for huggingface users to fine-tune your models on tool-use datasets,
since many hf users expect chat templates to work seamlessly for SFT: https://huggingface.co/docs/transformers/en/chat_templating

mlabonne

Liquid AI org Oct 14

@kumapo You can directly pass tools with the "tools" argument of .apply_chat_template(). Is it what you're talking about, or is it something else?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment