Any plans to support transformers~=4.57?
#31
by
Neafs
- opened
Hi,
We've been working with Kimi-VL-A3B-Instruct, very nice model. However, we had to hold back our transformers library to 4.53.3 because later versions seen to cause a breaking change:
AttributeError: 'DynamicCache' object has no attribute 'seen_tokens'
Now that Qwen3 is out and only supported from 4.57+, we will need to upgrade transformers. Do you have any plans to release a version that would be compatible?
Thank you π
for transformer 4.57.1, I am not able to do gradient checkpointing, but I am able to do other things by adding the following to the import:
import transformers.activations as activations
from transformers.activations import GELUTanh
activations.PytorchGELUTanh = GELUTanh
PytorchGELUTanh = GELUTanh