Is there a way to quantize models to this "pig" architecture ourselves to benefit from faster loading?
please refer to this; you could probably make it yourself
· Sign up or log in to comment