amd
/

Mixtral-8x7B-Instruct-v0.1_FP8_MLPerf_V3

Model card Files Files and versions

linzhao-amd commited on Jul 28

Commit

0eb1eaa

·

verified ·

1 Parent(s): 7b66cd6

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -33,6 +33,9 @@ The following layers are ignored during quantization:
 - `*.o_proj`
 - `lm_head`
 ## Quantization Scripts
 ```
 cd examples/torch/language_modeling/llm_ptq/

 - `*.o_proj`
 - `lm_head`
+## Algorithms
+AutoSmoothQuant algorithm is applied in weight-activation quantization for better performance.
 ## Quantization Scripts
 ```
 cd examples/torch/language_modeling/llm_ptq/