OxxoCodes
/

Meta-Llama-3-70B-Instruct-GPTQ

Text Generation

4-bit precision

Model card Files Files and versions

OxxoCodes commited on Apr 19, 2024

Commit

3c9e279

·

verified ·

1 Parent(s): 5208c40

Add model card

Files changed (1) hide show

README.md +38 -0

README.md ADDED Viewed

	@@ -0,0 +1,38 @@

+---
+license: other
+license_name: llama3
+tags:
+- llama-3
+- conversational
+---
+# OxxoCodes/Meta-Llama-3-70B-Instruct-GPTQ
+*Built with Meta Llama 3*
+Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
+# Model Description
+This is a 4-bit GPTQ quantized version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
+This model was quantized using the following quantization config:
+```python
+quantize_config = BaseQuantizeConfig(
+    bits=4,
+    group_size=128,
+    desc_act=False,
+    damp_percent=0.1,
+)
+```
+To use this model, you need to install AutoGPTQ.
+For detailed installation instructions, please refer to the [AutoGPTQ GitHub repository](https://github.com/AutoGPTQ/AutoGPTQ).
+# Example Usage
+```python
+from auto_gptq import AutoGPTQForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-70B-Instruct")
+model = AutoGPTQForCausalLM.from_quantized("OxxoCodes/Meta-Llama-3-70B-Instruct-GPTQ")
+output = model.generate(**tokenizer("The capitol of France is", return_tensors="pt").to(model.device))[0]
+print(tokenizer.decode(output))
+```