Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,85 @@
|
|
| 1 |
---
|
| 2 |
license: llama2
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: llama2
|
| 3 |
---
|
| 4 |
+
# LongLLaMA-Code 7B Instruct
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
<div align="center">
|
| 8 |
+
|
| 9 |
+
<table>
|
| 10 |
+
<tr>
|
| 11 |
+
<th style="font-size: 120%"> >_ 🎓 <a href="https://huggingface.co/syzymon/long_llama_code_7b_instruct">LongLLaMA-Code 7B Instruct</a> 📑🗨 </th>
|
| 12 |
+
</tr>
|
| 13 |
+
<tr>
|
| 14 |
+
<td align="center">
|
| 15 |
+
<a href="https://colab.research.google.com/github/CStanKonrad/long_llama/blob/main/long_llama_code_instruct_colab.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
|
| 16 |
+
</td>
|
| 17 |
+
|
| 18 |
+
</tr>
|
| 19 |
+
</table>
|
| 20 |
+
|
| 21 |
+
</div>
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
## TLDR
|
| 25 |
+
[LongLLaMA-Code 7B Instruct](https://huggingface.co/syzymon/long_llama_code_7b_instruct) is [LongLLaMA-Code 7B](https://huggingface.co/syzymon/long_llama_code_7b) tuned on [TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct), [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca), and [ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) datasets. It can answer basic questions about research papers and code. It can also perform a simple code refactoring. You can try the quantized version of the model using a free GPU in [Google Colab](https://colab.research.google.com/github/CStanKonrad/long_llama/blob/main/long_llama_code_instruct_colab.ipynb).
|
| 26 |
+
|
| 27 |
+
## Tuning
|
| 28 |
+
|
| 29 |
+
### Code
|
| 30 |
+
The model was tuned on a TPU v3-128 pod with 128 batch size.
|
| 31 |
+
For tuning, we have used the data preparation pipeline available in instruction_fine_tuning.
|
| 32 |
+
However, we have replaced the Hugging Face Trainer with a modification of FoT continued pretraining code. This modification boils down to propagating the memory cache throughout the model (basically reproducing the Pytorch inference code functionality in JAX).
|
| 33 |
+
|
| 34 |
+
### Training
|
| 35 |
+
Here, we present the basic information about how the model was tuned. For more details, see the [GitHub repo](https://github.com/CStanKonrad/long_llama/tree/main/instruction_fine_tuning/misc).
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
All inputs were truncated and randomly padded (left/right) to 3072 tokens.
|
| 39 |
+
The last context length was set to 1536.
|
| 40 |
+
The model was trained for 9k steps, started with a learning rate of 1.2e-5, 700 steps of warmup, and finished with a learning rate of 0.
|
| 41 |
+
The optimizer was adamw.
|
| 42 |
+
|
| 43 |
+
The question prompt (`pre_question_text`) was:
|
| 44 |
+
```
|
| 45 |
+
You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can.\n\n
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
To trigger the model answer one can use:
|
| 49 |
+
```
|
| 50 |
+
\nAnswer:
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
The chat prompt was:
|
| 54 |
+
```
|
| 55 |
+
A chat between a user (denoted as USER:) and an artificial intelligence assistant (denoted as ASSISTANT:). The assistant gives helpful, detailed, and polite answers to the user's questions.\n\n
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
To denote the assistant one can write:
|
| 59 |
+
```
|
| 60 |
+
\nASSISTANT:
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
+
To denote the user one can write:
|
| 64 |
+
```
|
| 65 |
+
\nUSER:
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
### Datasets and sampling probability
|
| 69 |
+
* 0.71 - [TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct)
|
| 70 |
+
* 0.16, - [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) questions with less than 5k chars
|
| 71 |
+
* 0.08, - [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) questions above 5k chars but below 12k chars
|
| 72 |
+
* 0.02 - [zetavg/ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) conversations below 6k chars
|
| 73 |
+
* 0.01 - [zetavg/ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) conversations above 6k chars but below 12k chars
|
| 74 |
+
|
| 75 |
+
To improve the quality of the data, the datasets were filtered using regular expressions.
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
## License
|
| 80 |
+
The instruction/chat-tuned models are for research purposes only.
|
| 81 |
+
[LongLLaMA-Code 7B Instruct](https://huggingface.co/syzymon/long_llama_code_7b_instruct) is [LongLLaMA-Code 7B](https://huggingface.co/syzymon/long_llama_code_7b) tuned on [TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct), [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca), and [ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) datasets. Note that those datasets contain outputs from ChatGPT. See also the [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) license.
|
| 82 |
+
|
| 83 |
+
## Acknowledgements
|
| 84 |
+
We gratefully acknowledge the TPU Research Cloud program, which was instrumental to our research by providing significant computational resources.
|
| 85 |
+
|