tl-hyungguk commited on
Commit
032115c
·
verified ·
1 Parent(s): 0dcb825

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -6
README.md CHANGED
@@ -7,13 +7,10 @@ language:
7
  ---
8
  # Intermediate Checkpoints Release
9
 
10
- For the first time among Korean-targeted LLMs, we’re releasing **intermediate checkpoints** from the Tri family—**0.5B**, **1.9B**, and **7B**—to advance research on LLM training dynamics.
11
 
12
- We release checkpoints at regular step intervals— **≈20B tokens (0.5B), ≈40B (1.9B), and ≈160B (7B & 70B)** —enabling consistent analysis of training dynamics.
13
-
14
- You can grab the **Tri-7B** model here: [https://huggingface.co/trillionlabs/Tri-7B](https://huggingface.co/trillionlabs/Tri-7B?utm_source=chatgpt.com).
15
-
16
- You can grab the **Tri-70B** preview model here: [https://huggingface.co/trillionlabs/Tri-70B-preview-SFT](https://huggingface.co/trillionlabs/Tri-70B-preview-SFT).
17
 
18
  We’re also sharing the **0.5B** and **1.9B** runs—originally produced for system bring-up but now available as valuable artifacts for analyzing training behavior at smaller scales.
19
 
 
7
  ---
8
  # Intermediate Checkpoints Release
9
 
10
+ For the first time among Korean-targeted LLMs, we’re releasing **intermediate checkpoints** from the Tri family—**0.5B**, **1.9B**, and **7B**—to advance research on LLM training dynamics. We release checkpoints at regular step intervals— **≈20B tokens (0.5B), ≈40B (1.9B), and ≈160B (7B & 70B)** —enabling consistent analysis of training dynamics. Each step’s release is distinguished by its **branch name**.
11
 
12
+ - You can grab the **Tri-7B** model here: [https://huggingface.co/trillionlabs/Tri-7B](https://huggingface.co/trillionlabs/Tri-7B?utm_source=chatgpt.com).
13
+ - You can grab the **Tri-70B** preview model here: [https://huggingface.co/trillionlabs/Tri-70B-preview-SFT](https://huggingface.co/trillionlabs/Tri-70B-preview-SFT).
 
 
 
14
 
15
  We’re also sharing the **0.5B** and **1.9B** runs—originally produced for system bring-up but now available as valuable artifacts for analyzing training behavior at smaller scales.
16