Fix Readme Links
Browse files
README.md
CHANGED
|
@@ -17,12 +17,12 @@ tags:
|
|
| 17 |
# Model Overview
|
| 18 |
|
| 19 |
## Description:
|
| 20 |
-
The NVIDIA gpt-oss-120b Eagle model is the Eagle head of the OpenAI’s gpt-oss-120b model, which is an auto-regressive language model that uses a mixture-of-experts (MoE) architecture with 5 billion activated parameters and 120 billion total parameters. For more information, please check [here](https://huggingface.co/openai/gpt-oss-120b). The NVIDIA gpt-oss-120b Eagle3 model incorporates Eagle speculative decoding with [
|
| 21 |
|
| 22 |
This model is ready for commercial/non-commercial use. <br>
|
| 23 |
|
| 24 |
### Note
|
| 25 |
-
For use cases of less than 8k context length - please consider using [gpt-oss-120b-Eagle3-
|
| 26 |
|
| 27 |
### License/Terms of Use:
|
| 28 |
[nvidia-open-model-license](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/)
|
|
@@ -35,7 +35,7 @@ Developers designing AI Agent systems, chatbots, RAG systems, and other AI-power
|
|
| 35 |
<br>
|
| 36 |
|
| 37 |
### Release Date: <br>
|
| 38 |
-
Huggingface: Aug 20th, 2025 via [https://huggingface.co/nvidia/gpt-oss-120b-Eagle3] <br>
|
| 39 |
|
| 40 |
## Model Architecture:
|
| 41 |
**Architecture Type:** Transformers <br>
|
|
@@ -84,7 +84,7 @@ The integration of foundation and fine-tuned models into AI systems requires add
|
|
| 84 |
|
| 85 |
## Training Dataset:
|
| 86 |
|
| 87 |
-
**Link:** [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) and [Magpie-Llama-3.1-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Llama-3.1-Pro-300K-Filtered), only prompts from the datasets were used for data synthesis
|
| 88 |
|
| 89 |
** Data Modality
|
| 90 |
[Text]
|
|
|
|
| 17 |
# Model Overview
|
| 18 |
|
| 19 |
## Description:
|
| 20 |
+
The NVIDIA gpt-oss-120b Eagle model is the Eagle head of the OpenAI’s gpt-oss-120b model, which is an auto-regressive language model that uses a mixture-of-experts (MoE) architecture with 5 billion activated parameters and 120 billion total parameters. For more information, please check [here](https://huggingface.co/openai/gpt-oss-120b). The NVIDIA gpt-oss-120b Eagle3 model incorporates Eagle speculative decoding with [Model Optimizer](https://github.com/NVIDIA/Model-Optimizer).
|
| 21 |
|
| 22 |
This model is ready for commercial/non-commercial use. <br>
|
| 23 |
|
| 24 |
### Note
|
| 25 |
+
For use cases of less than 8k context length - please consider using [gpt-oss-120b-Eagle3-short-context](https://huggingface.co/nvidia/gpt-oss-120b-Eagle3-short-context)
|
| 26 |
|
| 27 |
### License/Terms of Use:
|
| 28 |
[nvidia-open-model-license](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/)
|
|
|
|
| 35 |
<br>
|
| 36 |
|
| 37 |
### Release Date: <br>
|
| 38 |
+
Huggingface: Aug 20th, 2025 via [https://huggingface.co/nvidia/gpt-oss-120b-Eagle3-long-context] <br>
|
| 39 |
|
| 40 |
## Model Architecture:
|
| 41 |
**Architecture Type:** Transformers <br>
|
|
|
|
| 84 |
|
| 85 |
## Training Dataset:
|
| 86 |
|
| 87 |
+
**Link:** [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) and [Magpie-Llama-3.1-Pro-300K-Filtered](https://huggingface.co/datasets/Magpie-Align/Magpie-Llama-3.1-Pro-300K-Filtered), only prompts from the datasets were used for data synthesis (the original responses from GPT were not used for data synthesis) which is then used to train the Eagle modules. Click the links above for more information regarding the dataset. <br>
|
| 88 |
|
| 89 |
** Data Modality
|
| 90 |
[Text]
|