Multilingual for Translation Corpus Helsinki-NLP/opus_books Viewer • Updated Mar 29, 2024 • 1.25M • 11.1k • 85
models rasa/LaBSE Feature Extraction • Updated May 20, 2021 • 10.6k • • 22 nomic-ai/nomic-embed-text-v1.5 Sentence Similarity • 0.1B • Updated Jul 21, 2025 • 3.93M • 765 NovaSearch/stella_en_1.5B_v5 Sentence Similarity • 2B • Updated Jul 28, 2025 • 24.8k • 260 llmware/llama-3.2-1b-gguf 1B • Updated Feb 8, 2025 • 14 • 1
Vietnamese ngtoanrob/vien-translation Translation • Updated Feb 24, 2023 • 2 • 1 ngtoanrob/envi-translation Updated Apr 1, 2023 • 1 gozu888/Envit5-tuned Translation • 0.3B • Updated Jun 28, 2023 • 8 • 3 IWSLT/mt_eng_vietnamese Updated Jan 18, 2024 • 211 • 29
Wish list HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 29.8k • 643 bookcorpus/bookcorpus Updated May 3, 2024 • 5.71k • 346 sentence-transformers/wikipedia-en-sentences Viewer • Updated Apr 25, 2024 • 7.87M • 306 • 7 sentence-transformers/paq Viewer • Updated May 1, 2024 • 64.4M • 602 • 2
LLMs TheBloke/Llama-2-13B-chat-GGML Text Generation • Updated Sep 27, 2023 • 119 • 696 TheBloke/Llama-2-7B-32K-Instruct-GGML Updated Sep 27, 2023 • 2 • 8 openchat/openchat-3.6-8b-20240522 Text Generation • 8B • Updated May 28, 2024 • 7.62k • • 156
corpuses Skylion007/openwebtext Viewer • Updated Dec 26, 2025 • 8.01M • 52.7k • 484 humarin/chatgpt-paraphrases Viewer • Updated Apr 5, 2023 • 419k • 76 • 59 stanford-oval/ccnews Viewer • Updated Aug 31, 2024 • 893M • 3.17k • 32 stanford-oval/wikipedia Viewer • Updated Apr 29, 2025 • 345M • 2.62k • 12
Multilingual for Translation Corpus Helsinki-NLP/opus_books Viewer • Updated Mar 29, 2024 • 1.25M • 11.1k • 85
Wish list HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 29.8k • 643 bookcorpus/bookcorpus Updated May 3, 2024 • 5.71k • 346 sentence-transformers/wikipedia-en-sentences Viewer • Updated Apr 25, 2024 • 7.87M • 306 • 7 sentence-transformers/paq Viewer • Updated May 1, 2024 • 64.4M • 602 • 2
models rasa/LaBSE Feature Extraction • Updated May 20, 2021 • 10.6k • • 22 nomic-ai/nomic-embed-text-v1.5 Sentence Similarity • 0.1B • Updated Jul 21, 2025 • 3.93M • 765 NovaSearch/stella_en_1.5B_v5 Sentence Similarity • 2B • Updated Jul 28, 2025 • 24.8k • 260 llmware/llama-3.2-1b-gguf 1B • Updated Feb 8, 2025 • 14 • 1
LLMs TheBloke/Llama-2-13B-chat-GGML Text Generation • Updated Sep 27, 2023 • 119 • 696 TheBloke/Llama-2-7B-32K-Instruct-GGML Updated Sep 27, 2023 • 2 • 8 openchat/openchat-3.6-8b-20240522 Text Generation • 8B • Updated May 28, 2024 • 7.62k • • 156
Vietnamese ngtoanrob/vien-translation Translation • Updated Feb 24, 2023 • 2 • 1 ngtoanrob/envi-translation Updated Apr 1, 2023 • 1 gozu888/Envit5-tuned Translation • 0.3B • Updated Jun 28, 2023 • 8 • 3 IWSLT/mt_eng_vietnamese Updated Jan 18, 2024 • 211 • 29
corpuses Skylion007/openwebtext Viewer • Updated Dec 26, 2025 • 8.01M • 52.7k • 484 humarin/chatgpt-paraphrases Viewer • Updated Apr 5, 2023 • 419k • 76 • 59 stanford-oval/ccnews Viewer • Updated Aug 31, 2024 • 893M • 3.17k • 32 stanford-oval/wikipedia Viewer • Updated Apr 29, 2025 • 345M • 2.62k • 12