view article Article LeMaterial: an open source initiative to accelerate materials discovery and research +8 Dec 10, 2024 β’ 54
view article Article Finally, a Replacement for BERT: Introducing ModernBERT +13 Dec 19, 2024 β’ 711
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Paper β’ 2501.07171 β’ Published Jan 13 β’ 55
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Paper β’ 2407.14482 β’ Published Jul 19, 2024 β’ 26
swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA Text Generation β’ 8B β’ Updated Sep 1 β’ 2.06k β’ β’ 29
Running on CPU Upgrade 81 Open Ita Llm Leaderboard π 81 Track, rank and evaluate open LLMs in the italian language!
view post Post 8742 Working on a concept GPT-2 (small) that uses KANs instead of MLPs.The ckpt and training code will be soon on the hub. 6 replies Β· π 31 31 π 14 14 π₯ 11 11 π€― 4 4 β 4 4 + Reply
Granite 2.0 Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. β’ 23 items β’ Updated 24 days ago β’ 202
Rethinking Interpretability in the Era of Large Language Models Paper β’ 2402.01761 β’ Published Jan 30, 2024 β’ 23
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture Paper β’ 2401.08406 β’ Published Jan 16, 2024 β’ 37