Allen Institute for AI (AI2) Releases OLMo 32B: A Fully Open Model to Beat GPT 3.5 and GPT-4o mini on a Suite of Multi-Skill Benchmarks

Allen Institute for AI (AI2) Releases OLMo 32B: A Fully Open Model to Beat GPT 3.5 and GPT-4o mini on a Suite of Multi-Skill Benchmarks


The rapid evolution of artificial intelligence (AI) has ushered in a new era of large language models (LLMs) capable of understanding and generating human-like text. However, the proprietary nature of many of these models poses challenges for accessibility, collaboration, and transparency within the research community. Additionally, the substantial computational resources required to train such models often limit participation to well-funded organizations, thereby hindering broader innovation.​

Addressing these concerns, the Allen Institute for AI (AI2) has introduced OLMo 2 32B, the latest and most advanced model in the OLMo 2 series. This model distinguishes itself as the first fully open model to surpass GPT-3.5 Turbo and GPT-4o mini across a suite of widely recognized, multi-skill academic benchmarks. By making all data, code, weights, and training details freely available, AI2 promotes a culture of openness and collaboration, enabling researchers worldwide to build upon this work.

OLMo 2 32B’s architecture comprises 32 billion parameters, reflecting a significant scaling from its predecessors. The training process was meticulously structured in two primary phases: pretraining and mid-training. During pretraining, the model was exposed to approximately 3.9 trillion tokens from diverse sources, including DCLM, Dolma, Starcoder, and Proof Pile II, ensuring a comprehensive understanding of language patterns. The mid-training phase utilized the Dolmino dataset, which consists of 843 billion tokens curated for quality, encompassing educational, mathematical, and academic content. This phased approach ensured that OLMo 2 32B developed a robust and nuanced grasp of language.

A notable aspect of OLMo 2 32B is its training efficiency. The model achieved performance levels comparable to leading open-weight models while utilizing only a fraction of the computational resources. Specifically, it required approximately one-third of the training compute compared to models like Qwen 2.5 32B, highlighting AI2’s commitment to resource-efficient AI development. ​

Phemex

In benchmark evaluations, OLMo 2 32B demonstrated impressive results. It matched or exceeded the performance of models such as GPT-3.5 Turbo, GPT-4o mini, Qwen 2.5 32B, and Mistral 24B. Furthermore, it approached the performance levels of larger models like Qwen 2.5 72B and Llama 3.1 and 3.3 70B. These assessments spanned various tasks, including Massive Multitask Language Understanding (MMLU), mathematics problem-solving (MATH), and instruction-following evaluations (IFEval), underscoring the model’s versatility and competence across diverse linguistic challenges. ​

The release of OLMo 2 32B signifies a pivotal advancement in the pursuit of open and accessible AI. By providing a fully open model that not only competes with but also surpasses certain proprietary models, AI2 exemplifies how thoughtful scaling and efficient training methodologies can lead to significant breakthroughs. This openness fosters a more inclusive and collaborative environment, empowering researchers and developers globally to engage with and contribute to the evolving landscape of artificial intelligence.

Check out the Technical Details, HF Project and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 80k+ ML SubReddit.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.

Parlant: Build Reliable AI Customer Facing Agents with LLMs 💬 ✅ (Promoted)



Source link

[wp-stealth-ads rows="2" mobile-rows="3"]

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

#GlobalNewsIt
Blockonomics
#GlobalNewsIt
Allen Institute for AI (AI2) Releases OLMo 32B: A Fully Open Model to Beat GPT 3.5 and GPT-4o mini on a Suite of Multi-Skill Benchmarks
Phemex
Fiverr
Launching your first AI project with a grain of RICE: Weighing reach, impact, confidence and effort to create your roadmap
Photo of the White House in Washington as OpenAI and Google are each urging the US government to take decisive action to secure the nation's AI leadership.
Mobile gaming market shows tentative signs of growth again | Adjust
Person in a plane cockpit accelerating, illustrating ServiceNow launching its Yokohama platform which introduces AI agents across various sectors to boost workflows and maximise end-to-end business impact.
Alibaba Researchers Introduce R1-Omni: An Application of Reinforcement Learning with Verifiable Reward (RLVR) to an Omni-Multimodal Large Language Model
Generative AI isn't coming for you — your reluctance to adopt it is
bitcoin
ethereum
bnb
xrp
cardano
solana
dogecoin
polkadot
shiba-inu
dai
Fixing Ethereum’s biggest problems
TON Society celebrates Pavel Durov leaving France as free speech win
Rising $219B stablecoin supply signals mid-bull cycle, not market top
Report: Bitcoin Miners Sitting on 100K BTC Fortune — But Owe $4.6B
Onchain Transfer Costs Remain Low: Bitcoin and Ethereum Users Enjoy Minimal Fees
Fixing Ethereum’s biggest problems
TON Society celebrates Pavel Durov leaving France as free speech win
Rising $219B stablecoin supply signals mid-bull cycle, not market top
Report: Bitcoin Miners Sitting on 100K BTC Fortune — But Owe $4.6B
bitcoin
ethereum
tether
xrp
bnb
solana
usd-coin
cardano
dogecoin
tron
bitcoin
ethereum
tether
xrp
bnb
solana
usd-coin
cardano
dogecoin
tron