Latest research
August 2024 - Digital Socrates: Evaluating LLMs through explanation critiques
Digital Socrates is an evaluation tool that can characterize LLMs' explanation capabilities.
August 2024 - Open research is the key to unlocking safer AI
Ai2 presents our stance on openness and safety in AI.
July 2024 - Broadening the scope of noncompliance: When and how AI models should not comply with user requests
We outline the taxonomy of model noncompliance and then delve deeper into implementing model noncompliance.
June 2024 - PolygloToxicityPrompts: Multilingual evaluation of neural toxic degeneration in large language models
New research on AI prompt toxicity, revealing insights into neural toxic degeneration across diverse languages.
May 2024 - Data-driven discovery with large generative models
We believe AI can assist researchers in finding relevant preexisting work to expedite discoveries. Here's how.
April 2024 - SatlasPretrain Models: Foundation models for satellite and aerial imagery
We’re excited to announce SatlasPretrain Models, a suite of open geospatial foundation models.
April 2024 - OLMo 1.7–7B: A 24 point improvement on MMLU
Introducing an updated version of our 7 billion parameter Open Language Model, OLMo 1.7–7B.
April 2024 - Making a switch — Dolma moves to ODC-BY
We’re moving the Dolma dataset to the ODC-BY license. Here’s why.
March 2024 - RewardBench: The first benchmark & leaderboard for reward models used in RLHF
Introducing RewardBench, a benchmark for evaluating preference reward models.