Skip to main content ->
Ai2

Latest research

August 2024 - Digital Socrates: Evaluating LLMs through explanation critiques

Digital Socrates is an evaluation tool that can characterize LLMs' explanation capabilities.

August 2024 - Open research is the key to unlocking safer AI

Ai2 presents our stance on openness and safety in AI.

July 2024 - Broadening the scope of noncompliance: When and how AI models should not comply with user requests

We outline the taxonomy of model noncompliance and then delve deeper into implementing model noncompliance.

June 2024 - PolygloToxicityPrompts: Multilingual evaluation of neural toxic degeneration in large language models

New research on AI prompt toxicity, revealing insights into neural toxic degeneration across diverse languages.

May 2024 - Data-driven discovery with large generative models

We believe AI can assist researchers in finding relevant preexisting work to expedite discoveries. Here's how.

April 2024 - SatlasPretrain Models: Foundation models for satellite and aerial imagery

We’re excited to announce SatlasPretrain Models, a suite of open geospatial foundation models.

April 2024 - OLMo 1.7–7B: A 24 point improvement on MMLU

Introducing an updated version of our 7 billion parameter Open Language Model, OLMo 1.7–7B.

April 2024 - Making a switch — Dolma moves to ODC-BY

We’re moving the Dolma dataset to the ODC-BY license. Here’s why.

March 2024 - RewardBench: The first benchmark & leaderboard for reward models used in RLHF

Introducing RewardBench, a benchmark for evaluating preference reward models.