A computer generated image showing a swirl of shapes, meant to depict a hopeful futuristic feeling.

Latest research

An abstract map.

August 2024 - Digital Socrates: Evaluating LLMs through explanation critiques

Digital Socrates is an evaluation tool that can characterize LLMs' explanation capabilities.

Abstract image of clear tiles floating in space.

August 2024 - Open research is the key to unlocking safer AI

Ai2 presents our stance on openness and safety in AI.

Abstract lines and dots

July 2024 - Broadening the scope of noncompliance: When and how AI models should not comply with user requests

We outline the taxonomy of model noncompliance and then delve deeper into implementing model noncompliance.

smoke on a white background; abstract toxicity

June 2024 - PolygloToxicityPrompts: Multilingual evaluation of neural toxic degeneration in large language models

New research on AI prompt toxicity, revealing insights into neural toxic degeneration across diverse languages.

A vibrant abstract composition of colorful light beams intersecting against a dark background, evoking a sense of scientific discovery

May 2024 - Data-driven discovery with large generative models

We believe AI can assist researchers in finding relevant preexisting work to expedite discoveries. Here's how.

Satellite image of Earth with cloud connectivity links between continents

April 2024 - SatlasPretrain Models: Foundation models for satellite and aerial imagery

We’re excited to announce SatlasPretrain Models, a suite of open geospatial foundation models.

April 2024 - OLMo 1.7–7B: A 24 point improvement on MMLU

Introducing an updated version of our 7 billion parameter Open Language Model, OLMo 1.7–7B.

A digital image of squares of different shades overlaying each other in a random pattern.

April 2024 - Making a switch — Dolma moves to ODC-BY

We’re moving the Dolma dataset to the ODC-BY license. Here’s why.

3d render of abstract explosion shape. Blue futuristic background.

March 2024 - RewardBench: The first benchmark & leaderboard for reward models used in RLHF

Introducing RewardBench, a benchmark for evaluating preference reward models.

Previous46-54Next