Skip to content
@METR

METR

Model Evaluation and Threat Research

Popular repositories Loading

  1. eval-analysis-public eval-analysis-public Public

    Public repository containing METR's DVC pipeline for eval data analysis

    Python 228 45

  2. task-standard task-standard Public

    METR Task Standard

    TypeScript 177 36

  3. RE-Bench RE-Bench Public

    Python 134 18

  4. vivaria vivaria Public

    Vivaria is METR's tool for running evaluations and conducting agent elicitation research.

    TypeScript 134 39

  5. public-tasks public-tasks Public

    HTML 120 19

  6. inspect-action inspect-action Public

    Running UK AISI's Inspect in the Cloud

    Python 20 9

Repositories

Showing 10 of 56 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.