Patronus AI | Products

Platform

Percival

World Models

for Digital Workflows

Our core evaluation platform provides teams with a centralized solution for experiments, logging, comparisons, and traces, among more

LLM-as-a-Judge

Enables developers to score multimodal AI systems for image to text

Explore

Glider

Powerful 3B evaluator LLM that can score any text input on user-defined criteria

Explore

Lynx

A SOTA hallucination detection LLM that is capable of advanced reasoning

Explore

Percival is our evaluation copilot for agentic systems built to detect 20+ failure modes in agentic traces, suggesting optimizations, and evaluating a suite of reasoning and planning errors

Percival

Eval copilot that analyzes traces, identifies issues, and suggests optimizations

Explore

Percival Chat Assistant

Interactive Al agent that lets you unlock the power of Percival

Explore

We are a team of AI researchers and engineers formerly from companies such as Meta AI, Amazon AGI, and Google.

Generative Simulators

Adaptive environments that co-generate tasks, world dynamics, and reward functions

Explore

MemTrack

Benchmark to evaluate long-term memory and state tracking in multi-platform agent environments

Explore

First Digital Model World

The first large-scale model of digital work, enabling agents to learn and operate across realistic software, tools, and workflows.

Explore

Platform

Our core evaluation platform provides teams with a centralized solution for experiments, logging, comparisons, and traces, among more

LLM-as-a-Judge

Enables developers to score multimodal AI systems for image to text

Explore

Glider

Powerful 3B evaluator LLM that can score any text input on user-defined criteria

Explore

Lynx

A SOTA hallucination detection LLM that is capable of advanced reasoning

Explore

Percival

Percival is our evaluation copilot for agentic systems built to detect 20+ failure modes in agentic traces, suggesting optimizations, and evaluating a suite of reasoning and planning errors

Percival

Eval copilot that analyzes traces, identifies issues, and suggests optimizations

Explore

Percival Chat Assistant

Interactive AI assistant that lets you unlock the power of Percival

Explore

World Models

for Digital Workflows

We are a team of AI researchers and engineers formerly from companies such as Meta AI, Amazon AGI, and Google. Our work has led to product contributions serving top Fortune 500 clients

Generative Simulators

Adaptive environments that co-generate tasks, world dynamics, and reward functions

Explore

MemTrack

Benchmark to evaluate long-term memory and state tracking in multi-platform agent environments

Explore

First Digital Model World

The first large-scale model of digital work, enabling agents to learn and operate across realistic software, tools, and workflows.

Explore

Patronus AI Products

Let's collaborate