# Echo, UniPat, and UniScientist Deep Dive

Updated: July 1, 2026

## Bottom Line

UniPat AI is the company. Echo is UniPat AI's prediction/forecasting system. EchoZ-1.0 is the forecasting model reported inside Echo. UniScientist is a different UniPat AI model: it is a rubric-trained scientific research model, not the Echo forecasting model.

EchoZ-1.0 is visible through UniPat's public Echo leaderboard and API, but I did not find public Hugging Face weights for EchoZ-1.0. The public UniPat Hugging Face model is `UnipatAI/UniScientist-30B-A3B`.

Direct URLs:

- Echo home: https://echo.unipat.ai/
- Echo leaderboard: https://echo.unipat.ai/leaderboard
- Echo predictions/questions: https://echo.unipat.ai/questions
- Echo blog/methodology: https://unipat.ai/blog/Echo
- UniPat Hugging Face org: https://huggingface.co/UnipatAI
- UniScientist model: https://huggingface.co/UnipatAI/UniScientist-30B-A3B

## Terminology

| Name | What it is | Relationship |
| --- | --- | --- |
| UniPat AI | Company / research organization | Parent organization behind Echo and UniScientist. |
| Echo | Prediction-intelligence product, benchmark, and API | UniPat AI's event forecasting infrastructure. |
| EchoZ-1.0 | Forecasting model inside Echo | Reported as the top model on the Echo leaderboard; weights not found publicly. |
| General AI Prediction Leaderboard | Live Echo leaderboard | Compares model probability forecasts on aligned prediction points. |
| UniScientist-30B-A3B | Public Hugging Face model | UniPat AI model for scientific research tasks, not the Echo forecasting model. |

## Echo Public System

Sources:

- Echo blog: https://unipat.ai/blog/Echo
- Echo leaderboard: https://echo.unipat.ai/leaderboard
- QbitAI article: https://www.qbitai.com/2026/03/393353.html
- Public API snapshot stored locally: `manifests/echo_public_snapshot.json`

UniPat describes Echo as a full-stack prediction intelligence system with three parts:

1. General AI Prediction Leaderboard.
2. Train-on-Future post-training pipeline.
3. AI-native prediction API.

The blog describes the leaderboard as a dynamic benchmark that compares models only at aligned prediction points. It maps Brier-score differences into soft pairwise wins, then fits a Bradley-Terry/Elo-style model with horizon-weighted battles.

## What The Echo Leaderboard Does

Echo's leaderboard is not a static benchmark like MMLU. It continuously collects prediction questions whose answers are not known yet, asks participating models for probability distributions at scheduled points before resolution, waits for outcomes, and then scores the forecasts.

The core mechanics are:

1. Question acquisition: Echo uses prediction-market questions, synthesized questions from trends, and expert-annotated questions.
2. Prediction scheduling: each question is sampled at several points before the event resolves. Longer-horizon questions receive more prediction points, but with a logarithmic schedule so long questions do not dominate cost.
3. Point alignment: models are compared only when they forecast the same question at the same prediction point. This avoids giving one model an easier later forecast and another model a harder earlier forecast.
4. Proper scoring: each forecast is scored by probability assigned to the realized outcome, using Brier-style scoring and log-score displays in question pages.
5. Pairwise battles: for aligned model pairs, Brier-score differences are converted into soft win/loss labels.
6. Elo estimation: the soft battles are fit with a Bradley-Terry/Elo-style maximum-likelihood model, with more weight on longer-lead-time forecasts.
7. Public inspection: the web app exposes current rankings, active questions, resolved model cases, and per-question probability time series.

Why this matters for us: the leaderboard is trying to evaluate forecasting behavior under comparable information states, rather than only counting final right/wrong answers. That is directly relevant to our problem because the hard part is separating real forecasting skill from temporal leakage, lucky outcomes, and uneven forecast timing.

The Train-on-Future mechanism described by UniPat has three pieces:

1. Dynamic question synthesis from current streams so labels are in the future.
2. Rubrics Search, where process rubrics are selected by how well rubric rankings match outcome-based Elo rankings on held-out questions.
3. Map-Reduce agent architecture, where multiple agents collect/analyze evidence and a reducer aggregates conflicts into a probability distribution.

Company/context note: the Echo blog lists contributors affiliated with UniPat AI and Peking University. The QbitAI article describes Echo as UniPat AI's full prediction-intelligence infrastructure and highlights the same Train-on-Future, dynamic evaluation, and EchoZ-1.0 themes.

## Current Echo API Snapshot

The public Echo API response on July 1, 2026 shows:

- EchoZ-1.0 rank: 1.
- EchoZ-1.0 Elo: 1024.1.
- EchoZ-1.0 battles: 83,606.
- EchoZ-1.0 resolved count: 2,396.
- EchoZ-1.0 first prediction date: March 4, 2026.
- Polymarket Market baseline rank: 3.
- Market baseline Elo: 1011.0.
- Market baseline battles: 74,191.
- Active questions returned by the public question API: 151.

Important date distinction: the March blog and QbitAI article reported a March 2026 leaderboard snapshot where EchoZ-1.0 had Elo 1034.2. The live public API on July 1, 2026 reports 1024.1.

The public API exposes:

- `/api/v2/rankings?category=Overall`
- `/api/v2/ranking-history?category=Overall&batches=4`
- `/api/v2/questions?page=1&size=20`
- `/api/v2/question-detail?questionId=...`
- `/api/v2/model-detail?modelId=...`
- `/api/v2/model-cases?modelId=...`

Question detail exposes model probability time series for active and resolved questions. It does not expose full reasoning trajectories or training data in the records I checked.

## Hugging Face and GitHub Findings

Sources:

- UniPat HF org: https://huggingface.co/UnipatAI
- UniScientist model: https://huggingface.co/UnipatAI/UniScientist-30B-A3B
- UniScientist GitHub: https://github.com/UniPat-AI/UniScientist
- UniScientist blog: https://unipat.ai/blog/UniScientist

Hugging Face current state:

- Public models under `UnipatAI`: one model, `UnipatAI/UniScientist-30B-A3B`.
- EchoZ search on Hugging Face did not return a public UniPat EchoZ model.
- UniPat datasets on Hugging Face are BabyVision, BabyVision-Gen, Monthly-SWEBench 2026-03/04/05, RoadmapBench, and EvoCodeBench. I did not find an Echo dataset there.

UniScientist metadata:

- Model size: 31B parameters.
- Architecture tag: `qwen3_moe`.
- Precision: BF16 safetensors.
- License: Apache 2.0.
- Last modified: March 4, 2026.
- Files: 25, total remote size about 61.08 GB.
- Downloads and likes at capture: 49 downloads, 15 likes.
- Public metadata/config/tokenizer files downloaded locally under `data/models/uniscientist_30b_a3b_metadata/`.
- Public GitHub code cloned under `data/models/uniscientist_repo/`.

UniScientist blog claims:

- 30B total parameters with 3B active per token.
- Qwen3-30B-A3B-Thinking-2507 fine-tuned on an H200 cluster.
- About 1,200 GPU-hours.
- 128k token context.
- Up to 100 tool-invocation steps.
- Tools: web search, Google Scholar, page fetching, code interpreter.
- Dataset: 4,700+ research-grade instances, 20+ rubrics per question, 50+ disciplines.

## What This Means For Our Project

Echo is the closest system-design reference, but it is not directly reproducible from public weights. The reproducible pieces are:

- Public leaderboard outputs.
- Public probability time series on specific questions.
- Public model/case summaries.
- Public methodology from the blog.
- Adjacent UniPat rubric/agentic code and UniScientist metadata.

For our system, we should reproduce the principles, not depend on Echo weights:

1. Build our own aligned prediction-point evaluator.
2. Use proper scoring and market baseline comparisons before training.
3. Add process rubrics, but validate rubrics against held-out realized forecasting performance.
4. Store reasoning traces ourselves so process rewards can be audited for leakage.
5. Treat market prices as a baseline and possible feature, not as the label to imitate blindly.
