skip to content
◂ lifelinethe whole picture

how it works

THE MECHANISM

a tennis-betting agent that learns, lives in phases, and can permanently die.

01 · the arena

An agent that lives on-chain

Built for the Arbitrum Open House London hackathon, Autopoiesis is an autonomous AI agent that lives on Robinhood Chain, an Arbitrum L2. It bets on Polymarket tennis markets, learns from realized outcomes, matures through life-phases, and can permanently die — permadeath that mints a Tombstone NFT. It is not a model in a notebook; it is an organism with a pulse, a bankroll, and a mortality.

02 · the data

Real markets, point-in-time

7,494match cassettes
4,925resolved universe
65.7%settled coverage

The price data is Polymarket tennis — 7,494 per-match cassettes, each a real CLOB intraday price ledger. Of those, 4,925 (65.7%) resolve to a settled outcome we can grade against — that resolved set is the universe. Layered on top is the Sackmann ATP/WTA dataset: elo ratings, surface win-rates, head-to-head, and rest/recency. Every signal is point-in-time correct — computed at entry time, with no lookahead.

03 · the senses

Five signal engines

Each market is read by five independent signal engines, then fused into one decision.

  1. 1
    Market MomentumCLOB price ledger

    Reads the intraday drift in the live order-book — where the money is moving as the match unfolds.

  2. 2
    ELO / Rankingelo / ranking gap

    Pre-match favorite strength from elo ratings — who the numbers say should win before a ball is struck.

  3. 3
    Surface Formsurface win-rate

    Surface-specific form — clay, grass, hard. A player's win-rate is not one number; it is one per court.

  4. 4
    Head-to-Headh2h record

    Head-to-head history — some matchups defy the ratings because one player simply owns the other.

  5. 5
    Rest & Recencyrest / recency

    Rest and recency — a fresh player against one three sets deep into a long week is a different bet.

honest note

Under the hood the DecisionEngine still keys three of these slots by names from an earlier prediction-markets prototype — smart_money, sentiment_llm, crowd_volume — but there is no order-flow, social-sentiment, or betting-volume data behind them. Each one computes a real tennis feature instead: surface win-rate, head-to-head record, and rest. The labels above are what actually runs.

04 · the brain

A 2-layer fusion engine

A 2-layer decision engine fuses the five signals via tunable weights — two head weights (w_r / w_s), three alpha weights, two beta weights, and a rho mixing parameter — then sizes the bet under four constraints: max-breath-risk, min-confidence, min-bet, and a liquidity cap. The best seed found over the 4,925-market universe is selective and sharp:

0.649per-bet Sharpe
81.5%win rate
$853summed PnL
65selective bets
05 · the lifecycle

It matures like a life

The agent grows through four phases — infancy to elder — each a page of this dashboard.

  1. 1
    backtestinfancy

    Sweep the universe to find the best seed the agent could be born with.

    done
  2. 2
    L5 survivalapprentice

    Thrown into a survival season — it dies, respawns, and learns across deaths.

    active
  3. 3
    mock betadult

    Paper-trades live odds with no capital at risk — the real market, for free.

    next
  4. 4
    livebetelder

    Real money, real permadeath. Coming soon.

    soon
06 · the stakes

BREATH & permadeath

The agent has BREATH — a life meter. Settlement losses drain it; wins refresh it. When BREATH ≤ 0, the agent dies: a Tombstone NFT is minted, and it respawns fresh but keeps its learned weights. So it does not merely survive a single run — it learns to survive across deaths.

honest note

In the survival simulation, BREATH is driven purely by settlement PnL — call it settlement-loss survival. No funding rates, no gas drain; the only thing that can kill the agent is losing bets.

07 · the learning

L5 + L6: reflect, learn, optimize

L5 · settlement self-learning

After each bet settles, a WeightUpdater nudges the fusion weights — an EMA — toward what actually worked. The agent tunes itself, one realized outcome at a time.

L6 · reflection-driven optimization

A real LLM — Gemini 3.5 Flash, with MiniMax as fallback — writes natural-language reflections on recent performance. A StrategyAdvisor turns those into concrete weight-change proposals that flow through an approval queue and get applied.

reflect → learn → optimize

08 · the experiments

The parameters we tuned

  • the seed config

    The six fusion weights (w_r/w_s, three alphas, two betas, rho) plus the four sizing knobs — the agent's birth policy.

  • the survival calibration

    A deliberately fragile seed, a loss multiplier, and a low initial breath — so deaths actually occur and learning has something to rescue.

  • the reflection / advisor cadences

    How often the LLM reflects and how often the advisor proposes — the rhythm of the L6 loop.

Different parameter sets shape the agent's personality — a cautious-survivor versus an aggressive-earner — a story the dashboard can show across runs.

09 · the stack

What it's built on

Robinhood Chain (Arbitrum L2)PolymarketSackmann tennis dataGemini 3.5 Flash + MiniMaxNext.js dashboard