how it works
THE MECHANISM
a tennis-betting agent that learns, lives in phases, and can permanently die.
An agent that lives on-chain
Built for the Arbitrum Open House London hackathon, Autopoiesis is an autonomous AI agent that lives on Robinhood Chain, an Arbitrum L2. It bets on Polymarket tennis markets, learns from realized outcomes, matures through life-phases, and can permanently die — permadeath that mints a Tombstone NFT. It is not a model in a notebook; it is an organism with a pulse, a bankroll, and a mortality.
Real markets, point-in-time
The price data is Polymarket tennis — 7,494 per-match cassettes, each a real CLOB intraday price ledger. Of those, 4,925 (65.7%) resolve to a settled outcome we can grade against — that resolved set is the universe. Layered on top is the Sackmann ATP/WTA dataset: elo ratings, surface win-rates, head-to-head, and rest/recency. Every signal is point-in-time correct — computed at entry time, with no lookahead.
Five signal engines
Each market is read by five independent signal engines, then fused into one decision.
- 1Market MomentumCLOB price ledger
Reads the intraday drift in the live order-book — where the money is moving as the match unfolds.
- 2ELO / Rankingelo / ranking gap
Pre-match favorite strength from elo ratings — who the numbers say should win before a ball is struck.
- 3Surface Formsurface win-rate
Surface-specific form — clay, grass, hard. A player's win-rate is not one number; it is one per court.
- 4Head-to-Headh2h record
Head-to-head history — some matchups defy the ratings because one player simply owns the other.
- 5Rest & Recencyrest / recency
Rest and recency — a fresh player against one three sets deep into a long week is a different bet.
honest note
Under the hood the DecisionEngine still keys three of these slots by names from an earlier prediction-markets prototype — smart_money, sentiment_llm, crowd_volume — but there is no order-flow, social-sentiment, or betting-volume data behind them. Each one computes a real tennis feature instead: surface win-rate, head-to-head record, and rest. The labels above are what actually runs.
A 2-layer fusion engine
A 2-layer decision engine fuses the five signals via tunable weights — two head weights (w_r / w_s), three alpha weights, two beta weights, and a rho mixing parameter — then sizes the bet under four constraints: max-breath-risk, min-confidence, min-bet, and a liquidity cap. The best seed found over the 4,925-market universe is selective and sharp:
It matures like a life
The agent grows through four phases — infancy to elder — each a page of this dashboard.
- 1donebacktestinfancy
Sweep the universe to find the best seed the agent could be born with.
- 2activeL5 survivalapprentice
Thrown into a survival season — it dies, respawns, and learns across deaths.
- 3nextmock betadult
Paper-trades live odds with no capital at risk — the real market, for free.
- 4soonlivebetelder
Real money, real permadeath. Coming soon.
BREATH & permadeath
The agent has BREATH — a life meter. Settlement losses drain it; wins refresh it. When BREATH ≤ 0, the agent dies: a Tombstone NFT is minted, and it respawns fresh but keeps its learned weights. So it does not merely survive a single run — it learns to survive across deaths.
honest note
In the survival simulation, BREATH is driven purely by settlement PnL — call it settlement-loss survival. No funding rates, no gas drain; the only thing that can kill the agent is losing bets.
L5 + L6: reflect, learn, optimize
After each bet settles, a WeightUpdater nudges the fusion weights — an EMA — toward what actually worked. The agent tunes itself, one realized outcome at a time.
A real LLM — Gemini 3.5 Flash, with MiniMax as fallback — writes natural-language reflections on recent performance. A StrategyAdvisor turns those into concrete weight-change proposals that flow through an approval queue and get applied.
reflect → learn → optimize
The parameters we tuned
- the seed config
The six fusion weights (w_r/w_s, three alphas, two betas, rho) plus the four sizing knobs — the agent's birth policy.
- the survival calibration
A deliberately fragile seed, a loss multiplier, and a low initial breath — so deaths actually occur and learning has something to rescue.
- the reflection / advisor cadences
How often the LLM reflects and how often the advisor proposes — the rhythm of the L6 loop.
Different parameter sets shape the agent's personality — a cautious-survivor versus an aggressive-earner — a story the dashboard can show across runs.