Can a Claude Code AI Agent CRUSH the Predictions Market? Let's Find Out

2026-02-20

I taught Claude Code to trade Polymarket. Specifically the 5-minute Bitcoin up/down market. The whole thing runs through my browser-based automation system with Opus running the strategy. I had a lot of fun with this one, so today I want to share how I set it up, run an hour of live trading on camera, then test what happens when I tell the agent to "be more creative" with its strategy.

Spoiler: the gambling part of "be more creative" goes about as well as you'd expect.

Watch the video:

The Strategy

I built this around a "frontload the next window" idea. Instead of betting inside the current 5-minute window (where prices are heavily influenced by late-window noise), the agent waits and places bets at the very start of the next window, before the crowd repositions.

The signal stack has seven inputs:

Price vs. target
Binance websocket vs. target (live BTC price)
Sidebar shift direction
Consensus
Momentum
Short trend
Crowd positioning (the psychological one — if there have been seven downs in a row, people pile into "must go up next")

Is this a proven strategy? Absolutely not. It is something I cooked up to see if Claude Code could execute it cleanly. The point of the exercise is not edge — it is automation.

Running the Setup

The agent loaded my polymarket skill (stored under polymarket/SKILL.md), opened the Bitcoin 5-minute market in the browser, and started navigating between windows. This is the same browser automation pattern I covered in the Surfagent post — recon-first navigation through the page DOM.

I told it to do one hour of trading. --dangerously-skip-permissions on so I would not have to approve each trade.

The Insane First Trade

The first bet was the wildest. The agent put $1 on "up" at the very last second of the window. The market flipped. We made $9 on a $1 bet. ~900% return on a single trade.

I have no idea how. The agent's reasoning was that the price had crashed below target — "up is going to lose" — but it placed the bet anyway because the signal stack flipped at the very end. Pure variance, but a great way to start a recording.

Steady-State Trading

For the next several windows the agent was placing $1-$3 bets per window, sometimes splitting into multiple smaller bets. Most of these lost. The strategy is genuinely random — the signal stack does not have edge, and 5-minute Bitcoin price movement is mostly noise. After 30 minutes I was net positive thanks to that one outlier, but the actual hit rate was about 30%.

The interesting part wasn't the wins or losses, it was watching Claude Code reliably execute the trading flow: time the windows correctly, place bets, track results, claim winnings, repeat. Zero technical errors over an hour of automated browser interaction.

The "Get Creative" Mistake

Around the 1-hour mark I prompted:

"Take the learnings, create a new trading strategy with more risk for the next 30 minutes as a test. Think hard, be creative, use the data you have gathered."

The agent came back with a "Fade the Swing" strategy: bet against the current window's trend, "the crowd piles into the obvious direction, odds get expensive, the reversal is free money." Aggressive sizing — start at $3, double to $5, then $8 if losing. With a "dead cat bounce exception."

This is straight gambling logic dressed up as a strategy. I let it run anyway because I wanted to see what would happen.

What happened: $3 bet won (+$2), $3 bet lost, $5 bet lost, $8 bet... barely won. Net basically flat after a wild swing. The lesson is that "more creative" without a real edge is just larger variance — same expected value, scarier outcomes. This is the same lesson I covered in the Polymarket autoresearch post where I switched to a proper iterative strategy search instead of human-prompted ideas.

Final Tally

$37 won across the 90-minute session, mostly from the freak first trade. Claimed it, balance updated. Money was just for testing — what matters is the operational reliability.

What worked:

The agent navigated Polymarket cleanly for 90 straight minutes
Window timing, bet placement, claim flow — all autonomous, all reliable
Same browser automation primitives as my other long-running tasks

What did not work:

The strategy itself — no edge, just exposure
"Be more creative" without a hypothesis is just gambling

What's Next

The fix for "no strategy" is the autoresearch pattern — let the agent iterate strategies against historical data, score them, keep what works. That's exactly what I did in the follow-up Karpathy autoresearch post, which is way more interesting than this one.

The reason I include this one in the channel is that it shows the upper limit of what manual prompting can do. The agent can execute anything you tell it. The question is what you tell it. For trading, that question is "what is your edge?" — and if you don't have an answer, no agent will save you.

Resources

AMD Ryzen AI Pro — sponsor of this video, good for running local LLMs (GPT-OSS 20B at ~50 tok/s) when you need privacy or are offline.
My GitHub — repos and code samples.

If you want me to share the Polymarket skill itself, give the video a like and I'll consider adding it to the skillsmd.store with the rest of the agent setups.