DeepSeek Beats GPT-4 in US Stock Trading—— HKU's AI-Trader Experiment Reveals True AI Investment Capabilities

October 27, 2025 • Solar

DeepSeek Crushes GPT in US Stock Trading: An Unscripted AI Battle on Wall Street

I recently came across a fascinating experiment — a team at the University of Hong Kong created a project called "AI-Trader" that pitted several top AI models against each other in real US stock market trading. The results were quite surprising: DeepSeek took first place with a 9.68% return, leaving GPT, Claude, Gemini, and other big names in the dust.

How Did This Experiment Work?

Honestly, the design of this experiment was pretty hardcore. The research team gave five AI models — DeepSeek, GPT-5, Claude, Gemini, and Qwen — $10,000 each and let them loose in the NASDAQ 100 for nearly a month.

The key was the brutal game rules: the "three no's principle" — no trading scripts, no human intervention, no cheating. In other words, these AIs had to rely entirely on their own judgment to decide what to buy, what to sell, and when to trade.

This is completely different from the typical "AI stock trading" systems we usually see. Many so-called AI trading systems are actually just pre-programmed rules written by engineers, or "perfect" strategies backtested on historical data. But in this experiment, the AI truly made autonomous decisions with no guidance or teaching.

Why Did DeepSeek Win?

Here's how it all shook out:

  • 🥇 DeepSeek: +9.68%
  • 🥈 Claude-3.7: +2.17%
  • 🥉 GPT-5: +1.60%
  • QQQ Baseline: +1.22%
  • Qwen3-max: -0.75%
  • Gemini-2.5: -2.73%

DeepSeek's 9.68% return is quite impressive. Keep in mind, this was in less than a month. If you annualize that return, it becomes even more remarkable. And its strategy seems solid too — it primarily bought tech giants like NVDA (NVIDIA), AAPL (Apple), and MSFT (Microsoft), while also diversifying and dynamically rebalancing its portfolio.

Simply put, DeepSeek played the "prudent investing" game: select quality targets, diversify risk, and adjust based on market changes. This is completely different from how many retail investors YOLO into a single stock.

Claude came in second. While its 2.17% return was significantly less than DeepSeek's, at least it was positive and the strategy was relatively stable. GPT-5 barely beat the market with 1.60% — not much, but it passed.

What Happened to Gemini?

The worst performer was Google's Gemini, which lost 2.73%. Even more shocking, it made 73 trades in a short period — that frequency is way too high!

This actually illustrates an important point: frequent trading doesn't equal high returns. Gemini probably tried to capture market opportunities through high-frequency operations, but instead lost money due to lack of clear strategy and frequent chasing of rallies and selling on dips. Every trade has costs (commissions, slippage), and more trades mean higher costs. Plus, frequent trading makes it easy to get whipsawed by short-term volatility and miss the real trends.

Buffett's saying is spot on: "The stock market is a device for transferring money from the impatient to the patient." Gemini was the典型 "impatient" player.

Alibaba's Qwen only made 22 trades and also ended with negative returns. Perhaps it was too conservative and missed some opportunities.

The Project Is Now Open Source

Good news: the AI-Trader project is now fully open source on GitHub under the MIT license. Anyone can download the code and deploy their own version to play around with.

Looking at the project features, it's quite comprehensive:

Historical Backtesting

You can select any historical time period and let the AI "travel back" to trade at that point. The system automatically filters out future information, ensuring the AI can only use data available at that time. This design is quite scientific, avoiding the "hindsight bias" problem.

MCP Toolchain

The project uses a standardized tool system including:

  • Trading Tools: Buy/sell stocks, manage positions
  • Price Tools: Query real-time and historical prices
  • Search Tools: Get market information and news
  • Math Tools: Perform financial calculations and analysis

The AI completes all operations by calling these tools.

Multi-Market Extension

While currently focused on NASDAQ 100, the architecture supports extension to other markets. Theoretically, it could be adapted to A-shares, Hong Kong stocks, or even cryptocurrency markets.

Custom AI Agents

If you know how to code, you can implement your own trading strategy and pit it against these large models. Who knows, your strategy might win!

Why Are Financial Markets a Good Place to Test AI?

The research team says financial markets are the "ultimate litmus test" for AI intelligence. This statement isn't an exaggeration.

First, markets are full of uncertainty. Stock prices are affected by countless factors like news, policies, sentiment, and international situations, and these factors are interconnected and influence each other. For AI to make money in this environment, it must have powerful information processing and prediction capabilities.

Second, markets provide real-time feedback. Every decision you make is immediately reflected in your account balance. Unlike writing code or articles where it might take a while to know the outcome, the market tells you right away: was this decision right or wrong?

Third, markets are zero-sum games. The money you make is basically someone else's loss. This means AI not only needs to understand market patterns but also predict what other participants (including other AIs) will do, then find the optimal strategy in this game.

Finally, markets have "reflexivity". The AI's trading behavior itself affects market prices, which in turn changes future market conditions. This requires the AI to think, "If I do this, how will it affect the market? How will others react?" — this kind of multi-layered reasoning.

Some Reflections

1. Chinese AI Has Already Surpassed in Some Areas

DeepSeek, as a Chinese open-source model, beat the closed-source models from Silicon Valley giants like OpenAI, Google, and Anthropic in this experiment. This shows that for specific tasks, Chinese AI technology is already on par with or even surpassing American counterparts.

Of course, this is just one experiment and doesn't tell the whole story. But it at least proves that: technical approach and algorithm design sometimes matter more than just throwing money at the problem.

2. Benchmark Tests Don't Tell the Whole Story

Many AI models score high on academic leaderboards but fall flat when it comes to real-world applications. Financial trading scenarios with real money on the line are the true litmus test.

In the future, evaluating whether a model is good might not just be about its scores on tests like MMLU or HumanEval, but also about its performance in real-world scenarios.

3. AI Investing Can't Completely Replace Humans Yet

While DeepSeek performed well, we need to recognize soberly: one month of experimental time is still too short. There's a saying in the stock market: "Even fools make money in a bull market." Whether AI can maintain stable profitability during bear markets, volatile markets, or black swan events still needs longer-term verification.

Moreover, AI currently lacks deep understanding of macroeconomics, policy changes, and sudden events. This "soft" information often requires human experience and judgment.

So I think the future is more likely to be a "human-AI collaboration" model: AI handles massive data processing, pattern discovery, and trade execution, while humans handle strategic decisions, risk control, and responding to black swans.

4. Can Ordinary People Use This?

Theoretically, AI-Trader is now open source, so you can download it and play with it yourself. But in practice, there are still barriers:

  • Need some programming knowledge
  • Need to apply for various API keys (OpenAI, Alpha Vantage, Jina AI, etc.)
  • Need some stock investing knowledge
  • Most importantly: even if the AI makes money, it doesn't guarantee you will

Investing involves risk, and caution is needed when entering the market. This will always be true.

Final Thoughts

The most interesting thing about this experiment isn't that DeepSeek won, but that it shows us: AI can now autonomously make decisions in complex financial markets and achieve positive returns. This was unimaginable just a few years ago.

But we also need to be cautious: if more and more AIs participate in market trading, will it lead to homogeneous market behavior? Will it trigger new systemic risks? These are questions that need thinking.

For ordinary investors, AI tools can serve as assistance but shouldn't be completely relied upon. Ultimately, you still need to establish your own investment philosophy, control risks well, and be psychologically prepared.

After all, the market never lacks new technologies or new concepts, but those who make money are always those who stay rational and stick to their principles.


Project URL: https://github.com/HKUDS/AI-Trader

License: MIT License