Uncovering DeepSeek: Why does DeepSeek prefer young people without work experience?
Without a work history, how does DeepSeek select candidates? The answer is, by looking at potential.
Author: Sam Gao, Author of ElizaOS
0. Introduction
Recently, with the emergence of DeepSeek V3 and R1, American AI researchers, entrepreneurs, and investors have begun to experience FOMO. This feast is as surprising as the launch of ChatGPT at the end of 2022.
With the complete open-source nature of DeepSeek R1 (the model can be downloaded for local inference from HuggingFace) and its extremely low price (1/100 of OpenAI's o1), DeepSeek quickly rose to the top of the Apple App Store in the U.S. within just five days.
So, where does this mysterious new AI force, incubated by a Chinese quantitative company, come from?
1. The Origin of DeepSeek
I first heard about DeepSeek in 2021 when a genius girl from the neighboring team at Damo Academy, Luo Fuli, a master's student from Peking University who published eight papers at ACL (the top conference in natural language processing) in a year, left to join Huanshu Quant (High-Flyer Quant). Everyone was very curious why such a profitable quantitative company would recruit talent from the AI field: Did Huanshu also need to publish papers?
At that time, as far as I knew, most of the AI researchers recruited by Huanshu were exploring cutting-edge directions independently, with the core focus being large models (LLM) and text-to-image models (then OpenAI's Dall-e).
Fast forward to the end of 2022, Huanshu gradually began to attract more and more top AI talent (most of whom were students from Tsinghua and Peking University). Stimulated by ChatGPT, Huanshu's CEO Liang Wenfeng was determined to enter the field of general artificial intelligence: "We have built a new company, starting with language large models, and there will also be visual models later."
Yes, this company is DeepSeek, which, in early 2023, gradually stepped into the spotlight alongside the six small dragon companies represented by Zhipu, Moonlight, and Baichuan Intelligence. In the bustling areas of Zhongguancun and Wudaokou, DeepSeek's presence was largely overshadowed by these companies that were hit by hot money and gained "attention."
Therefore, in 2023, as a pure research institution without star founders (like Li Kaifu's Zero One Everything, Yang Zhilin's Moonlight, Wang Xiaochuan's Baichuan Intelligence, etc.), it was difficult for DeepSeek to independently raise funds from the market. Thus, Huanshu decided to spin off DeepSeek and fully fund its development. In this fiery era of 2023, no venture capital firm was willing to provide funding for DeepSeek, partly because most of the researchers in DeepSeek were recent PhDs without well-known top researchers, and partly because capital exit was a long way off.
In this noisy and restless environment, DeepSeek began to write its own stories in AI exploration:
November 2023: DeepSeek launched DeepSeek LLM, with up to 67 billion parameters, its performance approaching GPT-4.
May 2024: DeepSeek-V2 officially went live.
December 2024: DeepSeek-V3 was released, with benchmark tests showing its performance surpassing Llama 3.1 and Qwen 2.5, while being comparable to GPT-4o and Claude 3.5 Sonnet, igniting industry attention.
January 2025: The first generation of reasoning-capable large model DeepSeek-R1 was released, with a price less than 1/100 of OpenAI o1 and outstanding performance, sending shockwaves through the global tech community: the world truly realized that Chinese power has arrived… Open source always wins!
2. Talent Strategy
I got to know some DeepSeek researchers early on, mainly those researching AIGC, such as the authors of Janus released in November 2024 and DreamCraft3D, including one who helped me optimize my latest paper @xingchaoliu.
From my observations, most of the researchers I know are very young, primarily doctoral students or those who graduated within the last three years.
Most of these individuals are graduate or doctoral students studying in the Beijing area, with strong academic backgrounds: many have published 3-5 papers at top conferences.
I asked a friend at DeepSeek why Liang Wenfeng only recruits young people.
They relayed Liang Wenfeng's words, which are as follows:
The mysterious veil of the DeepSeek team makes people curious: what is its secret weapon? Foreign media say, this secret weapon is "young geniuses," who are capable of competing with financially powerful American giants.
In the AI industry, hiring experienced veterans is the norm, and many local Chinese AI startups prefer to recruit senior researchers or those with overseas PhDs. However, DeepSeek goes against the grain, favoring young people without work experience.
A headhunter who has worked with DeepSeek revealed that DeepSeek does not hire senior technical personnel, "3-5 years of work experience is the maximum; those with over 8 years basically get passed." Liang Wenfeng also stated in a May 2023 interview with 36Kr that most of DeepSeek's developers are either fresh graduates or just starting their careers in artificial intelligence. He emphasized: "Most of our core technical positions are held by fresh graduates or those with one or two years of work experience."
Without work experience, how does DeepSeek select its candidates? The answer is, looking at potential.
Liang Wenfeng once said, "In doing something long-term, experience is not that important; compared to that, foundational abilities, creativity, and passion are more important." He believes that perhaps the top 50 AI talents in the world are not currently in China, "but we can cultivate such talents ourselves."
This strategy reminds me of OpenAI's early approach. When OpenAI was founded at the end of 2015, Sam Altman's core idea was to find young and ambitious researchers. Therefore, aside from President Greg Brockman and Chief Scientist Ilya Sutskever, the remaining four core founding technical team members (Andrew Karpathy, Durk Kingma, John Schulman, Wojciech Zaremba) were all fresh PhD graduates from Stanford University, the University of Amsterdam, UC Berkeley, and New York University.
From left to right: Ilya Sutskever (former Chief Scientist), Greg Brockman (former President), Andrej Karpathy (former Technical Lead), Durk Kingma (former Researcher), John Schulman (former Reinforcement Learning Team Lead), and Wojciech Zaremba (current Technical Lead)
This "young wolf strategy" has allowed OpenAI to reap rewards, incubating talents such as GPT's father Alec Radford (equivalent to a private three-year college graduate), DALL-E's father Aditya Ramesh (NYU undergraduate), and Prafulla Dhariwal, the multimodal lead for GPT-4o and three-time Olympiad gold medalist. This enabled OpenAI, which initially had an unclear mission to save the world, to carve out a path through the youthful vigor of its team, transforming from an unknown entity next to DeepMind into a giant.
Liang Wenfeng saw the success of Sam Altman's strategy and firmly chose this path. However, unlike OpenAI, which waited seven years to see ChatGPT, Liang Wenfeng's investment yielded results in just over two years, showcasing the speed of China.
3. Speaking for DeepSeek
In the article about DeepSeek R1, its various metrics are astonishingly excellent. However, it has also raised some doubts: there are two points of concern,
① The expert mixture (MoE) technology it uses has high training requirements and data demands, which raises valid questions about whether DeepSeek used OpenAI data for training.
② DeepSeek employs reinforcement learning (RL) techniques that have high hardware requirements, but compared to Meta and OpenAI's massive clusters, DeepSeek's training only used 2048 H800 GPUs.
Due to the limitations of computing power and the complexity of MoE, the success of DeepSeek R1 with just $5 million seems somewhat suspicious. However, regardless of whether you admire its "low-cost miracle" or question its "flashy but impractical" nature, one cannot ignore its dazzling functional innovations.
BitMEX co-founder Arthur Hayes expressed: Will the rise of DeepSeek lead global investors to question American exceptionalism? Is the value of American assets severely overestimated?
Stanford professor Andrew Ng publicly stated at this year's Davos Forum: "I am impressed by DeepSeek's progress. I believe they can train models in a very economical way. Their latest released reasoning model is outstanding… 'Keep it up'!"
A16z founder, Marc Andreessen stated, "DeepSeek R1 is one of the most astonishing and impressive breakthroughs I have ever seen—and as open source, it is a profound gift to the world."
In 2023, DeepSeek, which stood in the corner of the stage, finally reached the pinnacle of global AI before the Lunar New Year in 2025.
4. Argo and DeepSeek
As a technical developer of Argo and an AIGC researcher, I have DeepSeek-ified important functions within Argo: as a workflow system, the rough original workflow generation work was done using DeepSeek R1. Additionally, Argo has integrated LLM as the standard DeepSeek R1 and chose to abandon the expensive closed-source OpenAI model. The reason is that workflow systems typically involve a large amount of token consumption and contextual information (averaging >=10k tokens), which leads to very high execution costs if using the expensive OpenAI or Claude 3.5 models. This kind of pre-spending is detrimental to the product before web3 users receive real value capture.
As DeepSeek continues to improve, Argo will collaborate more closely with the Chinese power represented by DeepSeek: including but not limited to the localization of Text2Image/Video interfaces and the Chinese adaptation of LLM.
In terms of collaboration, Argo will invite DeepSeek researchers to share their technical achievements in the future and provide grants for top AI researchers, helping web3 investors and users understand AI advancements.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
FOMC and Crypto: How Can the First FOMC Meeting Under Trump Affect the Crypto Market?
BlackRock Increases Its Bitcoin Reserves to $58 Billion, CEO Predicts Strong Appreciation
DeepSeek AI Analysis Points to $500K Bitcoin by Early 2026
Dogwifhat (WIF) Holds Key Support With Significant Whale Buying: What’s Next?