Site icon Sino-Cooperation

Interview with the founder of Deepseek Liang Wenfeng

Conducted in July 2024, shortly after the company rose to fame with its open-source V2 model, this rare interview with DeepSeek founder Liang Wenfeng shows how a Chinese startup dares to challenge the giants and redefine the rules of innovation.

How was the first shot fired in the price war?

Interviewer : After the release of the DeepSeek V2 model, a fierce price war quickly broke out in the large AI model industry. Some consider you as market disruptors.

Liang Wenfeng (Founder of DeepSeek) : We never intended to be a disruptor – it just happened that way.

 

Interviewer : Did this result surprise you?

Liang Wenfeng : Yes, very much. We never thought that pricing would be such a sensitive issue. We simply calculated our costs and set a reasonable pricing strategy – without making losses, but also without pursuing excessive profits. Our current pricing only includes a small profit margin above costs.

 

Interviewer : Five days later, Zhipu AI also cut prices, followed by ByteDance, Alibaba, Baidu and Tencent also entering the price war.

Liang Wenfeng : Zhipu AI only lowered the prices of entry-level products, while their flagship models remain expensive. The first company to really follow our prices of flagship products was ByteDance – and that put pressure on the other companies. Since the costs of large models at the big companies are much higher than ours, we never expected that anyone would be willing to operate at a loss. But eventually the market has returned to the logic of subsidy competition from the Internet era.

 

Interviewer : From the outside, price reduction seems to be a typical competitive strategy of the Internet era, with the aim of attracting users.

Liang Wenfeng : Attracting users was not our main goal. There were two reasons for our price reduction: First, our costs have decreased in researching new model architectures. Second, we believe that AI and API services should be affordable and available to everyone at any time. 

Interviewer : Previously, most Chinese companies developed AI applications by simply copying LLaMA’s architecture. Why did you decide to focus on the model architecture itself?

Liang Wenfeng : If the goal is to develop applications, then it makes sense to use LLaMA as a foundation to bring products to market quickly. However, our goal is AGI (Artificial General Intelligence), and that requires exploring new model architectures to achieve stronger intelligence with limited resources. This is the foundation for scalable progress. In addition to architecture, we have also worked hard on data filtering and human-like reasoning – all of which are reflected in our model. In addition, LLaMA’s training efficiency and inference cost are about two generations behind compared to the world’s leading standards.

 

Interviewer : What exactly is this two-generation technological gap?

Liang Wenfeng : First, in training efficiency. We estimate that the current best models in China, with the same computing power, require about twice as many resources to reach the level of the world’s best models. This is due to differences in architecture and training strategies. Second, the data exploitation efficiency in Chinese models is only about half that of the global top models. This means that you need twice as much data and computing power to achieve the same results. These two factors together lead to the total resource consumption being about four times higher. Our goal is to continuously narrow this gap.

 

Interviewer : Most Chinese companies focus on both model and application development. Why does DeepSeek focus exclusively on research?

Liang Wenfeng : Because we believe that the most important thing at present is to actively contribute to global technological innovation. Chinese companies have long relied on using foreign technological innovations and monetizing them at the application level – but this model is not sustainable. Our goal is not short-term profit, but to promote cutting-edge technological research to strengthen the entire ecosystem from the ground up.

 

Interviewer : In the Internet and mobile Internet era, the general perception was that the United States was the originator of innovations, while China was the leader in the application of those innovations.

Liang Wenfeng : With economic progress, China must evolve from a technology user to a technology contributor, rather than continuing to rely on foreign achievements. During the IT revolution of the past 30 years, we have hardly participated in groundbreaking innovations. We have become accustomed to Moore’s Law “falling from the sky” – that we simply have to wait 18 months to get better hardware and software. The same is true of the “scaling law” of large models. In reality, these advances are the result of decades of continuous efforts by Western scientists. Because we have not participated deeply enough in this process for a long time, we underestimate its true value.

 

The real gap lies in originality, not just in time

Interviewer : Why did DeepSeek V2 surprise so many in Silicon Valley? Liang Wenfeng: In the US, innovations happen every day, so from that perspective, our breakthrough is not unusual. But what surprised them is that a Chinese company is not just a copycat, but is directly competing as an innovator. This is completely different from the pattern that most Chinese companies are used to.

 

Interviewer : However, in the Chinese reality, relying solely on innovation seems to be a luxury. Developing large AI models is extremely expensive, and not every company can afford to focus only on research before commercialization. 

Liang Wenfeng : Innovation is of course expensive, and in the past we often relied on existing technologies, mainly because of China’s level of development. But today China has a global economy, and companies like ByteDance or Tencent are highly profitable. Our biggest deficit is not capital, but confidence – and the ability to effectively organize high-caliber talent and drive innovation.

 

Interviewer : Why do even well-capitalized Chinese tech giants often place more emphasis on rapid commercialization than on innovation?

Liang Wenfeng : Over the past 30 years, we have become more focused on profit than innovation. But innovation is not only driven by commercial interests – it requires curiosity and the ambition to create something new. We are still bound by old ways of thinking, but this is just a phase.

 

Interviewer : DeepSeek is ultimately a company, not a non-profit research organization. If you innovate and publish groundbreaking results as open source, as you did in May with the MLA architecture, your competitors could simply copy it. So where is your competitive advantage? Liang Wenfeng: In disruptive technology fields, closed protection mechanisms are not sustainable. Even OpenAI’s closed source model could not prevent other companies from catching up. Our real line of defense is growing our team – accumulating technological know-how and fostering a culture of innovation. Open source and publishing research results do not put us at a decisive disadvantage. For engineers, it is an achievement when their work is recognized by the industry. Open source is not just a business strategy, it is a culture. Supporting the community is an honor, and at the same time helps us attract more talent.

 

Interviewer: What do you think of the market-oriented thinking, such as that of Zhu Xiaohu (who argues that AI companies should focus primarily on commercialization rather than doing basic research and thinks AGI is unrealistic)? Liang Wenfeng: Zhu’s logic works for short-term profitable projects. But the most profitable companies in the US are often those that have built technology barriers through long-term research.

 

Interviewer : But in AI, mere technological advantage is not enough. What is DeepSeek banking on in the long term? Liang Wenfeng: We believe that China’s AI cannot be a laggard forever. It is often said that China’s AI is one to two years behind America’s, but the real difference is “originality” versus “imitation”. If we don’t change that, China will always be lagging behind rather than charting its own course. Some technological exploration is inevitable. Nvidia’s success was not just the result of their own efforts, but of long-term collaboration within the Western technology ecosystem to plan the next generation of technologies. China needs a similar ecosystem. Many failed Chinese chip projects failed not because of a lack of funding, but because they lacked a supportive technological community. They relied on secondary information rather than being truly at the forefront.

 

More capital ≠ more innovation

Interviewer : DeepSeek currently seems like OpenAI in its early idealistic phase – and you stick to open source. Will you move to a closed source model in the future, like OpenAI or Mistral? 

Liang Wenfeng : No, we will not switch to closed source. We believe that building a strong technology ecosystem is more important than a closed business model.

Interviewer : Are there any funding plans? There have been reports that Huanfang is planning to spin off DeepSeek for an IPO. In Silicon Valley, AI startups often end up allying with large companies – will you follow this trend? 

Liang Wenfeng : There are currently no short-term financing plans. Our biggest challenge has never been capital, but the export bans on high-performance chips.

 

Interviewer : Many believe that the development of AGI requires high visibility and industry presence, unlike discrete business models such as quantitative trading. Do you agree? 

Liang Wenfeng : More investment does not automatically mean more innovation. If capital alone could bring about technological breakthroughs, large companies would have dominated everything long ago.

 

Interviewer : DeepSeek does not develop applications – is this due to a lack of operational competence? 

Liang Wenfeng : We believe that we are currently in a phase of technological innovation, not an application explosion phase. In the long term, we want to build an ecosystem where companies use our technology directly and develop B2B or B2C services based on it. We focus on basic research. If the ecosystem works well, we don’t need to develop applications ourselves. Of course, we could do it if necessary, but research and innovation remain our top priority.

Interviewer : Why should customers choose DeepSeek’s API rather than larger providers?

Liang Wenfeng : The future world will most likely be one that relies heavily on division of labor and collaboration. Continuous innovation of fundamental AI models is critical, but large companies have their own limitations and are not necessarily best suited to take on this role.

 

Interviewer : But is technology alone really enough to create a sufficiently large competitive advantage? You yourself said that there are no absolute “secrets”.

Liang Wenfeng : There are no secrets, but imitation takes time and costs. There is nothing mysterious about Nvidia’s GPUs, but to match them you would have to rebuild a team and catch up with the next generation of technologies – that’s the real barrier.

 

Interviewer : After you lowered your prices, ByteDance was the first company to follow suit – this shows that they felt the competitive pressure. How do you see the new competitive situation between start-ups and large companies?

Liang Wenfeng: To be honest, we don’t really care. The price reduction was a casual decision. Providing cloud services is not our core business – our goal is AGI (Artificial General Intelligence). At the moment, we don’t see any truly groundbreaking solutions. Large companies have users, but at the same time their “cash cow” businesses limit them and open up opportunities for start-ups to overtake them.

 

Interviewer: What do you think about the future of the six most important Chinese AI start-ups?

Liang Wenfeng : There will probably be two or three companies left in the end. All of them are currently burning money, but the ones that survive are certainly those with a clear strategy and strong execution. The others may change direction – their value will not disappear, but will continue in a different form.

DeepSeek V2: Completely developed by local talent

Interviewer : Jack Clark, former policy director at OpenAI and co-founder of Anthropic, said that DeepSeek attracted a group of “elusive geniuses” who developed DeepSeek V2. What sets these people apart?

Liang Wenfeng : There are actually no “elusive geniuses”. They are simply recent graduates, PhD students (even fourth or fifth year interns) from top universities, and some young people with a few years of work experience.

 

Interviewer : Many large AI companies recruit top talent from around the world. Some say it’s unlikely that the world’s top 50 AI scientists will work for a Chinese company. Where does your team come from?

Liang Wenfeng : DeepSeek V2 was developed entirely by local talent. The world’s top 50 AI researchers may not currently be working in China, but we hope to cultivate a team of that level ourselves.

 

Interviewer : How did the innovation of MLA architecture come about? I heard that it was originally the personal interest of a young researcher.

Liang Wenfeng : After summarizing the main evolutionary principles of mainstream attention architectures, he suddenly had an inspiration and developed an alternative solution. But it was a long road from the idea to the implementation. We assembled a team and spent several months testing the feasibility.

 

Interviewer : This kind of spontaneous innovation seems to be related to your flat organizational structure. At Huanfang, you avoid a top-down management structure. But AGI is a highly uncertain and complex research field – don’t you sometimes intervene to take control?

Liang Wenfeng : DeepSeek remains completely bottom-up. We don’t predefine roles – the division of labor happens organically. Everyone brings their experience and ideas to the table without anyone having to push. When challenges arise, people bring in other colleagues on their own. However, we invest resources from the management level as soon as an idea shows potential.

 

Interviewer : We hear that DeepSeek is extremely flexible in its use of computing resources and personnel.

Liang Wenfeng : We have no restrictions on the use of computing power or team members. If someone has an idea, they can use our training clusters at any time without asking for permission. In addition, we have no strict hierarchies or departmental barriers – if team members are interested in the same topic, they can easily collaborate.

 

Interviewer : This flexible way of working requires highly motivated talent. It is said that DeepSeek identifies exceptional people based on unconventional criteria.

Liang Wenfeng : Our hiring criteria are always based on passion and curiosity. Our team is very diverse and unique – the members are more interested in research than money.

Innovation: Start-ups vs. large AI labs

Interviewer : Transformer was developed in the Google AI Lab, ChatGPT comes from OpenAI. What do you see as the differences between the innovative power of large AI labs and start-ups?

Liang Wenfeng : Whether it’s Google Research, OpenAI or the AI ​​labs of large Chinese tech companies – they have all made important contributions. OpenAI’s breakthrough was also a bit of a historical coincidence.

 

Interviewer : So do you think innovation is mostly a matter of luck? Your office design allows for chance encounters – similar to the creation of Transformer, when a researcher overheard a discussion and developed the idea further.

Liang Wenfeng : Innovation is first and foremost a belief. Why is Silicon Valley so innovative? Because they just try. When ChatGPT came out, trust in Chinese cutting-edge research was low – investors and big companies thought the gap was too big and focused on applications instead. But real innovation requires self-confidence, and young people often have more of that.

 

Interviewer : Unlike other AI companies that actively seek funding and media attention, DeepSeek remains more low-key. How do you ensure that DeepSeek remains the first choice for AI talent?

Liang Wenfeng : Because we solve the toughest problems. The most attractive thing for top talent is to take on the world’s toughest challenges. In fact, China’s top talents are often underestimated because of a lack of real deep-tech innovation and seldom recognition. We provide them with the stage they crave.

 

Interviewer : OpenAI did not present GPT-5 at its last event. Many believe that the industry’s technological growth curve is slowing down and some are starting to question the “scaling law”. What do you think about this?

Liang Wenfeng : We remain optimistic. The industry continues to develop as expected. OpenAI is not a god-like institution – they will not stay at the top forever.

 

Interviewer : How long do you think it will take to reach AGI? Before V2, you released code and math models and moved from a dense architecture to MoE (Mixture of Experts). What is your roadmap?

Liang Wenfeng : Maybe two, five or ten years – but it will definitely happen in our lifetime. As for our roadmap, there is no consensus even within the company. But we are focusing on three main directions:

  1. Mathematics and code are the natural testing ground for AGI. Similar to the game of Go, these are closed and testable systems in which self-directed learning can potentially produce highly intelligent systems.
  2. Multimodality – Artificial intelligence should interact directly with the real world and learn from it.
  3. Natural language – It is the foundation of human-like intelligence.

We keep ourselves open to all possibilities.

 

Interviewer : What will the final stage of large AI models look like?

Liang Wenfeng : In the future, there will be specialized companies that provide basic AI models and services. A long value chain with a high degree of specialization will emerge. Many companies will build on this to develop solutions for the diverse needs of society.

 

“All strategies are products of the previous generation”

Interviewer : There have been many changes in the landscape of Chinese AI startups in the past year. For example, Wang Huiwen (co-founder of Meituan), who initially joined with great enthusiasm, has now withdrawn, while new competitors are increasingly creating differentiation.

Liang Wenfeng : Wang Huiwen bore all the losses himself and allowed others to get out unscathed. He made the decision that was most disadvantageous for himself but best for everyone else. I admire his responsibility.

 

Interviewer : What are you most focused on at the moment?

Liang Wenfeng : On research into the next generations of large AI models, because there are still many unsolved problems.

 

Interviewer : Many AI startups rely on a combination of model development and application, since technological leadership is not a permanent advantage. Why does DeepSeek continue to focus so consistently on research? Is your model not yet strong enough?

Liang Wenfeng : All strategies are products of the previous generation – this does not mean that they will still be valid in the future. Discussing the future of AI monetization using the business logic of the Internet age is like comparing Tencent’s early development with that of General Electric or Coca-Cola – this is an outdated way of thinking, comparable to the old Chinese saying “marking a sword in the water” (刻舟求剑), which means sticking to outdated methods.

 

Interviewer : Huan Fang (a company specializing in quantitative investment) has a strong technological and innovative DNA and has developed relatively smoothly. Does this give you more confidence in technology-driven innovation?

Liang Wenfeng : Huan Fang has strengthened our confidence in technology-driven innovation to a certain extent, but its growth has by no means been effortless. Our own path has been a long journey. Many people only see the boom after 2015, but in fact we have been building foundations for 16 years.

 

Interviewer : On the question of original innovation – in an environment of economic slowdown and declining investment, will this hamper disruptive research?

Liang Wenfeng : Not necessarily. The restructuring of China’s industry will increasingly depend on deep-tech innovation. As quick wins become increasingly rare, more people will turn to real innovation.

 

Interviewer: So you are optimistic?

Liang Wenfeng: I grew up in a small town in Guangdong in the 1980s. My father was a primary school teacher. In the 1990s, there were many opportunities to make money in Guangdong, and many parents in my neighborhood debated whether education was still necessary. Some said that learning was useless. But looking back, this view has completely changed.

Earning money is no longer as easy as it used to be – even driving a taxi is no longer a secure source of income. In just one generation, the environment has changed dramatically.

In the future, there will be more and more genuine deep tech innovations. At the moment, many do not understand them sufficiently because our society is still in the middle of the learning process. Once society starts to recognize the successes of deep tech innovators, the collective consciousness will naturally change.

What we need are more real success stories – and time for this change to happen.

Exit mobile version
Alipay
Wechat Pay
请使用 支付宝 或 微信 扫码支付