Écrit par

Alexis Laporte
2x Cofounder - 1x Board Member - Tech Specialist
Subscribe to newsletter
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
If we stick to the figures put forward by DeepSeek, 2,000 GPUs and an excellent team are enough to produce a model that is competitive in terms of performance and cost. This is excellent news for corporate use because the DeepSeek method makes it possible to design an in-house LLM with a level of o1.
It is also excellent news for the independence of France and Europe in the face of foreign artificial intelligence champions. Because France has this power of GPUs, if only between Scaleway and the CNRS. What's missing are three key elements:
DeepSeek is a Chinese AI laboratory that designed R1, a model equivalent to ChatGPT o1, for only 2,000 GPUS H100, which is 100 times less expensive than OpenAI.
However, academic and economic circles suspect that the number of GPUs actually used is much higher.
DeepSeek V3 and R1 are cutting-edge models that prove that it is possible to compete with OpenAI in terms of performance and cost.
DeepSeek made its method and the weights of the model public, which allowed other laboratories (Berkeley) to reproduce certain advances, in particular in Reinforcement Learning (RL).
DeepSeek illustrates the strategic and geopolitical importance of foundational models.
DeepSeek V3 is based on a Mixture of Experts (MoE) architecture, activating only certain specialized sub-models depending on the task. Mistral uses a similar approach with Mixtral.
R1 is a reasoning model that has innovated by starting its learning directly with Reinforcement Learning (RL) rather than Supervised Learning (SFT), which was previously considered too expensive. He does not rely on examples but receives a bonus or a malus depending on his answers.
DeepSeek-v3 would have been trained with 2,048 H800 GPUs for two months, at an estimated cost of $5.6 million. However, some sources suggest that up to 50,000 H100 GPUs would have been mobilized, a level of power similar to Google or Amazon.
Researchers at Berkeley have taken up the method of DeepSeek, which could influence the next evolutions of OpenAI, Gemini and Claude.
Lucie is a research project aimed at producing a completely open-source Artificial Intelligence, although still in the experimental phase. It was trained on 512 CNRS H100 GPUs.
The launch of this project was completely premature: Lucie is a raw model that is still intended for the public, the model was presented too early, generating unrealistic expectations and a disappointment in its performance.
Unfortunately, its launch coincides with DeepSeek and highlights the communication problem: Lucie's publication came when DeepSeek caused a sensation, reinforcing the perception of failure.
Not only does Lucie's presentation not correspond to the reality of the model, but it highlights our lack of clarity on the challenges of artificial intelligence.
Lucie aims to comply with the Open Source Initiative (OSI) standard, guaranteeing free and transparent access to her code, method and training data.
Although her launch was poorly managed, Lucie fulfilled her initial objective by establishing the foundations for a transparent European LLM. However, this effort is not enough to make up for the backlog accumulated in the face of major international players.
At the same time, while public initiatives are struggling, companies like Kyutai and Mistral are making progress thanks to more effective strategies, based on more dynamic financing and a more pragmatic approach.
This gap highlights a deficit in infrastructure and technical resources for AI research in France, hampering its development and limiting its competitiveness in the face of major powers such as the United States and China. The question remains: how to structure a collective effort to reverse this trend?
Join our corporate venture building club