The Deepseek-R1 and Web3-Ai effect

The world of artificial intelligence (AI) was taken over a few days ago with the release of Deepseek-R1, an open source model that corresponds to the performance of TOP foundation models while claiming to have been built Using a remarkably low training budget and new post-training techniques. The release of Deepseek -R1 not only challenged conventional wisdom surrounding the laws on the scale of foundation models – which traditionally promote massive training budgets – but have done so in the most active field of research in the field : reasoning.

The open nature (as opposed to the open source) of the version made the model easily accessible to the AI ​​community, which led to a wave of clones in a few hours. In addition, Deepseek-R1 has left its mark on the current AI race between China and the United States, strengthening what has been more and more obvious: Chinese models are of exceptionally high quality and entirely capable to conduct innovation with original ideas.

Unlike most generative AI advances, which seem to expand the gap between web2 and web3 in the field of foundation models, the exit from Deepseek-R1 offers real implications and presents intriguing opportunities for Web3-Ai. To assess them, we must first examine more closely the main innovations and differentiaries of Deepseek-R1.

Inside Deepseek-R1

Deepseek-R1 is the result of the introduction of incremental innovations in a well-established pre-training frame for foundation models. In general terms, Deepseek-R1 follows the same training methodology as most high-level foundation models. This approach consists of three key steps:

  1. Pre-training: The model is initially pre -prone to predict the following word using massive quantities of unmarked data.
  2. Supervised end adjustment (SFT): This step optimizes the model in two critical areas: the following instructions and the answer to questions.
  3. Alignment with human preferences: A final fine adjustment phase is made to align the model’s responses to human preferences.

Most of the main foundation models – including those developed by Openai, Google and Anthropic – adhere to this same general process. At a high level, the Deepseek-R1 training procedure does not seem significantly different. But but rather than preclting a basic model from zero, R1 has exploited the basic model of its predecessor, Deepseek-V3-Base, which has an impressive 617 billion parameters.

Essentially, Deepseek-R1 is the result of the SFT application to Deepseek-V3-Base with a set of large-scale reasoning data. The real innovation lies in the construction of these sets of reasoning data, which are notoriously difficult to build.

First step: Deepseek-R1-Zero

One of the most important aspects of Deepseek-R1 is that the process has not produced a single model but two. The most important innovation in Deepseek-R1 may have been the creation of an intermediate model called R1-Zero, which specializes in reasoning tasks. This model was formed almost entirely using the learning of strengthening, with minimal dependence on the labeled data.

Reinforcement learning is a technique in which a model is rewarded to generate correct answers, which allows it to generalize knowledge over time.

R1-Zero is quite impressive, because he was able to correspond to GPT-O1 in reasoning tasks. However, the model fought with more general tasks such as questions of question and readability. That said, the goal of R1 -Zero was never to create a general model but rather to demonstrate that it is possible to obtain cutting -edge reasoning capacities by alone learning to strengthen – even if the model does not does not work in other areas.

Second step: Deepseek-R1

Deepseek-R1 was designed to be a model for general use that excels reasoning, which means that it had to surpass R1-Zero. To achieve this, Deepseek started again with his V3 model, but this time he refined it on a small set of reasoning data.

As mentioned above, the sets of reasoning data are difficult to produce. This is where R1-Zero played a crucial role. The intermediate model was used to generate a set of synthetic reasoning data, which was then used to refine Deepseek V3. This process led to another intermediate reasoning model, which was then put in the learning phase in in-depth strengthening using a set of data of 600,000 samples, also generated by R1-Zero. The end result of this process was Deepseek-R1.

Although I omitted several technical details of the R1 sampling process, here are the two main dishes to remember:

  1. R1-Zero has shown that it is possible to develop sophisticated reasoning capacities using basic strengthening learning. Although R1-Zero is not a strong general model, he managed to generate the reasoning data necessary for R1.
  2. R1 has expanded the traditional pre-training pipeline used by most foundation models by incorporating R1-Zero into the process. In addition, it has exploited a large amount of synthetic reasoning data generated by R1-Zero.

Consequently, Deepseek-R1 emerged as a model that corresponded to the reasoning capacities of GPT-O1 while being built using a simpler and probably much cheaper pre-election process.

Everyone is suitable that R1 marks an important step in the history of generative AI, that which is likely to reshape the way in which foundation models are developed. Regarding web3, it will be interesting to explore how R1 influences the evolutionary landscape of web3-ai.

Deepseek-R1 and Web3-Ai

So far, web3 has struggled to establish convincing use cases which clearly add value to the creation and use of foundation models. To a certain extent, the traditional workflow for sampling foundation models seems to be the antithesis of web architectures. However, despite its early stages, the release of Deepseek-R1 highlighted several opportunities that could naturally align with web 3-Ai architectures.

1) Remaucations of strengthening fine adjustment networks

R1-Zero has shown that it is possible to develop models of reasoning using pure strengthening learning. From a computer point of view, the learning of strengthening is very parallelizable, which makes it well suited to decentralized networks. Imagine a web3 network where the nodes are offset for the fine adjustment of a model on the learning tasks of strengthening, each applying different strategies. This approach is much more feasible than the other pre-training paradigms that require complex GPU topologies and centralized infrastructure.

2) Generation of the synthetic reasoning data set

Another key contribution from Deepseek-R1 has been to present the importance of reasoning data sets generated by synthesis for cognitive tasks. This process is also well suited for a decentralized network, where the nodes are carrying out data generation work and are offset as these data sets are used for pre-training or fine adjustment foundation models. Since this data is generated synthetically, the whole network can be fully automated without human intervention, which makes it an ideal adjustment for web 3 architectures.

3) decentralized inference for small models of distilled reasoning

Deepseek-R1 is a massive model with 671 billion parameters. However, almost immediately after his release, a wave of distilled reasoning models emerged, ranging from 1.5 to 70 billion parameters. These smaller models are much more practical for inference in decentralized networks. For example, a 1.5B – 2B distilled R1 model could be integrated into a DEFI protocol or deployed in the nodes of a backdrop. More simply, we will probably see the rise in the criteria for inference of profitable reasoning fueled by decentralized calculation networks. Reasoning is an area where the performance gap between young and large, shrinking, creating a unique opportunity for web3 to effectively take advantage of these models distilled in decentralized inference parameters.

4) Raisier the source of the data

One of the decisive characteristics of reasoning models is their ability to generate traces of reasoning for a given task. Deepseek-R1 makes these traces available in the context of its inference exit, strengthening the importance of provenance and traceability for reasoning tasks. Internet works mainly on outputs, with little visibility in the intermediate steps that lead to these results. Web3 presents an opportunity to follow and verify each step of reasoning, potentially creating a “new surfer of reasoning” where transparency and verifiability become the standard.

Web3-Ai has a chance in the era of post-R1 reasoning

The release of Deepseek-R1 marked a turning point in the evolution of the generative AI. By combining intelligent innovations with established pre-training paradigms, he challenged traditional AI workflows and has opened a new era in the AI-focused AI. Unlike many previous foundation models, Deepseek-R1 presents elements that bring the AI ​​generator of web3 closer.

Key aspects of R1 – Sets of synthetic reasoning data, more parallelizable training and the growing need for traceability – naturally align with web principles. While web3-Ai has struggled to gain a significant traction, this new era of post-R1 reasoning could present the best opportunity to date so that web3 plays a more important role in the future of AI.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top