AI Game-Chanceer or OverHip? Deepsek investigated on bold claims. Technology news

After the works of Google and Openai caused shockwaves with AI models with rival capabilities, China’s Deepsek is facing questions about whether its bold claims are standing for investigation.
Hangzo-based startups announced that it developed R1 at an excerpt from the cost of the latest models of Silicon Valley, immediately in question assumptions about the United States’ dominance and sky-high market evaluation of its top technical firms in AI Was said.
However, some skepticists have challenged the lampsac account to work on a shosting budget, suggesting that the firm had access to more advanced chips and more funding as it has been accepted.
“It is a very open question whether the claims of Deepsek can be taken at the marked price. The AI community will be digging in them and we will find out, ”Emeritus Pedro Domingos, Professor at Computer Science and Engineering at Washington University, told Al Jazira.
“It’s laudable to me that they can train a model with $ 6m,” said Domingos.
“But it is also quite possible that it is just the cost of fine-tuning and post-processing models, which is a higher cost, that Deepsek cannot do so without the construction of more expensive models by others.”
In a research paper released last week, the Deepsek Development Team said that they used 2,000 Nvidia H800 GPU – a low advanced chip designed to follow American export controls – and the founding model of R1 Spend $ 5.6m to train V3.
Openai CEO Sam Altman has stated that its chatbot costs more than $ 100 meters to train GPT -4, while analysts have estimated that the model uses more than 25,000 advanced H100 GPU.
The announcement of the Deepsek, established by serial Entrepreneur Liang Wenfeng at the end of 2023, extended the widely organized confidence that companies demanding AI to be at the forefront need to invest billions of dollars in data centers and large amounts of expensive highs End chips.
It also raised questions about the effectiveness of Washington’s efforts to disrupt China’s AI region by banning the export of most advanced chips.
California-based Nvidia shares, which is a nearby GPU supply, with a nearby-one-one-one-one-one-one-one-one-one-one-one-one-one-one-one-one-one-one-one-one, which is 17 percent on Monday At a distance of, the chip wiping approximately $ 593bn from the spacious market value-a figure of domestic product (a figure with GDP (GDP)) of Sweden.
While there is a widespread agreement that R1’s R1’s release of Deepsac represents at least one important achievement, some prominent supervisors have warned against taking their claims at an inscribed price.
Palmer Lucky, the founder of virtual reality company Okulus VR, on Wednesday, labeled the claim of Deepsek as a “fake” and accused a lot of “useful stupids” to fall for “Chinese promotion”.
“This is pushed by a Chinese hedge fund to slow the investment in American AI startups, serve his own shorts against American Titans such as Nvidia, and hide the approval theft,” Lukey has a X on X Said in the post.
“America is a fertile bed for Psyops like this because our media equipment hates our technology companies and wants to see President Trump to fail.”
In an interview with CNBC last week, Alexandr Wang, CEO of Scale AI, also expressed doubt on Deepsek’s account, saying that it was his “understanding” that it had access to 50,000 more advanced H100 chips that it was America’s exports The reason could not talk. Control.
Wang did not give evidence for his claim.
Tech billionaire Elon Musk, one of the closest confidant of US President Donald Trump, supported Dipsek’s Skptics, wrote “clearly” on X under a post about Wang’s claim.
Deepsek did not respond to the remarks requests.
But Zeehan Wang, a PhD candidate, who previously worked on a deepsek model, said back on the critics of the startup, “The matter is cheap.”
“It is easy to criticize,” he said in response to Al Jazira’s questions on X that the claims of the lampsake should not be taken at the marked price.
“If they spend more time working on the code and reproduce the deepsek idea, it would be better to talk on paper,” Zeehan Wang said, people who use the English translation of a Chinese idiom Let’s talk passive.
He did not directly answer a question about whether he believes that Deepsek has spent less than $ 6m and used low advanced chips to train the founding model of R1.
In a 2023 interview with Chinese media outlet waves, Liang said that his company had stock 10,000 of NVIDIA A100 chips-which are older than H800–American-American President who is the President who is on his exports before Biden’s administration Ban.
R1 users also indicate borders due to its origin in China, namely the sensoring of subjects considered to be sensitive by Beijing, including a massacre in the 1989 Tianmenman Square and the position of Taiwan.
In the indication that the initial nervousness about the potential impact of Deepsek on the US tech sector resumed the price of NVidia’s share on Tuesday.
Tech-Havi Nasdaq 100 increased by 1.59 percent after a decline of more than 3 percent on the previous day.
Professor Tim Miller specialized at the University of Queensland, said it was difficult to say how much stock should be kept in Deepsek’s claims.
Miller told Al Jazeera, “The model itself gives some details of how it works, but the cost of the main changes that they claim – that I understand – the model does not show themselves so much.”
Miller said that he had not seen any “alarm bell”, but there are appropriate arguments against the research paper and against it.
“Success is incredible – almost a ‘very good’ style. Cost breakdown is not clear,” Miller said.
On the other hand, he said, successes are sometimes in computer science.
“These large -scale models are a very recent event, so capabilities are found,” Miller said.
“They knew that it would be appropriately straight to breed for others, they would know that they seem stupid if they were ***************** All were all. : There is already a team to try to present.
Falling cost
Lucas Hansen, co-founder of non-profit Sivai, said it was difficult to know if Deepsek has sidelined US export controls, the training budget claimed for startups has been referred to in V3, which is GPT of OpenaiI -is equal to 4, not only R1.
“GPT-4 ended training in late 2022. There have been lots of algorithm and hardware improvements since 2022, which reduces the cost of training GPT-4 class models. Similarly for GPT-2 The situation happened. At that time it was a serious venture to train, but now you can train it for $ 20 in 90 minutes, ”Hansen told Al Jazira.
“Dipsek made a base model – in this case, applied the V3 – and some clever methods to make the base model more carefully to think more carefully by applying some clever ways,” Hansen said.
“This teaching process is comparatively inexpensive when compared to the price of training to the base model. Now that Deepsek has published details about bootstraping a base model in a thinking model, we will see a large number of new thinking models. ,