Chinese AI startup DeepSeek has recently made headlines after releasing DeepSeek Coder V2, an open-source mixture of experts code language model. The company previously made a splash in the AI world after the release of DeepSeek Chat, which is a rival to ChatGPT and was trained on 2 trillion Chinese and English tokens.
The company’s latest development has demonstrated superior abilities in both math and coding tasks. Not only does DeepSeek Coder V2 outperform closed-source models, such as Claude 3 Opus, Gemini 1.5 Pro, and GPT-4 Turbo, but it can support 300 programming languages. The company has claimed it is the first to create an open-source model to achieve such a task, far outperforming Llama 3-70B and other models in the same category.
Additionally, DeepSeek Coder V2 also appears to perform well in both language and reasoning tasks.
DeepSeek’s unique qualities
What appears to set DeepSeek’s recent development apart is that it is open source and relatively small. Samuel Hammond, senior economist for the Foundation of American Innovation, told Blaze News that DeepSeek Coder V2 “integrates state of the art ‘mixture of experts’ and sparsity methods that are also being integrated into newer U.S. models, which is why GPT-4o and 4-turbo run so much faster than the original GPT-4.”
Some of the most popular chatbots available, such as Gemini, Claude, and ChatGPT, use MoE to help tackle a wide range of user-based prompts. To achieve both broad and deep expertise on a given subject, these chatbots must have access to highly specialized data that they can then share with the user.
‘They are bringing top talent, but also many Chinese entities have a history of stealing IP of their Western competitors.’
The purpose of MoE stratagems is to combine several specialized models, known as “experts,” into one overarching system. This allows for each “expert” to specialize or focus on a specific task to generate the deepest and most sophisticated bits of information. This strategy is significantly different from a one-size-fits-all machine-learning system, which may be able to generate a significant amount of information but may come up short in specialization.
Another element that makes DeepSeek V2 such a powerhouse is that it is open-sourced, meaning that the source code is available to everyone within the public domain. This allows people to use, modify, and distribute their discoveries and developments. Open-source models lend themselves to outside creativity and innovation, which is not the case with close-sourced models.
“If people are freaked out, it’s because DeepSeek V2 is now one of the best open-source MoE models now available. U.S. companies are sitting on comparable or even better models, including Meta with their Llama3 400b model, but it has yet to be released,” Hammond said.
“DeepSeek V2 illustrates the danger of U.S. AI companies being reluctant to open source their models due to public pressure or potential legal risk. Independent developers and researchers around the world will always want to use the best open-source model available, and we’d rather the best open model be American than Chinese.”
The possibility of artificial general intelligence
DeepSeek was founded in 2023 with a mission to “unravel the mystery of AGI [artificial general intelligence] with curiosity.” While this may be a noble goal, there is still debate among those in the world of AI whether achieving AGI will ever be a possibility. AGI is generally defined as artificial intelligence that contains human-like reasoning and problem-solving capabilities, including the ability to learn and adapt on its own.
OpenAI CEO Sam Altman appears to have an optimistic outlook about the future of AGI, claiming that it is already a “reasonably close-ish future.” However, Altman added that AGI will “change the world much less than we all think and it will change jobs much less than we all think.”
He went on to say that “people are begging to be disappointed [by what AGI can really do] and they will be. We don’t have an actual [artificial general intelligence] and that’s sort of what’s expected of us.”
Additionally, Shane Legg, chief AGI scientist at Google DeepMind, said that there is a 50% chance that AGI will become a reality by 2028. But not everyone in the field holds this same amount of optimism.
Grady Booch, an IBM fellow and chief scientist for software engineering, said that AGI will never happen. “I, being a historian of computing, have a rather jaded and cynical view of the hyperbolic optimism of our field and as such am somewhat conditioned to be a contrarian when it comes to predictions such as this.”
One unique challenge for DeepSeek and OpenAI — two companies that claim to have the ultimate goal of attaining AGI — is that the issue quickly becomes one not of technology but of philosophy. To achieve AGI, one must establish what that might look like.
Sara Hooker, who leads Cohere for AI, a research facility that focuses on machine learning, said, “It really is a philosophical question. So, in some ways, it’s a very hard time to be in this field, because we’re a scientific field.” She added that much of the debate around AGI is more value-driven than technically driven, which can obscure any meaningful definition of AGI.
Hooker went on to say that it’s “very unlikely” that AGI will be defined or achieved by “a single event where we check it off and say, ‘AGI achieved.’” Before AGI can realistically be achieved, there must be a testable definition that everyone in the field can agree on.
Microsoft Research, with the help of OpenAI, released a paper in 2023 that suggested GPT-4 demonstrated a nascent example of AGI. Researchers on the project claimed that “GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google’s PaLM for example) that exhibit more general intelligence than previous AI models.”
Until experts and others in the field can productively and specifically define AGI, it is somewhat unclear what researchers mean when they say GPT-4 “exhibit[ed] more general intelligence than previous AI models.”
The battle for AI dominance
Earlier this month, data analytics firm Govini indicated that the U.S. has fallen behind China in the AI race. Consequently, the U.S. would have a hard time winning a war against the People’s Liberation Army if a serious conflict were to erupt between the two world superpowers.
Govini investigates the performance of the federal government, specifically focusing on the 15 most important national security technologies through the lens of acquisition, adversarial capital, procurement, supply chain, foreign influence, and science and technology.
‘The AI aspects of our spending being so R&D focused, it tells me that we’re not actually moving this into the weapons systems and platforms that we’re deploying today, obviously, appropriately so given that it’s artificial intelligence.’
Govini’s report suggested that the U.S. has continued to under-invest in valuable AI capabilities while also slowing down in the research and development stages. Nine of the 12 areas assessed in the report noted that over 65% of the government’s funding was still lingering in the research and development stage in 2023. As a result, many of these potentially valuable technologies are still not production-ready.
Govini CEO Tara Murphy Dougherty said: “Despite the fact that artificial intelligence is an incredibly, highly visible, arguably the most transformational technology that matters in the critical tech competition, not just for the United States, but around the world, the Department of Defense is still primarily attacking this as a research and development effort.”
“While there is more to do in R&D for artificial intelligence, it is well past time for DoD to stop treating AI like it is just a science project,” she continued.
Govini’s 2023 report indicated that the U.S. was at serious risk of “weakness and dependence” as it fell behind China in the technology race. In 2022, the data analytics firm found that the U.S. was not injecting enough money into AI and ML to win a potential technological race against its Eastern rival.
“If you add in an AI advantage that the United States doesn’t have, it potentially tips the war into unwinnable [for the U.S.],” Dougherty said.
Nathan Leamer, executive director of the Digital First Project, told Blaze News that “DeepSeek is one of a growing number of Chinese entities heavily investing in [the AI] industry.”
“They are bringing top talent, but also many Chinese entities have a history of stealing IP of their Western competitors. AI is an arms race, and considering the billions the Chinese are putting into this, it is not surprising they are developing state-of-the-art technology,” he added.
This appears to echo Dougherty’s point that AI can manifest in subtle ways during war. She said that China would not have to weaponize AI to have a dramatic impact during a conflict, but rather the PLA could use AI to penetrate the U.S. energy grid, which could have a catastrophic effect.
While the Department of Defense seems to be slow in moving AI and ML technologies out of the development stage, Dougherty said the opportunity is still there. “The AI aspects of our spending being so R&D focused, it tells me that we’re not actually moving this into the weapons systems and platforms that we’re deploying today, obviously, appropriately so given that it’s artificial intelligence.”
“But, I believe that the department has a great framework to govern that and make sure that AI is used appropriately in a military context. So let’s get going on it,” she added.
Additionally, it appears the U.S. has fallen behind China in obtaining patents in 13 of the 15 critical technology areas. China has managed to speed up its patent grants over the last few years after its “14th Five Year Plan for Informatization Development,” according to Dougherty.
“The way to think about patents is that it’s a leading indicator of technological dominance,” she added.
In addition to the Department of Defense conducting an in-depth analysis as to why it has fallen behind, Hammond mentioned that the U.S. must “focus on denying [China] access to AI hardware — the advanced AI chips needed for scaling and serving the largest models.”
Software engineer Mike Wacker said that the U.S. “should be worried about AI superiority, both in general and specifically with respect to military applications.” However, he added that it is “interesting” that “DeepSeek is open-source; if it were truly valuable to the CCP, they probably wouldn’t let those researchers open-source it in the first place.”
While the U.S. may be in a relatively good place in AI development, DeepSeek is an indicator that China is not far behind, and it does not appear that China is trimming investments in the AI race against the West.
Like Blaze News? Bypass the censors, sign up for our newsletters, and get stories like this direct to your inbox. Sign up here!
The post Blaze News original: China’s DeepSeek Coder claims it is the first open-source model to surpass GPT-4 Turbo amid tense AI race appeared first on TheBlaze.