News
This gain is made possible by TNG’s Assembly-of-Experts (AoE) method — a technique for building LLMs by selectively merging the weight tensors ...
German firm TNG has released DeepSeek-TNG R1T2 Chimera, an open-source variant twice as fast as its parent model thanks to a ...
DeepSeek quietly updated R1 in late May, marking its first revision since its high-profile debut. The start-up released R1-0528 on the open-source AI developer community Hugging Face, calling it a ...
Say hello to DeepSeek-TNG R1T2 Chimera, a large language model built by German firm TNG Consulting, using three different ...
Chinese AI startup DeepSeek has released an update to its R1 reasoning model. The new version, named R1-0528, was published on developer platform Hugging Face on May 29, although the company has ...
Chinese AI upstart MiniMax released a new large language model, joining a slew of domestic peers inspired to surpass DeepSeek in the field of reasoning AI. The Shanghai-based company touted the ...
The new version, dubbed DeepSeek-R1-0528, is now being positioned as a direct challenger to OpenAI’s o3 and Google’s Gemini 2.5 Pro, with benchmark results and technical enhancements that show ...
Huawei published a technical paper in collaboration with Chinese AI startup SiliconFlow, which finds that Huawei's CloudMatrix 384 cluster can outperform Nvidia in running DeepSeek models. The ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results