Cerebras Launches World’s Fastest DeepSeek R1 Inference

<p><strong>SUNNYVALE<&sol;strong> &&num;8212&semi; <a href&equals;"https&colon;&sol;&sol;cts&period;businesswire&period;com&sol;ct&sol;CT&quest;id&equals;smartlink&amp&semi;url&equals;https&percnt;3A&percnt;2F&percnt;2Fcerebras&period;ai&percnt;2F&amp&semi;esheet&equals;54196387&amp&semi;newsitemid&equals;20250130028856&amp&semi;lan&equals;en-US&amp&semi;anchor&equals;Cerebras&plus;Systems&amp&semi;index&equals;1&amp&semi;md5&equals;1786cf1b14213323887208c2a4367e19" target&equals;"&lowbar;blank" rel&equals;"nofollow noopener" shape&equals;"rect">Cerebras Systems<&sol;a>&comma; a pioneer in accelerating generative AI&comma;  announced record-breaking performance for DeepSeek-R1-Distill-Llama-70B inference&comma; achieving more than 1&comma;500 tokens per second – 57 times faster than GPU-based solutions&period; This unprecedented speed enables instant reasoning capabilities for one of the industry&&num;8217&semi;s most sophisticated open-weight models&comma; running entirely on U&period;S&period;-based AI infrastructure with zero data retention&period;<&sol;p>&NewLine;<p>&&num;8220&semi;DeepSeek R1 represents a new frontier in AI reasoning capabilities&comma; and today we&&num;8217&semi;re making it accessible at the industry’s fastest speeds&comma;&&num;8221&semi; said Hagay Lupesko&comma; SVP of AI Cloud&comma; Cerebras&period; &&num;8220&semi;By achieving more than 1&comma;500 tokens per second on our Cerebras Inference platform&comma; we&&num;8217&semi;re transforming minutes-long reasoning processes into near-instantaneous responses&comma; fundamentally changing how developers and enterprises can leverage advanced AI models&period;&&num;8221&semi;<&sol;p>&NewLine;<p>Powered by the Cerebras Wafer Scale Engine&comma; the platform demonstrates dramatic real-world performance improvements&period; A standard coding prompt that takes 22 seconds on competitive platforms completes in just 1&period;5 seconds on Cerebras – a 15x improvement in time to result&period; This breakthrough enables practical deployment of sophisticated reasoning models that traditionally require extensive computation time&period;<&sol;p>&NewLine;<p>DeepSeek-R1-Distill-Llama-70B combines the advanced reasoning capabilities of DeepSeek&&num;8217&semi;s 671B parameter Mixture of Experts &lpar;MoE&rpar; model with Meta&&num;8217&semi;s widely-supported Llama architecture&period; Despite its efficient 70B parameter size&comma; the model demonstrates superior performance on complex mathematics and coding tasks compared to larger models&period;<&sol;p>&NewLine;<p>&&num;8220&semi;Security and privacy are paramount for enterprise AI deployment&comma;&&num;8221&semi; continued Lupesko&period; &&num;8220&semi;By processing all inference requests in U&period;S&period;-based data centers with zero data retention&comma; we&&num;8217&semi;re ensuring that organizations can leverage cutting-edge AI capabilities while maintaining strict data governance standards&period; Data stays in the U&period;S&period; 100&percnt; of the time and belongs solely to the customer&period;&&num;8221&semi;<&sol;p>&NewLine;

Editor

Wispr Scores $25 Million Series A Extension

SAN FRANCISCO -- Wispr, the voice-to-text AI that turns speech into clear, polished writing in every…

1 day

Numeric Dials Up $51 Million Series B

SAN FRANCISCO -- Numeric, an AI accounting automation platform, has raised a $51 million Series…

1 day

Apple Names 45 Finalists for App Store of the Year Awards

Apple has announced 45 finalists for this year’s App Store Awards, recognizing the best apps…

2 days

UC Reaches Agreement With Nurses, Strike Canceled

The University of California (UC) and the California Nurses Association (CNA) have reached a tentative…

4 days

HouseRX Rakes In $55 Million Series B

SAN FRANCISCO -- House Rx, a health tech company focused on making specialty medications more accessible and…

4 days

King Charles Honors NVIDIA’s Jensen Huang

Britain's King has given an award to the King of NVIDIA! NVIDIA founder and CEO…

4 days