DeepSeek

New AI Age

AB & R1

Tech Talk

Emerging as a formidable force in artificial intelligence, China’s DeepSeek has stunned the tech world with its cost-efficient, high-performance models like DeepSeek-V3 and DeepSeek-R1, which rival OpenAI’s GPT-4 and Anthropic’s Claude 3.5 Sonnet in benchmarks while slashing training costs to just $5.58 million—a fraction of competitors’ budgets. By pioneering architectural innovations such as Mixture-of-Experts (MoE) and Multi-Head Latent Attention, the company optimizes computational efficiency, activating only 37 billion parameters per token compared to dense models’ full parameter usage . Its open-source strategy, releasing models under MIT licenses, has democratised access to cutting-edge AI, sparking global collaboration and challenging proprietary giants like Meta and Google. DeepSeek’s impact extends beyond technical prowess: its aggressive pricing (as low as $0.014 per million input tokens) has triggered a price war in China’s AI market, pressuring tech giants like Tencent and Alibaba to lower costs . The startup’s rise, led by founder Liang Wenfeng—a quant-trading pioneer—showcases a unique blend of financial acumen and engineering rigor, leveraging reinforcement learning (RL) to achieve human-expert reasoning without costly supervised fine-tuning . Despite US chip export restrictions, DeepSeek thrives through software-driven optimizations and partnerships with AMD, proving resourcefulness trumps raw computational power. Silicon Valley’s reaction has been seismic: Meta’s engineers scramble to reverse-engineer its models, while luminaries like Yann LeCun hail its open-source breakthroughs as a paradigm shift. With its App Store dominance and applications spanning coding, education, and multilingual systems, DeepSeek isn’t just challenging norms—it’s redefining AI’s future through affordability, transparency, and relentless innovation.