Microsoft unveils powerful lightweight AI model for CPUs

Microsoft researchers have introduced the largest 1-bit AI model to date, called BitNet b1.58 2B4T, designed to run efficiently on standard CPUs instead of relying on GPUs. This ‘bitnet’ model, now openly available under the MIT license, can even operate on Apple’s M2 chips.

Bitnets use extreme weight quantisation, storing only -1, 0, or 1 as values, making them far more memory- and compute-efficient than most conventional models.

With 2 billion parameters and trained on 4 trillion tokens, roughly the equivalent of 33 million books, BitNet b1.58 2B4T outperforms several similarly sized models in key benchmarks.

Microsoft claims it beats Meta’s Llama 3.2 1B, Google’s Gemma 3 1B, and Alibaba’s Qwen 2.5 1.5B on tasks like grade-school maths and physical reasoning. It also runs up to twice as fast while using significantly less memory, offering a potential edge for lower-end or energy-constrained devices.

The main limitation lies in its dependence on Microsoft’s custom bitnet.cpp framework, which supports only select hardware and does not yet work with GPUs.

Instead of being broadly compatible with existing AI systems, BitNet’s performance depends on a narrower infrastructure, a hurdle that may limit adoption, despite its promise for lightweight AI deployment.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Google uses AI and human reviews to fight ad fraud

Google has revealed it suspended 39.2 million advertiser accounts in 2024, more than triple the number from the previous year, as part of its latest push to combat ad fraud.

The tech giant said it is now able to block most bad actors before they even run an advert, thanks to advanced large language models and detection signals such as fake business details and fraudulent payments.

Instead of relying solely on AI, a team of over 100 experts from across Google and DeepMind also reviews deepfake scams and develops targeted countermeasures.

The company rolled out more than 50 LLM-based safety updates last year and introduced over 30 changes to advertising and publishing policies. These efforts, alongside other technical reinforcements, led to a 90% drop in reports of deepfake ads.

While the US saw the highest number of suspensions, with all 39.2 million accounts coming from there alone, India followed with 2.9 million accounts taken down. In both countries, ads were removed for violations such as trademark abuse, misleading personalisation, and financial service scams.

Overall, Google blocked 5.1 billion ads globally and restricted another 9.1 billion, instead of allowing harmful content to spread unchecked. Nearly half a billion of those removed were linked specifically to scam activity.

In a year when half the global population headed to the polls, Google also verified over 8,900 election advertisers and took down 10.7 million political ads.

While the scale of suspensions may raise concerns about fairness, Google said human reviews are included in the appeals process.

The company acknowledged previous confusion over enforcement clarity and is now updating its messaging to ensure advertisers understand the reasons behind account actions more clearly.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Inephany raises $2.2M to make AI training more efficient

London-based AI startup Inephany has secured $2.2 million in pre-seed funding to develop technology aimed at making the training of neural networks—particularly large language models—more efficient and affordable.

The investment round was led by Amadeus Capital Partners, with participation from Sure Valley Ventures and AI pioneer Professor Steve Young, who joins as both chair and angel investor.

Founded in July 2024 by Dr John Torr, Hami Bahraynian, and Maurice von Sturm, Inephany is building an AI-driven platform that improves training efficiency in real time.

By increasing sample efficiency and reducing computing demands, the company hopes to dramatically cut the cost and time of training cutting-edge models.

The team claims their solution could make AI model development at least ten times more cost-effective compared to current methods.

The funding will support growth of Inephany’s engineering team and accelerate the launch of its first product later this year.

With the costs of training state-of-the-art models now reaching into the hundreds of millions, the startup’s platform aims to make high-performance AI development more sustainable and accessible across industries such as healthcare, weather forecasting, and drug discovery.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

South Korea’s $23B chip industry boost in response to global trade war

South Korea announced a $23 billion support package for its semiconductor industry, increasing from last year’s $19 billion to protect giants like Samsung and SK Hynix from US tariff uncertainties and China’s growing competition

The plan allocates 20 trillion won in financial aid, up from 17 trillion, to drive innovation and production, addressing a 31.8% drop in chip exports to China due to US trade restrictions.

The package responds to US policies under President Trump, including export curbs on high-bandwidth chips to China, which have disrupted global demand. 

At the same time, Finance Minister Choi Sang-mok will negotiate with the US to mitigate potential national security probes on chip trade. 

South Korea’s strategy aims to safeguard a critical economic sector that powers everything from smartphones to AI, especially as its auto industry faces US tariff challenges. 

Analysts view this as a preemptive effort to shield the chip industry from escalating global trade tensions.

Why does it matter?

For South Koreans, the semiconductor sector is a national lifeline, tied to jobs and economic stability, with the government betting big to preserve its global tech dominance. As China’s tech ambitions grow and US policies remain unpredictable, Seoul’s $23 billion investment speaks out about the cost of staying competitive in a tech-driven world.

Nvidia hit by the new US export rules

Nvidia is facing fresh US export restrictions on its H20 AI chips, dealing a blow to the company’s operations in China.

In a filing on Tuesday, Nvidia revealed it now needs a licence to export these chips indefinitely, after the US government cited concerns they could be used in a Chinese supercomputer.

The company expects a $5.5 billion charge linked to the controls in its first fiscal quarter of 2026, which ends on 27 April. Shares dropped around 6% in after-hours trading.

The H20 is currently the most advanced AI chip Nvidia can sell to China under existing regulations.

Last week, reports suggested CEO Jensen Huang might have temporarily eased tensions during a dinner at Donald Trump’s Mar-a-Lago resort, by promising investments in US-based AI data centres instead of opposing the rules directly.

Just a day before the filing, Nvidia announced plans to manufacture some chips in the US over the next four years, though the specifics were left vague.

Calls for tighter controls had been building, especially after it emerged that China’s DeepSeek used the H20 to train its R1 model, a system that surprised the US AI sector earlier this year.

Government officials had pushed for action, saying the chip’s capabilities posed a strategic risk. Nvidia declined to comment on the new restrictions.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Quantum breakthrough could be just years away

Most quantum professionals believe that quantum utility — the point at which quantum computers outperform classical machines in solving real-world problems — could be reached within the next decade.

According to a new survey by Economist Impact, 83% of global experts expect quantum utility to arrive in ten years or less, with one-third predicting it will happen in as little as one to five years.

Optimism aligns with some industry roadmaps, such as Finnish startup IQM, which is targeting quantum utility as early as next year.

However, there’s still little consensus on the timeline. While Google’s CEO Sundar Pichai recently suggested practically useful quantum computers could be five to ten years away, Nvidia’s Jensen Huang believes it may take at least 15 years — a remark that briefly shook confidence in quantum stocks.

Industry confusion over terms like ‘quantum utility,’ ‘advantage,’ and ‘supremacy’ only adds to the uncertainty, highlighting the need for clearer communication and better public understanding.

Despite the buzz, major challenges remain. Over 80% of professionals cite technical barriers, especially error correction, as a major hurdle.

A further 75% point to a lack of skilled talent in the field. While misconceptions about quantum computing are seen as slowing progress, the real bottlenecks lie in engineering and workforce development.

If these can be overcome, quantum computing could revolutionise sectors from pharmaceuticals and materials science to finance and cybersecurity — with profound implications, both promising and perilous.

For more information on these topics, visit diplomacy.edu.

Samsung brings AI-powered service tool to India

Samsung, already the leading home appliance brand in India by volume, is now enhancing its after-sales service with an AI-powered support tool.

The tech company from South Korea has introduced the Home Appliances Remote Management (HRM) tool, designed to improve service speed, accuracy, and overall customer experience instead of sticking with traditional support methods.

The HRM tool allows customer care teams to remotely diagnose and resolve issues in Samsung smart appliances connected via SmartThings. If a problem can be fixed remotely, staff will ask for the user’s consent before taking control of the device.

If the issue can be solved by the customer, step-by-step instructions are provided instead of sending a technician straight away.

When neither of these options applies, the issue is forwarded directly to service technicians with full diagnostics already completed, cutting down the time spent on-site.

The new system reduces the need for in-home visits, shortens waiting times, and increases the uptime of appliances instead of leaving users waiting unnecessarily.

SmartThings also plays a proactive role by automatically detecting issues and offering solutions before customers even need to call.

Samsung India’s Vice President for Customer Satisfaction, Sunil Cutinha, noted that the tool significantly streamlines service, boosts maintenance efficiency, and helps ensure timely product support for users across the country.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Nvidia brings AI supercomputer production to the US

Nvidia is shifting its AI supercomputer manufacturing operations to the United States for the first time, instead of relying on a globally dispersed supply chain.

In partnership with industry giants such as TSMC, Foxconn, and Wistron, the company is establishing large-scale facilities to produce its advanced Blackwell chips in Arizona and complete supercomputers in Texas. Production is expected to reach full scale within 12 to 15 months.

Over a million square feet of manufacturing space has been commissioned, with key roles also played by packaging and testing firms Amkor and SPIL.

The move reflects Nvidia’s ambition to create up to half a trillion dollars in AI infrastructure within the next four years, while boosting supply chain resilience and growing its US-based operations instead of expanding solely abroad.

These AI supercomputers are designed to power new, highly specialised data centres known as ‘AI factories,’ capable of handling vast AI workloads.

Nvidia’s investment is expected to support the construction of dozens of such facilities, generating hundreds of thousands of jobs and securing long-term economic value.

To enhance efficiency, Nvidia will apply its own AI, robotics, and simulation tools across these projects, using Omniverse to model factory operations virtually and Isaac GR00T to develop robots that automate production.

According to CEO Jensen Huang, bringing manufacturing home strengthens supply chains and better positions the company to meet the surging global demand for AI computing power.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Zhipu AI launches free agent to rival DeepSeek

Chinese AI startup Zhipu AI has introduced a free AI agent, AutoGLM Rumination, aimed at assisting users with tasks such as web browsing, travel planning, and drafting research reports.

The product was unveiled by CEO Zhang Peng at an event in Beijing, where he highlighted the agent’s use of the company’s proprietary models—GLM-Z1-Air for reasoning and GLM-4-Air-0414 as the foundation.

According to Zhipu, the new GLM-Z1-Air model outperforms DeepSeek’s R1 in both speed and resource efficiency. The launch reflects growing momentum in China’s AI sector, where companies are increasingly focusing on cost-effective solutions to meet rising demand.

AutoGLM Rumination stands out in a competitive landscape by being freely accessible through Zhipu’s official website and mobile app, unlike rival offerings such as Manus’ subscription-only AI agent. The company positions this move as part of a broader strategy to expand access and adoption.

Founded in 2019 as a spinoff from Tsinghua University, Zhipu has developed the GLM model series and claims its GLM4 has surpassed OpenAI’s GPT-4 on several evaluation benchmarks.

In March, Zhipu secured major government-backed investment, including a 300 million yuan (US$41.5 million) contribution from Chengdu.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

TheStage AI makes neural network optimisation easy

In a move set to ease one of the most stubborn hurdles in AI development, Delaware-based startup TheStage AI has secured $4.5 million to launch its Automatic NNs Analyzer (ANNA).

Instead of requiring months of manual fine-tuning, ANNA allows developers to optimise AI models in hours, cutting deployment costs by up to five times. The technology is designed to simplify a process that has remained inaccessible to all but the largest tech firms, often limited by expensive GPU infrastructure.

TheStage AI’s system automatically compresses and refines models using techniques like quantisation and pruning, adapting them to various hardware environments without locking users into proprietary platforms.

Instead of focusing on cloud-based deployment, their models, called ‘Elastic models’, can run anywhere from smartphones to on-premise GPUs. This gives startups and enterprises a cost-effective way to adjust quality and speed with a simple interface, akin to choosing video resolution on streaming platforms.

Backed by notable investors including Mehreen Malik and Atlantic Labs, and already used by companies like Recraft.ai, the startup addresses a growing need as demand shifts from AI training to real-time inference.

Unlike competitors acquired by larger corporations and tied to specific ecosystems, TheStage AI takes a dual-market approach, helping both app developers and AI researchers. Their strategy supports scale without complexity, effectively making AI optimisation available to teams of any size.

Founded by a group of PhD holders with experience at Huawei, the team combines deep academic roots with practical industry application.

By offering a tool that streamlines deployment instead of complicating it, TheStage AI hopes to enable broader use of generative AI technologies in sectors where performance and cost have long been limiting factors.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!