In the past 24 hours, the AI industry has experienced dramatic changes: from the skyrocketing global demand for AI inference, to IBM and AMD’s partnership in quantum-centric supercomputing, and MongoDB’s AI-driven revenue growth. These three major developments outline the latest trends shaping AI infrastructure, computing innovation, and database ecosystems.
Data shows that Google processed over 980 trillion tokens in July 2025, doubling from May. Microsoft handled over 500 trillion tokens via its Foundry API during fiscal year 2025 (ending June), a 7x year-over-year increase. ByteDance consumed an average of 16.4 trillion tokens per day by May 2025. By the end of June, China’s daily token consumption had reached 30 trillion, a 300x jump compared to early 2024.
Commentary:
The inference phase now consumes significantly more tokens than training, signaling a shift from AI model development to large-scale deployment. Inference APIs boast profit margins as high as 70%, generating major revenue for companies like Google and Microsoft. Google Cloud reported $13.6B in Q2 2025 revenue, while Microsoft Azure AI grew 5x.
NVIDIA’s next-generation Spectrum-XGS Ethernet and Jetson Thor technologies are poised to further improve efficiency.
Currently, Microsoft Azure serves OpenAI and numerous AI companies, and Google reports over 85,000 enterprises using Gemini, highlighting the massive market demand for inference capabilities. ByteDance’s total token consumption in May was close to 500 trillion, underscoring its strength in AI applications and its competitiveness in global AI infrastructure.
However, the sheer scale of token processing also raises growing concerns around privacy and regulatory compliance.
IBM and AMD have signed an agreement to jointly develop a quantum-centric supercomputing architecture that integrates quantum computing with high-performance computing (HPC) and AI infrastructure. Under this architecture, quantum computers will work seamlessly with CPUs, GPUs, and other computing engines to support scalable AI workloads.
Commentary:
Quantum computing excels at simulating complex molecular behavior, while HPC and AI are optimized for large-scale data analysis. Combining the two unlocks opportunities to solve problems traditional computing cannot handle.
This partnership boosts industry confidence, but quantum computing is still in its early stages, and there are growing concerns over potential investment bubbles in AI and quantum. If IBM and AMD successfully integrate quantum with AI, it could accelerate breakthroughs and reshape the future of intelligent computing.
MongoDB reported total Q2 FY2026 revenue of $591.4M, up 24% year-over-year. Subscription revenue reached $572.4M (+23%), while services revenue hit $19M (+33%).
Gross profit for the quarter was $420M with a margin of 71%, slightly below last year’s 73%. Non-GAAP gross profit came in at $436.4M with a 74% margin. Operating loss narrowed to $65.3M, and non-GAAP operating income rose to $86.8M (+65% YoY). As of July 31, 2025, MongoDB held $2.3B in cash and equivalents.
Commentary:
MongoDB’s AI strategy is powering rapid growth in its Atlas cloud database. Atlas contributed 74% of Q2 revenue, up 29% year-over-year, with customer count reaching 59,900 — an increase of 2,800 in just one quarter.
MongoDB’s NoSQL document database stands out for its flexibility and performance, making it ideal for AI workloads such as vector search, generative AI data storage, and real-time inference.
The explosion of inference demand has accelerated MongoDB’s growth, but expanding cloud infrastructure and developing AI-specific capabilities have also increased costs. Facing competition from cloud giants like Google, AWS, and Microsoft, maintaining differentiation remains a key challenge for MongoDB.
For more cutting-edge AI insights, business analysis, and tech trends, visit:
https://iaiseek.com
To catch up on major AI events from the past 72 hours, check out: