In April 2025, Google quietly dropped Gemini 2.5 Pro Preview to all users, weeks ahead of its planned debut at Google I/O 2025, upending expectations for how quickly AI advances can ship. But this wasn’t just another “faster, stronger, better” brag; under the hood, engineers say the model truly cuts hallucinations and expands what developers can build, from drag-and-drop Canvas workflows to more trustworthy code generation.
Meanwhile, rumors swirl of Apple slipping Google’s brain into Siri as a stopgap until Cupertino’s on-device models catch up, and OpenAI has navigated its own governance drama. Reaffirming nonprofit control of its for-profit arm even as Elon Musk’s lawsuit simmers on. Throw in Microsoft’s renegotiated revenue split, OpenAI’s blockbuster Windsurf acquisition, plus fresh breakthroughs from HeyGen, Lightricks, and you’ve got a week in AI that feels like half a year.
Google’s Early Drop of Gemini 2.5 Pro Preview
Google didn’t just surprise I/O attendees this year; it surprised itself. On May 6, 2025, the company rolled out the Gemini 2.5 Pro Preview (branded the “I/O edition”) to everyone with no fanfare beyond the dev blogs and API dashboards.
Development teams report that this build addresses one of Gemini’s biggest critiques: hallucinations. In internal tests, they observed far fewer phantom functions and missing tool-call errors, which means developers can lean on Gemini for front-end UI scaffolding with real confidence.
The model leapt to the top of the WebDev Arena leaderboard, an impressive feat considering it beat out Grok-3 and GPT-4.5 by over 40 points. And on the VideoMME benchmark, used to measure AI video understanding, it posted an 84.8% score, proof that Gemini isn’t just a lines-of-code factory but can genuinely “see” and parse visual content.
Context windows remain enormous, a full million tokens, about an hour of 4K video transcription or eleven hours of audio, paving the way for applications that juggle massive documents or media files in one shot. Google is already teasing a push toward two million tokens.
Drag-and-drop Canvas editing in the consumer app now converts your sketches into React code, living up to years of developer-daydreams. Suddenly, low-code meets “AI-speaks-code” in a single workflow. No more copying and pasting; simply draw a box.
Apple’s Quiet Pact with Mountain View
Word is, iOS 19’s much-touted “Apple Intelligence” tier might lean on Google’s powerhouse LLM (at least initially) to shore up Siri’s brainpower.
Apple has poured billions into on-device ML, touting privacy and efficiency. Yet, in antitrust testimony earlier this spring, Sundar Pichai hinted at a deal with Cupertino to embed Gemini in Siri by year’s end.
Why? Because Apple’s models still struggle with niche queries or generating usable code snippets on the fly. Borrowing Gemini under the hood would let Siri handle everything from augmented-reality shopping lists to draft email suggestions without sending data back to Google’s servers. A stopgap, for sure. But once Apple’s own Neural Engines catch up (rumor has it the A19 chip will close the gap) Gemini might quietly be phased out. Until then, expect some Siri-powered demos in June’s WWDC that look strikingly familiar.
OpenAI’s Governance Roller Coaster
Last week, Sam Altman sent a terse memo: OpenAI’s nonprofit arm will remain the controlling shareholder of its public benefit corporation, killing off any plan for a standalone, fully for-profit subsidiary.
This re-embrace of nonprofit oversight traces back to last year’s boardroom coup and dramatic recall of Altman himself. Now, the structure is formally cemented in Delaware and California filings, designed to reassure regulators that profit won’t trump mission.
Yet Elon Musk isn’t convinced. His lawsuit (filed in Northern California) claims OpenAI violated its charter by prioritizing commercial deals over public benefit. Musk’s legal team has even argued his near-$100 billion takeover bid was a genuine attempt to realign the company with its founding ethos.
Both sides are digging in. The picnic-table optics of nonprofit oversight look good in press releases, but true accountability will hinge on how—and if—the nonprofit board flexes its influence in boardrooms and product roadmaps.
Microsoft’s Slimmer Cut
Money talks. OpenAI has told investors that by 2030, it will slash the percentage of revenues it pays Microsoft, from the current 20% down to 10%, with room for further reductions if volume targets are hit.
This renegotiation reflects OpenAI’s ambition to raise up to $40 billion at a valuation north of $300 billion, where every marginal point of margin translates to hundreds of millions in extra capital for R&D.
For Microsoft, first-dibs on new OpenAI tech remains strategic, Azure still hosts the bulk of OpenAI’s compute, but those revenue dollars will taper unless both parties hammer out fresh commercial terms beyond 2030.
Windsurf Buy: OpenAI’s Biggest Splash Yet
In a move that stunned open-source advocates, OpenAI agreed to acquire Windsurf (ex-Codeium) for roughly $3 billion.
Windsurf is known for its real-time code completions and collaborative canvas tools, features that plug neatly into ChatGPT’s Developer Mode and AI Studio environments.
OpenAI’s pitch: integrate Windsurf’s canvas directly into GPT-powered IDE plugins, giving professional developers a seamless, in-context AI pairing as they code. Think GitHub Copilot on steroids, with richer debugging and UI-generation workflows baked in.
But antitrust watchers will be listening. A single company swallowing up two leading AI-assisted coding rivals in under two years raises questions about competition in the developer tools space.
HeyGen Avatar IV: Faces Meet AI
Avatar IV from HeyGen promises lifelike talking-head videos generated from a single selfie and brief voice clip—complete with micro-gestures and nuanced lip sync.
The secret? A diffusion-inspired audio-to-expression engine that analyzes cadence, tone, and pauses to choreograph realistic head movements and facial tics.
No jitter. No cartoonish wobble. Producers are using Avatar IV for everything from virtual spokespeople to personalized training demos, applications that demand subtle human expressions.
Lightricks’ LTXV-13B: Hollywood on a GPU
Video AI just got portable. Lightricks’ new LTXV-13B is a 13 billion-parameter model that runs on consumer GPUs (3090, 4090, even laptop-grade cards) and can generate nine-second clips in under forty seconds—30× faster than previous open-source rivals.
Multiscale rendering is the magic sauce: the model first sketches a coarse motion grid, then progressively fills in high-resolution tiles, like painting by numbers, but smarter.
Creators gain licensed access to Getty and Shutterstock footage for training, so generated content is commercially cleared for small organizations under $10 million in revenue.
What Comes Next?
We’re two weeks from Google I/O (May 20–22) and a month from WWDC (June 9). Expect deeper dives into Gemini’s roadmap (think 2 millionᵗʰ token windows, Pro vs. Ultra tiers) and demos of Siri tapping Gemini behind the scenes. OpenAI will tease its next model iterations, roll out new AGI-safety initiatives, and possibly unveil further buyouts in developer tooling. In the wild, community hackathons will mutate these platforms into unexpected uses, from generative film pre-viz to AI choreographers.
Discover more from Aree Blog
Subscribe now to keep reading and get access to the full archive.