Categories AI

Introducing GPT-5.6: Ben’s Bites Update

Hey everyone! After a refreshing long weekend in Greece, I’m excited to share some insights about a groundbreaking company I invested in a few years ago that is now emerging from stealth mode.

Etched.

While training has dominated discussions in recent years, the real competition in 2026 will center around delivering intelligence: reducing latency, cutting costs, minimizing power consumption, and increasing token capacity. The demand for enhanced inference systems is unprecedented, and the market for inference is unlike anything we’ve witnessed before. In fact, it’s enough to keep some people awake at night (check out the tweet below for a laugh 😂).

AI inference could redefine market dynamics, potentially becoming the largest sector in history. However, the primary challenge lies in the hardware for inference. Etched has developed an outstanding product that could revolutionize the operation of models.

Etched is constructing cutting-edge inference clusters with an unparalleled level of vertical integration, which includes chips, racks, software, manufacturing, and production—all designed collaboratively from start to finish. Remarkably, their first product is due in under three years, a significant improvement over the typical 7+ years for most hardware companies.

They have already made remarkable progress: securing $800 million in funding, with a backlog of over $1 billion in orders. Production is in motion with TSMC, and their initial chips operated flawlessly on the first attempt (A0) using TSMC’s 4nm technology. An A0 success is not easy, and achieving it so quickly is astonishing.

I invested in Gavin back in 2023 when he was a 21-year-old Harvard dropout. His first-principles technical conviction and capability to assemble an extraordinary team—comprising individuals who built previous generations of computing infrastructure—convinced me. Now, Etched boasts over 400 team members hailing from NVIDIA, Google TPU, Broadcom, SK Hynix, TSMC, and virtually every notable AI chip initiative.

They have the backing of major players like VentureTech Alliance (with a robust relationship with TSMC), Peter Thiel, Jane Street, Two Sigma, Jump, HRT, Stripes, Ribbit, and many others.

As we can observe, success in AI won’t solely hinge on having the superior models; it will also depend on who can effectively serve them.

Ben’s Bites is sponsored by Render

Chainable compute. Right on queue.

Define tasks with Render’s lightweight SDK and arrange them into robust, ongoing workflows. Launch your agents and batch jobs on demand. Render Workflows takes care of queuing, orchestration, and retries.

Try it: Use code RENDER-BENSBITES for $50 in credits.

  • OpenAI has launched GPT-5.6, but its release is facing delays from the U.S. government. Only a select group of partners currently have access to the new models: Sol, Terra, and Luna. Of these, Sol is the most advanced, surpassing Mythos in several benchmarks but falls slightly short in exploiting cybersecurity vulnerabilities. Sam Altman assures that GPT-5.6 will soon be available to the general public, although it may start in the U.S. only, even as he advocates for a global rollout.

  • OpenAI has also published an economics paper detailing the adoption of Codex both within and outside the company. The rate of non-technical adoption is quickly catching up with that of engineering teams. The lead on the Codex app discussed the design and its important features on Lenny’s podcast.

  • Cursor for iOS allows you to launch cloud agents directly from your smartphone and remotely control agents running on your computer. Composer 2.5 is currently available at 75% off until July 5.

  • X has introduced a hosted MCP that connects Grok, Cursor, and other MCP-compatible tools to the X API and developer documentation without the need for self-hosting.

  • How can we create AI that genuinely understands users? Working Smarter, a podcast from Dropbox focused on AI and modern work, has returned for its third season. From context engineering to multimodal search, listen to insights on building AI that works seamlessly in various environments. Catch the first episode now*

  • Replit has launched a desktop app compatible with both Mac and Windows platforms.

  • The U.S. National Design Studio has created Rampart, a lightweight in-browser ML model for redacting PII before it’s sent to a server.

  • Custom agents are available for regular users; however, power users seek skills that they can integrate into their own agents to enhance their experience.

  • Human-in-the-loop – how can we effectively balance interaction with agents?

  • Zaro – enables the development of live apps, agents, and workflows using Slack, email, documents, and calendars.

  • Unpeel – a native Mac terminal for agents, supporting persistent sessions and git worktrees. (demo)

  • Tau – an educational agent framework designed for creating TUIs, extensions, and harnesses.

  • Inference.net – allows users to test GLM 5.2 on mirror production traffic before switching models for live users.

  • Odessia Travel – an AI trip planner that intelligently searches and books flights, accommodations, and activities.

  • smolmachines – ideal for spinning up hardware-isolated Linux microVMs.

  • Animation vocabulary – a skill that enables users to request motion using terms like ‘morph’ and ‘rubber-band.’

  • Whenever you feel the need for a dashboard, ask your agent to create a temporary HTML page.

  • Teaching agents about product design at Vercel.

  • MCPs, APIs, and CLIs – these concepts are interconnected.

  • A new collection of shadcn components for crafting chat interfaces—streaming chat, scrolling, messages, and attachments.




Share Ben’s Bites

* sponsors who make this newsletter possible 🙂
Interested in partnering with us for the upcoming quarter?
Contact us at shanice@bensbites.com or k@bensbites.com

Leave a Reply

您的邮箱地址不会被公开。 必填项已用 * 标注

You May Also Like