Categories AI

Exciting News from Ben’s Bites!

Hello, I’m Ben. I enjoy building projects with agents, even though I’m not a tech expert. Here’s a collection of what I’m exploring and learning. If you’re interested in building or enhancing your ‘vibe-coding’ abilities, join our community.

Hello everyone,

Thomas Dohmke, former CEO of GitHub, has launched Entire — an innovative company focused on creating the “next developer platform” for enabling collaboration between agents and humans. They’ve secured a remarkable $60M seed round, led by Felicis. Their vision is that traditional code in files and pull requests will soon be obsolete. The future lies in transforming intent into outcomes using natural language. Their first product is called Checkpoints, which retains all relevant agent context (including transcript, prompts, files accessed, token usage, and tool calls) alongside every git commit. This advancement has generated considerable buzz on Twitter, with many discussing whether Entire represents a paradigm shift akin to a new-age GitHub. Whether Entire will redefine the space or be just one of many attempts, it’s clear that the infrastructure for a world where agents code is urgently needed.

OpenAI has introduced new features in the Responses API that facilitate long-running agentic tasks: server-side compaction (allowing multi-hour agent executions without exceeding context limits), networking containers (enabling agents to install libraries and run scripts with internet access), and native Skills support with a pre-configured spreadsheet capability. Additionally, they’ve shared 10 tips for running multi-hour agent workflows effectively.

Claude Cowork has now launched for Windows. It includes full feature parity with macOS, offering file access, multi-step tasks, plugins, and MCP connectors.

Matt Shumer’s “Something Big is Happening” went viral, presenting an in-depth essay aimed at non-tech audiences on the current landscape of AI. The summary: he describes what he imagines could be built in simple terms, steps away for four hours, and returns to find a finished product. According to OpenAI’s documentation, GPT-5.3 Codex “helped build itself.” Regardless of whether you see the essay as insightful or inadequate (transparently, Will Manidis wrote an excellent counter-essay titled “Tool Shaped Objects,” likening AI hype to FarmVille — definitely worth a read), it moved many sentiments. John Coogan also shared a valuable perspective: AI isn’t a singular event like Covid; it represents various S-curves, not just one trajectory of exponential growth.

Universal-3 Pro is a new promptable speech model for production from AssemblyAI. It allows for upfront guidance during transcription rather than fixing it afterward, effectively managing industry terminology, speaker recognition, disfluencies, and formatting in one go. Try it free until February.

  • Lex Fridman interviewed Peter Steinberger (creator of OpenClaw) in an extensive 3+ hour conversation. It covers his origin story, the reasons behind OpenClaw’s viral success (over 180k GitHub stars), security concerns, comparisons of GPT-5.3 Codex with Opus 4.6, acquisition offers from OpenAI and Meta, and the implications of AI agents potentially replacing a significant portion of applications. It’s a worthwhile listen.

  • Andrej Karpathy discusses DeepWiki and the growing adaptability of software. With this new paradigm, you might not need to install massive libraries anymore; simply instruct your agent to extract exactly what you need. “Libraries are gone; LLMs are the new compilers.”

  • Nader Dabit wrote a tutorial titled “You Could’ve Invented OpenClaw” — an extensive guide detailing OpenClaw’s architecture from scratch, covering topics like sessions, SOUL.md, tools, permissions, gateway patterns, compaction, memory, cron jobs, and multi-agent setups. This roughly 400 lines of Python is a fantastic resource for understanding how these agents operate. You can check out the markdown version + say “build it” to take it for a spin 😊 (I’ve crafted a few projects inspired by it).

  • Meng To has shared a 41-minute tutorial on how to ship products, designs, and articles using OpenClaw and Codex.

  • Mitchell Hashimoto remarks that AI-friendly code storage threatens GitHub’s future. Whichever entity constructs the foundational infrastructure needed by agents coding will hold a significant advantage.

  • Lenny Rachitsky gathered feedback on OpenClaw its impact and experiences. The contrasting responses are quite entertaining—there’s both love and criticism stemming from users.

  • Boris documented his workflow while utilizing Claude Code: Plan in a separate document, annotate it, and iteratively engage with a persistent artifact that doesn’t get compacted. He suggests that the planning mode is deficient in all coding agents.

  • Agents training next-gen AI models — Hamza explored whether existing models can achieve this. The conclusion? It’s more intricate than the public narrative implies.

  • I enhanced 15 LLMs’ coding capabilities in one afternoon. Only the harness had changed.” — this piece is a real gem. The author developed a new editing tool dubbed Hashline for his open-source coding agent (oh-my-pi, a variant of Pi). Instead of requiring models to replicate exact text for file modifications (which often leads to failures), each line receives a short content hash. As a result, models simply reference the hash when instructed to “edit this line.” The outcome: Gemini’s performance improved by 8%, while Grok Code Fast soared from 6.7% to 68.3%—a tenfold enhancement. No retraining, no new models, just an improved tool interface. “the model is the moat. The harness is the bridge. Damaging bridges just leads to fewer people attempting to cross.

  • Track anything you wish, and gain fresh insights through Signals. Define a topic, entity, or trend, and Signals will compile continuously updating datasets regarding it. Return anytime to investigate emerging trends and insights. Your first Signal is complimentary – acquire more with a 25% discount using BENSBITES25.*

  • pgrok — an alternative to ngrok that’s free to use. Point a wildcard domain to a VPS, set it up on both ends, and you’re ready to go. It runs fully on your infrastructure, built using opentui.

  • Simon Willison has developed Showboat and Rodney — two innovative tools designed to help coding agents display their work beyond just executing automated tests.

  • Tambo 1.0 — an open-source generative UI toolkit specifically for React.

  • Happycapy is now available for everyone! This is an agent-native computer accessible in your browser and smartphone. Powered by Claude Code + MiniMax (including Opus 4.6 and Minimax M2.5), it offers a secure cloud space, agent teams, and automation without installations—just run it. It serves as a comprehensive alternative to self-managing OpenClaw.

  • Repo Prompt 2.0 — features a newly integrated Agent mode using RP’s MCP tools, providing first-class Codex support, along with Claude Code and Gemini CLI capabilities. They’ve even added new onboarding processes!

Enjoying this newsletter? Feel free to share it with a friend.

Share

That wraps things up for today. Please feel free to share your thoughts in the comments. 👋

* A special thank you to our sponsors who made this newsletter possible 🙂
Interested in partnering with us for Q1?

Leave a Reply

您的邮箱地址不会被公开。 必填项已用 * 标注

You May Also Like