Categories AI

Fable Returns: Ben’s Bites

Hello everyone,

During my recent trip to Greece, I found myself needing to arrange a taxi from the airport to my home. After sending a quick message to Codex, my request was processed in just over a minute, and I received a confirmation text indicating that my car was booked.

Taxi Booking with Codex

Codex also added the redacted bit to this image

While it may not seem like a monumental task, this experience exemplifies a structured approach to completing various tasks efficiently.

To begin, I started all my daily agent conversations in my ‘bites’ folder, which contains links to various files that provide insight into my memories, ongoing projects, and personal information.

Next, Codex accesses my AGENTS.md file and initiates the relevant task.

The system connects to my Google Calendar and Gmail, first checking whether my flight details are logged (thanks to a previous Codex interaction). It retrieves the trip information and examines my emails to find relevant transfer correspondence, which includes details such as my home address and the taxi company I used previously (this information could also be saved in my folders under a ‘TRAVEL.md’ file linked from AGENTS.md).

With all pertinent details gathered, Codex is now equipped with the context needed to accomplish the booking task. It understands the necessary next steps: leading to the taxi company’s website, filling in the booking form, and processing payment, all initiated through my computer, which remained on during my trip.

Working with agents relies heavily on providing them with the right context and resources needed for task completion. When both elements are in place, agents can operate independently without direct oversight.

It’s essential to frame tasks with this mindset: what would an agent—or even a person—require if they lacked any background on you or your specific objectives? It’s your responsibility to ensure that all necessary information is accessible.

Let’s dive right in.

By the way, I’d like to give a shout-out to my friend’s entrepreneurial venture, Fixxa—a platform that enables voice quotes and invoicing for UK tradespeople, conveniently delivered via WhatsApp with integrated Stripe links.

This edition of Ben’s Bites is brought to you by Attio

Introducing Attio: your agentic CRM. With agents and automations that build pipelines, chase signals, and drive deals forward, Attio orchestrates your revenue efforts tirelessly. It’s favored by high-growth startups like Granola, Modal, and Wispr Flow. Start for free today.

  • Fable 5 is back for all paid users. Anthropic’s blog noted enhanced guardrails in this update, though I haven’t encountered them yet. This iteration will be available in subscription plans until July 7, with a limit of 50% usage on Fable. This benchmark suggests Fable can handle 16% of remote work projects, which is double that of Opus 4.8.

  • Before Fable’s return, we also received Claude Sonnet 5, which is reported to perform similarly to Opus 4.8 across various tasks. Although it’s priced lower per token, the overall expense is comparable to Opus for specific tasks. Currently, it serves as the default for Free and Pro users, and is available in Claude Code and API, featuring promotional pricing of $2/$10 per million tokens until August 31. My experiences, along with others, suggest it’s both costly and slow, making it less appealing compared to other models.

  • Two new Gemini media models: Nano Banana 2 Lite and Gemini Omni Flash. Both models are accessible in the Gemini app and API. In the API, Nano Banana 2 Lite offers fast (under 4 seconds) and economical image generation (~30 images at 1K resolution for a dollar), while Omni Flash enables video creation and editing at $0.10/sec.

  • Bridgewater and Thinking Machines trained a specialized model, achieving 84.7% accuracy in financial triage at a cost 13.8 times lower than the best frontier model tested. Factory’s Droid Shield 2.0 fine-tuned two detectors to capture disclosed secrets in sessions, aiming to minimize false alarms.

  • Modelence Mobile Builder – create native mobile apps through chat, utilizing the same Modelence authentication and backend. (portfolio company)

  • Browserbase Agents – execute browser automation with one prompt or API call in a hosted environment.

  • Safari MCP – allows agents to utilize Safari for opening tabs, debugging pages, and resolving Safari-specific issues.

  • Option AFK – a Wispr-flow alternative featuring local transcription (keeping everything on-device), capable of transcribing multi-hour recordings and includes a CLI for your agents.

  • Claude Science – a Mac/Linux workbench for scientists, offering code-supported figures, 60+ scientific skills/connectors, and more. Currently in beta for Pro, Max, Team, and Enterprise users.

  • Ramp PorTAL – facilitates the transfer of fine-tuned tasks between models at roughly half the standard cost.

  • xAI Voice Agent Builder – create no-code Grok Voice agents at a cost of $0.05/min.

  • plain-writing-skill – guidelines for ensuring agents write clearly, along with an HTML diff highlighting changes.

  • Bond – an AI Chief of Staff designed for founders and executives to monitor context, blockers, and follow-ups.

  • Interfere – keeps an eye on production, identifies issues, and resolves problems before they impact users.

  • AI-native PM leverage – how Product Managers can evolve from text-based assistance to prototypes, PRs, and evaluations. (course)

  • /wizard – a skill for crafting interactive CLIs to simplify cumbersome setup tasks.

  • Understanding agent-written code – a case for retaining comprehension of code’s functionality even when written by agents.

Leave a Reply

您的邮箱地址不会被公开。 必填项已用 * 标注

You May Also Like