Hello, I’m Ben. I’m passionate about creating tools with agents, despite not being a technical expert. Here’s a glimpse into my current readings and projects. If you’re interested in starting your own creations or enhancing your ‘vibe-coding’ abilities, I invite you to join our community.
Hello everyone,
Recently, I developed a simple tool that I found necessary. Amidst the excitement around clawdbot and openclaw, I realized it was challenging to manage files on my remote machines (like my Mac Mini or VPS). To address this, I created a combined file explorer application that allows users to upload, view, and edit files on any connected device, whether local or remote. Best of all, it’s free to clone and remix.
We also have two new coding models on the scene: Opus 4.6 from Anthropic and GPT-5.3-Codex from OpenAI. My social feeds are buzzing about GPT-5.3-Codex (check out Matt’s and Theo’s reviews). Personally, I find myself leaning towards it at times; when Opus encounters roadblocks or seems less intelligent about certain tasks, I turn to Codex. For direct requirements and speedy execution, Codex is the go-to, but for planning and brainstorming, Opus shines due to its resourcefulness (documents, links, etc.). Both OpenAI and Anthropic are pushing the limits with these models.
The latest Opus version comes with beta support for a massive 1M tokens in its context window and a fast mode that delivers outputs 2.5x quicker, albeit at a cost of 6x the original price. Anthropic has also rolled out several API enhancements, including Context Compaction, Adaptive Thinking, Effort, and Claude Code’s new feature: Agent Teams (check out the demo and installation guide). These Agent Teams function as multiple Claude sessions that can collaborate on shared tasks, with their own communications and centralized management. This feature is accessible to all users of Claude Code.
OpenAI is also making headlines with a new platform called OpenAI Frontier. Similar to the capabilities of Codex, it enables enterprises to create agents connected to their data and run commands on computers with feedback mechanisms for continuous improvement. Although Copilot and Google Cloud have been in this space for a while, limitations in model capabilities and computer access have hindered their potential. In my view, Frontier feels like a strategic bid to attract those users amidst the backdrop of Claude Code’s new Agent Teams.
Here are a few more noteworthy highlights:
Ever wondered why there’s always a meeting bot on your Zoom calls? The answer lies with Recall.ai, the backbone of numerous meeting AI applications, from Cluely to Hubspot to Clickup. Recall.ai excels in managing the complexities of data recording across various meeting platforms. Start your journey with $100 in credits.*
-
Wiki Education’s collaboration with Pangram (an AI detection tool) produced a comprehensive report outlining its effectiveness and limitations. The community’s trust in Pangram’s detection abilities has surpassed initial skepticism, making this an essential read.
-
Tailscale facilitates seamless access to a development environment on a remote machine (like a Mac Mini) from any device. I’ve been utilizing it for my projects (here’s a guide). The company’s ex-CTO (now with exe.dev) recently reflected on the past eight months of agent development.
-
Stripe is harnessing ‘minions’—agents capable of executing features from beginning to end. Simon wrote about how StrongDM’s AI team manages to build robust software without directly examining the code. Additionally, explore: Agent-native engineering, focusing on organizing your team around agents as independent contributors rather than traditional engineers.
-
Should we consider developing a new programming language tailored for the agent era? (I certainly think it’s worth exploring—and I’m eager to invest in it!)
-
Ghostty founder’s transition from being skeptical of AI to recognizing its daily value.
-
Let Claude refine its own abilities and enhance its marketing strategies.
-
The emergence of the professional vibe coder continues to shape the landscape.
-
Stop conversing with mere predictive text and start conducting genuine research with Superagent. Pose a question and watch it work: Subagents deeply investigate your topic, gather reputable sources, and compile everything into boardroom-ready reports, slides, documents, or websites.*
-
👩🚀 Agent Composer – AI agents designed for advanced industries, aimed at reducing routine engineering tasks from hours to minutes.*
-
Claude in Excel and PowerPoint – Official extensions for office tools, brought to you by Anthropic.
-
Sphinx – A browser-based data science environment featuring a robust agent.
-
Agentation – Allow your agent to fix your UI by annotating elements in your application.
-
Solo manages your entire development stack, detecting all processes and letting you start everything with just one click.
-
Observational Memory by Mastra AI – A human-like memory system with a stable context window.
-
Keep.md – Save links from anywhere, organize them as Markdown, and access them from your agent whenever needed. (I plan to replace my current link-saving method with this for this newsletter)
-
An endeavor in prototyping components within your coding workspace.
OpenClaw’s skill store, Clawhub, now automatically scans all skills for malware using VirusTotal. There are numerous variants of OpenClaw launching daily. Here are a few that seem promising:
-
Webclaw – A quick, local-first, open-source web client for OpenClaw.
-
Aight.cool – OpenClaw as an iOS application.
-
Klaus by Bits – A cloud-based, feature-rich solution.
There are plenty of articles available to help you set it up and maximize its potential. A few choice selections:
And try this entertaining tool → Lobster Anatomy. It visualizes your OpenClaw agent and aids in its enhancement.
-
Latch – Security middleware to safeguard agents and their tools, preventing suspicious actions while allowing safe ones. (I’m an investor!)
-
Execute npx playbooks scan skills to scan your locally installed skills for security vulnerabilities, or explore this security skill.
-
BabyAGI 3 – A minimal autonomous assistant (learn more)
-
md-browser – A mini-browser focused on Markdown that interprets the web like an AI.
-
Agent-relay – Facilitating real-time messaging between AI agents with sub-5ms latency, regardless of CLI or programming language.
-
Sage – A privacy-centric personal AI agent equipped with persistent memory, built in Rust. (watch the explainer video)
-
agent-browser can now interact with local PDFs/HTML files, capturing all clickable divs on a webpage.
-
pi-messenger – A chat room designed for multiple agents collaborating on the same project.
-
Shannon – An AI security tool aimed at identifying vulnerabilities in your application.
-
Napkin – A skill integrated with Claude Code, designed to give the agent memory of its previous errors.
-
X API is now under a pay-per-use model. While I prefer using Bird CLI, this new pricing structure allows for easy development of projects like this Twitter research assistant using official APIs.
-
Cloudflare’s Sandbox SDK now supports PTY (pseudo-terminal) passthrough, enabling terminal-like UIs in your browser.
-
Vercel AI Accelerator – A six-week program offering access to the Vercel team, investors, and $6M in credits. Applications are open until February 16th.
-
Vouch – A system for managing community trust regarding contributions to your open-source projects.
-
Cursor has launched Composer 1.5 – built on the same foundation as Composer 1 but enhanced significantly to be more effective. The costs will be higher, given that it’s currently measured through Cursorbench (an internal benchmark without public criteria).
I hope you’ve enjoyed this newsletter. If you have a friend who might benefit, feel free to share it!
That wraps it up for today. I encourage you to share your thoughts and feedback. 👋
* A special thanks to our sponsors who make this newsletter possible 🙂
If you’re interested in partnering with us for Q1, let’s talk.

