Categories AI

Exploring CLI Tools

Agents are an innovative aspect of large language models (LLMs) that allow for tool utilization. Instead of merely responding to inquiries, these agents actively perform tasks on your behalf. But what does “tool-use” truly entail, and what tools are we talking about?

The primary form of tools available to these agents is through command-line interfaces (CLI). Given that agents primarily communicate via text, CLIs—being text-based and operating on text input and output—naturally complement this interaction. Essentially, a CLI is a method to control software through typed commands, leading to specific outcomes.

Consider this straightforward example: organizing files using the Bash tool.

“Rename all 400 product photos to align with our SKU format, resize them to 1200×1200 pixels, and categorize them into folders.”

  • Using the command ‘mkdir’ creates a ‘directory’ (folder) and, in this case, generates five folders: output, output/shoes, output/bags, output/jackets, and output/hats.

  • Flags are modifiers for commands; for instance, the flag ‘-p’ instructs the system to create any necessary parent folders as well—meaning if ‘./output/’ does not already exist, it will be created as well.

This operation completes in mere seconds, a process that could take hours if done manually.

While Bash is a versatile CLI provided by your computer, other specialized CLIs exist for specific tasks:

  • Stripe CLI — Used to retrieve revenue data, manage subscriptions, and test payments.

  • Playwright — Allows control over a web browser, enabling navigation, clicking, form filling, and screenshot capture.

  • AWS CLI — Used for launching servers, managing databases, and scaling infrastructure.

  • Vercel CLI — Can deploy a website live with a single command.

Each of these represents distinct tools an agent can leverage. The earlier file organization example utilized Bash. However, if you provide the Stripe CLI, the agent gains the capacity to access revenue data; with Playwright, it can browse the web; and by incorporating Vercel, it can deploy created assets.

That’s the essence of “tool-use.” The more CLIs you grant access to an agent, the more it can accomplish. Your responsibility lies in ensuring the agent has the appropriate tools for its designated tasks.

Although this may come across as technical, such command lines are typically only visible when using a terminal or when observing them in action within tools like Claude Code. They operate beneath the surface, even if not immediately apparent.

If an agent, such as Cowork, is executing a task, you have the option to click and reveal the commands it ran—just like in the example where it lists files to locate recent fund updates.

Every agent operates with commands similar to this behind the scenes. The user interface conceals these complexities while offering users a streamlined experience.

Notably, every agent is executing commands similar to those hidden from the user’s view—all while providing a seamless experience.

Share Ben’s Bites

* sponsors who make this newsletter possible 🙂
Interested in partnering with us for the upcoming quarter?
Email us at shanice@bensbites.com or k@bensbites.com

The discussion around agents and tool use reveals immense potential. As technology continues to advance, the capabilities of these agents will expand, streamlining workflows and enhancing productivity in various segments. By ensuring the proper tools are available, we can maximize the potential of these intelligent systems.

Leave a Reply

您的邮箱地址不会被公开。 必填项已用 * 标注

You May Also Like