AI & ML9 min read

Vercel Eve: An AI Agent Is Just a Folder of Files

Vercel's open-source Eve makes an AI agent a folder of files: instructions, tools, and skills that ship as a durable service. How it works and when to use it.

Rhythm Bhiwani · Jun 24, 2026

Half the commits landing on Vercel's own platform are now written by agents, not people. That was the number Guillermo Rauch led with at Ship in London: agent-triggered commits jumped from under 3% to more than 50% in six months. Which raises an awkward question. If agents write that much of the code, what does an agent actually look like as a project you can read, review, and deploy?

Vercel shipped its answer the same day. It's called Eve, and the whole bet is that an agent should be a folder of files.

What Eve is

Eve launched on June 17 at Vercel Ship 2026 in London. It's open-source under Apache-2.0, TypeScript all the way through, and in public preview now. The framing people keep reaching for is "Next.js for agents," and that's the right instinct. Next.js took a pile of React, a router, a bundler, and a server you used to wire together by hand, and gave it a convention. Eve is trying to do the same thing for the loop you write every time you build an agent.

If you've ever built an LLM agent from scratch, you know that loop: a model, a list of tools, and a while-loop that calls the model, runs whatever tool it asked for, feeds the result back, and repeats until the agent says it's done. The model part is easy now. The annoying part is everything around it. Where does state live. What happens when the process restarts mid-task. How do you let a human approve a risky step. How do you test the thing. Eve's pitch is that those should be conventions, not code you write again on every project.

An agent is a directory of files

Here's the core idea, and it's almost too simple. You make a folder. The folder is the agent.

agent/
  agent.ts          # model + settings
  instructions.md   # the system prompt: who this agent is
  tools/            # TypeScript functions the agent can call
    get_weather.ts
  skills/           # Markdown domain knowledge, loaded on demand
    refund-policy.md
  subagents/        # agents this one can hand work off to
  channels/         # where it listens: Slack, Discord, GitHub...
  schedules/        # cron jobs for autonomous runs

Eve scans that directory, validates what's inside, and compiles it into a manifest that runs as a durable service on Vercel Functions. Nothing is registered in code. The filesystem is the config. Drop a file in tools/ and the agent can call it. Drop a Markdown file in skills/ and it becomes domain knowledge the agent pulls in when it's relevant.

The two files you'll touch most are the instructions and a tool. One is plain prose, the other is plain TypeScript.

You are a support assistant for a weather app.
 
Be brief. When someone asks about conditions in a city,
call the get_weather tool and summarize the result in
one sentence. If the city is ambiguous, ask which one
they mean before calling anything.

That tool is just a function, and tool calling is the mechanism underneath: the model decides it needs weather, Eve runs run(), and the JSON goes back into the conversation. The exact export names are in the docs, but the shape is the point. A tool is a file, not a registration ceremony.

Quick check

In Eve, what is an 'agent'?

Durable by default

This is the part worth stealing even if you never touch Eve.

Every conversation in Eve is a durable workflow. Each step gets checkpointed, so a session can pause, survive a crash or a redeploy, and pick up exactly where it left off. An agent that's waiting on a human approval or a slow API doesn't sit there burning a function invocation. It parks. State is saved, compute stops, and it wakes back up when the next event arrives.

Loading diagram…

Why durable execution matters for agents

A normal serverless function has a time limit and dies when it returns. That's fine for a request that finishes in a second. It's useless for an agent that might wait three hours for a human to click approve, or run a chain of twenty tool calls where step 14 hits a flaky API. Checkpointing each step means the long-running, stop-and-start nature of agent work stops fighting the platform it runs on. The work resumes instead of restarting.

If you've built an agent on raw functions, you've felt this gap. You end up bolting on a queue, a state table, and a retry scheme to fake durability. Eve makes that the floor instead of a project.

What you get without writing it

The directory gives you six things by default. Durable execution is the one above. The rest:

Sandboxed compute for code the agent generates or runs, using Vercel Sandbox in production and a local sandbox while you develop.
Human-in-the-loop approvals, where the agent parks on a risky action and waits for a person, with no compute ticking the whole time.
Subagents so one agent can hand a sub-task to another instead of cramming every job into one giant prompt.

You also get OpenTelemetry tracing that exports to the usual places like Datadog or Honeycomb, and a built-in evals system so you can score an agent against test cases instead of eyeballing whether your last prompt tweak made it worse. Tracing and evals are the two things people skip when they roll their own, and they're exactly the two that tell you whether the thing actually works.

Does it hold up? Vercel's own fleet

Vercel didn't ship a demo. It says it runs more than a hundred agents on Eve in production, and it named a few:

Agent	Job	What Vercel claims
d0	Data analyst in Slack	30,000+ questions answered a month
Lead Agent	Autonomous sales rep	~$5k/year cost against a ~$160k return
Athena	RevOps tool	Built by non-engineers in six weeks
Vertex	Support	Resolves 92% of tickets on its own

Two more numbers from the keynote put the scale in context. That jump to over half of commits being agent-written is one. The other is token volume through Vercel's AI Gateway, which the company says went from roughly 2 trillion to 20 trillion a month over the same six months. This is the same shift where coding agents quietly became the default way teams ship, now pointed at Vercel's own internal tools.

These are the vendor's own numbers

A 92% resolution rate and a 32x ROI are Vercel describing Vercel, on a platform Vercel is selling. Read them as direction, not as a benchmark you'll hit. The useful signal isn't the exact percentage. It's that the company built real internal tools on this and is willing to put names and figures on them, which is more than most framework launches do.

The catch

Eve is genuinely nice DX, and it's also a vendor framework that's happiest on its maker's infrastructure. Durable execution, the sandbox, the approvals, the parking model all assume Vercel Functions and Vercel Sandbox underneath. It's open-source and the code is TypeScript you can read, but "open-source" and "easy to run anywhere else" are not the same promise. Plan for that before you build something load-bearing on it.

It's also a public preview, which means the API will move and the docs are ahead of the scars. And the folder convention, as clean as it is, doesn't make the hard parts disappear. Writing good instructions.md, designing tools that fail safely, and building evals that catch real regressions is still the actual work. Eve removes the plumbing. It does not write the agent for you.

What I'd actually do

If you want to understand agents, don't start with Eve. Build the loop yourself first so you know what the folder is hiding. Once you've felt the pain of managing state and retries by hand, Eve's conventions read as relief instead of magic.

If you have a real internal tool in mind, a Slack bot that answers questions over your own data is the obvious first project, and it's close to what d0 does. Try it on Eve, keep the agent logic in plain functions you could lift out later, and watch how much the durable-by-default model saves you. Even if you never adopt the framework, copy the idea: agents are long-running and stop-and-start by nature, so build them on something that checkpoints and resumes instead of something that times out and dies.

The bet underneath all of this is that agents are about to be ordinary software, written and shipped like any other service. Turning one into a folder you can read in a code review is a real step toward making that true.

#ai #agents #developer-tools #typescript

Written by

Rhythm Bhiwani

Engineer and relentless builder, happiest reverse-engineering hard problems until they click.

Portfolio

Copied!

Enjoyed this?

Tap the heart to leave some love.

Be the first to react

Comments

Join the conversation.

Loading comments…

AI & ML9 min read

Vercel Eve: An AI Agent Is Just a Folder of Files

Vercel's open-source Eve makes an AI agent a folder of files: instructions, tools, and skills that ship as a durable service. How it works and when to use it.

Rhythm Bhiwani · Jun 24, 2026

Vercel shipped its answer the same day. It's called Eve, and the whole bet is that an agent should be a folder of files.

What Eve is

An agent is a directory of files

Here's the core idea, and it's almost too simple. You make a folder. The folder is the agent.

agent/
  agent.ts          # model + settings
  instructions.md   # the system prompt: who this agent is
  tools/            # TypeScript functions the agent can call
    get_weather.ts
  skills/           # Markdown domain knowledge, loaded on demand
    refund-policy.md
  subagents/        # agents this one can hand work off to
  channels/         # where it listens: Slack, Discord, GitHub...
  schedules/        # cron jobs for autonomous runs

The two files you'll touch most are the instructions and a tool. One is plain prose, the other is plain TypeScript.

You are a support assistant for a weather app.
 
Be brief. When someone asks about conditions in a city,
call the get_weather tool and summarize the result in
one sentence. If the city is ambiguous, ask which one
they mean before calling anything.

Quick check

In Eve, what is an 'agent'?

Durable by default

This is the part worth stealing even if you never touch Eve.

Loading diagram…

Why durable execution matters for agents

If you've built an agent on raw functions, you've felt this gap. You end up bolting on a queue, a state table, and a retry scheme to fake durability. Eve makes that the floor instead of a project.

What you get without writing it

The directory gives you six things by default. Durable execution is the one above. The rest:

Sandboxed compute for code the agent generates or runs, using Vercel Sandbox in production and a local sandbox while you develop.
Human-in-the-loop approvals, where the agent parks on a risky action and waits for a person, with no compute ticking the whole time.
Subagents so one agent can hand a sub-task to another instead of cramming every job into one giant prompt.

Does it hold up? Vercel's own fleet

Vercel didn't ship a demo. It says it runs more than a hundred agents on Eve in production, and it named a few:

Agent	Job	What Vercel claims
d0	Data analyst in Slack	30,000+ questions answered a month
Lead Agent	Autonomous sales rep	~$5k/year cost against a ~$160k return
Athena	RevOps tool	Built by non-engineers in six weeks
Vertex	Support	Resolves 92% of tickets on its own

These are the vendor's own numbers

The catch

What I'd actually do

#ai #agents #developer-tools #typescript

Written by

Rhythm Bhiwani

Engineer and relentless builder, happiest reverse-engineering hard problems until they click.

Portfolio

Copied!

Enjoyed this?

Tap the heart to leave some love.

Be the first to react

Comments

Join the conversation.

Loading comments…

Vercel Eve: An AI Agent Is Just a Folder of Files

What Eve is

An agent is a directory of files

Durable by default

What you get without writing it

Does it hold up? Vercel's own fleet

The catch

What I'd actually do

Comments

Related articles

Build a Simple AI Agent in Python

Tool Calling: Let the LLM Use Your Functions

Claude Sonnet 5 vs Opus 4.8: When to Pay More

Vercel Eve: An AI Agent Is Just a Folder of Files

What Eve is

An agent is a directory of files

Durable by default

What you get without writing it

Does it hold up? Vercel's own fleet

The catch

What I'd actually do

Comments

Related articles

Build a Simple AI Agent in Python

Tool Calling: Let the LLM Use Your Functions

Claude Sonnet 5 vs Opus 4.8: When to Pay More