Logo Raphael Pereira Raphael Pereira
PT EN

UX & Conversion

AI Agents as Users: Rethinking UX for Machines That Make Decisions

Your next user might not have eyes. And that changes everything about how you design interfaces.

7 min

Listen to article

0:00 / —:—

Think about the last checkout flow you designed. Buttons positioned right, clear visual hierarchy, visual feedback at every step. Now imagine the person using it isn’t a person at all — it’s an autonomous AI agent executing a task on someone’s behalf.

That red button you picked to grab attention? Irrelevant. The loading animation? Invisible. The carefully written CTA microcopy? It might actually get in the way if it’s not semantically clear.

We’re entering a phase where a significant chunk of interactions with digital products will be handled by machines acting on behalf of humans. And this isn’t speculation — it’s already happening with Claude Computer Use, Manus, and dozens of other AI agents running in production.

The problem: interfaces were designed for human eyes

The entire UX discipline was built on an implicit assumption: the end user is a human with visual perception, limited attention, and specific cognitive patterns.

This shaped everything:

  • Visual hierarchy based on size, color, and position
  • Feedback through visual states (hover, loading, success)
  • Error prevention through deliberate friction
  • Affordances that depend on visual recognition

When the user is an AI agent, those assumptions collapse.

An agent can interact with your interface in three distinct ways:

  1. Via API — accesses endpoints directly, completely bypasses the visual interface
  2. Via screen reader — interprets the DOM or accessibility elements
  3. Via computer vision — “looks” at screenshots and decides where to click

Each method has different design implications. And most products today are unprepared for any of them.

What changes in practice

Semantic clarity beats visual clarity

When a human looks at a form, they understand visually that “Name” is a text field and “Submit” is the final action. An agent needs that information in structure — not in pixels.

This means:

  • Labels explicitly connected programmatically to fields
  • Real semantic HTML (not divs with classes that “look like” buttons)
  • Correct ARIA attributes (not as an accessibility checklist — as a source of information)
  • States and errors communicated via text, not just color

Interface for humans

  • Field highlighted in red = error
  • Large green button = primary action
  • Loading spinner = wait
  • Hover tooltip = explanation

Interface for agents

  • aria-invalid='true' + text error message
  • role='button' + descriptive aria-label
  • aria-busy='true' + text status
  • Inline description or aria-describedby

Notice: it’s not that the visual version is wrong. It’s that it’s incomplete when the interface consumer doesn’t process pixels.

Feedback needs to be programmatic, not just visual

A human sees the button change color and understands the action processed. An agent needs structured confirmation — a status code, a DOM state change, an element that appears or disappears in a detectable way.

In practice, this means:

  • State changes reflected in the DOM, not just CSS
  • Success or error messages as text elements, not just visual toasts
  • URLs that change when state changes (functional deep linking)
  • Consistent, predictable responses to the same actions

Error prevention can’t depend on “are you sure?”

Confirmation dialogs are a common error prevention technique for humans. For agents, they can be traps — or simply ignored.

If the agent is executing an automated task, it’ll click “yes” on the confirmation dialog without thinking. The dialog prevents nothing.

Error prevention for agents needs to be structural:

  • Destructive actions protected by additional authentication
  • Smart rate limiting
  • Server-side validation that doesn’t depend on client confirmation
  • Action logs that allow rollback

The three layers of access for agents

The Nielsen Norman Group proposes a useful framework for thinking about user-agnostic interfaces:

  • Well-documented and consistent APIs for direct access
  • Robust accessibility layer for agents that read screens
  • Visual interface that also works for computer vision interpretation

Most products today have (if you’re lucky) the third layer. Some have APIs, but disconnected from the core experience. Almost none think about the accessibility layer as an interface for agents.

Why this matters now

We’re not talking about a distant future. AI agents are already:

  • Scheduling meetings by accessing calendars
  • Making purchases on behalf of users
  • Filling out forms and submitting documents
  • Navigating complex interfaces to extract information

If your product isn’t accessible to agents, you’re building an adoption barrier. And if your competitor solves this first, they own that channel.

More concretely: an agent that can’t complete a checkout on your e-commerce will complete it on your competitor’s site with a more interpretable interface.

What to do about this

For product managers

Start including “agent personas” in your discovery process. Just as you consider different human user profiles, consider:

  • Agents accessing via API — what do they need?
  • Agents navigating via screen reader — is your accessibility real?
  • Agents using computer vision — are your elements interpretable?

For designers

  • Prioritize semantics over aesthetics in interactive elements
  • Ensure every visual feedback has a programmatic equivalent
  • Test your interface with screen readers — not as a checklist, as actual use
  • Document the structure of the interface, not just the design

For developers

  • Semantic HTML is not optional — it’s the API of your interface
  • Component states should be reflected in attributes, not just CSS classes
  • Consider alternative endpoints for flows that depend on visual confirmation

The paradox of the invisible interface

There’s an irony here: the better you design for agents, the less they need the visual interface. The ideal endpoint for an agent is a clean API. The visual interface becomes fallback — or isn’t used at all.

This doesn’t mean visual interfaces disappear. Humans will still use your product directly. But it means the value of the visual interface shifts: it serves for oversight, not execution.

The human watches the screen to verify what the agent did. Not to do it themselves.

This inverts the traditional hierarchy. The visual interface becomes a transparency layer over an operation happening in another layer.

What this demands from designers and PMs

Designing for agents requires a shift in mindset more than a shift in technique. It’s not about learning new tools — it’s about abandoning the assumption that the user sees what you designed.

  • You need to think about structure before you think about visuals
  • You need to consider programmatic interpretation as a legitimate channel
  • You need to accept that part of your work will be “invisible” in the traditional sense

The designer who understands this now will be ready for a market that’s just waking up to the problem. The one who doesn’t will keep optimizing button colors for a user who won’t even look at them.

Translation notes

Key changes made:

  1. Semantic clarity: Adapted “Aumentar o orçamento de mídia” → “Throwing more budget at paid media” (more natural EN phrasing)
  2. Industry terminology:
    • “mídia paga” → “paid media”
    • “UX” kept as-is (international term)
    • “agentes de IA” → “AI agents” (standard EN)
    • “tela” → “screen” (context dependent)
  3. Tone: Maintained the direct, no-fluff voice — removed formal Portuguese hedging
  4. Idioms:
    • “colapsam” → “collapse” (works idiomatically in EN)
    • “que nem vai olhar para eles” → “won’t even look at them” (natural EN directness)
  5. Frontmatter: Translated all fields including slug (SEO-friendly EN phrasing), category, tags, and descriptions
  6. Comparison component: Translated both sides while keeping the same item count (4 items each)
  7. Checklist: Kept the same three-item format with parallel structure
  8. MDX structure: All components preserved with identical positioning

The translation reads as native English product thinking — not a word-for-word conversion.

Retrato de Raphael Pereira

Author

Raphael Pereira

Designer & strategist focused on performance-led digital experiences.

Related posts