Why Johnny
Can't Use
Agents

Industry Aspirations vs. User Realities with AI Agent Software

Abstract

There is growing imprecision about what “AI agents” are, what they can do, and how effectively they can be used by their intended users. We pose two key research questions: (i) How does the tech industry conceive of and market “AI agents”? (ii) What challenges do end-users face when attempting to use commercial AI agents for their advertised uses?

We first performed a systematic review of marketed use cases for 102 commercial AI agents, finding that they fall into three umbrella categories: orchestration, creation, and insight. Next, we conducted a usability assessment where N = 31 participants attempted representative tasks for each of these categories on two popular commercial AI agent tools.

Methodology

1

Systematic Review

Analyzed 102 AI agents to build taxonomy

2

Think-Aloud Sessions

31 participants using Operator & Manus

3

Semi-Structured Interviews

Identified usability barriers and implications

Overview of Research Process
RQ1: How does industry conceive AI agents?
Systematic Review (102 agents)
Taxonomy
Tasks
RQ2: What challenges do users face?
Think-Aloud Sessions (31 participants)
Semi-Structured Interviews
Usability Barriers
Design Implications

Findings

AI Agent Taxonomy

1

Orchestration

Agents that act on behalf of users to manipulate software interfaces

31 agents
Examples: Salesforce Agentforce, Copilot
2

Creation

Generate structured documents with well-defined formats

33 agents
Examples: Lovable, Gamma, Beautiful.AI
3

Insight

Distill knowledge into structured takeaways

87 agents
Examples: Perplexity, Spotify AI DJ

Key Findings

Generally Successful

Users completed tasks with “Good” to “Excellent” usability ratings

!

Agent-Specific Strengths

Manus excelled at slide making (90.6), Operator at holiday planning (88.8)

×

Critical Usability Barriers Found

Despite success, users faced 5 significant usability challenges

System Usability Scale Scores

Holiday PlanningExcellent Range
Operator
88.8
Manus
78
Stipend BudgetingGood Range
Operator
84.2
Manus
71.2
Slide MakingMixed Results
Operator
69.8
Manus
90.6
Operator
Manus

Higher scores indicate better usability (0-100 scale)

Usability Barriers

Click on any card to reveal user quotes and detailed explanations

Misaligned Mental Models

Agent capabilities don't match user expectations

Misaligned Mental Models

Example User Quote:

You've got to sit there and make sure that your initial prompt is perfect.

Explanation:

Users can't predict the agent's actual capabilities, turning delegation into frustrating "prompt gambling".

Presuming Trust

Without demonstrating competence or security

Presuming Trust

Example User Quote:

I was waiting for it to ask me for some more information.

Explanation:

Agents expect immediate delegation without first establishing credibility through active preference-elicitation or demonstrating the competence necessary to handle sensitive tasks.

Inflexible Collaboration

Rigid interaction styles that don't adapt

Inflexible Collaboration

Example User Quote:

It's kind of like I'm giving you a job, and you're throwing the job back at me.

Explanation:

They act as "lone wolf" execution tools that fail to adapt to a user's need for hands-on guidance or mid-task oversight.

Communication Overhead

Overwhelming users with excessive output

Communication Overhead

Example User Quote:

Oh, my God! It threw out so much stuff...it's almost an overwhelming amount of information.

Explanation:

This arises from agents generating excessive, poorly-formatted output and forcing users to articulate complex, subjective preferences in a cognitively demanding way.

Metacognitive Gaps

Agents lack self-awareness of limitations

Metacognitive Gaps

Example User Quote:

It just was kind of circling...it's seeking to provide an answer rather than to say 'I don't know.'

Explanation:

They lack the self-awareness to recognize their own errors or limitations, leading them to get stuck in time-wasting "try-fail cycles" that require manual human debugging.

Recommendations

Design implications for building next-generation AI agents

1

Know Your User

Collect preferences, skills, and collaboration styles

2

Know Yourself

Develop metacognitive abilities to recognize limitations

3

Be Adaptable

Adapt interface based on task type and user preferences

4

Measure Twice, Cut Once

Support user control during planning and execution phases

5

Show, Don't Tell

Support multiple input modalities beyond text prompts

6

Next Time's the Charm

Enable precise iteration on outputs with contextual controls

Citation

@article{shome2025johnny,
  title={Why Johnny Can't Use Agents: Industry Aspirations vs. User Realities with AI Agent Software},
  author={Shome, Pradyumna and Krishnan, Sashreek and Das, Sauvik},
  journal={arXiv preprint arXiv:2509.14528},
  year={2025}
}