ACM Conference on AI and Agentic Systems (CAIS), May 2026

Why Johnny can’t use agents

Industry aspirations versus user realities with AI agent software.

102
commercial agents reviewed
Systematic corpus
31
participants observed
Think-aloud study
5
usability barriers identified
Thematic analysis
§ 0Abstract

An investigation into delegated work.

There is growing imprecision about what “AI agents” are, what they can do, and how effectively they can be used by their intended users. We pose two research questions: (i) how does the tech industry conceive of and market “AI agents,” and (ii) what challenges do end-users face when attempting to use commercial AI agents for their advertised uses?

We performed a systematic review of marketed use cases for 102 commercial AI agents, finding that they fall into three umbrella categories — orchestration, creation, and insight. We then observed N = 31 participants attempting representative tasks for each category on two popular commercial AI agent tools, surfacing five usability barriers and six design recommendations.

§ 1Methodology

1.A mixed-methods approach.

Our study proceeds in two parts. First, a systematic review of commercial agents yields a taxonomy of marketed use cases. Second, a think-aloud and interview study observes real users attempting representative tasks drawn from that taxonomy.

RQ1 · How does industry conceive AI agents?
Systematic review (102 agents)
Taxonomy of use cases
Representative tasks
RQ2 · What challenges do users face?
Think-aloud sessions (N = 31)
Semi-structured interviews
Usability barriers
Design implications
Fig. 1Research process, from research question to findings. Each question is answered through a complementary chain of methods.
§ 2Findings

2.What the industry sells & how well users can use it.

2.1Taxonomy of commercial AI agents

Across 102 reviewed tools, marketed use cases cluster into three categories. Counts exceed 102 because many agents span more than one category.

§CategoryExamplesN
2.1.1
Orchestration
Agents that act on behalf of users to manipulate software interfaces across applications and websites.
e.g.Salesforce Agentforce, Microsoft Copilot
31agents
2.1.2
Creation
Agents that generate structured documents with well-defined formats (slides, websites, designs).
e.g.Lovable, Gamma, Beautiful.AI
33agents
2.1.3
Insight
Agents that distill large amounts of information into structured, user-facing takeaways.
e.g.Perplexity, Spotify AI DJ
87agents
Tab. 1Taxonomy of 102 commercial AI agents by primary marketed use case.

2.2Usability ratings by task

Participants rated the usability of two tools — OpenAI Operator and Manus — on three representative tasks using the System Usability Scale (SUS, 0–100, higher is better).

TaskOperator Manus Interpretation
Holiday planning
Operator
88.8
Manus
78.0
Excellent / Good
Stipend budgeting
Operator
84.2
Manus
71.2
Good / Good
Slide making
Operator
69.8
Manus
90.6
Fair / Excellent
Tab. 2 — SUS scores (0–100, higher is better) · Operator vs. Manus across three task types.
§ 3Usability barriers

3.Five barriers to agent delegation.

Despite generally positive usability scores, participants reported recurring frustrations. Five themes emerged from the qualitative data; click a card to reveal a representative quote.

Barrier 01

Misaligned mental models

Users cannot predict what an agent is actually able to do.

Click for quote
Quote · B01
You've got to sit there and make sure that your initial prompt is perfect.

Users can't predict the agent's actual capabilities, turning delegation into frustrating 'prompt gambling.'

← back
Barrier 02

Presuming trust

Agents expect delegation before earning user confidence.

Click for quote
Quote · B02
I was waiting for it to ask me for some more information.

Agents expect immediate delegation without first establishing credibility through preference-elicitation or demonstrating competence for sensitive tasks.

← back
Barrier 03

Inflexible collaboration

Agents do not adapt their interaction style to the user or task.

Click for quote
Quote · B03
It's kind of like I'm giving you a job, and you're throwing the job back at me.

They act as 'lone wolf' execution tools that fail to adapt to a user's need for hands-on guidance or mid-task oversight.

← back
Barrier 04

Communication overhead

Agents overwhelm users with output and ask for inarticulable input.

Click for quote
Quote · B04
Oh, my God! It threw out so much stuff... it's almost an overwhelming amount of information.

Agents generate excessive, poorly-formatted output and force users to articulate complex, subjective preferences in a cognitively demanding way.

← back
Barrier 05

Metacognitive gaps

Agents lack self-awareness of their errors and limitations.

Click for quote
Quote · B05
It just was kind of circling... it's seeking to provide an answer rather than to say 'I don't know.'

They lack the self-awareness to recognise their own errors, leading to time-wasting 'try-fail cycles' that require manual human debugging.

← back
§ 4Design recommendations

4.Six recommendations for next-generation agents.

4.1

Know your user

Actively collect preferences, skills, and collaboration styles before executing a task.

4.2

Know yourself

Develop metacognitive abilities to recognise capability limits and stop confidently.

4.3

Be adaptable

Adapt the interface and interaction style based on task type and user signals.

4.4

Measure twice, cut once

Surface planning and checkpoints so users can steer before and during execution.

4.5

Show, don't tell

Support multiple input modalities beyond free-form text prompts.

4.6

Next time's the charm

Enable precise iteration on outputs via contextual, direct-manipulation controls.

Citation.

BibTeX
@inproceedings{shome2026johnny,
  title={Why Johnny Can't Use Agents: Industry Aspirations vs. User Realities with AI Agent Software},
  author={Shome, Pradyumna and Krishnan, Sashreek and Das, Sauvik},
  booktitle={Proceedings of the ACM Conference on AI and Agentic Systems (CAIS '26)},
  year={2026}
}