🧑🏻‍💻 For human 🤖️ For agent

Computer-Use Agents as Judges for Generative User Interface

1University of Oxford 2Show Lab, National University of Singapore 3Microsoft
Equal Contribution

📖 TL;DR

What does Agent-friendly look like? Check out below demo:
The left UI is designed for 🧑🏻‍💻humans—prioritizing aesthetics.
The right UI is redesigned for 🤖agents—focused on clarity and functionality.
Can Computer-Use Agents offer feedback to assist Coders to Generate UI?
AUI Framework
👨‍💻Humans Collaboration vs. 🤖️Coder-CUA Collaboration.
Left: Most GUIs are designed by humans and optimized for user experience (e.g. aesthetics), forcing trained agents to adapt to human-oriented behaviors. Right: Our Coder-CUA Collaboration framework leverages Coder as Designer and CUA as Judge together, enabling more reliable task execution and improved usability for agents.
This is AUI, Agent-friendly UI.

🔧 How does it work?

AUI Framework
Overview of the Coder-CUA in Collaboration framework.
The process begins with the Coder as Designer, which initializes and iteratively revises the UI based on queries and feedback. In parallel, the CUA as Judge executes task-driven navigation within the testing environment, generating trajectories and error logs to evaluate task solvability. A verifier ensures functional correctness, while feedback from CUA navigation informs subsequent UI revisions. This collaboration yields a finalized agent-centric UI optimized for both functionality and execution success.

📊 New UI improve Success Rate

We evaluate our framework using Function Completeness Rate (FC) and CUA Success Rate (SR). The results demonstrate that our Coder-CUA collaboration significantly improves both metrics compared to the baseline, especially for stronger models like GPT-5 and Gemini-3-Pro.

Coder Method Overall Performance
Func. Completeness (%) CUA Success Rate (%)
GPT-5 Baseline 67.9 24.5
+ Ours 81.5 26.0
Qwen3-Coder-30B Baseline 42.1 7.3
+ Ours 60.1 19.0
GPT-4o Baseline 36.3 8.8
+ Ours 43.1 16.1
Gemini-3-Pro Baseline 71.7 35.8
+ Ours 72.5 47.0

🎓 Citation

Please kindly cite our paper if you find this project helpful.
@misc{lin2025aui,
title={Computer-Use Agents as Judges for Generative User Interface},
author={Kevin Qinghong Lin and Siyuan Hu and Linjie Li and Zhengyuan Yang and Lijuan Wang and Philip Torr and Mike Zheng Shou},
year={2025},
eprint={2511.15567},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2511.15567},
}