may 2026 · field guide · 11 sections

Generative UI,
applied.

For four decades, software was a static artifact — wireframed, anticipated, frozen at runtime. In 2026 that assumption is breaking. Interfaces are starting to be summoned, not built.

based on Google Research on PAGEN · A2UI v0.9 spec · ICSE SEIP vibe-coding review · arXiv on accessibility · NN/g on outcome-oriented design

00.5 read this your way

This article, reframed.

Pick a lens and a depth — the same article, projected for who you are and how much time you have. The thesis of this piece, applied to the piece itself.

lens

depth

Reading the full article at standard depth — every section, every block, the canonical version.

00 demo

Try it. Live.

The article's thesis, in 30 seconds. Type below or click a chip — components are pre-built, the combination is decided at runtime from your intent. No LLM call; just the declarative pattern from Section 03.

build me

try

output will appear here.

view the spec

01 paradigm

From static artifacts to malleable software.

The maturation of frontier LLMs precipitated a structural shift. The interface itself is now a runtime artifact.

The evolution of human-computer interaction has historically been defined by the constraints of predefined systems. From command-line interfaces of early computing to the polished GUIs of the mobile era, software has fundamentally operated as a static artifact — architected before deployment, frozen at runtime.

Generative User Interfaces is an architectural pattern where parts or the entirety of a UI are dynamically generated, selected, assembled, or controlled by an AI agent at runtime, rather than being fully predefined by human developers prior to deployment.

Human beings should not be forced to adapt to the constraints of technology. Technology must adapt fluidly to the cognitive, physical, and contextual realities of the human user.

Instead of forcing users to navigate an existing catalog of monolithic applications or dig through nested menus, GenUI functions as malleable software that reshapes the digital environment at the exact moment of use. The model doesn't merely output text or markdown — it interprets intent and instantly constructs bespoke interactive tools, visualizations, and contextual simulations.

This represents a shift from designing for features to designing for intent. Fundamentally different epistemology.

02 architecture

Protocols, state, and the plumbing nobody sees.

The transition from demos to scalable software demands a clean separation of concerns. Two protocols have emerged as the foundation.

AG-UI (Agent-User Interaction) is the bi-directional event and state protocol that sits beneath the visual layer. It handles the transition of a tool's state from started to streaming to finished or failed. Crucially, it supports human-in-the-loop workflows through its interrupt model — an agent can pause, request human approval, accept modifications, and safely resume. Vital for financial operations, security actions, compliance reviews.

Google's A2UI v0.9 standardizes how UI intent is declared. It uses a secure, declarative JSON format rather than executable code — so local or remote agents communicate with any client across web, mobile, desktop using a common language. Security-first: agents can only request rendering for components that exist within a pre-approved enterprise catalog. No injection of malicious code or off-brand visuals.

The state-management problem is harder than it looks. Traditional tools like Redux were designed for deterministic, linear transitions orchestrated by human interactions. GenUI introduces high-entropy, asynchronous variables. A simple isLoading boolean is entirely insufficient when an AI's response is streaming word by word or component by component. Developers must implement context supply chains — comprehensive, auditable trails of every autonomous state modification, so anomalies can be debugged. Sandboxing limits the blast radius of hallucinated components.

03 patterns

Static, declarative, fully generated.

The implementation of GenUI is not monolithic. Three patterns sit on a spectrum of flexibility and control.

01 · static

Pre-built reactions

AI determines logical state; frontend reacts by rendering hardcoded components tied to that state.

strengthMaximum reliability, accessibility, brand safety. Checkout, healthcare.

limitationEvery component must be anticipated and built in advance.

02 · declarative

Mix & match

AI assembles the interface by selecting from a finite registry of pre-approved building blocks.

strengthOptimal balance of flexibility and safety. Predictable, composable outputs anchored to a design system.

limitationRequires upfront investment in a component registry.

03 · fully generated

Raw code at runtime

AI outputs raw HTML, CSS, JSX — rendering an entirely new interface from scratch.

strengthMaximum creative freedom. Prototyping, ephemeral exploratory tools.

limitationXSS, broken markup, inconsistent branding, accessibility violations.

verdict In production, declarative wins. Free-form code generation is bounded by model unpredictability — interfaces drift, become hard to maintain, require human intervention before deployment. Component-based assembly lets the AI orchestrate reliable, pre-tested enterprise assets.

04 pagen

What Google's PAGEN benchmark actually showed.

Human raters preferred AI-generated UIs over markdown 82.8% of the time. Over raw text, 97%. The catch: only at scale.

Google researchers introduced PAGEN — a benchmark dataset of highly polished websites and interfaces crafted by human experts — and evaluated GenUI outputs against it using human raters and ELO scoring. The model under test was Gemini 3 Pro, augmented with three components: server endpoints (real-time web search, image generation), meticulously crafted system instructions, and lightweight post-processors catching structural errors.

Output modality	ELO (LMArena)	ELO (info-seeking)
Human expert (PAGEN)	1756.0	Generally preferred
Gemini 3 GenUI	1710.7	1739.31
Generative markdown	1459.6	Significantly lower
Top Google search result	1355.1	Significantly lower
Raw generative text	1218.6	Significantly lower

82.8%GenUI preferred over markdown

97.0%GenUI preferred over raw text

44%cases judged comparable to human expert

60%error rate when Gemini 2.0 Flash-Lite attempted UI code

The 60% number is the load-bearing one. Robust GenUI is an emergent capability tied to model scale and reasoning power. Smaller, cheaper models break the contract. Anyone trying to ship GenUI on distilled or open-weight inference will hit a wall — unless they constrain the output space severely (which collapses you back into the declarative pattern).

A parallel research theme: Gradual Generation. Rather than generating the whole interface from a single prompt, the system presents customizations step-by-step as the UI is being built. Solves the discoverability problem of conversational AI — users facing a blank prompt box rarely know what's possible.

05 vibe coding

Joyful creation, flawed software.

The democratization of UI generation has exposed severe fault lines in QA, reliability, and trust.

Vibe coding — relying on AI tools (Gemini Canvas, Claude Artifacts, Copilot) to build software through natural language, intuition, and trial-and-error without deep understanding of the underlying code. The allure: dramatically accelerated prototyping. The slow, one-directional handoff from designer to engineer becomes a rapid AI-mediated co-creation loop. Google's Vibe Coding XR translates prompts into fully interactive, physics-aware WebXR applications in under 60 seconds.

the speed-quality paradox A systematic grey-literature review at ICSE 2026 analyzed hundreds of firsthand accounts. Practitioners report high motivation from rapid iteration and the psychological experience of instant success and flow — while simultaneously acknowledging the resulting software is profoundly flawed. Many skip testing entirely, blindly deploy model outputs without modification, or dangerously delegate security and functional checks back to the very AI that generated the code.

Research from ETH Zürich confirms vibe coding is not a panacea for non-technical users. It requires both exceptional written communication and foundational CS understanding to effectively steer the model, evaluate generated logic, and course-correct when the AI strays. Marketing hype suggests anyone can ship a product; reality is more like: the floor rose, but the ceiling moved with it.

06 tools

The commercial stack, by job.

Tool choice depends entirely on the architectural needs of the production environment. Free-form for prototyping; component-based for shipping.

design hub

Figma + AI Assist

Suggests layouts, generates contextual copy, audits inconsistencies across component libraries. Enterprise UX teams needing strict consistency.

prototyping

Framer AI · Claude Design

Translates prompts into interactive prototypes and fluid animations. Product managers validating complex flows.

handoff

TeleportHQ · Sketch2React

Ingests designs and prompts to generate production-ready React components mapped to existing frameworks.

sme tools

Uizard · UiMagic

Hand-drawn sketches or text descriptions into functional UI prototypes for dashboards and business applications.

campaigns

Canva Magic · MJ v7

Hyper-realistic image generation, smart auto-layouts, instant background manipulation for marketing workflows.

runtime

CopilotKit · A2UI clients

The plumbing layer — orchestrates declarative GenUI in production apps via AG-UI and A2UI protocols.

07 who needs it

E-commerce eats the first plate.

$262B of online holiday spend in 2025 was influenced by AI agents. Conversion lift, time-on-site, retention — all measurable, all positive.

$262Bglobal online holiday spend influenced by AI agents in 2025

+31%higher conversion for AI-referred traffic

+45%more time on site for AI-referred shoppers

59%faster YoY sales growth for retailers running branded agents (6.2% vs 3.9%)

The pure-play generative-AI-in-e-commerce market is valued at $1.24B in 2026, projected toward $3.94B by the 2030s at an 18.5%+ CAGR. GenUI solves the heavy-catalog problem: instead of forcing a user to navigate taxonomy trees, an agent interprets intent, instantly generates a comparison matrix, dynamically adjusts pricing, and produces targeted descriptions on the fly.

Enterprise SaaS is the second wave. Historically, platforms required exporting raw data into external BI tools. In 2026, AI-driven embedded analytics is paradigm-shifting. Users query datasets conversationally; the agent identifies patterns and dynamically generates a self-service dashboard with line charts, trend models, and alerts inside the workflow.

Enterprise workflows bridge the structured-data dilemma. Chat is great for flexibility, bad for complex business data. GUIs handle data well but are rigid. GenUI bridges. Mercedes-Benz uses generative interfaces as smart sales assistants. DocuSign summarizes legal agreements with UI panels highlighting clauses alongside contract text.

Hospitality uses GenUI for ambient personalization — a hotel app showing family-friendly attractions for one guest, fast checkout and expense reporting for another, off the same prompt.

08 roles

The death of the persona.

Wireframes anticipated the statistical average. GenUI generates for the individual, in real-time. The job changes.

Static personas — built on demographics and high-level goals, used to justify rigid wireframes — were always a compression artifact. GenUI removes the need to compress. The designer's job pivots from spatial layouts to policies and constraints that govern the AI.

The UX designer is no longer pushing pixels. They are designing the policies, constraints, and contextual signals that govern the AI.

The new role: Experience Architect. Maps four intent types and designs how the AI infers them from implicit behavioral signals:

informationalUsers want deep knowledge — system generates educational layouts, timelines, explanatory simulations.

navigationalUsers want a specific destination — system generates streamlined, frictionless routing.

commercialUsers want to consider options — system generates comparison matrices and feature highlights.

transactionalUsers want to purchase — system strips distractions, generates simplified secure checkout.

The experience architect must define how the system fails gracefully when intent is misread — ensuring users are never left in an unrecoverable digital state. For frontend developers, boilerplate decreases but systems-architecture burden increases: securing endpoints, managing state synchronization, ensuring component libraries are robust and orchestrable dynamically.

09 bottlenecks

Latency. Hallucination. The physical limit.

Full-page HTML regeneration can consume 220,000 tokens per session, requiring one to five minutes. Sub-second UX is non-negotiable. Something has to give.

Empirical studies through 2025 show full-page HTML regeneration consuming upwards of 220,000 tokens per session, requiring one to five minutes to render. In an era of sub-second expectations, minute-scale latency is a catastrophic regression. Edge devices feel it worst — interactive implementations (e.g., real-time object detection every 100ms to generate audio-framing interfaces on a smartphone) deplete batteries and overwhelm thermal headroom.

Enterprise architects mitigate with aggressive caching, smaller domain-specific models (which raise error rates, as the Flash-Lite finding showed), or restricting full generation to initial onboarding only.

The second wall is hallucination. In a chatbot, a hallucination is a wrong paragraph. In GenUI, it's a corrupted React component, an unhandled exception that crashes the app, or sensitive data exposed in the rendered output. LLMs are statistically optimized to produce the most likely token sequence — not to evaluate confidence — so they default to guessing rather than acknowledging uncertainty.

01 · Context injection & RAG

Restrict the AI to approved enterprise data distributions and design tokens. Reduces propensity to invent unsupported structures.

02 · Prompt engineering constraints

One-shot and few-shot methods strictly enforce output formats. Mandate valid JSON. Restrict behavior shape.

03 · Human-in-the-loop validation

Closed feedback loops where domain experts evaluate generated responses against predefined rubrics. Tagging informs continuous fine-tuning.

04 · Confidence calibration

Automated uncertainty estimation and reasoning-consistency checks intercept and block low-confidence code before it renders.

10 accessibility

Revolutionary solution. New barrier.

GenUI can dynamically heal accessibility failures in user-generated content. It can also destroy the spatial consistency two decades of web users rely on.

Traditional accessibility relies on static standards (WCAG) and developer compliance. The model fundamentally fails in platforms dominated by unpredictable user-generated content — C2C marketplaces where product photos arrive blurry, descriptions lack structural formatting, and page hierarchy shifts erratically from listing to listing.

GenUI bridges these gaps through Natively Adaptive Interfaces (NAI). Adaptability is baked into the runtime, not bolted on as compliance afterthought. For a blind user: autonomously restructure into clean semantic format optimized for screen readers. For an older adult selling an item: replace a complex web form with a conversational guidance chatbot. For a visually impaired seller: dynamically generate an audio-guided photo framing interface using real-time object detection.

A constantly mutating, dynamically generated UI destroys the spatial consistency two decades of web users have internalized as mental models.

But radical adaptability introduces severe usability risks. Predictability is a core UX tenet; users build mental models on spatial consistency — search top right, logo top left. A constantly mutating UI destroys this and heavily increases cognitive load. When models are given freedom to generate raw markup, they frequently omit ARIA labels, alt text, and logical heading structures — actively violating the very standards they're meant to enhance.

The implication: designers shift from drawing static layouts to writing rigid fail-safe policies that legally and functionally mandate accessible metadata in every dynamically generated element.

11 outlook

Structured creativity, not free-form code.

The future doesn't belong to unleashed models writing HTML at runtime. It belongs to disciplined orchestration of secure, pre-tested component registries.

The ascendance of GenUI represents a fundamental reimagining of human-computer interaction. By shifting from forcing users to navigate static features to deploying AI models that dynamically architect bespoke interfaces based on contextual intent, GenUI transforms software from a rigid tool into a collaborative, malleable environment.

The empirical data is unambiguous: users prefer rich, customized experiences over traditional static interfaces. Commercial implications are vast — early adopters in e-commerce, SaaS, and hospitality realize unprecedented conversion lifts, accelerated workflow efficiencies, massive ROI.

But rapid democratization through vibe coding has exposed fault lines in QA, reliability, and state management. Compute overhead introduces latency that can cripple UX; hallucinations demand multi-tiered mitigation; GenUI's accessibility power threatens the spatial consistency on which traditional usability relies.

The long-term deployment of GenUI will not rely on unleashing unbounded models to write free-form code. It demands a highly disciplined architectural approach based on declarative protocols and Structured Creativity — AI dynamically orchestrating interfaces exclusively from secure, rigorously tested, accessibility-compliant enterprise component registries. The fluidity of generative systems, with the unyielding reliability required of modern digital infrastructure.