Architecture
How the DisplayXR stack works, from OpenXR API layer through native compositors to vendor display processors.
Stack Overview
DisplayXR sits between the OpenXR API and vendor-specific display hardware. Applications write to the standard OpenXR interface. DisplayXR handles session management, compositor orchestration, and extension dispatch. Vendor-specific processing — weaving, interlacing, calibration — happens below, in the display processor layer.
Each graphics API gets its own native compositor. No cross-API interop or Vulkan intermediary is required.
Native Compositor Model
Most XR runtimes use a single graphics API internally and translate submitted textures as needed. DisplayXR takes a different approach: each supported graphics API has its own dedicated compositor implementation.
Shipping Components
A DisplayXR install delivers four cooperating pieces. Most applications only interact with the first; the others come into play when apps are sandboxed, when the shell is running, or when the web is the target surface.
Workspace Controllers
The runtime is useful on its own. A bare install gives you a standards-compliant OpenXR + WebXR surface for a 3D display — full-screen apps, simulation backend, native compositors, the WebXR bridge. No spatial desktop, no windowing, no launcher.
A workspace controller is an optional process that adds spatial-desktop features on top: multi-app composition, window chrome, layout presets, an app launcher. The runtime exposes two extensions for this: XR_EXT_spatial_workspace for window pose / hit-test / capture, and XR_EXT_app_launcher for tile registration and lifecycle. Anything that speaks them is a first-class controller.
Reference: DisplayXR Shell
Distributed separately as a polished, opinionated spatial desktop — 3D window manager, 2D capture, focus-adaptive 2D/3D mode, layout presets, launcher, MCP control. Optional.
Build your own
OEM-branded workspace, vertical cockpit (CAD, medical, automotive HMI), kiosk, or AI-agent driver. Implement the two extensions, register your binary, and the runtime treats it the same as the reference shell.
Activation is gated by orchestrator-PID match: the runtime trusts the binary it was configured to spawn, not a brand string. OEMs point a single service.json field at whichever controller the SKU should run.
Spec details: workspace-controller-registration.md.
AI Control Surface
DisplayXR exposes live spatial state and control to AI agents over the Model Context Protocol. The framework is a separate, embeddable library at displayxr-mcp — JSON-RPC 2.0 over a unix-socket / Windows-named-pipe transport, with a stdio bridge for any MCP client (Claude Code, voice CLI, custom agent).
Runtime tools (Phase A)
Per-PID server inside each app's runtime DLL. Introspection: list_sessions, get_display_info, get_kooima_params, capture_frame, tail_log.
Workspace tools (Phase B)
Per-workspace-controller server. Window control: list_windows, get/set_window_pose, set_focus, save/load_workspace. Lives in the controller, not the runtime — third-party controllers ship their own.
The library has no runtime or shell coupling — any C project can consume it via CMake FetchContent and register its own tools. Every tool call is audit-logged and gated by a per-client allowlist.
End users opt in by installing DisplayXR MCP Tools ( releases) — an optional third installer alongside the runtime and the shell. It writes HKLM\Software\DisplayXR\Capabilities\MCP\Enabled, a registry capability flag that the runtime and the shell read at startup to spawn their MCP server thread. The DISPLAYXR_MCP=1 environment variable still works as a process-local override (CI / dev / quick-disable).
Two Compositor Paths
The runtime chooses at session creation whether to composite inside the application process or delegate to the service. This decision is transparent to the application — it just uses OpenXR as normal.
In-process (native)
The app, compositor, and display processor all live in one process, on the app's own GPU device. Zero IPC overhead. Used by most native applications running outside a sandbox.
IPC (service)
The app connects to the service over a named pipe. Swapchain textures are shared cross-process via OS primitives. The service composites all connected clients into a single output. Used by sandboxed browsers (Chrome WebXR) and apps launched by the shell.
The runtime picks IPC automatically when it detects a sandboxed process (Chrome AppContainer, UWP) or a shell-managed session; otherwise it composites in-process.
Per-Graphics-API Design
This per-API design means no texture copies between APIs, no translation overhead, and no dependency on a single "blessed" graphics backend. The compositor that runs is the one that matches the application's chosen API.
The runtime selects the correct compositor at session creation time based on the graphics binding the application provides. This is transparent to the application — it simply uses OpenXR as normal.
Separation of Concerns
DisplayXR draws a clean boundary between two responsibilities:
App-Facing Portability
Standard OpenXR API, session management, swapchain handling, extension dispatch. Applications write once against this interface.
Vendor-Specific Processing
Weaving, interlacing, calibration, eye tracking integration. This lives in the display processor layer and is owned by the hardware vendor.
Simulation Driver
DisplayXR includes a simulation display processor (sim_display) that allows development and testing without physical 3D display hardware. It provides the same interface as a hardware-backed display processor but renders to a standard window.
This means developers can build, test, and iterate on spatial display applications using any standard monitor. The simulation path supports all graphics APIs and all application classes.
For a deeper look at the in-process vs service compositor split, see in-process-vs-service.md in the runtime repo.
Explore the full runtime source code on GitHub.