Architecture

How the DisplayXR stack works, from OpenXR API layer through native compositors to vendor display processors.

Stack Overview

DisplayXR sits between the OpenXR API and vendor-specific display hardware. Applications write to the standard OpenXR interface. DisplayXR handles session management, compositor orchestration, and extension dispatch. Vendor-specific processing — weaving, interlacing, calibration — happens below, in the display processor layer. The same runtime ships on Windows, macOS, and Android.

The DisplayXR stack: an app on any engine or graphics API talks to the DisplayXR runtime (OpenXR state tracker with native D3D11, D3D12, Vulkan, Metal, and OpenGL compositors), which crosses the neutral xrt_plugin ABI into the vendor display processor and out to the 3D display.

Each graphics API gets its own native compositor. No cross-API interop or Vulkan intermediary is required.

Native Compositor Model

Most XR runtimes use a single graphics API internally and translate submitted textures as needed. DisplayXR takes a different approach: each supported graphics API has its own dedicated compositor implementation.

D3D11Full compositor with window binding. On Leia hardware the weaver runs in the Leia SR plug-in, not the compositor.
D3D12Native compositor with window binding. Command queue managed per session.
VulkanNative compositor with swapchain management. MoltenVK path available on macOS; the Android compositor is Vulkan-native.
MetalNative compositor with sim_display weaver and window binding. macOS primary path.
OpenGLNative compositor supporting both Windows and macOS contexts.

Platforms

DisplayXR runs on three platforms from one codebase. On Windows it drives LeiaSR displays through the D3D11/D3D12/Vulkan/OpenGL compositors; on macOS it ships the Metal, OpenGL, and MoltenVK paths against the simulation backend.

On Android, the same OpenXR runtime drives integrated 3D tablets and handhelds — such as ZTE's Nubia Pad 2 and Red Magic Explorer 3D — through the native Vulkan compositor. The vendor display processor runs out-of-process as a service, so the runtime stays vendor-neutral and apps connect over IPC with a zero-copy buffer handoff. Rendering is orientation-aware — portrait and landscape share one worst-case atlas, with no stall on live rotation — and the same display-zone and see-through transparency model used on the desktop lets a weaved 3D object sit beside a flat 2D HUD or float over the live screen.

One Model for 2D and 3D Regions

A spatial-display window is rarely all-3D. A weaved 3D object often needs to sit beside a flat 2D HUD, a toolbar, or live screen content. DisplayXR expresses every such layout through a single mechanism — XR_EXT_display_zones: an app declares any number of 3D zones (each a rectangle with its own view rig and swapchain) alongside any number of 2D zones, plus a per-pixel wish mask that tells the panel which areas should be physically 3D and which should stay flat. There is now exactly one way to say "this region is 3D, that region is 2D."

This replaces an earlier, narrower mechanism. The original 2D-surround / output-rect path could only express a single 3D rectangle surrounded by one monolithic 2D fill — a strict special case of the zone model, where the output rect is just one 3D zone and the surround one 2D zone. It was retired in runtime v1.25.0, folding both 2D and 3D region expression onto the same compositing path and removing the redundant per-API fill code.

Crucially, how an app owns its output surface is orthogonal to how it expresses regions. Whether the app draws into its own window, shares a texture with the runtime, or lets the runtime host a window for it, zones are how all of them carve up 2D and 3D. A plain full-window app is simply the degenerate one-zone case — it needs no explicit zones at all.

Shipping Components

A DisplayXR install delivers four cooperating pieces. Most applications only interact with the first; the others come into play when apps are sandboxed, when the shell is running, or when the web is the target surface.

RuntimeOpenXR API implementation. Loaded in-process by every OpenXR application.
ServiceIPC server and multi-compositor. Hosts the display for sandboxed apps and multi-app shell sessions. Starts at login and sits in the system tray.
ShellReference workspace controller. Arranges 3D and 2D apps in a shared 3D scene with window chrome, layout presets, and an app launcher. Distributed separately from the runtime; entirely optional. See Workspace Controllers below.
WebXR BridgeChrome extension plus a local bridge that gives WebXR pages the full DisplayXR surface — display geometry, rendering mode switching, tracked eye poses, HUD overlay, and input forwarding — on top of Chrome's native WebXR session. Ships with a three.js reference sample and a standalone minimal starter, and falls back gracefully to standard WebXR when the extension isn't installed.

How the Pieces Ship

DisplayXR is deliberately split across repositories so each piece evolves on its own cadence and the runtime stays vendor-neutral — the runtime binary carries no vendor SDK and no shell code. You can install everything at once with the meta-installer, or add components individually.

RuntimeThe OpenXR runtime, service, and native compositors. displayxr-runtime. Windows installer + macOS .pkg.
Vendor plug-insDisplay-processor DLLs the runtime discovers at startup — the weaving / interlacing for a specific panel. The Leia SR plug-in is the reference. See the Vendors page.
Workspace controllersOptional spatial-desktop processes (windowing, launcher). The reference DisplayXR Shell ships from its own repo — see below.
Meta-installerBundles the above with pinned, compatible versions into one download. displayxr-installer.

Workspace Controllers

The runtime is useful on its own. A bare install gives you a standards-compliant OpenXR + WebXR surface for a 3D display — full-screen apps, simulation backend, native compositors, the WebXR bridge. No spatial desktop, no windowing, no launcher.

A workspace controller is an optional process that adds spatial-desktop features on top: multi-app composition, window chrome, layout presets, an app launcher. The runtime exposes two extensions for this: XR_EXT_spatial_workspace for window pose / hit-test / capture, and XR_EXT_app_launcher for tile registration and lifecycle. Anything that speaks them is a first-class controller.

Reference: DisplayXR Shell

Distributed separately as a polished, opinionated spatial desktop — 3D window manager, 2D capture, focus-adaptive 2D/3D mode, layout presets, launcher, MCP control. Optional.

Build your own

OEM-branded workspace, vertical cockpit (CAD, medical, automotive HMI), kiosk, or AI-agent driver. Implement the two extensions, register your binary, and the runtime treats it the same as the reference shell.

Activation is gated by orchestrator-PID match: the runtime trusts the binary it was configured to spawn, not a brand string. OEMs point a single service.json field at whichever controller the SKU should run.

Spec details: workspace-controller-registration.md.

AI Control Surface

DisplayXR exposes live spatial state and control to AI agents over the Model Context Protocol. The framework is a separate, embeddable library at displayxr-mcp — JSON-RPC 2.0 over a unix-socket / Windows-named-pipe transport, with a stdio bridge for any MCP client (Claude Code, voice CLI, custom agent).

Runtime tools (Phase A)

Per-PID server inside each app's runtime DLL. Introspection: list_sessions, get_display_info, get_kooima_params, capture_frame, tail_log.

Workspace tools (Phase B)

Per-workspace-controller server. Window control: list_windows, get/set_window_pose, set_focus, save/load_workspace. Lives in the controller, not the runtime — third-party controllers ship their own.

The library has no runtime or shell coupling — any C project can consume it via CMake FetchContent and register its own tools. Every tool call is audit-logged and gated by a per-client allowlist.

End users opt in by installing DisplayXR MCP Tools ( releases) — an optional third installer alongside the runtime and the shell. It writes HKLM\Software\DisplayXR\Capabilities\MCP\Enabled, a registry capability flag that the runtime and the shell read at startup to spawn their MCP server thread. The DISPLAYXR_MCP=1 environment variable still works as a process-local override (CI / dev / quick-disable).

Two Compositor Paths

The runtime chooses at session creation whether to composite inside the application process or delegate to the service. This decision is transparent to the application — it just uses OpenXR as normal.

In-process (native)

The app, compositor, and display processor all live in one process, on the app's own GPU device. Zero IPC overhead. Used by most native applications running outside a sandbox.

IPC (service)

The app connects to the service over a named pipe. Swapchain textures are shared cross-process via OS primitives. The service composites all connected clients into a single output. Used by sandboxed browsers (Chrome WebXR) and apps launched by the shell.

The runtime picks IPC automatically when it detects a sandboxed process (Chrome AppContainer, UWP) or a shell-managed session; otherwise it composites in-process.

Per-Graphics-API Design

This per-API design means no texture copies between APIs, no translation overhead, and no dependency on a single "blessed" graphics backend. The compositor that runs is the one that matches the application's chosen API.

The runtime selects the correct compositor at session creation time based on the graphics binding the application provides. This is transparent to the application — it simply uses OpenXR as normal.

Separation of Concerns

DisplayXR draws a clean boundary between two responsibilities:

App-Facing Portability

Standard OpenXR API, session management, swapchain handling, extension dispatch. Applications write once against this interface.

Vendor-Specific Processing

Weaving, interlacing, calibration, eye tracking integration. This lives in the display processor layer and is owned by the hardware vendor.

Simulation Driver

DisplayXR includes a simulation display processor (sim_display) that allows development and testing without physical 3D display hardware. It provides the same interface as a hardware-backed display processor but renders to a standard window.

This means developers can build, test, and iterate on spatial display applications using any standard monitor. The simulation path supports all graphics APIs and all application classes.

For a deeper look at the in-process vs service compositor split, see in-process-vs-service.md in the runtime repo.

Explore the full runtime source code on GitHub.

Where to next