Architecture

How the DisplayXR stack works, from OpenXR API layer through native compositors to vendor display processors.

Stack Overview

DisplayXR sits between the OpenXR API and vendor-specific display hardware. Applications write to the standard OpenXR interface. DisplayXR handles session management, compositor orchestration, and extension dispatch. Vendor-specific processing — weaving, interlacing, calibration — happens below, in the display processor layer.

App (any graphics API)

OpenXR API Layer

DisplayXR Runtime

D3D11

D3D12

Vulkan

Metal

OpenGL

Display Processor (vendor-specific)

3D Display

Each graphics API gets its own native compositor. No cross-API interop or Vulkan intermediary is required.

Native Compositor Model

Most XR runtimes use a single graphics API internally and translate submitted textures as needed. DisplayXR takes a different approach: each supported graphics API has its own dedicated compositor implementation.

D3D11Full compositor with LeiaSR weaver integration and window binding support.

D3D12Native compositor with window binding. Command queue managed per session.

VulkanNative compositor with swapchain management. MoltenVK path available on macOS.

MetalNative compositor with sim_display weaver and window binding. macOS primary path.

OpenGLNative compositor supporting both Windows and macOS contexts.

Shipping Components

A DisplayXR install delivers four cooperating pieces. Most applications only interact with the first; the others come into play when apps are sandboxed, when the shell is running, or when the web is the target surface.

RuntimeOpenXR API implementation. Loaded in-process by every OpenXR application.

ServiceIPC server and multi-compositor. Hosts the display for sandboxed apps and multi-app shell sessions. Starts at login and sits in the system tray.

ShellReference workspace controller. Arranges 3D and 2D apps in a shared 3D scene with window chrome, layout presets, and an app launcher. Distributed separately from the runtime; entirely optional. See Workspace Controllers below.

WebXR BridgeChrome extension plus a local bridge that gives WebXR pages the full DisplayXR surface — display geometry, rendering mode switching, tracked eye poses, HUD overlay, and input forwarding — on top of Chrome's native WebXR session. Ships with a three.js reference sample and a standalone minimal starter, and falls back gracefully to standard WebXR when the extension isn't installed.

Workspace Controllers

The runtime is useful on its own. A bare install gives you a standards-compliant OpenXR + WebXR surface for a 3D display — full-screen apps, simulation backend, native compositors, the WebXR bridge. No spatial desktop, no windowing, no launcher.

A workspace controller is an optional process that adds spatial-desktop features on top: multi-app composition, window chrome, layout presets, an app launcher. The runtime exposes two extensions for this: XR_EXT_spatial_workspace for window pose / hit-test / capture, and XR_EXT_app_launcher for tile registration and lifecycle. Anything that speaks them is a first-class controller.

Reference: DisplayXR Shell

Distributed separately as a polished, opinionated spatial desktop — 3D window manager, 2D capture, focus-adaptive 2D/3D mode, layout presets, launcher, MCP control. Optional.

Build your own

OEM-branded workspace, vertical cockpit (CAD, medical, automotive HMI), kiosk, or AI-agent driver. Implement the two extensions, register your binary, and the runtime treats it the same as the reference shell.

Activation is gated by orchestrator-PID match: the runtime trusts the binary it was configured to spawn, not a brand string. OEMs point a single service.json field at whichever controller the SKU should run.

Spec details: workspace-controller-registration.md.

AI Control Surface

DisplayXR exposes live spatial state and control to AI agents over the Model Context Protocol. The framework is a separate, embeddable library at displayxr-mcp — JSON-RPC 2.0 over a unix-socket / Windows-named-pipe transport, with a stdio bridge for any MCP client (Claude Code, voice CLI, custom agent).

Runtime tools (Phase A)

Per-PID server inside each app's runtime DLL. Introspection: list_sessions, get_display_info, get_kooima_params, capture_frame, tail_log.

Workspace tools (Phase B)

Per-workspace-controller server. Window control: list_windows, get/set_window_pose, set_focus, save/load_workspace. Lives in the controller, not the runtime — third-party controllers ship their own.

The library has no runtime or shell coupling — any C project can consume it via CMake FetchContent and register its own tools. Every tool call is audit-logged and gated by a per-client allowlist.

End users opt in by installing DisplayXR MCP Tools ( releases) — an optional third installer alongside the runtime and the shell. It writes HKLM\Software\DisplayXR\Capabilities\MCP\Enabled, a registry capability flag that the runtime and the shell read at startup to spawn their MCP server thread. The DISPLAYXR_MCP=1 environment variable still works as a process-local override (CI / dev / quick-disable).

Two Compositor Paths

The runtime chooses at session creation whether to composite inside the application process or delegate to the service. This decision is transparent to the application — it just uses OpenXR as normal.

In-process (native)

The app, compositor, and display processor all live in one process, on the app's own GPU device. Zero IPC overhead. Used by most native applications running outside a sandbox.

IPC (service)

The app connects to the service over a named pipe. Swapchain textures are shared cross-process via OS primitives. The service composites all connected clients into a single output. Used by sandboxed browsers (Chrome WebXR) and apps launched by the shell.

The runtime picks IPC automatically when it detects a sandboxed process (Chrome AppContainer, UWP) or a shell-managed session; otherwise it composites in-process.

Per-Graphics-API Design

This per-API design means no texture copies between APIs, no translation overhead, and no dependency on a single "blessed" graphics backend. The compositor that runs is the one that matches the application's chosen API.

The runtime selects the correct compositor at session creation time based on the graphics binding the application provides. This is transparent to the application — it simply uses OpenXR as normal.

Separation of Concerns

DisplayXR draws a clean boundary between two responsibilities:

App-Facing Portability

Standard OpenXR API, session management, swapchain handling, extension dispatch. Applications write once against this interface.

Vendor-Specific Processing

Weaving, interlacing, calibration, eye tracking integration. This lives in the display processor layer and is owned by the hardware vendor.

Simulation Driver

DisplayXR includes a simulation display processor (sim_display) that allows development and testing without physical 3D display hardware. It provides the same interface as a hardware-backed display processor but renders to a standard window.

This means developers can build, test, and iterate on spatial display applications using any standard monitor. The simulation path supports all graphics APIs and all application classes.

For a deeper look at the in-process vs service compositor split, see in-process-vs-service.md in the runtime repo.

Explore the full runtime source code on GitHub.