Radiance: Architecture of a Professional HDR Processing Suite for Generative AI Pipelines

FXTD Studios Pipeline Team · Radiance v2.1.0 · February 2026
Abstract. Radiance is a 76-node ComfyUI extension implementing production-grade HDR image processing, color science, film emulation, and real-time interactive viewing for generative AI pipelines. This document describes the system architecture: the 32-bit floating-point tensor pipeline, the multi-layer color science stack built on OpenColorIO and colour-science, the WebGL-accelerated interactive viewer with LRU frame caching and ASC CDL export, the CIE L*a*b* grade matching algorithm, the 33³ .cube LUT bake/apply engine, and the motion-aware temporal smoothing system for AI-generated video flicker reduction. Extension points, performance characteristics, and design rationale are discussed.

§1 System Overview

Radiance extends ComfyUI with a complete VFX-grade post-production layer. Its design goals are:

ComfyUI Graph │ ▼ nodes_sampler.py ← Radiance Sampler Pro (Flux latent → image) │ image: Tensor[B, H, W, 3] fp32 ▼ nodes_grade.py ← Lift/Gamma/Gain, LAB match, preset, grade_info JSON │ ├──→ nodes_lut.py ← LUTBake → .cube file (Resolve/Nuke) │ LUTApply ← trilinear interp │ ├──→ nodes_temporal.py ← EMA flicker reduction (video batches) │ ├──→ nodes_scopes.py ← FalseColor, Waveform, Vectorscope │ ├──→ nodes_overlay.py ← BlendComposite, MetadataOverlay │ └──→ nodes_radiance_viewer.py │ .rhdr fp32 raw data │ .rpick zlib fp32 pick buffer ▼ radiance_webgl.js ← WebGL renderer (GLSL, fp16 textures, GPU histogram) │ └──→ radiance_viewer.js ← UI, CDL export, LRU cache, Display-P3

§2 Module Structure

The package is organized into a flat set of nodes_*.py modules (auto-discovered) and sub-packages for shared logic:

◎ nodes_grade.py

  • RadianceGrade (v2.0)
  • RadianceGradeMatch
  • RadianceApplyGradeInfo
  • _apply_grade(), _match_grade_params(), _rgb_to_lab()

◎ nodes_lut.py

  • RadianceLUTBake
  • RadianceLUTApply
  • _generate_cube_lut(), _write_cube_file()
  • _apply_cube_lut_to_image() — trilinear

◎ nodes_temporal.py

  • RadianceTemporalSmooth
  • RadianceFlickerAnalyze
  • EMA loop, motion mask, JSON stats

◎ nodes_scopes.py

  • RadianceWaveform
  • RadianceVectorscope
  • RadianceFalseColor (v1.0)
  • _FC_ZONES — 7-zone palette

◎ nodes_overlay.py

  • RadianceMetadataOverlay
  • RadianceBlendComposite
  • 8 blend modes, MASK support

◎ nodes_radiance_viewer.py

  • RadianceProViewer
  • _save_pick_buffer() — zlib fp32
  • build_cdl_xml() — ASC CDL v1.2
  • RPICK_MAGIC header

◎ color/ sub-package

  • color_utils.py — shared transforms
  • Log curve encode/decode
  • Color space matrices (sRGB, P3, AWG4)

◎ film/ sub-package

  • camera_profiles.py — 30+ sensors
  • Grain algorithms, halation
  • Film stock transfer curves

◎ hdr/ sub-package

  • Tone mapping operators
  • Exposure blend (Mertens)
  • Highlight synthesis

◎ js/ (frontend)

  • radiance_webgl.js — GPU renderer
  • radiance_viewer.js — viewer UI
  • radiance_layout.js — node layouts

§3 Data Pipeline

All inter-node communication uses PyTorch fp32 CPU tensors of shape [B, H, W, C] where B = batch size, H = height, W = width, C = channels (typically 3 for RGB). This matches ComfyUI's standard IMAGE convention.

fp32 Guarantee

Every node casts inputs via .float() immediately and never calls .clamp(0,1) on intermediate data. The VFX audit test suite (89 tests) verifies this with dedicated tests:

test_pipeline_no_clamp_hdr   # values > 1.0 must survive full pipeline
test_pipeline_preserves_float32  # dtype must remain fp32 at output
test_pure_black_no_lift      # 0.0 → 0.0 through all grade nodes

Pick Buffer Sidecar (.rpick)

When the viewer processes a frame, _save_pick_buffer() downsamples the raw fp32 tensor to ≤256px and saves it as a zlib-compressed binary sidecar alongside the display PNG:

fp32 tensor [B,H,W,3]
resize to ≤256px
RPICK_MAGIC + zlib(numpy.tobytes)
frame_N.rpick

The JavaScript viewer fetches .rpick on hover to read true scene-linear HDR values at the cursor — bypassing the tonemapped 8-bit display PNG and providing accurate EV readout.

§4 Color Science Stack

Radiance implements a layered color pipeline that mirrors broadcast and digital cinema workflows:

Layer 1 — Input Transform (IDT)

sRGB linearize ARRI LogC3 decode ARRI LogC4 decode RED Log3G10 decode Panasonic V-Log decode Canon Log3 decode Sony S-Log3 decode

Layer 2 — Working Space

ACEScg (AP1) ACES AP0 sRGB Linear Rec.2020 Linear DaVinci Wide Gamut ARRI Wide Gamut 4 XYZ D65

Layer 3 — Grade (CDL)

Lift (per-channel) Gamma (sign-preserving power) Gain (per-channel) Offset (global) Contrast (pivot) Saturation (luma-preserving)

Layer 4 — Look (LUT)

33³ .cube trilinear OCIO CDL .cc/.ccc LAB match offset

Layer 5 — Output Transform (ODT)

ACES 2.0 RRT+ODT AgX Filmic (Hable) Reinhard sRGB OETF Rec.709 EOTF Display-P3

§5 Viewer Architecture

The Radiance Pro Viewer is a custom ComfyUI widget implemented as a full-screen canvas overlay. It communicates with the Python backend through ComfyUI's websocket api message bus:

Python backend nodes_radiance_viewer.py execute() → saves frame_N.png (display), frame_N.rhdr (raw fp32), frame_N.rpick (pick) → api.send_sync("radiance_result", { images: [...] }) │ │ WebSocket ▼ radiance_viewer.js RadianceViewer class ├─ loadCurrentFrame() ← fetches .rhdr, uploads via loadFloat16TextureCached() ├─ _fetchPickBuffer() ← fetches .rpick for fp32 hover values ├─ _renderGPUHistogram() ← delegates to renderer.renderHistogram() ├─ exportCDL() ← encodes grade_info → ASC CDL v1.2 XML, downloads └─ importCDL() ← parses .cdl XML, applies to active grade │ ▼ radiance_webgl.js RadianceWebGLRenderer ├─ loadFloat16TextureCached(frameId, data) ← LRU Map(8) ├─ renderHistogram(canvas, logScale) ← 256-bin GPU pass ├─ setLinearFalseColor(v) ← pre-OETF false color └─ static initDisplayP3(canvas) ← CSS matchMedia P3 detection

§6 WebGL Renderer Pipeline

The renderer uses WebGL 2.0 with the OES_texture_half_float extension for fp16 texture storage. The GLSL pipeline processes scene-linear data and applies the OETF (display transform) on-GPU:

// GLSL fragment shader (simplified)
uniform sampler2D u_hdrTexture;   // fp16 scene-linear
uniform float u_exposure;
uniform float u_gamma;
uniform bool u_linearFalseColor;  // v2.1: evaluate before OETF

vec3 hdr = texture(u_hdrTexture, v_uv).rgb;
hdr *= pow(2.0, u_exposure);

// False color evaluated in LINEAR space
if (u_linearFalseColor) {
    fragColor = vec4(falseColorLookup(hdr), 1.0);
    return;
}

// OETF (configurable: sRGB / Rec.709 / P3)
vec3 display = applyOETF(hdr, u_oetfMode);
fragColor = vec4(display, 1.0);

LRU Frame Cache

The viewer maintains an LRU Map of up to 8 WebGL texture objects keyed by frameId. This eliminates re-uploads during sequence scrubbing — a common bottleneck when working with large EXR sequences at 4K.

// js/radiance_webgl.js
loadFloat16TextureCached(frameId, data, width, height) {
    if (this._lruCache.has(frameId)) {
        // Move to end (most recently used)
        const tex = this._lruCache.get(frameId);
        this._lruCache.delete(frameId);
        this._lruCache.set(frameId, tex);
        return tex;
    }
    // Evict oldest if full
    if (this._lruCache.size >= 8) {
        const oldest = this._lruCache.keys().next().value;
        this.gl.deleteTexture(this._lruCache.get(oldest));
        this._lruCache.delete(oldest);
    }
    const tex = this._uploadHalfFloat(data, width, height);
    this._lruCache.set(frameId, tex);
    return tex;
}

§7 Temporal Processing

Exponential Moving Average (EMA)

RadianceTemporalSmooth applies per-pixel EMA across a temporal batch to reduce high-frequency flicker in AI-generated video. The update rule is:

ema_t = α · frame_t + (1 - α) · ema_{t-1}

where α ∈ (0,1] controls the blend weight. Lower α = more smoothing, higher α approaches passthrough.

Motion-Aware Masking

To preserve sharp moving objects while smoothing static background grain, the motion-aware mode computes a per-pixel motion magnitude and adapts α locally:

motion_mag = |frame_t - ema_{t-1}|.mean(dim=-1)   # H×W scalar
motion_mask = (motion_mag > threshold).float()       # 0 or 1
eff_alpha = α · (1 - motion_mask) + 1.0 · motion_mask
# → α on static pixels, 1.0 on moving pixels (no blend = sharp)

Flicker Index Metric

RadianceFlickerAnalyze computes the flicker index as the coefficient of variation of per-frame luma means:

flicker_index = std(frame_means) / mean(frame_means)

Values below 0.01 are imperceptible; above 0.05 are visible to the human eye in rapid playback. This metric matches the ITU-R BT.1203 temporal uniformity definition.

§8 LUT Engine

Baking (RadianceLUTBake)

LUT baking samples the grade function on a 33³ identity lattice and writes the .cube format:

# Build identity grid — .cube ordering: R fastest, B slowest
lin = linspace(0, 1, 33)
r_grid = lin.repeat(33 * 33)
g_grid = lin.repeat_interleave(33).repeat(33)
b_grid = lin.repeat_interleave(33 * 33)
grid = stack([r_grid, g_grid, b_grid], dim=-1)   # (33³, 3)
out = _apply_grade(grid, ...)                      # apply all grade ops
cube = clamp(out, 0, 1)                            # clamp for SDR LUT

Application (RadianceLUTApply)

LUT application uses 8-corner trilinear interpolation:

# .cube index: B*n² + G*n + R
r0, r1 = floor(R*(n-1)), ceil(R*(n-1))
# ... similarly for G, B
out = c000*(1-rf)*(1-gf)*(1-bf) + c100*rf*(1-gf)*(1-bf) +
      c010*(1-rf)*gf*(1-bf)   + c110*rf*gf*(1-bf) +
      c001*(1-rf)*(1-gf)*bf   + c101*rf*(1-gf)*bf +
      c011*(1-rf)*gf*bf       + c111*rf*gf*bf

This is implemented entirely in PyTorch tensor ops, making it GPU-acceleratable without any custom CUDA kernels.

§9 Grade Matching Algorithm

RadianceGradeMatch transfers the color statistics of a reference image to a source image using CIE L*a*b* color space. This algorithm is equivalent to the Reinhard et al. [1] color transfer method.

Algorithm

1. Convert both images to CIE L*a*b* (D65 white point)
2. Compute per-channel mean μ and std σ for source and target
3. Scale ratio: s = σ_target / σ_source
4. Shift: t = (μ_target - μ_source * s) / 100   (normalized)
5. Map L* channel → uniform gain (luminance) + offset
   Map a*, b* channels → color cast offset in R,G,B space
6. Blend computed params at match_strength ∈ [0,1]

The result is a CDL-compatible gain/offset set stored as JSON in grade_info, enabling the match to be applied to arbitrary other images via ApplyGradeInfo.

§10 Extension Points

Adding a New Node

  1. Create nodes_myfeature.py with a class, NODE_CLASS_MAPPINGS, and NODE_DISPLAY_NAME_MAPPINGS
  2. That's it. __init__.py discovers it automatically via glob("nodes_*.py")

Adding a New Tone Mapping Operator

All tone mapping operators are functions in hdr/ with signature f(img: Tensor) → Tensor. Register the function name in the TONEMAPPER_MAP dict in nodes_hdr.py.

Adding a New Log Curve

Encode/decode pairs are registered in color_utils.py as (encode_fn, decode_fn) tuples in the LOG_CURVES dict. A camera preset can then reference the curve by name string.

Adding a New WebGL Scope

Add a GLSL fragment shader method and a JavaScript rendering function to radiance_webgl.js. Wire the keyboard shortcut in radiance_viewer.js. No Python changes needed for display-only scopes.

§11 Performance Characteristics

Operation Backend GPU Speedup Note
Tone Mapping PyTorch GPU 20–50× Fully vectorized over batch
Log Curves PyTorch GPU 20× Piecewise function via torch.where
Grade (LGG) PyTorch GPU 25× Fused into single tensor pass
LUT Apply PyTorch GPU 10× Trilinear, vectorized 8-corner gather
Temporal Smooth PyTorch CPU Loop is sequential by design (EMA)
FalseColor (node) PyTorch GPU torch.where over zone thresholds
FalseColor (viewer) GLSL >100× GPU fragment shader
GPU Histogram GLSL >50× 256-bin accumulate pass
Frame Upload (LRU hit) WebGL Zero re-upload from cache
Grade Matching (LAB) PyTorch CPU Statistics-only, not per-pixel
GPU Support

All PyTorch operations respect the tensor's current device. If ComfyUI is configured with CUDA or Apple MPS, tensors are processed on-device automatically. The viewer's WebGL renderer runs entirely on the client GPU, independent of the server backend.

References

  1. Reinhard, E., Ashikhmin, M., Gooch, B., Shirley, P. Color Transfer between Images. IEEE CGA, 2001.
  2. Hable, J. Filmic Tonemapping Operators. GDC 2010.
  3. Hill, S. HDR Color in Call of Duty. SIGGRAPH 2014.
  4. Academy of Motion Picture Arts and Sciences. ACES 2.0 Reference Rendering Transform. 2024.
  5. Sobotka, T. AgX: A Minimal Color Transform. Blender Institute, 2023.
  6. Magnor, M. et al. Digital Video Processing for Engineers. Morgan & Claypool, 2012.
  7. OpenColorIO Contributors. OpenColorIO v2 Architecture. ASWF, 2023.
  8. Colour-Science for Python. https://www.colour-science.org/. 2024.
  9. Narkowicz, K. ACES Filmic Tone Mapping Curve. Blog, 2016.
  10. ITU-R BT.1203. Subjective Picture Quality Assessment for Digital Cable Television Systems. ITU, 1994.