Windows video playback optimisations #1454

richiemcilroy · 2025-12-17T16:03:10Z

Significant improvements to hardware and software video decoding, particularly by dynamically detecting hardware decoder capabilities on Windows and optimizing software decoding threading based on video resolution.

Summary by CodeRabbit

New Features
- Parallel SIMD-enabled video color conversions with progress reporting and cancellation.
- Hardware decoder capability detection and caching with automatic fallback to software decoding.
- Dynamic GPU texture sizing with zero-copy fallback on constrained devices.
Improvements
- Texture pooling and on-demand allocation for more efficient memory use.
- Adaptive software-threading tuned by frame size.
- Minor error-message formatting and lint cleanups.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-17T16:03:26Z

Walkthrough

Added SIMD-parallel YUV→RGBA conversion with progress/cancellation and rayon-based parallelism; refactored GPU texture allocation to be dynamic and GPU-limit aware with zero-copy fallback on Windows; added hardware-decoder capability detection and software-threading configuration for FFmpeg/Media Foundation.

Changes

Cohort / File(s)	Summary
Parallelization dependencies `crates/rendering/Cargo.toml`, `crates/video-decode/Cargo.toml`	Added `rayon = "1.10"` and `num_cpus = "1.16"` dependencies for parallel processing and CPU detection.
SIMD-accelerated YUV conversion `crates/rendering/src/cpu_yuv.rs`	Added `ConversionProgress` (atomic progress/cancel), `SimdLevel` enum and detection, SIMD-dispatched NV12/YUV420p→RGBA paths (AVX2/SSE2/Scalar), parallel/sequential drivers, per-row helpers, CPU-parallel processing via rayon, architecture fallbacks, and tests for detection, parity, large-frame parallelism, and cancellation.
Dynamic GPU texture management `crates/rendering/src/yuv_converter.rs`	Increased effective GPU limits, added alignment/validation, dynamic texture creation (Y/U/V/UV/output), `ensure_texture_size()` and `allocated_*` state, GPU-limit-aware initialization, and Windows zero-copy attempt with staged-copy fallback and zero_copy state handling.
Hardware decoder capabilities `crates/video-decode/src/ffmpeg.rs`	Introduced `HwDecoderCapabilities` and global cache, replaced static size checks with capability checks, added `configure_software_threading()` and integrated capability-aware hardware/software selection and logging.
Media Foundation capabilities and texture pooling `crates/video-decode/src/media_foundation.rs`	Added `MFDecoderCapabilities`, static cache, `TexturePool` for reusable output/YUV textures, refactored sample/read flow to use TexturePool, extended `MFDecodedFrame` to include optional Y/UV textures/handles, and exposed capabilities accessor.
Minor formatting/lint adjustments `crates/frame-converter/src/d3d11.rs`, `crates/recording/src/output_pipeline/win.rs`, `crates/rendering/src/decoder/mod.rs`	Adjusted error-format strings and format placeholders; added clippy allow attributes for redundant-closure lint on platform-specific backings.
Public API re-exports `crates/video-decode/src/lib.rs`	Re-exported `MFDecoderCapabilities` and `get_mf_decoder_capabilities`.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Conv as YuvToRgbaConverter
    participant Tex as TextureAllocation
    participant GPU as GPU/WGPU
    participant D3D as D3D11 (Windows)
    User->>Conv: convert_nv12_with_fallback(...)
    Conv->>Conv: validate_dimensions(gpu_max_texture_size)
    alt Resize needed
        Conv->>Tex: ensure_texture_size(new_w,new_h)
        Tex->>GPU: create/reallocate textures
        GPU-->>Tex: textures ready
    end
    Note over Conv,D3D: Zero-copy attempt (Windows)
    Conv->>D3D: acquire shared handle
    D3D-->>Conv: handle or failure
    alt Zero-copy succeeds
        Conv-->>User: return texture view
    else Zero-copy fails
        Conv->>Conv: set zero_copy_failed
        Conv->>GPU: staged copy of NV12 planes
        GPU-->>Conv: staged texture ready
        Conv-->>User: return texture view
    end

sequenceDiagram
    participant Scheduler as rayon Thread Pool
    participant Progress as ConversionProgress
    participant SIMD as SIMD Dispatch (AVX2/SSE2/Scalar)
    participant Output as Output Buffer
    Scheduler->>Scheduler: split rows, par_iter
    par Threads
        Scheduler->>SIMD: detect SimdLevel (cached)
        SIMD-->>Scheduler: chosen variant
        Scheduler->>SIMD: convert row(s)
        SIMD-->>Output: write RGBA pixels
        Scheduler->>Progress: fetch_add rows_completed
        Scheduler->>Progress: check cancelled -> break if set
    end
    Scheduler-->>Output: final buffer

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Pay special attention to:
- crates/rendering/src/cpu_yuv.rs: SIMD intrinsics correctness, memory/layout assumptions, thread-safety of ConversionProgress, rayon boundary conditions, and tests.
- crates/rendering/src/yuv_converter.rs: GPU limit calculation, ensure_texture_size() correctness, zero-copy vs staged-copy coherence, texture view lifetime.
- crates/video-decode/src/media_foundation.rs and ffmpeg.rs: capability detection caching (OnceLock) and thread-safety, TexturePool reuse semantics and error handling.

Possibly related PRs

feat: 0.4 windows fixes + optimisations #1453 — earlier changes to cpu_yuv.rs covering baseline SIMD YUV→RGBA conversions (strong overlap).
fix(desktop): properly use dxva2 for hardware decoding on windows #1227 — related updates to FFmpeg hardware-decoding selection and device handling.
Hardware accelerated encoding on Windows #976 — related Media Foundation / D3D11 texture and shared-handle code paths.

Suggested reviewers

Brendonovich

Poem

🐰 In threads we hop and SIMD carrots gleam,
Pixels flip and buffers hum like cream,
When zero-copy trips, fallback saves the day,
Rabbity converters bound and sprint away! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Windows video playback optimisations' accurately describes the main focus of the PR, which introduces hardware decoder capability detection and software threading optimization for Windows video decoding across multiple crates.
Docstring Coverage	✅ Passed	Docstring coverage is 82.22% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch windows-playback-optimisation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/rendering/src/yuv_converter.rs (1)

1136-1173: Potential mismatch when downscaling is triggered.

The validate_dimensions call returns effective_width/effective_height and _downscaled flag, but the actual conversion still uses the original width and height for:

RGBA buffer allocation (line 1141)

SIMD conversion (lines 1143-1151)

Texture write (lines 1153-1171)

If the video exceeds GPU limits and downscaling is indicated, the textures will be sized for the downscaled resolution while the data is at original resolution, which could cause issues.

This also applies to convert_yuv420p_cpu below.

Consider either:

Using effective_width/effective_height consistently when downscaling

Or rejecting the conversion with an error if downscaling would be required for CPU paths

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2528bde and 8521adf.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

crates/rendering/Cargo.toml (1 hunks)
crates/rendering/src/cpu_yuv.rs (9 hunks)
crates/rendering/src/yuv_converter.rs (17 hunks)
crates/video-decode/Cargo.toml (1 hunks)
crates/video-decode/src/ffmpeg.rs (2 hunks)
crates/video-decode/src/lib.rs (1 hunks)
crates/video-decode/src/media_foundation.rs (8 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.rs: Never use dbg!() macro; use proper logging (tracing::debug!, etc.) instead
Never write let _ = async_fn() which silently drops futures; await or explicitly handle them
Use duration.saturating_sub(other) instead of duration - other to avoid panics on underflow
Merge nested if statements: write 'if a && b { }' instead of 'if a { if b { } }'
Don't call .clone() on Copy types (integers, bools, etc.); copy them directly
Use function references directly: iter.map(foo) instead of iter.map(|x| foo(x))
Accept &[T] or &str instead of &Vec or &String in function parameters for flexibility
Use .is_empty() instead of .len() == 0 or .len() > 0 / .len() != 0
Don't assign () to a variable: write foo(); instead of let _ = foo(); or let x = foo(); when return is unit
Use .unwrap_or(val) instead of .unwrap_or_else(|| val) when the default is a simple/cheap value
Use 'for item in &collection' or 'for (i, item) in collection.iter().enumerate()' instead of 'for i in 0..collection.len()'
Use value.clamp(min, max) instead of manual if chains or .min(max).max(min) patterns
Always handle Result/Option or types marked #[must_use]; never ignore them

**/*.rs: Use rustfmt and workspace clippy lints for Rust code formatting and linting
Use snake_case for Rust module names and kebab-case for crate names
Never use dbg!() macro in Rust code; use proper logging instead (Clippy: dbg_macro = deny)
Always handle Result/Option or types marked #[must_use]; never ignore them (Rust compiler lint: unused_must_use = deny)
Never write let _ = async_fn() which silently drops futures; await or explicitly handle them (Clippy: let_underscore_future = deny)
Use saturating_sub instead of - for Duration to avoid panics (Clippy: unchecked_duration_subtraction = deny)
Merge nested if statements: use if a && b { } instead of if a { if b { } } (Clippy: collapsible_if = deny)
Don't call .clone() on Copy types; just copy them directly (Clippy: clone_on_copy = deny)
U...

Files:

crates/video-decode/src/lib.rs
crates/video-decode/src/ffmpeg.rs
crates/video-decode/src/media_foundation.rs
crates/rendering/src/cpu_yuv.rs
crates/rendering/src/yuv_converter.rs

**/*.{ts,tsx,js,jsx,rs}

📄 CodeRabbit inference engine (AGENTS.md)

Never add comments to code (//, /* */, ///, //!, #, etc.); code must be self-explanatory through naming, types, and structure

Files:

crates/video-decode/src/lib.rs
crates/video-decode/src/ffmpeg.rs
crates/video-decode/src/media_foundation.rs
crates/rendering/src/cpu_yuv.rs
crates/rendering/src/yuv_converter.rs

🧠 Learnings (2)

📚 Learning: 2025-10-28T08:39:42.230Z

Learnt from: Brendonovich
Repo: CapSoftware/Cap PR: 1305
File: crates/recording/src/output_pipeline/macos.rs:80-90
Timestamp: 2025-10-28T08:39:42.230Z
Learning: In `crates/recording/src/output_pipeline/macos.rs`, the `AVFoundationMp4Muxer` intentionally holds the `Mutex<MP4Encoder>` lock during retry attempts in `send_video_frame()` and `send_audio_frame()`. This blocking behavior is correct because frame processing must happen sequentially to prevent audio and video frames from being interleaved incorrectly in the encoder.

Applied to files:

crates/video-decode/src/lib.rs
crates/video-decode/src/ffmpeg.rs

📚 Learning: 2025-12-07T14:29:40.743Z

Learnt from: CR
Repo: CapSoftware/Cap PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-07T14:29:40.743Z
Learning: Applies to **/*.rs : Use `rustfmt` and workspace clippy lints for Rust code formatting and linting

Applied to files:

crates/rendering/Cargo.toml

🧬 Code graph analysis (1)

crates/video-decode/src/lib.rs (1)

crates/video-decode/src/media_foundation.rs (1)

get_mf_decoder_capabilities (138-140)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Clippy (aarch64-apple-darwin, macos-latest)
GitHub Check: Clippy (x86_64-pc-windows-msvc, windows-latest)
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (rust)

🔇 Additional comments (20)

crates/video-decode/Cargo.toml (1)

12-12: Appropriate dependency choice for CPU-aware threading configuration.

The num_cpus crate version 1.16.0 is a stable choice for dynamically detecting available CPUs to configure software decoder threading, which aligns well with the PR's objective to optimize threading based on available hardware. A newer version 1.17.0 is available, but the caret versioning (^1.16) allows automatic updates within the 1.x series as needed.

crates/rendering/Cargo.toml (1)

21-21: Excellent choice for parallelization. Rayon is a data-parallelism library for Rust that is extremely lightweight and guarantees data-race freedom, making it the standard for this use case. Version 1.10 is current and appropriate.

crates/video-decode/src/lib.rs (1)

10-14: LGTM!

Clean extension of the public API to expose hardware decoder capability querying. The re-export structure follows the existing pattern.

crates/video-decode/src/ffmpeg.rs (4)

13-28: LGTM!

Well-structured capability struct with sensible defaults for fallback scenarios.

32-134: LGTM!

Solid D3D11 capability detection with proper fallback handling. The tiered resolution testing (8K → 4K) correctly identifies hardware limits.

145-172: LGTM!

Good resolution-based threading configuration. The thresholds align well with common video resolutions (1080p, 4K).

213-246: LGTM!

Good integration of dynamic hardware capability checking. The fallback paths are well-structured, ensuring software threading is configured appropriately in all scenarios.

crates/video-decode/src/media_foundation.rs (4)

34-140: LGTM!

Well-structured capability detection with descending resolution tests. The Option return from get_mf_decoder_capabilities appropriately signals when initialization hasn't occurred yet.

160-294: LGTM!

Efficient texture pooling implementation that correctly handles dimension changes and uses appropriate formats for NV12 planes.

350-388: LGTM!

Clean initialization with one-time capability detection. The warning for oversized videos is appropriate without being blocking.

555-624: LGTM!

The texture pool integration and plane extraction logic are correct. The D3D11_BOX regions properly extract Y (full res) and UV (half res) planes from the NV12 texture.

Note: y_handle and uv_handle are always None in this path. If shared handle support is needed for zero-copy in consumers, this would need to be extended.

crates/rendering/src/cpu_yuv.rs (4)

1-33: LGTM!

Clean progress tracking with appropriate use of relaxed atomics for UI feedback purposes.

431-475: AVX2/SSE2 functions are placeholders calling scalar code.

The nv12_convert_row_avx2 and nv12_convert_row_sse2 functions both just delegate to the scalar implementation. The SIMD detection logic is in place, but actual SIMD intrinsics aren't implemented yet.

This means the parallelization via rayon provides the optimization, but not SIMD vectorization within each row.

Is this intentional as a first step, with SIMD intrinsics to be added later? If so, consider adding a TODO or tracking issue.

282-335: LGTM!

Good band-based parallelization with rayon. The MIN_ROWS_PER_THREAD threshold prevents excessive overhead for small images, and per-row progress updates provide responsive UI feedback.

1038-1224: LGTM!

Comprehensive test coverage for the new features: SIMD detection, progress tracking, parallel conversion, and cancellation behavior.

crates/rendering/src/yuv_converter.rs (5)

85-134: LGTM!

Solid dimension validation with GPU-awareness. The even dimension enforcement (& !1) is essential for YUV formats, and the aspect-ratio-preserving downscale is well-implemented.

347-470: LGTM!

Well-factored texture creation helpers with correct formats for each YUV plane type. Double-buffered output is good for pipeline efficiency.

472-510: LGTM!

Good grow-only texture reallocation strategy that avoids thrashing. The aligned dimensions ensure GPU-friendly allocation sizes.

943-1049: LGTM!

Clean implementation of the zero-copy D3D11 shared handle path with proper error handling for both texture imports.

1051-1122: LGTM!

Smart fallback strategy with sticky failure flag to avoid repeated zero-copy attempts. The reset_zero_copy_state method allows recovery when conditions change.

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (5)

crates/rendering/src/decoder/mod.rs (2)
213-216: Refactor to avoid suppressing the redundant_closure lint.

The closure can be eliminated by using .as_deref() followed by a function reference, which aligns with the coding guidelines that specify this lint is set to deny.

As per coding guidelines, use function references directly instead of closures.

Apply this diff:
-    #[allow(clippy::redundant_closure)]
     pub fn iosurface_backing(&self) -> Option<&cv::ImageBuf> {
-        self.iosurface_backing.as_ref().map(|b| b.inner())
+        self.iosurface_backing.as_deref().map(SendableImageBuf::inner)
     }
278-281: Refactor to avoid suppressing the redundant_closure lint.

The closure can be eliminated by using .as_deref() followed by a function reference, which aligns with the coding guidelines that specify this lint is set to deny.

As per coding guidelines, use function references directly instead of closures.

Apply this diff:
-    #[allow(clippy::redundant_closure)]
     pub fn d3d11_texture_backing(&self) -> Option<&ID3D11Texture2D> {
-        self.d3d11_texture_backing.as_ref().map(|b| b.inner())
+        self.d3d11_texture_backing.as_deref().map(SendableD3D11Texture::inner)
     }
crates/video-decode/src/media_foundation.rs (2)
57-134: Consider documenting that max_width/max_height represent the maximum resolution supported by any codec.

The current implementation sets max_width and max_height to the highest resolution supported by either H264 or HEVC decoders. This means if H264 supports 8K but HEVC only supports 4K, max_width will be 8K. Callers checking dimensions against these limits should be aware that the limits are codec-agnostic and may not reflect per-codec capabilities.

220-291: Validate that dimensions are even for NV12 texture creation.

Lines 256-257 create UV textures with half dimensions (width / 2, height / 2). If the input dimensions are odd, this will truncate, potentially causing misalignment issues with NV12 format which expects even dimensions.

Consider adding a validation check:
 fn get_or_create_yuv_textures(
     &mut self,
     device: &ID3D11Device,
     width: u32,
     height: u32,
 ) -> Result<(&ID3D11Texture2D, &ID3D11Texture2D), String> {
+    if width % 2 != 0 || height % 2 != 0 {
+        return Err(format!("NV12 requires even dimensions, got {}x{}", width, height));
+    }
+
     use windows::Win32::Graphics::Dxgi::Common::{
crates/rendering/src/yuv_converter.rs (1)
622-638: Consider more idiomatic Option handling.

The early return pattern works, but could be more concise:
-    if self.iosurface_cache.is_none() {
-        return Err(IOSurfaceTextureError::NoMetalDevice.into());
-    }
-
     let io_surface = image_buf
         .io_surf()
         .ok_or(IOSurfaceTextureError::NoIOSurface)?;

     let width = image_buf.width() as u32;
     let height = image_buf.height() as u32;

     let (effective_width, effective_height, _downscaled) =
         validate_dimensions(width, height, self.gpu_max_texture_size)?;
     self.ensure_texture_size(device, effective_width, effective_height);
     self.swap_output_buffer();

-    let cache = self.iosurface_cache.as_ref().unwrap();
+    let cache = self.iosurface_cache.as_ref()
+        .ok_or(IOSurfaceTextureError::NoMetalDevice)?;

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8521adf and ed8f831.

📒 Files selected for processing (7)

crates/frame-converter/src/d3d11.rs (11 hunks)
crates/recording/src/output_pipeline/win.rs (2 hunks)
crates/rendering/src/cpu_yuv.rs (9 hunks)
crates/rendering/src/decoder/mod.rs (2 hunks)
crates/rendering/src/yuv_converter.rs (18 hunks)
crates/video-decode/src/ffmpeg.rs (2 hunks)
crates/video-decode/src/media_foundation.rs (8 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.rs: Never use dbg!() macro; use proper logging (tracing::debug!, etc.) instead
Never write let _ = async_fn() which silently drops futures; await or explicitly handle them
Use duration.saturating_sub(other) instead of duration - other to avoid panics on underflow
Merge nested if statements: write 'if a && b { }' instead of 'if a { if b { } }'
Don't call .clone() on Copy types (integers, bools, etc.); copy them directly
Use function references directly: iter.map(foo) instead of iter.map(|x| foo(x))
Accept &[T] or &str instead of &Vec or &String in function parameters for flexibility
Use .is_empty() instead of .len() == 0 or .len() > 0 / .len() != 0
Don't assign () to a variable: write foo(); instead of let _ = foo(); or let x = foo(); when return is unit
Use .unwrap_or(val) instead of .unwrap_or_else(|| val) when the default is a simple/cheap value
Use 'for item in &collection' or 'for (i, item) in collection.iter().enumerate()' instead of 'for i in 0..collection.len()'
Use value.clamp(min, max) instead of manual if chains or .min(max).max(min) patterns
Always handle Result/Option or types marked #[must_use]; never ignore them

**/*.rs: Use rustfmt and workspace clippy lints for Rust code formatting and linting
Use snake_case for Rust module names and kebab-case for crate names
Never use dbg!() macro in Rust code; use proper logging instead (Clippy: dbg_macro = deny)
Always handle Result/Option or types marked #[must_use]; never ignore them (Rust compiler lint: unused_must_use = deny)
Never write let _ = async_fn() which silently drops futures; await or explicitly handle them (Clippy: let_underscore_future = deny)
Use saturating_sub instead of - for Duration to avoid panics (Clippy: unchecked_duration_subtraction = deny)
Merge nested if statements: use if a && b { } instead of if a { if b { } } (Clippy: collapsible_if = deny)
Don't call .clone() on Copy types; just copy them directly (Clippy: clone_on_copy = deny)
U...

Files:

crates/recording/src/output_pipeline/win.rs
crates/video-decode/src/ffmpeg.rs
crates/rendering/src/decoder/mod.rs
crates/video-decode/src/media_foundation.rs
crates/rendering/src/yuv_converter.rs
crates/frame-converter/src/d3d11.rs
crates/rendering/src/cpu_yuv.rs

**/*.{ts,tsx,js,jsx,rs}

📄 CodeRabbit inference engine (AGENTS.md)

Never add comments to code (//, /* */, ///, //!, #, etc.); code must be self-explanatory through naming, types, and structure

Files:

crates/recording/src/output_pipeline/win.rs
crates/video-decode/src/ffmpeg.rs
crates/rendering/src/decoder/mod.rs
crates/video-decode/src/media_foundation.rs
crates/rendering/src/yuv_converter.rs
crates/frame-converter/src/d3d11.rs
crates/rendering/src/cpu_yuv.rs

🧠 Learnings (2)

📚 Learning: 2025-10-28T08:39:42.230Z

Learnt from: Brendonovich
Repo: CapSoftware/Cap PR: 1305
File: crates/recording/src/output_pipeline/macos.rs:80-90
Timestamp: 2025-10-28T08:39:42.230Z
Learning: In `crates/recording/src/output_pipeline/macos.rs`, the `AVFoundationMp4Muxer` intentionally holds the `Mutex<MP4Encoder>` lock during retry attempts in `send_video_frame()` and `send_audio_frame()`. This blocking behavior is correct because frame processing must happen sequentially to prevent audio and video frames from being interleaved incorrectly in the encoder.

Applied to files:

crates/recording/src/output_pipeline/win.rs
crates/video-decode/src/ffmpeg.rs

📚 Learning: 2025-12-07T14:29:40.743Z

Learnt from: CR
Repo: CapSoftware/Cap PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-07T14:29:40.743Z
Learning: Applies to **/*.rs : Use function references directly instead of redundant closures: `iter.map(foo)` instead of `iter.map(|x| foo(x))` (Clippy: `redundant_closure` = deny)

Applied to files:

crates/rendering/src/decoder/mod.rs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Clippy (x86_64-pc-windows-msvc, windows-latest)
GitHub Check: Clippy (aarch64-apple-darwin, macos-latest)
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (rust)

🔇 Additional comments (27)

crates/recording/src/output_pipeline/win.rs (1)

215-215: LGTM! Improved error message clarity.

The formatting change from positional to named parameters ({poisoned}) makes the error messages more explicit and easier to understand during debugging.

Also applies to: 543-543

crates/frame-converter/src/d3d11.rs (1)

116-116: LGTM! Consistent error formatting improvements.

The switch from positional placeholders ({:?}) to named parameters ({e:?}) across all error sites improves code clarity and follows Rust formatting conventions.

Also applies to: 120-120, 124-124, 168-168, 195-195, 200-200, 226-228, 236-236, 349-349, 381-381, 407-407, 431-431, 455-455, 527-527

crates/video-decode/src/ffmpeg.rs (5)

10-30: LGTM!

The HwDecoderCapabilities struct and caching mechanism are well-designed. Using OnceLock ensures thread-safe lazy initialization of hardware capabilities, and the optimistic default values (8192x8192) are safe since the decoder logic validates dimensions against actual capabilities before attempting hardware decode.

32-133: LGTM - Hardware capability detection is robust.

The Windows D3D11 video decoder capability query is well-structured with proper error handling. The logic correctly:

Tests for 8K (8192x8192) support first and updates max dimensions to 8192 if successful

Falls back to testing 4K (4096x4096) support if 8K fails

Retains the baseline 4096x4096 if neither check succeeds

Returns safe defaults on any query failure

The use of unsafe Windows APIs is necessary for D3D11 device creation and has appropriate error handling.

135-142: LGTM!

The non-Windows stub and public accessor are appropriate. The get_or_init pattern ensures the capability query runs once and the result is cached for subsequent calls.

144-171: LGTM - Software threading configuration is well-designed.

The function appropriately:

Uses u64 for pixel count calculation to prevent overflow

Applies resolution-based heuristics: auto-threading for 4K+, half CPU cores (min 2) for Full HD, and 2 threads for smaller resolutions

Uses frame-level threading (FF_THREAD_FRAME) which is appropriate for video decoding

Has a null pointer check before accessing the FFmpeg codec context

Logs the configuration for observability

212-245: LGTM - Hardware/software decode logic is correct.

The integration properly handles all decode paths:

Hardware requested, within limits → Uses hardware acceleration (line 228-232)

Hardware requested, exceeds limits → Falls back to software with threading (line 223)

Hardware requested, initialization fails → Falls back to software with threading (line 236)

No hardware requested → Configures software threading (line 244)

The logic correctly avoids duplicate software threading configuration by checking both hw_device.is_none() and hw_device_type.is_none() on line 243. While the pure software decode path (when hw_device_type is None) is not exercised by the current codebase, the implementation handles it correctly.

crates/rendering/src/yuv_converter.rs (11)

85-92: LGTM! Sensible texture dimension constants.

The increased GPU texture limits (8K) and initial allocation at 1080p with 64-byte padding alignment are appropriate for modern hardware.

97-134: LGTM! Robust dimension validation with GPU-aware downscaling.

The validation logic correctly:

Respects both hardware and configured limits

Scales down proportionally when needed

Rejects extreme downscaling (< 0.1)

Ensures even dimensions for YUV compatibility

Provides clear logging

152-154: LGTM! Appropriate state tracking fields.

The new fields correctly track:

Current texture allocation dimensions

GPU capability limits

Zero-copy failure state (Windows)

Also applies to: 163-164

169-174: LGTM! Proper GPU capability detection.

The initialization correctly queries GPU limits and logs the configuration.

305-345: LGTM! Sensible initial texture allocation.

Starting at 1080p avoids over-allocation while being sufficient for common use cases. The texture will grow dynamically as needed.

347-470: LGTM! Well-structured texture creation helpers.

The helper functions correctly:

Create appropriately sized textures for each plane

Handle Y, UV, U, V planes with correct dimensions (half-size for chroma)

Create double-buffered output textures

Return both texture and view for convenience

512-514: LGTM! Clean accessor method.

540-542: LGTM! Consistent dimension validation pattern.

All conversion entry points correctly validate dimensions against GPU limits before processing. The _downscaled flag is currently unused but may be useful for future metrics or warnings.

961-1049: LGTM! Well-implemented zero-copy path with proper error handling.

The implementation correctly:

Validates dimensions before processing

Attempts zero-copy import for both Y and UV planes

Provides detailed logging for success and failure cases

Returns appropriate errors with context

1051-1112: LGTM! Smart fallback strategy with persistent failure tracking.

The implementation intelligently:

Attempts zero-copy when available and not previously failed

Permanently disables zero-copy after first failure to avoid repeated overhead

Provides clear trace logging for path selection

Falls back gracefully to staging copy

The persistent failure tracking (zero_copy_failed) is a good optimization to avoid repeated failed attempts on incompatible hardware.

1114-1122: LGTM! Useful zero-copy state management API.

The accessors allow callers to:

Query current zero-copy status

Reset state if hardware configuration changes

crates/rendering/src/cpu_yuv.rs (9)

1-33: LGTM! Well-designed progress tracking with atomic state.

The implementation correctly:

Uses atomic operations for thread-safe progress tracking

Handles edge case (zero total_rows) in progress_fraction

Uses Relaxed ordering appropriately for non-critical progress updates

Provides cancellation support

129-160: LGTM! Clean SIMD capability detection.

The implementation correctly:

Detects AVX2/SSE2 support on x86 architectures

Falls back to scalar on non-x86

Provides useful metadata via pixels_per_iteration

162-165: LGTM! Sensible parallelism thresholds.

Using 1080p as the cutoff for parallel processing and 16 rows per thread are appropriate values that balance overhead with parallelism benefits.

177-255: LGTM! Robust SIMD conversion with proper validation.

The implementation correctly:

Validates buffer sizes with overflow protection (saturating arithmetic)

Detects optimal SIMD level

Chooses parallel/sequential based on frame size

Falls back to scalar conversion if validation fails

257-285: LGTM! Sequential conversion with cancellation support.

Properly checks cancellation before each row and updates progress atomically after processing.

287-342: LGTM! Efficient parallel conversion using rayon.

The implementation correctly:

Divides work into bands based on thread count

Uses par_chunks_mut for parallel processing

Checks cancellation within each band

Guards against out-of-bounds with global_row >= height check

Updates progress atomically

344-413: LGTM! Clean SIMD dispatch pattern.

The dispatch correctly routes to AVX2/SSE2/Scalar implementations based on detected capabilities. Unsafe blocks are properly contained.

537-621: LGTM! Consistent pattern with NV12 conversion.

The YUV420P implementation follows the same robust pattern as NV12 with proper validation and parallel/sequential selection.

Note: Same SIMD placeholder issue applies (see comment on lines 415-495).

623-711: LGTM! Consistent parallel and sequential conversion implementations.

The YUV420P conversion correctly mirrors the NV12 pattern with proper cancellation and progress tracking.

coderabbitai · 2025-12-17T16:57:30Z

crates/rendering/src/cpu_yuv.rs

+#[cfg(any(target_arch = "x86_64", target_arch = "x86"))]
+#[target_feature(enable = "avx2")]
+unsafe fn nv12_convert_row_avx2(
+    y_data: &[u8],
+    uv_data: &[u8],
+    width: usize,
+    y_row_start: usize,
+    uv_row_start: usize,
+    out_row_start: usize,
+    output: &mut [u8],
+) {
+    unsafe {
+        nv12_convert_row_sse2(
+            y_data,
+            uv_data,
+            width,
+            y_row_start,
+            uv_row_start,
+            out_row_start,
+            output,
+        );
+    }
+}

-                let r = clamp_u8((298 * c + 409 * e + 128) >> 8);
-                let g = clamp_u8((298 * c - 100 * d - 208 * e + 128) >> 8);
-                let b = clamp_u8((298 * c + 516 * d + 128) >> 8);
+#[cfg(any(target_arch = "x86_64", target_arch = "x86"))]
+#[target_feature(enable = "sse2")]
+unsafe fn nv12_convert_row_sse2(
+    y_data: &[u8],
+    uv_data: &[u8],
+    width: usize,
+    y_row_start: usize,
+    uv_row_start: usize,
+    out_row_start: usize,
+    output: &mut [u8],
+) {
+    nv12_convert_row_scalar(
+        y_data,
+        uv_data,
+        width,
+        y_row_start,
+        uv_row_start,
+        out_row_start,
+        output,
+    );
+}

-                let out_idx = out_row_start + col * 4;
-                if out_idx + 3 < output.len() {
-                    output[out_idx] = r;
-                    output[out_idx + 1] = g;
-                    output[out_idx + 2] = b;
-                    output[out_idx + 3] = 255;
-                }
-            }
+#[cfg(any(target_arch = "x86_64", target_arch = "x86"))]
+fn nv12_convert_row_scalar(
+    y_data: &[u8],
+    uv_data: &[u8],
+    width: usize,
+    y_row_start: usize,
+    uv_row_start: usize,
+    out_row_start: usize,
+    output: &mut [u8],
+) {
+    for col in 0..width {
+        let y_idx = y_row_start + col;
+        let uv_idx = uv_row_start + (col / 2) * 2;
+
+        let y = y_data.get(y_idx).copied().unwrap_or(0) as i32;
+        let u = uv_data.get(uv_idx).copied().unwrap_or(128) as i32;
+        let v = uv_data.get(uv_idx + 1).copied().unwrap_or(128) as i32;
+
+        let c = y - 16;
+        let d = u - 128;
+        let e = v - 128;
+
+        let r = clamp_u8((298 * c + 409 * e + 128) >> 8);
+        let g = clamp_u8((298 * c - 100 * d - 208 * e + 128) >> 8);
+        let b = clamp_u8((298 * c + 516 * d + 128) >> 8);
+
+        let out_idx = out_row_start + col * 4;
+        if out_idx + 3 < output.len() {
+            output[out_idx] = r;
+            output[out_idx + 1] = g;
+            output[out_idx + 2] = b;
+            output[out_idx + 3] = 255;
        }
    }
 }


⚠️ Potential issue | 🟠 Major

SIMD functions are placeholder implementations calling scalar code.

The AVX2 and SSE2 functions currently just call the scalar implementation, meaning no actual SIMD acceleration is occurring. The dispatch infrastructure is in place, but the vectorized implementations are missing.

This should be documented as known limitation, or actual SIMD implementations should be provided. If this is intentional (staged rollout), consider:

Adding a comment explaining that SIMD implementations are placeholders

Updating the PR description to clarify the current state

Or implementing the actual SIMD intrinsics for performance

Based on the PR objectives mentioning "SIMD-parallel YUV→RGBA conversion", users may expect actual SIMD acceleration in this release.

coderabbitai · 2025-12-17T16:57:30Z

crates/rendering/src/cpu_yuv.rs

+#[cfg(not(any(target_arch = "x86_64", target_arch = "x86")))]
+#[allow(clippy::too_many_arguments)]
+pub fn nv12_to_rgba_simd_with_progress(
+    y_data: &[u8],
+    uv_data: &[u8],
+    width: u32,
+    height: u32,
+    y_stride: u32,
+    uv_stride: u32,
+    output: &mut [u8],
+    _progress: Option<Arc<ConversionProgress>>,
+) {
+    nv12_to_rgba(y_data, uv_data, width, height, y_stride, uv_stride, output);
+}


⚠️ Potential issue | 🟡 Minor

Progress tracking not supported on non-x86 architectures.

The non-x86 fallback ignores the progress parameter, meaning cancellation and progress monitoring won't work on ARM or other architectures. Consider documenting this limitation if it's intentional.

If progress support is desired on all platforms, implement it for the scalar path:

#[cfg(not(any(target_arch = "x86_64", target_arch = "x86")))] #[allow(clippy::too_many_arguments)] pub fn nv12_to_rgba_simd_with_progress( y_data: &[u8], uv_data: &[u8], width: u32, height: u32, y_stride: u32, uv_stride: u32, output: &mut [u8], - _progress: Option<Arc<ConversionProgress>>, + progress: Option<Arc<ConversionProgress>>, ) { - nv12_to_rgba(y_data, uv_data, width, height, y_stride, uv_stride, output); + let height_usize = height as usize; + for row in 0..height_usize { + if let Some(ref p) = progress && p.is_cancelled() { + return; + } + // Process row (inline the conversion logic or refactor nv12_to_rgba) + if let Some(ref p) = progress { + p.rows_completed.fetch_add(1, Ordering::Relaxed); + } + } }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

crates/rendering/src/cpu_yuv.rs lines 510-523: the non-x86 fallback nv12_to_rgba_simd_with_progress currently ignores the progress parameter, so progress reporting/cancellation doesn't work on ARM; either document this limitation or implement progress support in the scalar path by removing the unused underscore from the parameter, propagate the Option<Arc<ConversionProgress>> into the scalar conversion routine, and update the scalar nv12_to_rgba implementation to periodically report progress and check for cancellation (e.g., compute rows/blocks processed, call progress.report(...) and abort/return early on cancellation). Ensure the function signature stays compatible and add a short doc comment if you choose to leave it unsupported.

coderabbitai · 2025-12-17T16:57:31Z

crates/rendering/src/cpu_yuv.rs

+    #[test]
+    fn test_cancellation() {
+        let progress = Arc::new(ConversionProgress::new(1080));
+
+        let width = 1920u32;
+        let height = 1080u32;
+        let y_stride = 1920u32;
+        let uv_stride = 1920u32;
+
+        let y_data: Vec<u8> = vec![128; (y_stride * height) as usize];
+        let uv_data: Vec<u8> = vec![128; (uv_stride * height / 2) as usize];
+
+        let mut output = vec![0u8; (width * height * 4) as usize];
+
+        progress.cancel();
+
+        nv12_to_rgba_simd_with_progress(
+            &y_data,
+            &uv_data,
+            width,
+            height,
+            y_stride,
+            uv_stride,
+            &mut output,
+            Some(progress.clone()),
+        );
+
+        let rows_done = progress.rows_completed.load(Ordering::Relaxed);
+        assert!(rows_done < height as usize);
+    }


⚠️ Potential issue | 🟡 Minor

Cancellation test may be non-deterministic.

The test cancels before starting conversion (line 1172) and expects rows_done < height (line 1186). While this should work since cancellation is checked at the start of each row, there's a theoretical race condition if the check happens after row processing begins.

Consider making the test more robust:

#[test] fn test_cancellation() { let progress = Arc::new(ConversionProgress::new(1080)); let width = 1920u32; let height = 1080u32; let y_stride = 1920u32; let uv_stride = 1920u32; let y_data: Vec<u8> = vec![128; (y_stride * height) as usize]; let uv_data: Vec<u8> = vec![128; (uv_stride * height / 2) as usize]; let mut output = vec![0u8; (width * height * 4) as usize]; progress.cancel(); nv12_to_rgba_simd_with_progress( &y_data, &uv_data, width, height, y_stride, uv_stride, &mut output, Some(progress.clone()), ); let rows_done = progress.rows_completed.load(Ordering::Relaxed); - assert!(rows_done < height as usize); + assert!( + rows_done < height as usize, + "Expected early termination due to cancellation, but {} rows completed", + rows_done + ); }

Or test cancellation mid-stream by canceling from another thread after a delay.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

#[test]

fn test_cancellation() {

let progress = Arc::new(ConversionProgress::new(1080));

let width = 1920u32;

let height = 1080u32;

let y_stride = 1920u32;

let uv_stride = 1920u32;

let y_data: Vec<u8> = vec![128; (y_stride * height) as usize];

let uv_data: Vec<u8> = vec![128; (uv_stride * height / 2) as usize];

let mut output = vec![0u8; (width * height * 4) as usize];

progress.cancel();

nv12_to_rgba_simd_with_progress(

&y_data,

&uv_data,

width,

height,

y_stride,

uv_stride,

&mut output,

Some(progress.clone()),

);

let rows_done = progress.rows_completed.load(Ordering::Relaxed);

assert!(rows_done < height as usize);

}

#[test]

fn test_cancellation() {

let progress = Arc::new(ConversionProgress::new(1080));

let width = 1920u32;

let height = 1080u32;

let y_stride = 1920u32;

let uv_stride = 1920u32;

let y_data: Vec<u8> = vec![128; (y_stride * height) as usize];

let uv_data: Vec<u8> = vec![128; (uv_stride * height / 2) as usize];

let mut output = vec![0u8; (width * height * 4) as usize];

progress.cancel();

nv12_to_rgba_simd_with_progress(

&y_data,

&uv_data,

width,

height,

y_stride,

uv_stride,

&mut output,

Some(progress.clone()),

);

let rows_done = progress.rows_completed.load(Ordering::Relaxed);

assert!(

rows_done < height as usize,

"Expected early termination due to cancellation, but {} rows completed",

rows_done

);

}

coderabbitai · 2025-12-17T16:57:31Z

crates/rendering/src/yuv_converter.rs

+fn align_dimension(dim: u32) -> u32 {
+    dim.div_ceil(TEXTURE_SIZE_PADDING) * TEXTURE_SIZE_PADDING
+}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Read the file to understand align_dimension and its usage cat -n crates/rendering/src/yuv_converter.rs | head -120

Repository: CapSoftware/Cap

Length of output: 4337

🏁 Script executed:

#!/bin/bash # Search for GPU-related limit constants and validations rg -i "gpu.*max|max.*texture|texture.*size.*limit" --type=rs -C2

Repository: CapSoftware/Cap

Length of output: 85

🏁 Script executed:

#!/bin/bash # Continue reading the file to see more of validate_dimensions and how align_dimension is used cat -n crates/rendering/src/yuv_converter.rs | sed -n '97,150p'

Repository: CapSoftware/Cap

Length of output: 2262

🏁 Script executed:

#!/bin/bash # Search for calls to align_dimension to see where it's used rg "align_dimension" crates/rendering/src/yuv_converter.rs -C3

Repository: CapSoftware/Cap

Length of output: 524

🏁 Script executed:

#!/bin/bash # Search for where ensure_texture_size is called to confirm validation happens rg "ensure_texture_size|validate_dimensions" crates/rendering/src/yuv_converter.rs -C3

Repository: CapSoftware/Cap

Length of output: 3252

🏁 Script executed:

#!/bin/bash # Read more of the file to see the full context of ensure_texture_size and how it's used cat -n crates/rendering/src/yuv_converter.rs | sed -n '150,250p'

Repository: CapSoftware/Cap

Length of output: 5229

🏁 Script executed:

#!/bin/bash # Check the full ensure_texture_size implementation to see if it re-validates after alignment cat -n crates/rendering/src/yuv_converter.rs | sed -n '260,310p'

Repository: CapSoftware/Cap

Length of output: 2701

🏁 Script executed:

#!/bin/bash # Find where allocated_width and allocated_height are set to understand the bounds rg "allocated_width\s*=" crates/rendering/src/yuv_converter.rs -C2

Repository: CapSoftware/Cap

Length of output: 240

🏁 Script executed:

#!/bin/bash # Find the complete ensure_texture_size function rg -A30 "fn ensure_texture_size" crates/rendering/src/yuv_converter.rs

Repository: CapSoftware/Cap

Length of output: 1438

🏁 Script executed:

#!/bin/bash # Check what the limits are for texture creation and if alignment could violate them rg "MAX_TEXTURE" crates/rendering/src/yuv_converter.rs -C1

Repository: CapSoftware/Cap

Length of output: 343

🏁 Script executed:

#!/bin/bash # Check if texture creation functions handle dimension checks or clamping rg -A5 "create_y_texture|create_uv_texture|create_u_texture|create_v_texture|create_output_textures" crates/rendering/src/yuv_converter.rs | head -40

Repository: CapSoftware/Cap

Length of output: 1557

🏁 Script executed:

#!/bin/bash # Verify the math: what is 4320 aligned? python3 << 'EOF' max_height = 4320 padding = 64 aligned = ((max_height + padding - 1) // padding) * padding print(f"MAX_TEXTURE_HEIGHT: {max_height}") print(f"TEXTURE_SIZE_PADDING: {padding}") print(f"Aligned height: {aligned}") print(f"Exceeds original? {aligned > max_height}") print() print("Calculation breakdown:") print(f" {max_height} / {padding} = {max_height / padding}") print(f" ceil({max_height / padding}) = {-(-max_height // padding)}") print(f" {-(-max_height // padding)} * {padding} = {-(-max_height // padding) * padding}") EOF

Repository: CapSoftware/Cap

Length of output: 145

🏁 Script executed:

#!/bin/bash # Get the full create_y_texture to see if there's any validation rg -A15 "fn create_y_texture" crates/rendering/src/yuv_converter.rs

Repository: CapSoftware/Cap

Length of output: 617

🏁 Script executed:

#!/bin/bash # Check if create_texture validates the dimensions against GPU limits rg -B5 -A10 "device.create_texture" crates/rendering/src/yuv_converter.rs | head -50

Repository: CapSoftware/Cap

Length of output: 1790

Alignment causes validated dimensions to exceed GPU limits.

The validation ensures dimensions don't exceed effective_max_height (4320 in this case), but align_dimension(4320) returns 4352, which exceeds the limit. Since alignment happens after validation in ensure_texture_size, dimensions that pass validation can become invalid. Re-validate or clamp dimensions to GPU limits after alignment before creating textures.

🤖 Prompt for AI Agents

In crates/rendering/src/yuv_converter.rs around lines 93 to 95, align_dimension currently rounds up (e.g., 4320 -> 4352) after validation which can push dimensions past GPU limits; change ensure_texture_size to perform alignment first and then re-validate or clamp the aligned dimensions against effective_max_width/effective_max_height (or return an error) so the final dimensions used to create textures never exceed GPU limits; apply this for both width and height and ensure any error messages mention the aligned size when rejecting or clamping.

coderabbitai · 2025-12-17T16:57:31Z

crates/rendering/src/yuv_converter.rs

+    fn ensure_texture_size(&mut self, device: &wgpu::Device, width: u32, height: u32) {
+        let required_width = align_dimension(width);
+        let required_height = align_dimension(height);
+
+        if required_width <= self.allocated_width && required_height <= self.allocated_height {
+            return;
        }
+
+        let new_width = required_width.max(self.allocated_width);
+        let new_height = required_height.max(self.allocated_height);
+
+        tracing::info!(
+            old_width = self.allocated_width,
+            old_height = self.allocated_height,
+            new_width = new_width,
+            new_height = new_height,
+            "Reallocating YUV converter textures for larger video"
+        );
+
+        let (y_texture, y_view) = Self::create_y_texture(device, new_width, new_height);
+        let (uv_texture, uv_view) = Self::create_uv_texture(device, new_width, new_height);
+        let (u_texture, u_view) = Self::create_u_texture(device, new_width, new_height);
+        let (v_texture, v_view) = Self::create_v_texture(device, new_width, new_height);
+        let (output_textures, output_views) =
+            Self::create_output_textures(device, new_width, new_height);
+
+        self.y_texture = y_texture;
+        self.y_view = y_view;
+        self.uv_texture = uv_texture;
+        self.uv_view = uv_view;
+        self.u_texture = u_texture;
+        self.u_view = u_view;
+        self.v_texture = v_texture;
+        self.v_view = v_view;
+        self.output_textures = output_textures;
+        self.output_views = output_views;
+        self.allocated_width = new_width;
+        self.allocated_height = new_height;
+    }


⚠️ Potential issue | 🟠 Major

Verify aligned dimensions don't exceed GPU limits.

The grow-only strategy is efficient, but consider this edge case:

Caller validates dimensions → gets (effective_width, effective_height) within GPU limits

ensure_texture_size aligns these with align_dimension → potentially exceeds limits

For example, if effective_width = gpu_max - 10 and TEXTURE_SIZE_PADDING = 64, the aligned dimension would exceed the GPU limit.

Apply this approach to validate post-alignment:

fn ensure_texture_size(&mut self, device: &wgpu::Device, width: u32, height: u32) { let required_width = align_dimension(width); let required_height = align_dimension(height); + + let effective_max = MAX_TEXTURE_WIDTH.min(self.gpu_max_texture_size); + let required_width = required_width.min(effective_max); + let required_height = required_height.min(effective_max); if required_width <= self.allocated_width && required_height <= self.allocated_height {

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In crates/rendering/src/yuv_converter.rs around lines 472 to 510, after computing required_width/required_height with align_dimension, validate those post-alignment dimensions against the device GPU limits (device.limits().max_texture_dimension_2d); if either aligned dimension exceeds the limit, do not attempt to allocate the larger textures — log an error/tracing with the attempted and max values and return early (or clamp to the max and log if you prefer a best-effort allocation), otherwise proceed with the existing reallocation logic; ensure allocated_width/allocated_height are only updated when allocation actually succeeds.

richiemcilroy added 4 commits December 17, 2025 14:48

Add dynamic texture allocation and scaling to YUV converter

e532663

Add hardware decoder capability checks and threading config

0191193

Add decoder capability queries and zero-copy fallback

4c48ebd

Add parallel SIMD YUV conversion with progress tracking

8521adf

coderabbitai bot reviewed Dec 17, 2025

View reviewed changes

richiemcilroy added 5 commits December 17, 2025 16:09

Refactor error formatting and allow redundant closure lint

fd8aa63

clippy

fae0156

clippy

e171607

clippy

b4bffad

clippy

ed8f831

richiemcilroy merged commit d1a6718 into main Dec 17, 2025
16 of 17 checks passed

coderabbitai bot reviewed Dec 17, 2025

View reviewed changes

This was referenced Dec 18, 2025

feat: Lots of Windows + desktop optimisations #1459

Merged

wip: Optimize macOS recording pipeline with M4S muxer and async finalization (+ optimisations) #1464

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Windows video playback optimisations #1454

Windows video playback optimisations #1454

Uh oh!

richiemcilroy commented Dec 17, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 17, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 17, 2025

Uh oh!

coderabbitai bot Dec 17, 2025

Uh oh!

coderabbitai bot Dec 17, 2025

Uh oh!

coderabbitai bot Dec 17, 2025

Uh oh!

coderabbitai bot Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Windows video playback optimisations #1454

Windows video playback optimisations #1454

Uh oh!

Conversation

richiemcilroy commented Dec 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

richiemcilroy commented Dec 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 17, 2025 •

edited

Loading