Skip to content

Add SAR backprojection transform#1108

Merged
cliffburdick merged 17 commits intomainfrom
add-sar-backprojection-transform
Jan 8, 2026
Merged

Add SAR backprojection transform#1108
cliffburdick merged 17 commits intomainfrom
add-sar-backprojection-transform

Conversation

@tbensonatl
Copy link
Copy Markdown
Collaborator

Add initial version of a synthetic aperture radar (SAR) backprojection operator. The operator is currently in the matx::experimental namespace as its API is subject to change. A large focus of this implementation is offering reasonable performance on platforms with reduced fp64 throughput. A ComputeType parameter indicates the overall computational mode. The ComputeType drives part of the accuracy-performance trade-off, with the types of the input/output tensors driving the remainder. The options are as follows:

  • Double: uses predominately fp64 operations.
  • Mixed: uses fp32 for the less sensitive intermediate calculations and fp64 for the more sensitive calculations. This is an improvement even on systems with full-throughput fp64.
  • Float: uses predominately fp32 calculations.
  • FloatFloat: uses a float-float representation to achieve close-to-fp64 precision by representing values as a pair of single-precision floats. Manipulation of float-float values uses only fp32 operations, other than fp64 conversion instructions when converting between double and FloatFloat. This mode offers improved performance on systems with lower fp64 throughput but is significantly slower on those with full-throughput fp64 due to the use of many fp32 operations to manipulate float-float values (e.g., adding two float-float values uses 20 fp32 instructions).

This initial SAR backprojector supports single and double-precision backprojection,
but does not yet implement further optimizations or mixed precision. Validation is
still underway.

This implementation has the following assumptions/limitations:

1. It largely assumes that the data has been motion-compenstated to some mocomp point
(potentially per pulse). The operator accepts an operator with the range to the mocomp
point and uses differential range calculations in the backprojector.
2. It accepts an operator with per-voxel coordinates. For regular grids, it would be
more efficient to accept, e.g., min_x, min_y, dx, dy, etc. and construct the
regular grid in the kernel. The more general implementation supports reconstructions
onto polar or other irregular grids.
3. It does not yet include any optimizations.

Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@tbensonatl tbensonatl self-assigned this Jan 5, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Jan 5, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Jan 5, 2026

Greptile Summary

Added initial experimental SAR backprojection operator with multi-precision compute modes (Double, Mixed, Float, FloatFloat) designed to optimize performance on platforms with reduced fp64 throughput.

Key changes:

  • Implemented matx::experimental::sar_bp operator following MatX operator patterns with proper PreRun/PostRun lifecycle
  • Created float-float arithmetic library (fltflt.h) implementing extended precision using pairs of single-precision floats, based on algorithms from Thall, Knuth, and Dekker
  • CUDA kernel supports four compute modes with optional phase LUT optimization for improved performance
  • FloatFloat mode achieves near-fp64 precision using only fp32 operations (except for conversion instructions), beneficial for GPUs with reduced fp64 throughput
  • Comprehensive test suite validates all compute modes including point target scenario
  • Properly integrated into build system and documentation structure

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Well-architected implementation following MatX conventions with comprehensive validation. Clean separation of concerns across transform/operator/kernel layers. Thorough test coverage validates correctness across all compute modes. Previous review comments addressed.
  • No files require special attention

Important Files Changed

Filename Overview
include/matx/kernels/fltflt.h Implements float-float arithmetic library for extended precision using pairs of floats. Clean implementation following established algorithms from Thall, Knuth, Dekker.
include/matx/kernels/sar_bp.cuh Implements SAR backprojection CUDA kernel with multiple compute precision modes (Double, Mixed, Float, FloatFloat). Well-structured with proper synchronization.
include/matx/operators/sar_bp.h Defines SAR BP operator interface with params, compute types, and feature flags. Proper operator pattern implementation following MatX conventions.
include/matx/transforms/sar_bp.h Transform implementation layer dispatching to kernels based on compute type and features. Proper validation and workspace allocation.
test/00_transform/SarBp.cu Comprehensive test suite covering all compute modes with non-mixed types, mixed precision, and point target validation scenarios.

Sequence Diagram

sequenceDiagram
    participant User
    participant SarBpOp as sar_bp Operator
    participant Transform as Transform Layer
    participant Kernel as CUDA Kernel
    participant FltFlt as Float-Float Lib

    User->>SarBpOp: Call sar_bp(initial_image, range_profiles, platform_positions, voxel_locations, range_to_mcp, params)
    SarBpOp->>SarBpOp: Validate inputs (rank, types)
    SarBpOp->>Transform: Exec() with parameters
    Transform->>Transform: Check FloatFloat requires PhaseLUT
    alt PhaseLUTOptimization enabled
        Transform->>Transform: Allocate workspace for phase LUT
        Transform->>Kernel: Launch SarBpFillPhaseLUT kernel
        Kernel-->>Transform: Phase LUT populated
    end
    alt ComputeType == FloatFloat
        Transform->>Kernel: Launch SarBp kernel (FloatFloat mode)
        loop Each pixel
            loop Each pulse block
                Kernel->>FltFlt: Convert platform positions to fltflt
                Kernel->>FltFlt: ComputeRangeToPixelFloatFloat()
                FltFlt-->>Kernel: High-precision range
                Kernel->>Kernel: Interpolate range profiles
                Kernel->>Kernel: Apply phase correction from LUT
                Kernel->>Kernel: Accumulate contribution
            end
        end
    else ComputeType == Double/Mixed/Float
        Transform->>Kernel: Launch SarBp kernel
        loop Each pixel
            loop Each pulse
                Kernel->>Kernel: ComputeRangeToPixel()
                Kernel->>Kernel: Interpolate range profiles
                alt PhaseLUT enabled
                    Kernel->>Kernel: Get phase from LUT + incremental
                else
                    Kernel->>Kernel: Compute phase with sincos
                end
                Kernel->>Kernel: Accumulate contribution
            end
        end
    end
    Kernel-->>Transform: Output image populated
    Transform-->>SarBpOp: Execution complete
    SarBpOp-->>User: Return result operator
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment thread include/matx/operators/sar_bp.h Outdated
Comment thread include/matx/kernels/sar_bp.cuh
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread include/matx/kernels/fltflt.h Outdated
Comment thread include/matx/kernels/sar_bp.cuh Outdated
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
Comment thread include/matx/transforms/sar_bp.h Outdated
Comment thread include/matx/transforms/sar_bp.h
Comment thread test/00_transform/SarBp.cu
Signed-off-by: Thomas Benson <tbenson@nvidia.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Jan 5, 2026

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

3 similar comments
@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

@cliffburdick
Copy link
Copy Markdown
Collaborator

/build

@cliffburdick cliffburdick merged commit 8bd9edf into main Jan 8, 2026
2 checks passed
@cliffburdick cliffburdick deleted the add-sar-backprojection-transform branch January 8, 2026 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants