Skip to content

Releases: NVIDIA/MatX

v1.0.0

04 Mar 19:48
312ff69

Choose a tag to compare

v1.0.0

Release 1.0.0 marks a major update for MatX. 1.0.0 is the first version to require C++20 support for both the CUDA and host compilers. As a result, CUDA versions lower than 12.2.1 are not supported.

Among the major release highlights are:

  • JIT Support
    CUDA JIT support via a new CUDAJitExecutor. When used, this executor makes a second pass at the compilation and caches the resulting kernel to be used in the future. JIT allows MatX to convert many runtime parameters into compile-time parameters, thus reducing the computations needed in the kernel. It also optionally enables kernel fusion support of the NVIDIA MathDx libraries. When enabled, MatX can potentially fuse FFT and GEMM operations into other arithmetic expressions if certain criteria are met. Only FFT and BLAS fusion are supported now, but other MathDx libraries will be added in the future. For more information, see the docs.
  • Logging
    Full logging support to stdout or to a file is supported. Logging is useful for seeing which code path MatX is taking, and dumping verbose information about each function. Note that logging requires the header, which is not available in all C++20 compilers.
  • Documentation
    Added tag showing which version of MatX each operator was added in
  • Compile-time Properties
    Specify compile-time properties on an operator for fine-grainer control of operation. For example, change the accumulation type of an operator.

Full Changelog

New Contributors

Full Changelog: v0.9.4...v1.0.0

v0.9.4

27 Oct 17:38
ad55c6b

Choose a tag to compare

Note: MatX is approaching a 1.0 release with several major updates. 1.0 will contain CUDA JIT capabilities that allow better kernel fusion and overall improvements in kernel runtimes. Along with the JIT capabilities, most files have changes that allow for efficient improvements in the kernels. MatX 1.0 will require C++20 support in both the CUDA and host compilers. CUDA 11.8 support will no longer be supported.

Notable Changes:

  • apply() and apply_idx() operators for writing lambda-based custom operators

Full Changelog

Full Changelog: v0.9.3...v0.9.4

v0.9.3

26 Sep 23:30
86d0b82

Choose a tag to compare

New operators: find_peaks, zipvec
Key Updates:

  • C2R FFT transforms
  • Indexing speedup for accessing tensors

What's Changed

New Contributors

Full Changelog: v0.9.2...v0.9.3

v0.9.2

29 Jul 19:13
fa9e872

Choose a tag to compare

New operator: interp

Other Additions:

  • Improvements to sparse support including new batched tri-diagonal solver
  • Automatic vectorization and ILP support
  • DLPack updated to 1.1
  • Many bug fixes

What's Changed

New Contributors

Full Changelog: v0.9.1...v0.9.2

v0.9.1

14 May 15:43
4475c22

Choose a tag to compare

Sparse support + bugfixes

  • New operators: argminmax, dense2sparse, sparse2dense, interp1, normalize, argsort
  • Removed requirement for --relaxed-constexpr
  • Added MatX NVTX domain
  • Significantly improved speed of svd and inv
  • Python integration sample
  • Experimental sparse tensor support (SpMM and solver routines supported)
  • Significantly reduced FFT memory usage

What's Changed

Read more

v0.9.0

15 Oct 18:12
af55b57

Choose a tag to compare

Version v0.9.0 adds comprehensive support for more host CPU transforms such as BLAS and LAPACK, including multi-threaded versions.

Beyond the CPU support, there are many more minor improvements:

  • Added several new operators include vector_norm, matrix_norm, frexp, diag, and more
  • Many compiler fixes to support a wider range of older and newer compilers
  • Performance improvements to avoid overhead of permutation operators when unnecessary
  • Much more!

A full changelist is below

What's Changed

Read more

v0.8.0

04 Apr 17:27
7719779

Choose a tag to compare

Release highlights:

  • Features
    • Updated cuTENSOR and cuTensorNet versions
    • Added configurable print formatting
    • ARM FFT support via NVPL
    • New operators: abs2(), outer(), isnan(), isinf()
    • Many more unit tests for CPU tests
  • Bug fixes for matmul on Hopper, 2D FFTs, and more

Full changelist:

What's Changed

New Contributors

Full Changelog: v0.7.0...v0.8.0

v0.7.0

04 Jan 21:06

Choose a tag to compare

Features

Fixes

Full Changelog: v0.6.0...v0.7.0

v0.6.0

02 Oct 16:50

Choose a tag to compare

Notable Updates

Full changelog below:

What's Changed

New Contributors

Full Changelog: v0.5.0...v0.6.0

v0.5.0

03 Jul 21:38

Choose a tag to compare

Notable Updates

  • Documentation rewritten to include working examples for every function based on unit tests
  • Polyphase resampler based on SciPy/cuSignal's resample_poly

Full changelog below:

What's Changed

New Contributors

Full Changelog: v0.4.1...v0.5.0