Jdb/grant review by jeremiedb · Pull Request #24 · Evovest/NeuroTabModels.jl

jeremiedb · 2026-02-13T06:52:04Z

No description provided.

- Add Enzyme.jl dependency - Implement compute_grads() using Enzyme.autodiff with runtime activity mode - Add train/test mode switching to handle BatchNorm mutation issues - Refactor mlogloss to use direct indexing instead of onehotbatch - Configure Enzyme strictAliasing in module __init__ Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

…m Reactant cache Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

…rnal onehotbatch calls. Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

jeremiedb

@AdityaPandeyCN I made some comments from a fork of your PR #23

jeremiedb · 2026-02-13T06:53:27Z

src/Fit/fit.jl

 import Optimisers
-import Optimisers: OptimiserChain, WeightDecay, Adam, NAdam, Nesterov, Descent, Momentum, AdaDelta
-import Flux: trainmode!, gradient, cpu, gpu
+import Optimisers: OptimiserChain, WeightDecay, Momentum, Nesterov, Descent, Adam, NAdam


I had to add back Adam as init uses Adam and only Nadam was imported.

jeremiedb · 2026-02-13T06:57:10Z

src/models/NeuroTree/neurotrees.jl

        outsize ÷= 2
        chain = Chain(
-            BatchNorm(nfeats),
+            BatchNorm(nfeats, track_stats=false),


Was this track_stats=false intended as a temporary fix? I'd expect this to result in a different behavior; and that default option true should also be compatible with Reactant

Yes,this was a fix. I was facing some problems with BN particularly the running stats (μ and σ²) values not changing. I will look into the docs more to see how to handle it.

jeremiedb · 2026-02-13T06:59:04Z

src/Fit/fit.jl

    offset_name=nothing,
 )

-    device = config.device


I think that device should still be used here, since it's a parameters defined in thelearner, and as such should be defined with the the "model constructor" (per MLJ terminology).

jeremiedb · 2026-02-13T07:00:31Z

benchmarks/titanic-logloss.jl

 using NeuroTabModels

+using Reactant
+Reactant.set_default_backend("cpu")


device should continue to be defined in the model contructor (ie NeuroTabRegressor at line 60 below)

jeremiedb · 2026-02-13T07:06:36Z

benchmarks/benchmark_mse.jl


 # desktop: 0.771839 seconds (369.20 k allocations: 1.522 GiB, 5.94% gc time)
-@time p_train = m(dtrain; device=:gpu);
+@time p_train = m(dtrain);


It should still be possible to request inference execution on a specific device (cpu or gpu), according to kwarg.

jeremiedb · 2026-02-13T07:11:29Z

benchmarks/benchmark_mse.jl

 using Random: seed!

+using Reactant
+Reactant.set_default_backend("gpu")


Test with gpu has a backen resulted in quite poor performance, ~ 15sec which is even a little slower that cpu timing at ~14sec.
Have you observed such poor performance. This would need to be investigated as performance should be at least at within parity zone with current jdb/neuro-v3 / zygote implementation,

It shows following warnings:

[ Info: Init training I0000 00:00:1770966229.664446 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966230.970942 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966232.270221 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966233.563376 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966234.858976 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966236.155050 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966237.464304 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966238.780420 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966240.100689 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 I0000 00:00:1770966241.403202 2648952 dot_merger.cc:481] Merging Dots in computation: main.6 14.681320 seconds (4.47 M allocations: 343.948 MiB, 0.30% gc time, 0.00% compilation time) 1.300183 seconds (4.06 k allocations: 1.673 GiB, 7.01% gc time)

For reference, with jdb/neuro-v3 branch:

cpu: ~ 21 sec (so this PR is an improvement)

gpu: ~ 3.2 sec, so almost 5X slower. Reactant should needs to achieve performance around this 3.2 sec to be a usable substitute.

Let me again have a look at the gpu performance. I will report you back.

wsmoses · 2026-02-13T13:31:37Z

src/Fit/fit.jl

+        if is_full
+            opts_ra, m_ra = cache[:compiled_step](cache[:loss], m_ra, opts_ra, args_ra...)
+        else
+            opts_ra, m_ra = Reactant.@jit _train_step!(cache[:loss], m_ra, opts_ra, args_ra...)


this is going to be inefficient, you should pad the last one with zeros if needed

wsmoses · 2026-02-13T13:33:17Z

src/MLJ.jl

@@ -61,24 +62,23 @@ function update(
    while fitresult.info[:nrounds] < model.nrounds


you can even compile the whole while loop

jeremiedb · 2026-02-15T07:53:49Z

@AdityaPandeyCN I've made some explorations for migrating to Lux and it looks encouraging. GPU perf seems improve meaningfully with Reactant. See branch lux-new: https://github.com/Evovest/NeuroTabModels.jl/tree/lux-new
Notably, the logic and Device get much simplified and quite aligned with basic Lux tutorials.
Note that inference (ie m(x)) and eval metrics tracking no longer work as adaptation is needed.
But I think these should be quite straightforward.
If it makes sense to you, I'd suggest to start from this branch, and complete the adaptation, notably the eval callback / inference support, and having proper functionality to allow training on either cpu (cpu_device) or GPU / Reactant-device, etc.

AdityaPandeyCN · 2026-02-15T08:29:15Z

Thanks @jeremiedb , This looks really great, I will complete the adaptation. I locally addressed the comments on this branch but some of the things got quite complex which I am assuming can be handled really well by lux's trainstate

jeremiedb · 2026-02-15T16:28:01Z

Note that something that I'm alittle concerned about regarding Lux is its loss fucntion interface: https://lux.csail.mit.edu/stable/api/Lux/utilities#Loss-Functions
It seems to assume only x, y pair, while in NeuroTabModels, there's the need to support loss where a weights and/or offset vector are also present. It would be worth taking a look at this to validate that such 3 and 4 data input to the loss function can be adapted to respect Lux interface.

AdityaPandeyCN · 2026-02-15T16:54:24Z

So something like a custom objective function instead of using MSELoss() to handle them correctly?

jeremiedb · 2026-02-16T04:10:31Z

Yeah there already are custom functions for NeuroTabModels defined in https://github.com/Evovest/NeuroTabModels.jl/blob/lux-new/src/losses.jl. These would need to be adapted for compatibility with Lux.

Looking under the hood at how Lux deals with the x, y pairs, it appears that the the gradients get updated by trating them just as a single data object, hence a tuple. So I'd expect that a loss function acting on (x,y,w) or (x, y, w, offset) to work just fine.
So I don't think there's a need to worry about this on your end for now, and instead focus on the cleaning up the implementation to have inference and tracking eval metrics to work (ex with apply(m, x, ps, st)), as well as the StackTree which was simplified into a single NeuroTree (a custom wrapper layer like https://lux.csail.mit.edu/stable/introduction/#Defining-Custom-Layers shall likely work).

AdityaPandeyCN and others added 20 commits January 30, 2026 19:16

revert back loss logic

e877c06

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

up

2fe0010

up

9e3d29d

use reactant and flux approach

a342bd5

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

formatting

7c3d33e

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

revert changes

c2ad582

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

add back imports

c0fc0ff

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

update callback.jl

01477ac

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

revert back NAdam and remove CUDA dependency

9c832e7

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

Compile gradient function once in init(), reuse across all epochs

389d624

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

remove CuDNN dependency

afcc486

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

refactor fit.jl

4258de8

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

Add _sync_to_cpu! to MLJ interface to synchronize trained weights fro…

30897d0

…m Reactant cache Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

Update mlogloss to accept pre-encoded one-hot matrices, removing inte…

7950273

…rnal onehotbatch calls. Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

make track_stats false

5587aed

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

update benchmark file

08974b4

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

simplify cpu sync logic

a0bfa18

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

clean MLJ.jl

c8b14d5

Signed-off-by: AdityaPandeyCN <adityapand3y666@gmail.com>

up

2388bcd

jeremiedb changed the base branch from main to jdb/neuro-v3 February 13, 2026 06:52

jeremiedb commented Feb 13, 2026

View reviewed changes

jeremiedb mentioned this pull request Feb 13, 2026

Use Reactant.jl for training #23

Open

wsmoses reviewed Feb 13, 2026

View reviewed changes

		@@ -61,24 +62,23 @@ function update(
		while fitresult.info[:nrounds] < model.nrounds

Conversation

jeremiedb commented Feb 13, 2026

Uh oh!

jeremiedb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeremiedb commented Feb 15, 2026

Uh oh!

AdityaPandeyCN commented Feb 15, 2026

Uh oh!

jeremiedb commented Feb 15, 2026

Uh oh!

AdityaPandeyCN commented Feb 15, 2026

Uh oh!

jeremiedb commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments