Address issues #255 and #295 by ookaay · Pull Request #323 · brian-team/brian2cuda

ookaay · 2026-03-30T06:59:45Z

This PR addresses parts of #255 and #295.

For #255, automatic GPU selection previously relied on the legacy compute-capability-based choice. I changed it so automatic selection now uses a CUDA-style performance estimate:

performance = multiprocessors * cuda_cores_per_multiprocessor * clock_rate

If performance detection is unavailable, it falls back to the old compute-capability-based logic.

For #295, I updated brian2cuda/tools/test_suite/run_test_suite.py so it no longer runs the Brian test flow directly itself. Instead, it now delegates to the shared brian2cuda.tests.run entrypoint.

I verified the new GPU-selection path, the fallback to the legacy selection logic, and that run_test_suite.py now routes through the shared Brian2CUDA test runner.

I also fixed the build configuration by changing the setuptools_scm fallback version from an invalid value to a valid PEP 440 version, so the package can build correctly in CI.

@mstimberg @denisalevi

fallback version

mstimberg · 2026-03-31T08:59:46Z

Hi @ookaay, thanks for the PR. In general this looks like a good addition, but I'm not sure that it actually works (did you test this with a machine that has a CUDA GPU?). Your code (and your tests) assume that the line for the multiprocessors look like:

Multiprocessors, (128) CUDA Cores/MP:           80 multiprocessors

but this is not the correct format, these lines first have the number of multiprocessors and end with "CUDA cores" (which are the product of the number of multiprocessors and the cores per multiprocessor). E.g., on my Laptop this looks like:

 (14) Multiprocessors, ( 64) CUDA Cores/MP:     896 CUDA Cores

Also, the parsing code assumes that the compute compatibility is <10. Another question: what is the source for the translation of compute capability into number of cores/SM? Finally, is using deviceQuery really necessary (since not every user will have this installed), couldn't you get the same information from nvidia-smi --query or something similar?

ookaay · 2026-03-31T12:30:28Z

Thanks @mstimberg, this makes sense.

You’re right, the parsing I added is too tied to one deviceQuery output format. The assumptions I made about the multiprocessor line and the compute capability parsing are too narrow.

I also take your point about deviceQuery itself. If the same information can be obtained more reliably from nvidia-smi, that’s probably the better direction.

I’ll rework the implementation with that in mind before pushing this further.

mstimberg · 2026-03-31T12:36:50Z

You’re right, the parsing I added is too tied to one deviceQuery output format. The assumptions I made about the multiprocessor line and the compute capability parsing are too narrow.

My question was also: did you run this on a machine where it worked? If I am not mistaken, the multiprocessor line has always looked like it does in my example.

ookaay · 2026-03-31T12:46:11Z

No, I didn’t run this on a real machine. I only made sure the implementation compiled correctly, but I didn’t validate the parsing against actual hardware output.

mstimberg · 2026-03-31T13:58:05Z

I see, but does that mean that the assumed parsing output is LLM-generated? And same for the mapping from compute capability to cores/SM? As we've seen for the deviceQuery output, these are not necessarily correct, and it would be helpful to know for the review where I can verify these things.

ookaay · 2026-03-31T20:06:08Z

Yes, I did some brainstorming with AI while working on this, and I was wrong not to check the deviceQuery output format against the actual source first.

For the compute capability to cores/SM mapping, that’s the mapping used in _ConvertSMVer2Cores in NVIDIA_CUDA-11.2_Samples/common/inc/helper_cuda.h.

I also should have tested it on a real CUDA GPU. I only checked that the code path compiled.

Using deviceQuery was the wrong direction. I should have stayed with nvidia-smi. I’ll rework it around that, verify it on a GPU runtime in Colab, and open a new PR after implementing it.

ookaay added 2 commits March 29, 2026 20:50

Address issues brian-team#255 and brian-team#295

70af42e

Fix invalid setuptools_scm

8b808f1

fallback version

Use nvidia-smi metrics for GPU selection

f5e2346

ookaay force-pushed the solved_issues branch from dfe745b to f5e2346 Compare April 4, 2026 10:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address issues #255 and #295#323

Address issues #255 and #295#323
ookaay wants to merge 3 commits intobrian-team:masterfrom
ookaay:solved_issues

ookaay commented Mar 30, 2026 •

edited

Loading

Uh oh!

mstimberg commented Mar 31, 2026

Uh oh!

ookaay commented Mar 31, 2026

Uh oh!

mstimberg commented Mar 31, 2026

Uh oh!

ookaay commented Mar 31, 2026

Uh oh!

mstimberg commented Mar 31, 2026

Uh oh!

ookaay commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ookaay commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mstimberg commented Mar 31, 2026

Uh oh!

ookaay commented Mar 31, 2026

Uh oh!

mstimberg commented Mar 31, 2026

Uh oh!

ookaay commented Mar 31, 2026

Uh oh!

mstimberg commented Mar 31, 2026

Uh oh!

ookaay commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ookaay commented Mar 30, 2026 •

edited

Loading