System Info
System Info
Working on a kubernetes deployment with debian + pytorch 2.4.0 + ROCm 6.1.
The deployment is using the multiple backend alpha release available in the parent bitsandbytes repo.
Reproduction
Trying to load a model with bitsandbytes fails because there is no access to rocminfo.
def get_rocm_gpu_arch() -> str:
logger = logging.getLogger(__name__)
try:
if torch.version.hip:
result = subprocess.run(["rocminfo"], capture_output=True, text=True)
match = re.search(r"Name:\s+gfx([a-zA-Z\d]+)", result.stdout)
ERROR:bitsandbytes.cuda_specs:Could not detect ROCm GPU architecture: [Errno 2] No such file or directory: 'rocminfo'
WARNING:bitsandbytes.cuda_specs:
ROCm GPU architecture detection failed despite ROCm being available.
https://github.com/ROCm/bitsandbytes/blob/4aad810bc1d93c38a5316ec54c822cd12b1f1cd2/bitsandbytes/cuda_specs.py#L54
Expected behavior
I would prefer if I could set the architecture via an environment variable and rocminfo would be the fallback option if the env var is not set.
Here is the related cope snippet.
Happy to work on this if other people feel it is a good workaround.
System Info
System Info
Working on a kubernetes deployment with debian + pytorch 2.4.0 + ROCm 6.1.
The deployment is using the multiple backend alpha release available in the parent bitsandbytes repo.
Reproduction
Trying to load a model with bitsandbytes fails because there is no access to rocminfo.
https://github.com/ROCm/bitsandbytes/blob/4aad810bc1d93c38a5316ec54c822cd12b1f1cd2/bitsandbytes/cuda_specs.py#L54
Expected behavior
I would prefer if I could set the architecture via an environment variable and
rocminfowould be the fallback option if the env var is not set.Here is the related cope snippet.
Happy to work on this if other people feel it is a good workaround.