CUDA初始化错误torch/cuda/__init__.py:118: UserWarning: CUDA initialization: CUDA unknown error - this may
查询发现有网友猜测是猜 NVIDIA 内核模块太脆弱了,而且会随机损坏所以采取删除并插入 nvidia_uvm 模块。在pytorch调用cuda的时候,报如下错误。
·
1. 问题描述
在pytorch调用cuda的时候,报如下错误
torch/cuda/__init__.py:118: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
此时
nvidia-smi
nvcc -V
torch.cuda.device_count()
均正常
而就在执行
torch.cuda.is_available()
时,报torch/cuda/__init__.py:118: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
return torch._C._cuda_getDeviceCount() > 0
2. 解决
执行
sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm
查询发现有网友猜测是猜 NVIDIA 内核模块太脆弱了,而且会随机损坏所以采取删除并插入 nvidia_uvm 模块。
执行完成后
>>> import torch
>>> torch.cuda.is_available()
True
更多推荐


所有评论(0)