ROS2 + Conda(CUDA/Torch)在 WSL2 上的混用指南
ROS2 + Conda(CUDA/Torch)在 WSL2 上的混用指南
环境与目标
环境信息
- WSL2 kernel:Linux-6.6.87.2-microsoft-standard-WSL2
- ROS2:Humble(Ubuntu 22.04)
- Conda env:audio2exp(torch nightly + CUDA 可用)
- GPU:NVIDIA GeForce RTX 5080 Laptop GPU,capability 12.0
- onnxruntime providers:TensorrtExecutionProvider, CUDAExecutionProvider, CPUExecutionProvider
目标拓扑
- 节点 A:发布音频
/audio/pcm - 节点 B:订阅音频并推理,发布
/face/arkit52
核心概念扫盲
ROS_DOMAIN_ID
DDS 的“逻辑隔离号”,同一网络中:
- Domain 不同:互相完全看不到
- Domain 相同:才可能发现与通信
RMW_IMPLEMENTATION
ROS2 的中间件抽象层(RMW)选择,决定底层 DDS 实现:
- Humble 默认:
rmw_fastrtps_cpp(FastDDS) - 常用替代:
rmw_cyclonedds_cpp(CycloneDDS,WSL2 下更稳定)
为什么 ros2 run 在 Conda 里找不到 Torch?ros2 run 调用的是 workspace 安装目录里的入口脚本,该脚本第一行 shebang 会被构建时的 Python 固定(通常是 /usr/bin/python3)。因此即使激活 Conda 环境,ros2 run 仍会用系统 Python 运行,自然无法导入 Conda 里的 Torch。
踩坑记录与解决方案
问题 1:WSL2 上 ROS2 “能发但收不到 / 发现不到”
定位方法:先测 DDS 组播是否正常:
ros2 multicast receive
ros2 multicast send
解决步骤:
- 若
receive能收到Hello World!,说明网络层 OK; - 确保双方
ROS_DOMAIN_ID一致; - 尝试切换 RMW 为
rmw_cyclonedds_cpp(WSL2 + FastDDS 偶发 discovery 兼容问题)。
问题 2:设置 RMW_IMPLEMENTATION=rmw_cyclonedds_cpp 后终端崩溃
典型原因:
- 没装 CycloneDDS RMW 包;
- Conda 抢了
PYTHONPATH/LD_LIBRARY_PATH,导致 ROS2 Python 包路径异常; - 脚本里
set -e:某个命令非 0 返回会让整个脚本直接退出。
解决:
- 脚本顺序:先 ROS2,再 Conda,最后补 ROS2 的
PYTHONPATH; - 先不强行设 RMW,等确认安装与可用后再打开;
- 调试阶段可暂时去掉
set -e,或包一层|| true。
问题 3:自定义 msg 包 colcon build 报错:缺 catkin_pkg / 缺 em
推荐解决路线(最稳):构建用系统 Python
conda deactivate
cd ~/ws_audio2exp
rm -rf build/ install/ log/
unset PYTHON_EXECUTABLE
unset Python3_EXECUTABLE
unset CMAKE_PREFIX_PATH
hash -r
source /opt/ros/humble/setup.bash
colcon build
问题 4:message 包报错:package.xml 需要 member_of_group
解决:在 audio_msgs/package.xml 中加入:
<export>
<build_type>ament_cmake</build_type>
<member_of_group>rosidl_interface_packages</member_of_group>
</export>
问题 5:Ctrl+C 退出时报 RCLError(双 shutdown)
解决:用 try_shutdown()
def main():
rclpy.init()
node = AudioSub()
try:
rclpy.spin(node)
except KeyboardInterrupt:
pass
finally:
node.destroy_node()
rclpy.try_shutdown()
问题 6:在 Conda 环境里运行节点,Torch/ONNX Runtime 仍 “No module named”
根因:shebang 固定解释器(构建时用的系统 Python),导致 ros2 run 永远用系统 Python。
最终稳定方案
方案 S:构建用系统 Python;运行推理节点用 conda python -m
构建(一次性)
cd ~/ws_audio2exp
conda deactivate || true
source /opt/ros/humble/setup.bash
colcon build
source install/setup.bash
运行(每次开终端)
终端 A:subscriber / 推理节点
source /opt/ros/humble/setup.bash
cd ~/ws_audio2exp
source install/setup.bash
conda activate audio2exp
python -m audio_demo_nodes.audio_sub
终端 B:publisher
source /opt/ros/humble/setup.bash
cd ~/ws_audio2exp
source install/setup.bash
conda activate audio2exp
python -m audio_demo_nodes.audio_pub_1hz
核心收益
- ROS2 构建链路不被 Conda 污染;
- Torch/CUDA/ONNX Runtime 完整可用;
- 避免
ros2 run固定系统 Python 的坑。
脚本
#!/usr/bin/env bash
# 用法:source ~/ros2_conda_audio2exp_env.sh
# =========================
# 1) 基础环境
# =========================
ROS_DISTRO="humble" # humble / jazzy
CONDA_ENV_NAME="audio2exp" # conda 环境名
WS_DIR="$HOME/ws_audio2exp" # ROS2 工作区
ROS_PY_VER="3.10"
# =========================
# 2) DDS / ROS 参数
# =========================
# 0=不固定RMW(系统默认) 1=固定 CycloneDDS
USE_CYCLONEDDS=1
# Domain ID(同机/同网要一致)
ROS_DOMAIN_ID_VALUE=0
# =========================
# 3) 自检开关
# =========================
# --- 开关:启动后是否做自检输出 ---
ENABLE_SELF_CHECK=1
# --- 开关:自检时是否检测 torch/cuda ---
ENABLE_TORCH_CHECK=1
# --- 开关:自检时是否检测 onnxruntime ---
ENABLE_ONNXRUNTIME_CHECK=1
# =========================
# 内部函数
# =========================
_is_sourced() {
[[ "${BASH_SOURCE[0]}" != "${0}" ]]
}
_fail() {
local code="${1:-1}"
if _is_sourced; then
return "$code"
else
exit "$code"
fi
}
_log() {
echo "[ROS2+Conda] $*"
}
_warn() {
echo "[ROS2+Conda][WARN] $*" >&2
}
# =========================
# 4) source ROS2
# =========================
ROS_SETUP="/opt/ros/${ROS_DISTRO}/setup.bash"
if [[ ! -f "$ROS_SETUP" ]]; then
_warn "找不到 ROS2 环境脚本: $ROS_SETUP"
_fail 1
fi
# shellcheck disable=SC1090
source "$ROS_SETUP" || _fail 1
# =========================
# 5) source conda + activate env
# =========================
CONDA_SH="$HOME/miniconda3/etc/profile.d/conda.sh"
if [[ ! -f "$CONDA_SH" ]]; then
_warn "找不到 conda.sh: $CONDA_SH"
_fail 1
fi
# shellcheck disable=SC1090
source "$CONDA_SH" || _fail 1
conda activate "$CONDA_ENV_NAME" || _fail 1
# =========================
# 6) source workspace(如果已 build)
# =========================
WS_SETUP="${WS_DIR}/install/setup.bash"
if [[ -f "$WS_SETUP" ]]; then
# shellcheck disable=SC1090
source "$WS_SETUP" || _fail 1
else
_warn "未找到工作区 setup: $WS_SETUP(如果还没 colcon build,这条可忽略)"
fi
# =========================
# 7) PYTHONPATH 兜底(避免 conda 覆盖 rclpy)
# =========================
ROS_PY_PATH="/opt/ros/${ROS_DISTRO}/lib/python${ROS_PY_VER}/site-packages"
if [[ -d "$ROS_PY_PATH" ]]; then
export PYTHONPATH="${ROS_PY_PATH}:${PYTHONPATH}"
else
_warn "ROS Python 路径不存在: $ROS_PY_PATH(检查 ROS_PY_VER)"
fi
# =========================
# 8) 设置 ROS_DOMAIN_ID / RMW
# =========================
export ROS_DOMAIN_ID="${ROS_DOMAIN_ID_VALUE}"
if [[ "$USE_CYCLONEDDS" == "1" ]]; then
export RMW_IMPLEMENTATION="rmw_cyclonedds_cpp"
else
# 不固定RMW,清掉让系统默认生效
unset RMW_IMPLEMENTATION
fi
# =========================
# 9) 自检(可选)
# =========================
if [[ "$ENABLE_SELF_CHECK" == "1" ]]; then
export ENABLE_TORCH_CHECK
export ENABLE_ONNXRUNTIME_CHECK
_log "Environment ready"
python - << 'PY'
import os, sys, platform
print("==== Runtime Self Check ====")
print("sys.executable:", sys.executable)
print("python:", sys.version.replace("\n", " "))
print("platform:", platform.platform())
print("cwd:", os.getcwd())
print("CONDA_PREFIX:", os.getenv("CONDA_PREFIX"))
print("ROS_DOMAIN_ID:", os.getenv("ROS_DOMAIN_ID"))
print("RMW_IMPLEMENTATION:", os.getenv("RMW_IMPLEMENTATION"))
print("PYTHONPATH:", os.getenv("PYTHONPATH", ""))
try:
import rclpy
print("rclpy: OK")
except Exception as e:
print("rclpy: FAIL", repr(e))
if os.getenv("ENABLE_TORCH_CHECK", "1") == "1":
try:
import torch
print("torch:", torch.__version__)
print("torch.cuda.is_available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("torch.version.cuda:", torch.version.cuda)
print("torch.backends.cudnn.enabled:", torch.backends.cudnn.enabled)
try:
print("torch.backends.cudnn.version:", torch.backends.cudnn.version())
except Exception:
pass
try:
print("cuda device[0]:", torch.cuda.get_device_name(0))
print("capability:", torch.cuda.get_device_capability(0))
print("total_mem(MB):", round(torch.cuda.get_device_properties(0).total_memory / 1024 / 1024, 1))
a = torch.randn((256, 256), device="cuda")
b = torch.randn((256, 256), device="cuda")
c = a @ b
_ = c.mean().item()
print("torch CUDA matmul: OK")
except Exception as e:
print("torch CUDA runtime test: FAIL", repr(e))
except Exception as e:
print("torch: FAIL", repr(e))
else:
print("torch check: SKIPPED")
if os.getenv("ENABLE_ONNXRUNTIME_CHECK", "1") == "1":
try:
import onnxruntime as ort
print("onnxruntime:", ort.__version__)
print("onnxruntime providers:", ort.get_available_providers())
except Exception as e:
print("onnxruntime: FAIL", repr(e))
else:
print("onnxruntime check: SKIPPED")
print("==== Self Check Done ====")
PY
fi
# 1) 先确保 conda 命令可用
_conda_sh="$HOME/miniconda3/etc/profile.d/conda.sh"
if [[ -f "$_conda_sh" ]]; then
# shellcheck disable=SC1090
source "$_conda_sh" || return 1
else
echo "[a2eenv][ERR] missing: $_conda_sh" >&2
return 1
fi
# 2) 激活 conda 环境(参数优先)
_env="${1:-$CONDA_ENV_NAME}"
conda activate "$_env" || return 1
可选方案
路线 1:让系统 Python “看到” Conda 的 site-packages(不推荐但可用)
在环境脚本中适当配置 PYTHONPATH,但可能导致系统 Python 与 Conda 环境冲突。
更多推荐


所有评论(0)