Accepted by ICRA 2026.
NavGSim is a Gaussian Splatting-based simulator for large-scale robot navigation.
Simulating realistic robot environments remains challenging, especially for high-quality rendering and physically meaningful interaction. The challenge is stronger in long-horizon navigation tasks that span multiple rooms or floors.
NavGSim is built on a hierarchical 3D Gaussian Splatting framework to support photorealistic rendering in large scenes (hundreds of square meters). For collision-aware navigation, NavGSim uses a Gaussian Splatting-based slicing method to extract navigable areas from reconstructed Gaussians. The system includes APIs/tooling for scene setup, robot setup, data collection, training integration, and evaluation.
To validate effectiveness, we collect trajectories in NavGSim, train a Vision-Language-Action (VLA) policy, and evaluate in both simulation and real-world environments.
- Build slice map (
slice_map.png,slice_map_meta.json) withpython_slice_extractor(CLI or UI). - Manually annotate task goals in
location.json. - Collect trajectories with
run.py --run-type collect_data(random start + planner-guided path to selected goal, no model inference required). - Train policy/model using UniNaVid: https://github.com/jzhzhang/NaVid-VLN-CE
- Evaluate with
run.py --run-type eval_smoothorrun.py --run-type dagger.
run.py: unified entry pointeval/nonlearning_agents_smooth.py: rollout/evaluation logicpython_slice_extractor/: slicing tools and UIconfig/: default config files (slice_map.png,slice_map_meta.json,location.json), overridable via CLI
In eval/nonlearning_agents_smooth.py, model loading/inference is intentionally left open:
load_model(self, model_path)predict_inference(self, prompt, rgb_list)
Implement these interfaces for your own model when running model-driven rollout.
Use the hierarchical-3d-gaussians baseline setup, with conda env name NavGSim.
conda create -n NavGSim python=3.12 -y
conda activate NavGSim
# Replace cu121 with cu118 if using CUDA 11.x
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
pip install -e .mkdir -p hierarchical_3d_gaussians/submodules/Depth-Anything-V2/checkpoints
wget -O hierarchical_3d_gaussians/submodules/Depth-Anything-V2/checkpoints/depth_anything_v2_vitl.pth \
"https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth?download=true"Alternative:
- Download
dpt_large-midas-2f21e586.ptfrom https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt - Place it under
hierarchical_3d_gaussians/submodules/DPT/weights/
cd hierarchical_3d_gaussians/submodules/gaussianhierarchy
cmake . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j --config Release
cd ../../..If runtime reports missing extensions (gaussian_hierarchy, simple_knn, diff_gaussian_rasterization):
pip install -e hierarchical_3d_gaussians/submodules/simple-knn
pip install -e hierarchical_3d_gaussians/submodules/hierarchy-rasterizer
pip install -e hierarchical_3d_gaussians/submodules/gaussianhierarchyFirst read: python_slice_extractor/README.md
Generate:
slice_map.png: occupancy mapslice_map_meta.json: world/pixel conversion metadata
CLI example:
python python_slice_extractor/slice_volume.py \
--ply /path/to/point_cloud.ply \
--z-min 0.5 \
--z-max 1.5 \
--num-slices 5 \
--backend auto \
--pixel-size 0.05 \
--out config/slice_map.png \
--meta-json config/slice_map_meta.jsonLocal UI:
python python_slice_extractor/slice_ui.pyWeb UI:
python python_slice_extractor/slice_web_ui.py --host 0.0.0.0 --port 7860Open: http://127.0.0.1:7860 (or forwarded host).
Create and manually annotate config/location.json.
Example:
[
{
"name": "toilet",
"pos": [17.9607, -36.9790, -1.5708]
}
]Use the dedicated model-free collection branch:
CUDA_VISIBLE_DEVICES=0 python run.py \
--run-type collect_data \
--split-id 0 \
--render-source-path /path/to/3dgs_scene \
--map-path config/slice_map.png \
--slice-path config/slice_map_meta.json \
--location-path config/location.json \
--num-episodes 300 \
--output-dir outputs/data_originTrain your model with collected trajectories in UniNaVid:
eval_smooth:
CUDA_VISIBLE_DEVICES=0 python run.py \
--run-type eval_smooth \
--split-id 0 \
--model-path /path/to/trained_model \
--render-source-path /path/to/3dgs_scene \
--map-path config/slice_map.png \
--slice-path config/slice_map_meta.json \
--location-path config/location.json \
--num-episodes 18 \
--output-dir outputs/eval_smoothdagger:
CUDA_VISIBLE_DEVICES=0 python run.py \
--run-type dagger \
--split-id 0 \
--model-path /path/to/trained_model \
--render-source-path /path/to/3dgs_scene \
--map-path config/slice_map.png \
--slice-path config/slice_map_meta.json \
--location-path config/location.json \
--num-episodes 300 \
--output-dir outputs/dagger--run-type:collect_data,eval_smooth,dagger--split-id: task split id--model-path: required foreval_smooth/dagger--render-source-path: required--map-path: path toslice_map.png--slice-path: path toslice_map_meta.json--location-path: path tolocation.json--output-dir: output directory--num-episodes: episode count for all run types
Full CLI:
python run.py -h- Coordinate conversion functions in
eval/nonlearning_agents_smooth.py(location_2_map,map_2_location) are scene-dependent and should match your coordinate system.
- Root project code: Apache-2.0 (
LICENSE) hierarchical_3d_gaussians/: adapted upstream code with its own license (hierarchical_3d_gaussians/LICENSE.md)

