VistaDream: sampling multiview consistent images for single-view scene reconstruction
VistaDream is a novel framework for reconstructing 3D scenes from single-view images using Flux-based diffusion models. This implementation combines image outpainting, depth estimation, and 3D Gaussian splatting for high-quality 3D scene generation, with integrated visualization using Rerun.
Background
VistaDream addresses the challenge of 3D scene reconstruction from a single image through a novel two-stage pipeline:
- Coarse 3D Scaffold Construction: Creates a global scene structure by outpainting image boundaries and estimating depth maps.
- Multi-view Consistency Sampling: Uses iterative diffusion-based RGB-D inpainting with multi-view consistency constraints to generate high-quality novel views.
The framework utilizes:
- Flux diffusion models for high-quality image outpainting and inpainting.
- 3D Gaussian Splatting for efficient 3D scene representation.
- Rerun for real-time 3D visualization and debugging.
Run the code
This is an external example. Check the repository for more information.
Requires: Linux with an NVIDIA GPU (tested with CUDA 12.9)
Make sure you have the Pixi package manager installed and run:
git clone https://github.com/rerun-io/vistadream.git
cd vistadream
pixi run example