Have you ever wanted to transform a single image into a 3D mesh that can be viewed from any angle? Imagine the possibilities of interacting with virtual objects or exploring realistic 3D scenes using just a single 2D image as input.

In this blog post, we’ll dive into an innovative method that allows you to generate a full 360-degree 3D textured mesh from a single image in just 45 seconds from One-2-3-45. We’ll explore the challenges of single image 3D reconstruction and discover how this new approach overcomes them.

One-2-3-45 reconstructs a full 360◦ mesh of any object in 45 seconds
Figure 1:One-2-3-45 reconstructs a full 360◦ mesh of any object in 45 seconds given a single image of it

The Challenge of Single Image 3D Reconstruction

Single image 3D reconstruction is a complex task that involves extracting the 3D shape, texture, and pose of an object or scene from a single 2D image. However, this problem is inherently challenging due to the lack of depth information and the ambiguity caused by different 3D shapes projecting to the same 2D image. Factors like occlusions, lighting conditions, and reflections further complicate the reconstruction process.

Figure 2:NeRF-based method [48] and SDF-based method [74] fail to reconstruct high-quality meshes given multi-view images predicted by Zero123. See Figure 1 for our reconstruction results.

Introducing the One-2-3-45 Method

The One-2-3-45 method offers a groundbreaking solution to the problem of single image 3D reconstruction. It enables the generation of a complete 360-degree 3D textured mesh from a single image, all within a remarkably short time frame of just 45 seconds.

How One-2-3-45 Works

The key components of the One-2-3-45 method are Zero123 and One2345.


Zero123 is a view-conditioned 2D diffusion model. It takes a single input image and employs an encoder network to encode it into a latent vector. Using a decoder network, Zero123 generates multiple images of the same object from different viewpoints, based on the latent vector and the desired camera pose. These multi-view images provide a comprehensive representation of the object’s appearance and viewpoint variations.


One2345 is a neural surface reconstruction module. It lifts the multi-view images generated by Zero123 into 3D space using signed distance functions (SDFs). By projecting the images onto a spherical surface using differentiable rendering, One2345 reconstructs the 3D shape of the object. It accomplishes this by minimizing the difference between the projected images and the rendered images obtained from the SDF.

Figure 3: comparison of One-2-3-45 with Point-E [52], Shap-E [29], Zero123 (Stable Dreamfusion version) [36], 3DFuse [68], and RealFusion [43]. In each example, we present both the textured and textureless meshes.

Benefits and Advantages of One-2-3-45 Method

The One-2-3-45 method offers several significant advantages over existing approaches

  1. Faster Reconstruction: One-2-3-45 can generate a complete 3D textured mesh in just 45 seconds, making it significantly faster than alternative methods.
  2. Improved Geometry and Consistency: The method produces more accurate geometry and exhibits greater 3D consistency, resulting in meshes that closely adhere to the input image.
  3. Integration with Text-to-Image Models: One-2-3-45 seamlessly integrates with off-the-shelf text-to-image diffusion models, enabling the generation of realistic 3D shapes from text descriptions.

One-2-3-45: Applications and Future Implications

The ability to create 3D meshes from single images in a rapid and accurate manner has numerous practical applications. Moreover, it opens up possibilities in virtual and augmented reality, computer graphics, gaming, content creation, and more. Furthermore, researchers and developers can leverage this technique to enhance user experiences, create interactive virtual environments, and streamline the process of 3D content generation. Additionally, this innovative approach can revolutionize industries such as architecture, fashion, medical imaging, and entertainment.


In conclusion, the One-2-3-45 method signifies a remarkable breakthrough in single image 3D reconstruction. Moreover, It leverages a view-conditioned 2D diffusion model and a neural surface reconstruction module. Furthermore, this method enables the swift generation of complete 360-degree 3D textured meshes from a single image in just 45 seconds. Embracing the transformative potential of this approach allows us to embark on a journey toward unlocking new frontiers in 3D visualization and immersion. With this cutting-edge technique at your disposal, you can revolutionize the way we interact with virtual objects and explore realistic 3D scenes. It propels us into a realm of limitless possibilities.