Google Scanned Objects: A High-Quality Dataset of 3D Scanned Household Items
read the original abstract
Interactive 3D simulations have enabled breakthroughs in robotics and computer vision, but simulating the broad diversity of environments needed for deep learning requires large corpora of photo-realistic 3D object models. To address this need, we present Google Scanned Objects, an open-source collection of over one thousand 3D-scanned household items released under a Creative Commons license; these models are preprocessed for use in Ignition Gazebo and the Bullet simulation platforms, but are easily adaptable to other simulators. We describe our object scanning and curation pipeline, then provide statistics about the contents of the dataset and its usage. We hope that the diversity, quality, and flexibility of Google Scanned Objects will lead to advances in interactive simulation, synthetic perception, and robotic learning.
This paper has not been read by Pith yet.
Forward citations
Cited by 8 Pith papers
-
DepthMaster: Unified Monocular Depth Estimation for Perspective and Panoramic Images
DepthMaster unifies metric monocular depth estimation for perspective and panoramic images by patching panoramas into perspective views, adding a consistency loss and virtual cameras, and training mostly on perspectiv...
-
Synthetic Dataset Generation for Partially Observed Indoor Objects
A Unity virtual scanning system with ray-based simulation and procedural indoor scene generation produces the V-Scan dataset of partial scans paired with complete 3D models for training scene reconstruction and object...
-
PixGS: Pixel-Space Diffusion for Direct 3D Gaussian Splat Generation
A single-stage pixel-space diffusion model for direct 3D Gaussian Splat generation that bypasses latent compression and adds geometric supervisions to outperform prior multi-stage methods.
-
A Cross-Model VLM-Judge Protocol for Single-Image 3D Mesh Quality (and Why Cheap Proxies Fall Short)
A reproducible VLM-judge protocol with position-bias correction is validated as superior to CLIP similarity and geometry-validity proxies for assessing single-image 3D mesh quality.
-
Unified Motion-Action Modeling for Heterogeneous Robot Learning
UMA treats object motion and robot actions as co-evolving variables under a masked generative objective with hindsight relabeling and contrastive disentanglement to support multi-task pretraining and deployment across...
-
Restore3D: Breathing Life into Broken Objects with Shape and Texture Restoration
Restore3D restores shape and texture of broken 3D objects via multi-view image refinement with a Mask Self-Perceiver and coarse-to-fine mesh reconstruction, outperforming baselines on synthetic and real benchmarks.
-
Qwen-Image Technical Report
Qwen-Image is a foundation model that reaches state-of-the-art results in image generation and editing by combining a large-scale text-focused data pipeline with curriculum learning and dual semantic-reconstructive en...
-
GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning
GeneralVLA-2 introduces GeoFuse-MV3D for improved multi-view 3D reconstruction and a governed memory system, demonstrating modest gains on 3D object and task benchmarks.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.