[paper review] SINGAPO

1 minute read

Published:

SINGAPO: Single Image Controlled Generation of Articulated Parts In Objects

| Project Page | Arxiv | Code |

  • Read ✓✓
  • Interests ★★☆☆☆
  • Keywords
    • 3d generation
    • articulated object
    • retrieval

Introduction

A method to generate articulated objects from a single image. Designed a pipeline that produces articulated objects from high-level structure to geometric details in a coarse-to-fine manner, where we use a part connectivity graph and part abstraction as proxies.

Motivation and contributions

  • recent line of works
    • reconstructs articulated objects from images or videos that capture the object from multi-views or multiple articulated states.
    • aims to generate articulated objects either unconditionally or guided by high-level constraints
  • So, in this work, tried to address
    • use single arbitrary view image as input
    • modular pipeline to allow more control

Methods

  • Input : single resting state image
  • Output : retrieved and assembled articulated parts Alt text

Goal

generating a set of articulated parts assembling into a 3d object that is geometrically consistent with the input image while being kinematically plausible.

  • use the input image to guide the abstract level generation of each part (used diffusion) > this paper focused on this first step.
  • fitting part meshes to the generated part arrangement and constructing them into a 3d articulated object (=retrieval)

Datasets

  • used PartNet-Mobility (7 categories) , rendered 20 images from random viewpoints + augmentation = 3063 objects paired with 20 images for training and 77 objects paired 2 random views for testing
  • ACD dataset to test the generalization capability

Limitations

  • tailored to specific object categories (cuboid shapes)
  • mesh retrieval falls short in capturing finer geometric details of parts as seen in the input