Skip to content

AaltoML/PAWS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PAWS: Perception of Articulation in the Wild at Scale from Egocentric Videos

Yihao Wang*, Yang Miao*, Wenshuai Zhao, Wenyan Yang, Zihan Wang, Joni Pajarinen, Luc Van Gool, Danda Pani Paudel, Juho Kannala, Xi Wang†, Arno Solin

*Equal contribution    †Co-advisor

Aalto University  |  INSAIT, Sofia University  |  ETH Zurich  |  TU Munich  |  MCML  |  ELLIS Institute Finland  |  University of Oulu

Project Page arXiv Code Data


Abstract

PAWS perceives object articulations from in-the-wild egocentric video via hand interaction and geometric cues, enabling downstream applications including articulation model fine-tuning and robot manipulation.

We propose PAWS, a method that directly extracts object articulations from hand–object interactions in large-scale in-the-wild egocentric videos. PAWS is an unsupervised articulation detection pipeline that uses only hand interactions and sparse 3D information, requiring no annotated data. It produces scalable articulation labels covering a wide range of objects and environments. We evaluate our method on HD-EPIC and Arti4D datasets, achieving significant improvements over baselines, and further demonstrate downstream applications in 3D articulation prediction and real-world robot manipulation.

See the project website at https://aaltoml.github.io/PAWS/.


Related Work

  • HaWoR — World-Space Hand Motion Reconstruction from Egocentric Videos (CVPR 2025)
  • VidBot — Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation (CVPR 2025)
  • Articulate3D — Zero-Shot Text-Driven 3D Object Posing

Citation

Coming soon.

About

Perception of Articulation in the Wild at Scale from Egocentric Videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors