Department of Computing and Numerical Analysis, University of Cordoba
UCO

Deep Depth Pose (DDP) model: 3D Pose Estimation from Depth Maps using a Deep combination of Poses

Manuel J. Marín-Jiménez, Francisco J. Romero, Rafael Muñoz-Salinas, Rafael Medina-Carnicer

Overview

This work addresses the problem of 3D human pose estimation from depth maps employing a Deep Learning approach. We propose a model, named Deep Depth Pose (DDP), which receives a depth map containing a person and a set of predefined 3D prototype poses and returns the 3D position of the body joints of the person. In particular, DDP is defined as a ConvNet that computes the specific weights needed to linearly combine the prototypes for the given input. We have thoroughly evaluated DDP on the challenging 'ITOP' and 'UBC3V' datasets, which respectively depict realistic and synthetic samples, defining a new state-of-the-art on them.

The following figure summarizes the main steps of our approach:

DDP pipeline


Results

Quantitative results

Results on ITOP (frontal and top views):
Results on ITOP

Results on UBC3V Hard-Pose (vs Shafaei'2016):
Results on UBC3V

Qualitative results

We show in the following video actual results on the test partition of ITOP dataset. Each image has been processed independently.




Poses estimated on ITOP test samples: Download.



Downloads

Filename Description Size
Demo code at GitHub Demo code for ITOP (contains sample data) -- MB
itopresults.mp4 Video with estimated 3D poses 78 MB

Related Publications

[1] M. Marin-Jimenez, Francisco J. Romero, Rafael Muñoz-Salinas, Rafael Medina-Carnicer
3D Pose Estimation from Depth Maps using a Deep combination of Poses
Journal of Visual Communication and Image Representation (in press), 2018

@Article{Marin18ijvcr,
  author     = "Marin-Jimenez, M.J. and Romero, F.J. and Mu\~noz-Salinas, R. and Medina-Carnincer, R.",
  title      = "3D Pose Estimation from Depth Maps using a Deep combination of Poses",
  journal    = "Journal of Visual Communication and Image Representation",
  year       = "2018",
  doi  = "https://doi.org/10.1016/j.jvcir.2018.07.010",
  note = "In press"
}

Acknowledgements

This project has been funded under projects TIN2016-75279-P and IFI16/00033 (ISCIII) of Spain Ministry of Economy, Industry and Competitiveness, and FEDER. Thanks to NVidia for donating the GPU Titan Xp used for the experiments presented in this work.