Fine-Tuning Human Pose Estimation in Videos
Digvijay Singh Vineeth Balasubramanian C. V. Jawahar
Overview
We propose a semi-supervised self-training method for fine-tuning human pose estimations in videos that provides accurate estimations even for complex sequences. We surpass state-of-the-art on most of the datasets used and also show a gain over the baseline on our new dataset of unrestricted sports videos. The self-training model presented has two components: a static Pictorial Structure based model and a dynamic ensemble of exemplars. We present a pose quality criteria that is primarily used for batch selection and automatic parameter selection. The same criteria works as a low-level pose evaluator used in post-processing. We set a new challenge by introducing a full human body-parts annotated complex dataset, CVIT-SPORTS, which contains complex videos from the sports domain. The strength of our method is demonstrated by adapting to videos of complex activities such as cricket-bowling, cricket-batting, football as well as available standard datasets.
Here we release our implementation of [1] for MATLAB software. To read more about the method, check the pdf on the left.
Downloads
Filename | Description | Size |
---|---|---|
fine_tuning_pose.tar.gz | Matlab code for fine-tuning human pose estimation in videos. | 94 MB |
README | Description on running the code and other info. | 4.0 KB |
cvit_sports_videos.tar.gz | CVIT-SPORTS-Videos dataset of 11 video sequences from cricket domain. | 66 MB |
References
[2] Y. Yang, D. Ramanan. Articulated Pose Estimation using Flexible Mixtures of Parts. CVPR 2011.
[3] A. Cherian, J. Marial, K. Alahari, C. Schmid. Mixing Body-Part Sequences for Human Pose Estimation. CVPR 2014.
[4] B. Sapp, D. Weiss, B. Taskar. Parsing Human Motion with Stretchable Models. CVPR 2011.
[5] T. Malisiewicz, A. Gupta, A. Efros. Ensemble of Exemplar-SVMs for Object Detection and Beyond. ICCV 2011.