Motion in Multiple Views
Mathematically, an image is the projection of the 3D world onto the 2D plane of the camera. This projection results in the loss of information present in the third dimension, popularly referred to as the depth or the z dimension. It is easy to see that a plurality of projections can compensate for this loss and this has led to the study of the geometry that underlies multiple views of the same scene.
Multiview analysis of scenes is an active area in Computer Vision today. The structure of points and lines as seen in two views attracted the attention of computer vision researchers like Longuet-Higgins and Oliver Faugeras in the eighties and early nineties. Similar studies on the underlying constraints in three or more views followed. The mathematical structure underlying multiple views has been analysed with respect to projective, affine, and euclidean frameworks of the world with amazing results. These multiview relations have been used for visual recognition of 3D objects under changing view positions, object tracking by means of image stabilization processing, view synthesis of 3D objects from 2D views of the same without recovering their 3D structure and many other applications.
Multiple view situations in Computer Vision have been analyzed with two objectives: to derive scene-independent constraints relating multiple views and to derive view-independent constraints relating multiple scene points. While the first approach seeks to model the configuration of cameras, the second attempts to characterize the configuration of points in the 3D world. The focus of our work is related to the second approach -- to derive constraints on the configuration of the points being imaged in a manner that does not depend on the viewpoint or imaging parameters. Non-rigid motion is difficult to analyze in this scheme. The case of multiple objects moving with different velocities or accelerations can be considered very close to the case of non-rigid motion. We have developed constraints on the projections of such point configurations which can be categorized into two classes: constraints that are time-dependent -- which are functions of time, and constraints that are time-independent -- which hold at every time instant.
Bennet and Hoffman showed that polynomials to characterize a configuration of stationary points in a view-independent manner can be constructed from 2 views of 4 points under orthographic projection. This was extended by Carlsson to the case of scaled orthographic projection using 2 views of 5 points. Shashua and Levin generalized this to the case of an affine projection model for a time-dependent view-independent constraint on the projections of 5 points moving with different but constant velocities.
We show that the projection of a point moving with constant velocity in the world moves with constant velocity in the image when the camera is affine. This result is used to formulate a view and time-independent constraint on the velocities of the projections of 4 points whose computation needs 2 views. This is a significant theoretical contribution as it needs fewer points and accommodates a more general imaging model. In the same manner, points moving with constant acceleration in the world move with constant acceleration in the image as well. We derive time-dependent and time-independent constraints on respectively the velocities and accelerations of the projections of 4 points. These view-independent constraints can be used to recognize a configuration of 4 moving points and also to align frames of synchronized videos.
Though, points moving with independent uniform velocities or accelerations model many non-rigid motion conditions, it is desirable to have a technique that accommodates general non-linear motion. We make the observation that as a point moves in the world it traces out a contour in the world and the trajectory traced out by its projection in an image would be the projection of the world contour. Thus, the contours traced out in different views would correspond. So the problem of analysing the non-rigid motion of a point can be transformed to the problem of analysing the contour traced out by its projections in various views. When the non-rigid motion in the world is restricted to a plane, that is when the motion is planar, the contours traced out in the views can be thought to be projections of a planar shape, the shape being the trajectory in the world. We discuss how planar shape recognition techniques can be used to recognize and analyse the contours.
We then combined the motion constraints with properties of a contour to devise a mechanism for recognizing a deforming contour, points on whose boundary move with independent uniform linear velocity or acceleration. Given two views of the deforming contour in a reference view, we can now recognize the same contour in any view at any time instant. We have also derived novel view-dependent parameterizations of the motion of the projections of points moving with uniform linear motion in the world, which enable us to arrive at view-independent constraints on the projections of points in motion that are simpler than the ones reported in literature.
|Year of completion:
Sujit Kuthirummal, C. V. Jawahar and P. J. Narayanan - Constraints on Coplanar Moving Points, Proceedings of the European Conference on Computer Vision (ECCV) LNCS 3024, May 2004, Prague, Czech Republic, pp. 168--179. [PDF]
Sujit Kuthirummal, C. V. Jawahar and P. J. Narayanan - Fourier Domain Representation of Planar Curves for Recognition in Multiple Views, Pattern Recognition, Vol. 37, No. 4, April 2004, pp. 739--754. [PDF]
Sujit Kuthirummal, C.V. Jawahar and P.J. Narayanan - Algebraic Constraints on Moving Points in Multiple Views, Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing(ICVGIP), Dec. 2002, Ahmedabad, India, pp. 311--316. [PDF]
Sujit Kuthirummal, C.V. Jawahar and P.J. Narayanan - Multiview Constraints for Recognition of Planar Curves in Fourier Domain, Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing(ICVGIP), Dec. 2002, Ahmedabad, India, pp. 323--328. [PDF]
Sujit Kuthirummal, C. V. Jawahar and P. J. Narayanan, Video Frame Alignment in Multiple Views, Proceedings of the International Conference on Image Processing(ICIP), Sep. 2002, Rochester, NY, Vol. 3, pp. 357--360. [PDF]
Sujit Kuthirummal, C. V. Jawahar and P. J. Narayanan, Planar Shape Recognition across Multiple Views, Proceedings of the International Conference on Pattern Recognition(ICPR), Aug. 2002, Quebec City, Canada, pp. 482--488. [PDF]
Sujit Kuthirummal, C. V. Jawahar and P.J. Narayanan - Frame Alignment using Multi View Constraints, Proceedings of the National Conference on Communications (NCC), Jan. 2002, Mumbai, India, pp. 523--527. [PDF]