Kiran Varanasi (homepage)
When geometric primitives such as curves and patterns appear in repetition over a per- spective image, they offer key information for recovering the real metric structure of the scene. This happens because multiplicity is equivalent to motion - a single image of such a scene is equivalent to multiple images taken from varying camera viewpoints. Multiplicity can manifest in the image through several forms - tiling (translational symmetry), reflection (bilateral symmetry), or rotation (point symmetry). In all these cases, it pays dividends to group these patterns into geometrically meaningful sets, i.e, into sets of patterns which pro- duce a uniform geometric constraint. For example, a unique vanishing point. Each of these patterns is defined in terms of some interest points and contour segments. It is conceivable that these points and contours are ill-identified, and this makes the task of geometric group- ing all the more challenging. However, if only the patterns are grouped robustly, the later tasks of 3D reconstruction and the estimation of pose can be handled with high accuracy.
Symmetrical patterns are commonplace in natural and man-made environments. Espe- cially in architectural scenes, these patterns abound in all types of variety. Several attempts have been being made by the research community to exploit this information. Success has been reported in several areas, particularly in the area of image based modeling and ren- dering (IBMR). In this thesis, we study the problem of geometric grouping of patterns, from the context of an interactive IBMR application. We will demonstrate how geometric grouping with minimal user interaction is useful towards improving the robustness of struc- ture recovery. We handle the problem of geometric grouping of planar patterns in all its generality - we do not presume a known period of repetition or a known template for the patterns. The only properties which we use for grouping the patterns are the geometric constraints (such as the vanishing line and the circular points) - properties that would be es- timated by the patterns themselves. Our algorithms facilitate the aggregation of geometric information from multiple sources in the image, and thus make robust rectification possible.
The principal contributions of our work are on three fronts - (1) A method for identify- ing interest points on a set of poorly identified and badly fragmented image contours (2) A greedy optimization approach for computing point correspondences through preservingthe coherence of spatial information (3) Ways to incorporate the information provided by user-input into the optimization process. Below, we discuss each of these issues in brief. In order to have a generic mechanism for describing patterns, we need to have a method for detecting interest points on the patterns reliably. The color / intensity information may not hold important cues for corner detection if the saliency of the required points is primarily geometric. A pattern in our case is represented using a collection of image con- tours, which could be badly fragmented and erroneous. We note that local shape properties such as derivatives at a point would be damaged badly by the operation of perspective projection. However, global shape properties such as the relative distance of points from the center of mass of the contour would not be badly affected, though not guaranteed to be preserved. We shall use these global properties and guardedly compute a set of interest points. We apply a neighborhood of a given size and suppress interest points which are not maximal in their neighborhoods. In the thesis, we demonstrate through experiments that this method identifies interest points reliably on a perspectively distorted image of the pattern. We compare our results with those provided by the method of Shi-Tomasi.
After the detection of interest points, we study the problem of pairing two sets of points uniquely with each other. We provide an effective solution for this problem by an opti- mization approach which tries to maximize the spatial coherence subject to a consensus on geometric constraints. Our method stands in contrast to classical methods which try to match points through the use of geometric invariants. Instead, we try to satisfy spatial co- herence between the point matches - which means the conservation of Euclidean properties such as angle and ratio of lengths. Though it is true that these Euclidean properties are no longer valid after a projective transform, we show that they can be preserved in an approxi- mate sense. Unlike invariant-based methods, our method has the added advantage of being robust to noise and outliers. We frame the optimization problem in a greedy setting and thus obtain a locally maximal solution. This produces a matching between the two point sets. When the replication (multiplicity) of the patterns is more than two, they are handled through generalizing our solution to the entire set of points. The model is solved using Levenberg Marquardt optimization. This is similar to the bundle-adjustment algorithm in stereo.
We study the problem of geometric grouping in the context of an interactive application. One of the major concerns for such an application would be to minimize the level of user- interaction and also to be able to tolerate errors at the micro-level. Previous methods of IBMR through plane-based rectification have required the user to provide information at the pixel level, in the form of parallel and perpendicular line segments. This method is not only error-prone but also extremely taxing for the user to provide. In our application, we provide new ways for the user to interact with the system and new means of incorporating his user-input into the optimization process. We demonstrate that the user input is indeed useful in reducing the combinatorial complexity of the algorithm by a large factor.
The method of geometric grouping of patterns has several applications. We discuss the direct application of an interactive image-based modeler. We provide references to the other applications - tracking, recognition, stereo etc in the future work section.
|Year of completion:||2006|