CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Banners
  • Contact Us
  • Login

Automatic Writer Identification and Verification using Online Handwriting


 Sachin Gupta (homepage)

Automatic person identification is one of the major concerns in this era of automation. However, this is not a new problem and our society has adopted several different ways to authenticate the identity of a person such as signature and possessing a document. With the advent of electronic communication media (Internet), the interactions are becoming more and more automatic and thus the problem of identity theft has became even more severe. Even, the traditional modes of person authentication systems such as Possessions and Knowledge are not able to solve this problem. Possessions include physical possessions such as keys, passports, and smart cards. Knowledge is a piece of information that is memorized, such as a password and is supposed to be kept a secret. Knowledge and possession based methods are more focused on "what you know" or "what you possess" rather than "who you are". Due to inability of knowledge and possession based authentication methods to handle the security concerns, biometrics research have gained significant momentum in the last decade as the security concerns are increasing due to increasing automation of every field. Biometrics refers to authentication of a person using a physiological and behavioral trait of the individual that distinguish him from others. Biometric authentication has various advantages over knowledge and possession based identification methods including ease of use and non repudiation. In this thesis, we address the problem of handwriting biometrics. Handwriting is a behavioral biometric as it is generated as the consequence of an action performed by a person. Handwriting identification also has a long history. Signature (a specific instance of handwriting) has been used for authentication of legal documents for a long time.

This thesis addresses the various problems related to automatic handwriting identification. Most of the writer identification work is being done manually till date as a lot of context dependent information, such as, source of documents, nature of handwriting, etc. is difficult to model mathematically. However, they can be easily analyzed by human experts. Still, an automatic handwriting analysis system is useful as it can remove subjectivity from the process of handwriting identification and can be used for expert advice in various court cases. The final aim of this research is to design efficient algorithms for automatic feature extraction and recognition of the writer from a given handwritten document with as less human intervention as possible.

Specifically, we propose efficient solutions to three different applications of handwriting identification. First we look at the problem of determining the authorship of an arbitrary piece of online handwritten text. We then analyze the discriminative information from online handwriting to propose an efficient and accurate approach for text-dependent writer verification for practical and low security applications. We also look at the problem of repudiation in handwritten documents for forensic document examination. After introducing the problem of repudiation in handwritten documents, we propose an algorithm for repudiation detection in the handwritten documents. Handwriting identification is quite different from handwriting recognition; the other popular sub-field of automatic handwriting analysis. Handwriting recognition tries to identify the content of a handwritten text and tries to minimize variations due to writing style. On the other hand, in the case of handwriting identification, variations due to style is sought out.

 

Year of completion:  February 2008
 Advisor : Anoop M. Namboodiri

Related Publications

  • Sachin Gupta and Anoop M. Namboodiri  - Text-Dependent Writer Verification Using Boosting Proceedings of International Conference of Frontiers in Handwriting Recognition, Montreal, Canada, 2008 [PDF]

  • Sachin Gupta and Anoop M. Namboodiri - Repudiation Detection in Handwritten Documents Proc of The 2nd International Conference on Biometics (ICB'07), PP. 356-365 Seoul, Korea, 27-29 August, 2007. [PDF]

  • Anoop M. Namboodiri and Sachin Gupta - Text Independent Writer Identification from Online Handwriting, International Workshop on Frontiers in Handwriting Recognition(IWFHR'06), October 23-26, 2006, La Baule, Centre de Congreee Atlantia, France. [PDF]


Downloads

thesis

 ppt

 

Word Hashing for Efficient Search In Document Image Collections


Anand Kumar

A large numbers of document image collections are now being scanned and made available over the Internet or in digital braries. Effective access to such information sources is limited by the lack of efficient retrieval schemes. The use of text search methods requires efficient and robust optical character recognizers (OCR), which are presently unavailable for Indian languages. Word spotting - word image matching - may instead be used to retrieve word images in response to a word image query. The approaches used for word spotting so far, dynamic time warping and/or nearest neighbor search tend to be slow for large collection of books. Direct matching of images is inefficient due to the complexity of matching and thus impractical for large databases. In general, indexing and retrieval methods for document images cluster similar words and build indexes with the representatives of the clusters. The time required for building such a clustering based index is very high. Such indexing methods are time inefficient with the use of complex image matching procedures required in the clustering step. This problem is solved by directly hashing word image representations.

An efficient mechanism for indexing and retrieval in large document image collections is presented in this thesis. First, document images are segmented to get words. Then features are computed at word level and indexed. Word retrieval is done very efficiently with \emph{content-sensitive hashing} (CSH), which uses an approximate nearest neighbor search technique called locality sensitive hashing (LSH). The word images are hashed into hash tables using features computed at word level. Content-sensitive hash functions are used to hash words such that the probability of grouping similar words in the same index of the hash table is high. The sub-linear time CSH scheme makes the search very fast without degrading accuracy. Experiments on a collection of Kalidasa's - the classical Indian poet of antiquity - books in Telugu demonstrate that the word images may be searched in a few milliseconds. The approach thus makes searching document image collections practical. The search time is reduced significantly by hashing the words. (more...)

 

Year of completion:  2008
 Advisor : C. V. Jawahar & R. Manmatha

Related Publications

  • Anand Kumar, C.V. Jawahar & R. Manmatha - Efficient Search in Document Image Collections Proceedings of 8th Asian Conference on Computer Vision (ACCV'07),Part I, LNCS 4843, pp. 586.595 Tokyo Japan, 18-22 November, 2007. [PDF]

  • C.V. Jawahar and Anand Kumar - Content-level Annotation of Large Collection of Printed Document Images Proc of 9th International Conference on Document Analysis and Recognition, Brazil, 23-26 September, 2007. [PDF]

  • Anand Kumar, A. Balasubramanian, Anoop M. Namboodiri and C.V. Jawahar - Model-Based Annotation of Online Handwritten Datasets, International Workshop on Frontiers in Handwriting Recognition(IWFHR'06), October 23-26, 2006, La Baule, Centre de Congreee Atlantia, France. [PDF]

 


Downloads

thesis

 ppt

Proxy Based Compression of Depth Movies


Pooja Verlani (homepage)

Sensors for 3D data are common today. These include multicamera systems, laser range scan- ners, etc. Some of them are suitable for the real-time capture of the shape and appearance of dynamic events. The 2-1/2 D model of aligned depth map and image, called a Depth Image, has been 1 popular for Image Based Modeling and Rendering (IBMR). Capturing the 2-1/2D geometric structure and photometric appearance of dynamic scenes is possible today. Time varying depth and image sequences, called Depth Movies, can extend IBMR to dynamic events. The captured event con- tains aligned sequences of depth maps and textures and are often streamed to a distant location for immersive viewing. The applications of such systems include virtual-space tele-conferencing, remote 3D immersion, 3D entertainment, etc. We study a client-server model for tele-immersion where captured or stored depth movies from a server is sent to multiple, remote clients on demand. Depth movies consist of dynamic depth maps and texture maps. Multiview image compression and video compression have been studied earlier, but there has been no study about dynamic depth map compression. This thesis contributes towards dynamic depth map compression for efficient transmission in a server-client 3D teleimmersive environment. The dynamic depth maps data is heavy and need efficient compression schemes. Immersive applications requires time-varying se- quences of depth images from multiple cameras to be encoded and transmitted. At the remote site of the system, the 3D scene is generated back by rendering the whole scene. Thus, depth movies of a generic 3D scene from multiple cameras become very heavy to be sent over network considering the available bandwidth.

poojathesis

This thesis presents a scheme to compress depth movies of human actors using a parametric proxy model for the underlying action. We use a generic articulated human model as the proxy to represent the human in action and the various joint angles of the model to parametrize the proxy for each time instant. The proxy represents a common prediction of the scene structure. The difference between the captured depth and the depth of the proxy is called as the residue and is used to represent the scene exploiting the spatial coherence. A few variations of this algorithm are presented in this thesis. We experimented with bit-wise compression of the residues and analyzed the quality of the generated 3D scene. Differences in residues across time are used to exploit temporal coherence. Intra-frame coded frames and difference-coded frames provide random access and high compression. We show results on several synthetic and real actions to demonstrate the compression ratio and resulting quality using a depth-based rendering of the decoded scene. The performance achieved is quite impressive. We present the articulation fitting tool, the com- pression module with different algorithms and the server-client system with several variants for the user. The thesis first explains the concepts about 3D reconstruction by image based rendering and modeling, compressing such 3D representations, teleconferencing, later we proceed towards the concept of depth images and movies, followed by the main algorithms, examples, experiments and results.

 

Year of completion:  2008
 Advisor : P. J. Narayanan

Related Publications

  • Pooja Verlani, P. J. Narayanan - Proxy-Based Compression of 2-1/2D Structure of Dynamic Events for Tele-immersive Systems Proceedings of 3D Data Processing, Visualization and Transmission June 18-20, 2008, Georgia Institute of Technology, Atlanta, GA, USA. [PDF]

  • Pooja Verlani, P. J. Narayanan - Parametric Proxy-Based Compression of Multiple Depth Movies of Humans Proceedings of Data Compression Conference 2008 March 25 to March 27 2008, Salt Lake City, Utah. [PDF]

  • Pooja Verlani, Aditi Goswami, P.J. Narayanan, Shekhar Dwivedi and Sashi Kumar Penta - Depth Image: representations and Real-time Rendering, Third International Symposium on 3D Data Processing, Visualization and Transmission,North Carolina, Chappel Hill, June 14-16, 2006. [PDF]

 


Downloads

thesis

 ppt

 

Projected Texture for 3D Object Recognition


Avinash Sharma (homepage)

Three dimensional objects are characterized by their shape, which can be thought of as the variation in depth over the object, from a particular view point. These variations could be deterministic as in the case of rigid objects or stochastic for surfaces containing a 3D texture. These depth variations are lost during the process of imaging and what remains is the intensity variations that are induced by the shape and lighting, as well as focus variations. Algorithms that utilize 3D shape for classification tries to recover the lost 3D information from the intensity or focus variations or using additional cues from multiple images, structured lighting, etc. This process is computationally intensive and error prone. Once the depth information is estimated, one needs to characterize the object using shape descriptors for the purpose of classification. Image-based classification algorithms try to characterize the intensity variations of the image for recognition. As we noted, the intensity variations are affected by the illumination and pose of the object. The attempt of such algorithms is to derive descriptors that are invariant to the changes in lighting and pose. Although image based classification algorithms are more efficient and robust, their classification power is limited as the 3D information is lost during the imaging process. Our problem is to find an image-based recognition method, which utilize the shape of the object, without explicitly recovering the 3D shape of the object. This implicitly avoids the high computational cost of shape recovery while achieving high accuracies. The method should be robust to view variation, occlusion and also should invariant to scale and position of the object. It should also handle partially specular and a texture-less object surfaces. We propose the use of structured lighting patterns, which we refer to as {\em projected texture}, for the purpose of object recognition. The depth variations of the object induces deformations in the projected texture, and these deformations encode the shape information. The primary idea is to view the deformation pattern as a characteristic property of the object and use it directly for classification instead of trying to recover the shape explicitly. To achieve this we need to use an appropriate projection pattern and derive features that sufficiently characterize the deformations. The patterns required could be quite different depending on the nature of the object shape and its variation across the objects. Specifically, we look at three different recognition problems and propose appropriate projection patterns, deformation characterizations, and recognition algorithms for each. The first category of objects are of fixed shape and pose, where minor differences in shape are to be used for discriminating between classes. 3D hand geometry recognition is taken as the example of class of objects. The second class of recognition problem is that of category recognition of rigid objects from arbitrary view points. We propose a classification algorithm based on popular bag-of-words paradigm for object recognition. Third problem is that of 3D texture classification, where the depth variation in surface is stochastic in nature. We propose a set of simple texture features that can capture the deformations in projected lines on 3D textured surfaces. The above mentioned approaches have been implemented, verified, tested, and compared on various datasets collected as well as available on the Internet. The analysis and comparative results demonstrate significant improvement over the existing approaches, in terms of accuracy and robustness. (more...)

 

Year of completion:  2008
 Advisor : Anoop M. Namboodiri

Related Publications

  • Avinash Sharma and Anoop M. Namboodiri - Object Category Recognition with Projected Texture IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 374-381, 16-19 Dec,2008, Bhubaneswar, India. [PDF]

  • Visesh Chari, Avinash Sharma, Anoop M Namboodiri and C.V. Jawahar - Frequency Domain Visual Servoing using Planar Contours IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 87-94, 16-19 Dec,2008, Bhubaneswar, India. [PDF]

  • Avinash Sharma and Anoop M. Namboodiri - Projected Texture for Object Classification Proceedings of the 10th European Confernece on Computer Vision (ECCV 2008), 12-18 Oct, 2008, France. [PDF]

  • Avinash Sharma, Nishant Shobhit and Anoop M. Namboodiri - Projected Texture for Hand Geometry based Authentication Proceedings of CVPR Workshop on Biometrics, 28 June, Anchorage, Alaska, USA. IEEE Computer Society 2008. [PDF]

 


Downloads

thesis

 ppt

 

 

Real Time Rendering of Implicit Surfaces on the GPU


Jag Mohan Singh

Generating visually realistic looking models is one of the core problems of Computer Graphics. Rasterization or scan converting the primitives used such as triangles is one method to render them. This method suffers from problems of an inexact representation as triangles themselves are an approximation of the underlying geometry. Ray tracing primitives is another method of rendering the objects. This method delivers exact representation of the underlying geometry and looks visually realistic. We thus use ray tracing of implicit surfaces rather than polygonizing them. The programmable graphics processor units (GPUs) have high computation capabilities but relatively limited bandwidth for data access. Compact representation of geometry using a suitable procedural or mathematical model and a ray-tracing mode of rendering fit the GPUs well, consequently. An implicit surface can be represented as S(x,y,z) = 0 and the ray dependent equation is F_f(t) = 0. Ray tracing S(x,y,z) = 0 is root computation of F_f(t) = 0 for all the pixels on the screen. Analytical methods can be used in surfaces up to order 4. We compute interval extension of functions exactly by computing the function at points of maxima and minima and end points. Since, we can compute roots of functions up to order 4 we can compute points of maxima and minima of functions up to order 5. We use interval arithmetic for surfaces up to order 5 using Mitchell's algorithm. Interval methods provide a robust way for root isolation. Marching points algorithm marches in equal stepsizes until the root is found which is detected by a sign change in the function. Marching points wastes computation by computing the function values at many points. Adaptive marching points algorithm marches adaptively to find the root. Though only fourth or lower order surfaces can be rendered using analytical roots, our adaptive marching points algorithm can ray-trace arbitrary implicit surfaces exactly, by sampling the ray at selected points till a root is found. Adapting the sampling step size based on a proximity measure and a horizon measure delivers high speed. The horizon measure helps in silhouette adaptation and provides good quality silhouettes. We also provide a taylor test which has flavours of interval arithmetic and helps in robust rendering of surfaces using adaptive marching points algorithm. While computing the function S(x,y,z) = 0 we never compute the ray dependent F_f(t) = 0 by using coefficients of t. We save lot of computational overhead by computing S(x,y,z) = 0 directly instead as there are O(d^3) coefficients for t where d is the degree of the surface. In our method we don't need coefficients of t which are expensive to compute we only need the value S(x,y,z) = 0. The derivative F'_f(t) can also be calculated efficiently using the gradient of S() as grad(S(x, y, z)) dot D_f. The Barth decic can be evaluated using about 30 terms as S(x, y,z) but needs to evaluate 1373 terms to compute all 11 coefficients of the tenth order polynomial F_f(t). We render Dynamic Implicit Surfaces which vary with time. Overall, a simple algorithm that fits the SIMD architecture of the GPU results in high performance. We ray-trace algebraic surfaces up to order 18 and non-algebraic surfaces including a Blinn's blobby with 30 spheres at better than interactive frame rates. Our adaptive marching points is an ideal match for the SIMD model of GPU due to low computational cost required per operation. We use analytical methods for ray tracing surfaces up to order 4. We achieve fps of 3750 on a cubic surface and 1400 on a quartic surface. We use the robust Mitchell method on surfaces up to order 5 and achieve fps up to 400 on a torus quartic and 85 on a quintic surface. Our adaptive marching points method renders high order implicit surfaces at interactive frame rates. We render surface of order 18 at an fps of 158. These experiments used NVIDIA 8800 GTX at a resolution of 512x512. Our GPU Objects renders Bunny with 35,947 spheres at 57 fps, 99,130 spheres is rendered at 30 fps and Hyperboloid with reflection and refraction at 300 fps. NVIDIA 6600 GTX was used in experiments related to GPU Objects and the viewport was of the size 512x512.

teaser

 

Year of completion:  December 2008
 Advisor : P. J. Narayanan

Related Publications

  • Jag Mohan Singh and P. J. Narayanan - Real-Time Ray Tracing of Implicit Surfaces on the GPU IEEE Transactions Visualization and Computer Graphics, Vol. 16(2), pp. 261-272 (2010). [PDF]

  • Visesh Chari,  Jag Mohan Singh and P. J. Narayanan - Augmented Reality using Over-Segmentation Proceedings of National Conference on Computer Vision Pattern Recognition Image Processing and Graphics (NCVPRIPG'08),Jan 11-13, 2008, DA-IICT, Gandhinagar, India. [PDF]

  • Kedarnath Thangudu, Lakshmi Gade, Jag Mohan Singh and P. J. Narayanan - Point Based Representations for Hierarchical Environments In International Conference on computing: Theory and Applications(ICCTA), Kolkatta, 2007. [PDF]

  • Jag Mohan Singh and P.J. Narayanan - Progressive Decomposition of Point Clouds Without Local Planes, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.364-375, 2006. [PDF]

  • Sunil Mohan Ranta, Jag Mohan singh and P.J. Narayanan - GPU Objects , 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.352-363, 2006. [PDF]


Downloads

thesis

 ppt

 

 

More Articles …

  1. Multiple View Geometry Applications to Robotic and Visual Servoing
  2. Scalable Primitives for Data Mapping and Movement on the GPU
  3. Learning in Large Scale Image Retrieval Systems
  4. Document Enhancement Using Text Specific Prior
  • Start
  • Prev
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • Next
  • End
  1. You are here:  
  2. Home
  3. Research
  4. Thesis
  5. Thesis Students
Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.