Biological Vision


The perceptual mechanisms used by different organisms to negotiate the visual world are fascinatingly diverse. Even if we consider only the sensory organs of vertebrates, such as eye, there is much variety. Several disciplines have approached the problem of investigating how sensory, motors and central visual systems function and are oganised. Area of biological vision aims to build a computational understanding of various brian mechanisms. Synergy between biological and computer vision research can be found in low-level vision. Substantial insights about the processes for extracting colour, edge, motion and spatial frequency information from images have come from combining computational and neuro-physiological constraints. Understanding of human perception/vision is said to be an early step towards indetifying objects and understanding of scene.

Work Undertaken


:: Towards Understanding Texture Processing :: 

A fundamental goal of texture research is to develop automated computational methods for retrieving visual information and understanding image content based on textural properties in images. A synergy between biological and computer vision research in low-level vision can give substantial insights about the processes for extracting color, edge, motion, and spatial frequency information from images. In this thesis, we seek to understand the texture processing that takes place in low level human vision in order to develop new and effective methods for texture analysis in computer vision. The different representations formed by the early stages of HVS and visual computations carried out by them to handle various texture patterns is of interest. Such information is needed to identify the mechanisms that can be use in texture analysis tasks. (more detail...)

:: Biologically Inspired Interest Point Operator ::
poich1Interest point operators (IPO) are used extensively for reducing computational time and improving the accuracy of several complex vision tasks such as object recognition or scene analysis. SURF, SIFT, Harris,Corner points etc., are popular examples. Though there exists a large number of IPOs in the vision literature, most of them rely on low level features such as color,edge orientation etc., making them sensitive to degradation in the images.

Human vision systems (HVS) perform these tasks with seemingly little effort and are robust to such degradation by employing spatial attention mechanisms to reduce the computational burden. Extensive studies of these spatial attention mechanisms have led to several computational models (eg. Itti, Koch). However, very few models have found successful applications in computer vision related tasks partly owing to their prohibitive

Computational attention systems either have used top-down or bottom-up information. Using both types of information is an attractive choice for top-down knowledge is quite helpful particularly when images are degraded [Antonio Torralba]. Our work is focused on developing a robust biologically-inspired IPO capable of utilizing top-down knowledge. The operator will be tested as a feature detector /descriptor for a Monocular Visual SLAM.

Antonio Torralba, Contextual Priming for Object Detection, IJCV,Vol.53, 2,2003,Pages:169--191.
Laurent Itti, Christof Koch, A saliency based search mechanism for overt and covert shifts of visual attention,Vision Research,Vol.40,2000,Pages: 1489--1506

Current under-going Projects

  • Medical Image Reconstruction on Hexagonal Grid
  • Computational Understanding of Medical Image Interpretation by Expert

Related Publications

  • N.V. Kartheek Medathati, Jayanthi Sivaswamy - Local Descriptor based on Texture of Projections Proceedings of Seventh Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP'10),12-15 Dec. 2010,Chennai, India. [PDF]

  • Joshi Datt Joshi, Saurabh Garg and Jayanthi Sivaswamy - Script Identification from Indian Documents, Proceedings of IAPR Workshop on Document Analysis Systems (DAS 2006), Nelson, pp.255-267. [PDF]

  • Gopal Datt Joshi, Saurabh Garg and Jayanthi Sivaswamy - A Generalised Framework for Script Identification Proc. of International Journal for Document Analysis and Recognition(IJDAR), 10(2), pp.55-68, 2007. [PDF]

  • Gopal Datt Joshi and Jayanthi Sivaswamy - A Computational Model for Boundary Detection, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.172-183, 2006. [PDF]

  • Gopal Datt Joshi, and Jayanthi Sivaswamy - A Simple Scheme for Contour Detection, Proceedings of International Conference on Computer Vision and Applications (VISAP 2006), Setubal. [PDF]

  • L.Middleton and J. Sivaswamy, Hexagonal Image Processing, Springer Verlag, London, 2005, ISBN: 1-85233-914-4. [PDF]

  • Gopal Datt Joshi , and Jayanthi Sivaswamy - A Multiscale Approach to Contour Detection, Proceedings of International Conference on Cognition and Recognition ,pp. 183-193, Mysore, 2005. [PDF]

  • L. Middleton and J. Sivaswamy - A Framework for Practical Hexagonal-Image Processing, Journal of Electronic Imaging, Vol. 11, No. 1, January 2002, pp. 104--114. [PDF]


Associated People

Depth-Image Representations


Depth Images are viable representations that can be computed from the real world using cameras and/or other scanning devices. The depth map provides a 2 and a half D structure of the scene. The depth map gives a visibility-limited model of the scene and can be rendered easily using graphics techniques. A set of Depth Images can provide hole-free rendering of the scene. Multiple views need to be blended to provide smooth hole-free rendering. Such a representation of the scene is bulky and needs good algorithms for real-time rendering and efficient representation. A GPU-based algorithm can render large models represented using DIs in real time.

The image representation of the depth map may not lend itself nicely to standard image compression techniques, which are psychovisually motivated. The scene representation using multiple Depth Images contains redundant descriptions of common parts and can be compressed together. Compressing these depth maps using standard techniques such as LZW and JPEG and comparing the quality of rendered novel views by varying the quality factors of JPEG can give us a good trade-off analysis between the quality and compression ratio. Multiview compression of texture images can be performed by exploiting the constraints between views such as disparity, epipolar constraint,multilinear tensors, etc.

GPU Rendering & DI Compression


We aim at rendering big,complex scenes efficiently from their Depth maps. Some of the features of the system are ::

  • The novel view point is not restricted and can be anywhere in the seen, unlike view morphing.
  • The visibility-limited aspect of the representation provides several locality properties. A new view will be affected only by depths and textures in its vicinity.
  • Use of multiple depth images to fill in the hole regions created due to the lack of complete information in a single depth map.
  • Only Valid views according to the thresholding angle are processed for the rendering, there by reducing the computation time.
  • GPU algorithm gives a multiple times better FPS than the CPU algorithm.
  • Frame buffer objects and Vertex Buffer objects improve on the performance and memory management of the rendering.
  • Resolution can be changed by the subsampling the grid and thus reducing the number of primitves to be drawn.

DIR2The scene representation using multiple Depth Images contains redundant descriptions of common parts. Our Compression methods aim at exploiting this redundancy for a compact representation. The various kinds of compression algorithms tried are ::

  • LZW Compression (lossless technique) is applied on depth maps using gzip
  • JPEG Compression :: Depth maps are compressed with various quality factors.
  • Quad Tree Based Compression :: If the block of any image/depth map represents one particular value it is stored as one single node in the tree.
  • MPEG compression :: All the frames are used to generate a movie sequence to get the encoded image.
  • Geometry Proxy Model :: It is an approximate description of the scene used to model the common, position independent,scene structure.
  • Progressive Compression :: Differences are added bit by bit progressivly this allows for smoother levels of detail.
  • Quality Levels :: Levels of Details (LODs) are varied to control the rendering time through the number of primitives or size of the model and texture.

Related Publication

  • Pooja Verlani, Aditi Goswami, P.J. Narayanan, Shekhar Dwivedi and Sashi Kumar Penta - Depth Image: representations and Real-time Rendering, Third International Symposium on 3D Data Processing, Visualization and Transmission,North Carolina, Chappel Hill, June 14-16, 2006. [PDF]

  • Sashi Kumar Penta and P. J. Narayanan, - Compression of Multiple Depth-Maps for IBR, The Visual Computer, International Journal of Computer Graphics, Vol. 21, No.8-10, September 2005, pp. 611--618. [PDF]

  • P. J. Narayanan, Sashi Kumar P and Sireesh Reddy K, Depth+Texture Representation for Image Based Rendering, Proceedings of the Indian Conference on Vision, Graphics and Image Processing(ICVGIP), Dec. 2004, Calcutta, India, pp. 113--118. [PDF]

Associated People

  • Pooja Verlani
  • Aditi Goswami
  • Naveen Kumar
  • Saurabh Aggrawal
  • Shekhar Dwivedi
  • Sireesh Reddy K
  • Sashi Kumar Penta
  • Prof. P. J. Narayanan

Handwriting Analysis


The work in handwriting analysis at CVIT concentrates on Recognition, Synthesis, Annotation, Search, and Classification of handwritten data. We primarily concentrate on online handwriting, where the temporal information of the writing process is available in the handwritten data, although many of the approaches we use are extensible to offline handwriting as well. Specifically, recognition of online handwriting in Indian languages has special significance, as it can form an effective mechanism of data input, as opposed to keyboards that needs multiple keystrokes and control sequences to input many characters.

Handwriting Synthesishyd

Handwriting synthesis is the problem of generating data close to how a human would write the text. The characterstics of the generated data could be that of specific writer or that from a generic model. Systhesis of handwriting pose a challange as writer spcecific features need to captured and preserved, yet at the same time, variability between handwriting should also be taken into account. Even with the given model, synthesis should not be deterministic since the variation that are found in human handwriting are stochastic.

Application of handwriting synthesis, includes, automatically creation of personalized handwritten documents, large amount of annotated handwritten data for training of recognition engines and writer independent matching and retrieval of handwritten documents.

For synthesis of Indic scripts, we model the handwriting at two levels. A stroke level model is used to capture the writing style and hand movements of the individual strokes. A space-time layout model is then used to arrange the synthesized strokes to form the words. Both the stroke model and the layout model can be learned from examples, and the method can learn from a single example, as well as a large collection to capture the variations. The model also allows us to synthesize the words in multiple Indic scripts through transliteration.

Annotation and Search of Handwritten dataimg

Annotation of handwriting is the process of labeling input data for training for the purpose of a variety of handwriting analysis problems like Handwriting recognition and writer identification systems. However manual annotation of large datasets is tedious, expensive, and error prone process, especially at character and stroke level. Lack of proper linguistic resources in form of annotated data sets is a major hurdle in building recognizers for them.

In many practical situations, plain transcripts of handwritten data is available, which can be used to make the process of annotation, easier. Data collection process can be carried out in different setting like, unrestricted data, designed text, dictation, and data generation (using handwriting synthesis). A parallel text is available in all the above case, except that of unrestricted data. For annotation, we use the model based handwriting synthesis unit described above to map the text corpora to handwriting space and annotation is propagated to word and character levels using elastic matching of handwriting. Stroke level annotation for online handwriting recognition is currently being done using semi-automatic tools.

Online Handwriting Recognition


We aim at building robust and accurate recognition engine for Indian languages, specifically for Hindi, Telugu and Malayalam. There are some features for Indian Languages and their writing, which necessitates a different approach for recognition as compared to English:

  • The primary unit of words is an akshara, which is a combination of multiple consonants, and ending in a vowel.
  • Each language has a very large set of aksharas - usually multiple thousands
  • Each akshara is composed of a single or multiple strokes, and no partial strokes.

A robust and accurate recognition system poses a variety of research challenges, especially for Indian languages. We concentrate on a variety of problems such as building large-class hierarchical classifiers, specifically for handwriting recognition and OCR, Discriminating classifiers for differenting similar looking time-series data (strokes), Compact representation of class models, Efficient spell checkers for languages with large number of work form variations, etc.

Writer Identification

Writer Identification is a process of identifying the authorship of handwritten documents. Relevance of document in civil and criminal litigations, primarily dependent on our ability to assign authorship to the particular document. For more information on writer identification click here.

Related Publications

  • Anoop M. Namboodiri and Sachin Gupta - Text Independent Writer Identification from Online Handwriting, International Workshop on Frontiers in Handwriting Recognition(IWFHR'06), October 23-26, 2006, La Baule, Centre de Congreee Atlantia, France. [PDF]

  • Anand Kumar, A. Balasubramanian, Anoop M. Namboodiri and C.V. Jawahar - Model-Based Annotation of Online Handwritten Datasets, International Workshop on Frontiers in Handwriting Recognition(IWFHR'06), October 23-26, 2006, La Baule, Centre de Congreee Atlantia, France. [PDF]

  • Karteek Alahari, Satya Lahari Putrevu and C.V. Jawahar - Learning Mixtures of Offline and Online Features for Handwritten Stroke Recognition, Proc. 18th IEEE International Conference on Pattern Recognition(ICPR'06), Hong Kong, Aug 2006, Vol. III, pp.379-382. [PDF]

  • C. V. Jawahar and A. Balasubramanian - Synthesis of Online Handwriting in Indian Languages, International Workshop on Frontiers in Handwriting Recognition(IWFHR'06), October 23-26, 2006, La Baule, Centre de Congree Atlantia, France. [PDF]

  • Karteek Alahari, Satya Lahari P and C. V. Jawahar - Discriminant Substrokes for Online Handwriting Recognition, Proceedings of Eighth International Conference on Document Analysis and Recognition(ICDAR), Seoul, Korea 2005, Vol 1, pp 499-503. [PDF]

  • A. Bhaskarbhatla, S. Madhavanath, M. Pavan Kumar, A. Balasubramanian, and C. V. Jawahar - Representation and Annotation of Online Handwritten Data, Proceedings of the International Workshop on Frontiers in Handwriting Recognition(IWFHR), Oct. 2004, Tokyo, Japan, pp. 136--141. [PDF]

  • Pranav Reddy and C. V. Jawahar, The Role of Online and Offline Features in the Development of a Handwritten Signature Verification System, Proceedings of the National Conference on Document Analysis and Recognition(NCDAR), Jul. 2001, Mandya, India, pp. 85--94. [PDF]


 Associated People

  • A. Balasubramanian
  • Naveen Chandra Tewari
  • Anurag mangal
  • Anil Gavini
  • click here
  • Kartheek Alahari
  • Sachin Gupta
  • Geetika Katragadda
  • Anubhaw Srivastava
  • click here
  • Anand Kumar
  • Amit Sangroya
  • Haritha Bellam
  • Rama Praveen

The Garuda: A Scalable, Geometry Managed Display Wall


Cluster-based tiled display walls simultaneously provide high resolution and large display area (Focus + Context) and are suitable for many applications. They are also cost-effective and scalable with low incremental costs. Garuda is a client-server based display wall solution designed to use off-the-shelf graphics hardware and standard Ethernet network. Garuda uses an object-based scene structure represented using a scene graph. The server determines the objects visible to each display-tile using a novel adaptive algorithm that culls an object hierarchy to a frustum hierarchy. Required parts of the scenegraph are transmitted to the clients which cache them to exploit inter-frame redundancy in the scene. A multicast-based protocol is used to transmit the geometry exploiting the spatial redundancy present especially on large tiled displays. A geometry push philosophy from the server helps keep the clients in sync with one another. No node including the server needs to render the entire environment, making the system suitable for interactive rendering of massive models. Garuda is built on the Open scene graph system and can transparently render any OSG-based application to a tiled display wall without any modification by the user.


The Garuda System provides ::full pp

  • Cluster based large display solution: Low in cost, easier to maintain and scale than Monolithic solutions.
  • Focus and Context: Renders large scenes with high detail, user is able to see the entirety of the scene along with fine details.
  • Driven from Graphics, not images: Interactive applications. Not only increase in size but also in resolution.
  • Parallel Rendering: Distributed rendering capabilities of a cluster used for parallel rendering. No individual system has to render the entire scene.
  • Transparent Rendering for OSG: More applicability, any Open Scene Graph application can be rendered to the system without any modification.
  • Capability to handle dynamic scenes: The system can handle dynamic OSG environment rendering at interactive frame rates.
  • Low Network Load: The system has very low network requirements owing to a Server push philosophy and caching at clients.

fatehFeatures of The Garuda System ::

  • Scalable to large tile configurations, up to 7x7 tile sizes on single server. A hierarchy of servers can be used to support even larger tile sizes.
  • Caching at the clients and use of multicast keeps the network requirements low, the system can handle huge tile configurations on a single 100Mbps network.
  • Using a novel culling algorithm the system scales sub-linearly to arbitrary large tile configurations. Please see: Adaptive Culling Algorithm for details.
  • No recompilation or relinking of OSG code necessary for rendering to a display wall.
  • Using distributed rendering the system can render massive models which could not be rendered at interactive frame rates using a single machine.



Related Publications


  • Nirnimesh, Pawan Harish and P. J. Narayanan - Garuda: A Scalable, Tiled Display Wall Using Commodity PCs Proc. of IEEE Transactions on Visualization and Computer Graphics(TVCG), Vol.13, no.~5, pp.864-877, 2007. [PDF]

  • Nirnimesh, Pawan Harish and P. J. Narayanan - Culling an Object Hierarchy to a Frustum Hierarchy, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCL 4338 pp.252-263,2006. [PDF]

Associated People

Biometric Authentication


Biometrics deals with recognizing people based on their physiological or behavioral characteristics. Our work primarily concentrates on three different aspects in biometrics:

  • Enhancing Weak Biometrics for Authentication: Weak biometrics (hand-geometry, face, voice, keystrokes) are the traits that possess low discriminating content and they change over time for each individual. However, there are several traits of weak biometrics such as social acceptability, ease of sensing, and lack of privacy concerns that make weak biometrics ideally suited for civilian applications. Methods that we developed can effectively handle the problems of low discriminative power and low feature stability of weak biometrics, as well as time-varying population in civilian applications.
  • Writer Identification from Handwritten Documents: Handwriting is a behavioural biometric that contains distinctive traits aquired by a person over time. Traditional approaches to writer identification tries to compute feature vectors that capture traits of handwriting that are known to experts as discriminative. In contrast we concentrate on automatic extraction of features that are suitable to specific applications such as writer identification in civilian domain and in problems such as forgery and repudiation in forensics.
  • Use of Camera as a Biometric Sensor: Camera has been used for capturing face images for authentication in the past. However, with biometrics traits such as fingerprints and iris, a specialized sensor is often preferred due to the high quality of data that they provide. Recent advances in image sensors have made digital cameras both inexpensive and technically capable for achieving high quality images. However, many problems such as variations in pose, illumination and scale restrict the use of cameras as sensors for many biometric traits. We are working on the use of models of imaging process to overcome these problems, to capture high quality data for authentication.

Enhancing Weak Biometric based Authentication


Weak biometrics (hand-geometry, face, voice, keystrokes) are the traits which possess low discriminating content and they change over time for each individual. Thus they show low accuracy of the system as compared to the strong biometrics (eg. fingerprints, iris, retina, etc.) However, due to exponentially decreasing costs of the hardware and computations, biometrics has found immense use in civilian applications (Time and Attendance Monitoring, Physical Access to Building, Human-Computer Interface, etc.) other than forensics (e.g. criminal and terrorist identification). Various factors need to be considered while selecting a biometric trait for civilian application; most important of which are related to user psychology and acceptability, affordability, etc. Due to these reasons, weak biometric traits are often better suited for civilian applications than the strong biometric traits. In this project, we address issues such as low and unstable discriminating information, which are present in weak biometrics and variations in user population in civilian applications.

schdaDue to the low discriminating content of the weak biometric traits, they show poor performance during verification. We have developed a novel feature selection technique called Single Class Hierarchical Discriminant Analysis (SCHDA), specifically for authentication purpose in biometric systems. SCHDA builds an optimal user-specific discriminant space for each individual where the samples of the claimed identity are well-separated from the samples of all the other users.

The second problem which leads to low accuracy of authentication is the poor stability or permanence of weak biometric traits due to various reasons (eg. ageing, the person gaining or losing weight, etc.) Civilian applications usually operate in cooperative or monitored mode wherein the users can give feedback to the system on occurrence of any errors. An intelligent adaptive framework is used, which uses feedback to incrementally update the parameters of the feature selection and verification framework for each individual.

The third factor that has been explored to improve the performance of an authentication system for civilian applications is the pattern of participation of each enrolled user. As the new users are enrolled into the system, a degradation is observed in performance due to increasing number of users. An interesting observation is that although the number of users enrolled into the system is very high, the number of users who regularly participate in the authentication process is comparatively low. We model the variation in participating population using Markov models. The prior probability of participation of each individual is computed and incorporated into the feature selection framework, providing more relevance to the parameters of regularly participating users. Both the structured and unstructured modes of variation of participation are explored.

Text Independent Writer Identification from Online Handwriting

Handwriting Individuality is a quantitative measure of writer specific information that can be used to identify authorship of the documents and study of comparison of writing habits, evaluation of the significance of their similarities and differences. It is an discrimitive process like fingerprint identification, firearms identification and DNA analysis. Individuality in handwriting lies in the habits that are developed and become consistant to some degree in the process of writing.

Discriminating elements of handwriting lies in various factors such as i) Arrangement, Connections, Constructions, Design, Dimensions, Slant or Slope, Spacings, CLass and choice of allographs, 2) Language styles such as Abbreviation, Commencements and terminations, diacritics and punctuation, line continuity, line quality or fluency, 3) Physical traits such as pen control, pen hold, pen position, pen pressure and writing movement, 4) Consistancy or natural variations and persistance, and 4) Lateral expansion and word proportions.

The framework that we utilize tries to capture the consistent information at various levels and automatically extract discriminative features from them.

Features of our Approach:clusters

  • Text-independent algorithm: Writer can be identified from any text given in underlined script. Comparison of features are not done for the similar charcters.
  • Script dependent framework: Applicablity is verified on different scripts like Devanagiri, Arabic,Roman, Chinese and Hebrew.
  • Use of Online Information: Online data is used for verification purpose. Offline information is also applicable with similar framework with appropriate change in feature extraction.
  • Authentication with small amount of data: Around 12 words in Devanagiri we get accuracy of 87%.



Underlying process of identification:

Represent   velocity 
  • Primitive Definition:

    Primitives are the discrimitive features of handwriting documents. First step is to identify primitive. Primitives can be individuality features like size, shape, distribution of curves in handwritten document. We choose subcharcter level curves as basic primitives

  • Extraction and Representation of primitive:

    Extraction of primitive is done using velocity profile of the stroke shown in the figure. Minimum velocity points are critical points of primitive. Primitives are extracted using size and shape features as shown in diagram.

  • Identification of Consistant Primitives:

    Repeating curves are consitent primitives. To extract consistent curves, unsupervised clustering algorithm is used to cluster them into different groups.

  • Classification:

    Variation in distribution, size and shape of curves in each cluster is used to discriminate writer from other writers.

Related Publications

  • Vandana Roy and C. V. Jawahar - Modeling Time-Varying Population for Biometric Authentication In International Conference on computing: Theory and Applications(ICCTA), Kolkatta, 2007. [PDF]

  • Anoop M. Namboodiri and Sachin Gupta - Text Independent Writer Identification from Online Handwriting, International Workshop on Frontiers in Handwriting Recognition(IWFHR'06), October 23-26, 2006, La Baule, Centre de Congreee Atlantia, France. [PDF]

  • Vandana Roy and C. V. Jawahar, - Hand-Geometry Based Person Authentication Using Incremental Biased Discriminant Analysis, Proceedings of the National Conference on Communication(NCC 2006), Jan 2006 Delhi, January 2006, pp 261-265. [PDF]

  • Vandana Roy and C. V. Jawahar, - Feature Selection for Hand-Geometry based Person Authentication, Proceedings of the Thirteenth International Conference on Advanced Computing and Communications, Coimbatore, December 2005. [PDF]


Associated People