CVIT Home CVIT Home
  • Home
  • People
    • Faculty
    • Staff
    • PhD Students
    • MS Students
    • Alumni
    • Post-doctoral
    • Honours Student
  • Research
    • Publications
    • Thesis
    • Projects
    • Resources
  • Events
    • Talks and Visits
    • Major Events
    • Visitors
    • Summer Schools
  • Gallery
  • News & Updates
    • News
    • Blog
    • Newsletter
    • Banners
  • Contact Us
  • Login

Multiple View Geometry Applications to Robotic and Visual Servoing


Visesh Chari (homepage)

Computer Vision may be described as the process of scene understanding through analysis of images captured by a camera. Understanding of a scene has several aspects associated with it, and this makes the field of computer vision a very vibrant and active field of research. For example, a section of the computer vision research concentrates on the understanding of the inherent characteristics of an object (identifying clauses like "this is a face", "this is a car" etc.). Yet another branch focuses on answering questions like "find the given face in this image" or "find where cars occur in this image". Another, more primitive of the branches, concerns itself with the estimation of the geometry of the scene. It answers questions like "what is the shape of this face", or "how would this car look from that viewpoint". This branch, and its various derivatives come under the name "Multiple View Geometry". Technically, Multiple View Geometry concerns itself with the geometric interaction of the 3D world with images captured by the camera, and the interpretation and manipulation of this information for various tasks. Multiple View Geometry (MVG) is two decades old in its research, and borrows heavily from a related field called Photogrammetry. Over the course of these years, many algorithms have been proposed for the estimation of geometric quantities like the transformation between cameras viewing a scene, or the 3D structure of a particular object being viewed by multiple cameras etc. The field has matured recently, with focus shifting towards producing globally optimal estimates of geometric quantities like transformations and structure, analysis of cases where the problem of geometric inference or manipulation is NP-Complete, etc. Even before maturity, many of the algorithms in Multiple View Geometry have found applications. The simple mosaicing solutions available in digital cameras these days, owes its origin to one such algorithm. Applications have also been spawned in areas like animation for films, robot motion in automated surgery and industrial environments, security systems that employ hundreds of cameras, etc.

collage

 

This thesis focuses on the application front of Multiple View Geometry, which has started gaining popularity. To this extent, we leverage some of the concepts of MVG, to develop new frameworks and algorithms for a variety of problems.

trackingFor this reason, we choose to explore the use of MVG in various robotics and computer vision tasks in this thesis. We first propose a tracking framework that utilizes various cues like textures and edges to perform tracking of 2D and 3D objects in various views of a scene. Tracking refers to the task of estimating the location and orientation of an object with respect to a pre-defined world coordinate system. Traditionally, filters like the Kalman Filter and its variants have been used for tracking purposes. Problems like illumination change and occlusion have affected many of these algorithms that make assumptions like uniform intensity of objects across views, etc. We show that by embedding MVG into tracking algorithms, we can achieve efficient tracking of objects, that is robust to large changes in perspective, illumination and occlusion. A by-product is the estimation of the pose of the camera, which in itself is useful for tasks like localization in a mobile environment. 

 

 

fouriervsThen we show an application of frequency domain based MVG to the task of robot positioning. Positioning (or Visual Servoing) is a task that enables a robot to assume a desired pose with respect to an object of interest, with the help of a camera. This object might be a heart, as in surgery, or an automobile part, as in industrial settings. We show that by using frequency domain techniques in MVG, we can achieve algorithms that require only rough correspondence between images, unlike earlier algorithms that needed specific point-to-point correspondences. This is further developed into a general framework for servoing that is capable of straight Cartesian paths and path following, which are recent problems in servoing.

 

 

 

inpaintingWithin computer vision, we explore the use of MVG for various image and video editing tasks. Tasks like removing a scene object from a video in a consistent manner would fall in this category (Predicting how the video would look like without the object). In this area, we propose an algorithm for video inpainting, where specific objects from a video are removed and resulting space-time holes are filled in a consistent manner. The algorithm is fully automatic unlike traditional image and video inpainting algorithms, and takes as input two functions; one function defines the object to be removed, and the other defines the background model that is used for hole-filling. 

 

 

 

 

IBR result

We then extend this algorithim to obtain Image Extrapolation, which is concerned with prediction of the future of a scene using available content about it. This is different from Inference in the sense that no data is actually available to confirm our predictions and hence several alternatives remain equally viable. In this direction, we propose an inpainting based framework for Image Based Rendering (IBR). Image Based Rendering (IBR) concerns itself with algorithms for an image based representation of the 3D information of a scene. Novel views of the scene can then be rendered with this information. We extend IBR to include cases when 3D information about a particular scene is incomplete, by incorporating information about the type of scene being viewed (for eg. the face of a person). We then devise algorithms to transfer specific semantic characteristics to the current scene from similar scenes available to us.

 

 

 

Year of completion:  December 2008
 Advisor : C. V. Jawahar & P. J. Narayanan

Related Publications

  • Visesh Chari, Avinash Sharma, Anoop M Namboodiri and C.V. Jawahar - Frequency Domain Visual Servoing using Planar Contours IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 87-94, 16-19 Dec,2008, Bhubaneswar, India. [PDF]

  • Visesh Chari, C. V. Jawahar, P. J. Narayanan - Video Completion as Noise Removal Proceedings of National Conference on Communications (NCC'08), Feb 1-3, 2008, IIT Mumbai, India. [PDF]

  • Visesh Chari,  Jag Mohan Singh and P. J. Narayanan - Augmented Reality using Over-Segmentation Proceedings of National Conference on Computer Vision Pattern Recognition Image Processing and Graphics (NCVPRIPG'08),Jan 11-13, 2008, DA-IICT, Gandhinagar, India. [PDF]

  • A.H. Abdul Hafez, Visesh Chari and C.V. Jawahar - Combining Texture and Edge Planar Tracker based on a local Quality Metric Proc. of IEEE International Conference on Robotics and Automation(ICRA'07), Roma, Italy, 2007. [ PDF ]

 


Downloads

thesis

 ppt

Scalable Primitives for Data Mapping and Movement on the GPU


Suryakant Patidar (homepage)

 

meGPUs have been used increasingly for a wide range of problems involving heavy computations in graphics, computer vision, scientific processing, etc. One of the key aspects for their wide acceptance is the high performance to cost ratio. In less than a decade, GPUs have grown from non-programmable graphics co-processors to a general-purpose unit with a high level language interface that delivers 1 TFLOPs for $400. GPU's architecture including the core layout, memory, scheduling, etc. is largely hidden. It also changes more frequently than the single core and multi core CPU architecture. This makes it difficult to extract good performance for non-expert users. Suboptimal implementations can pay severe performance penalties on the GPU. This is likely to persist as many-core architectures and massively multithreaded programming models gain popularity in the future.

bunnyOne way to exploit the GPU's computing power effectively is through high level primitives upon which other computations can be built. All architecture specific optimizations can be incorporated into the primitives by designing and implementing them carefully. Data parallel primitives play the role of building blocks to many other algorithms on the fundamentally SIMD architecture of the GPU. Operations like sorting, searching etc., have been implemented for large data sets.

We present efficient implementations of a few primitives for data mapping and data distribution on the massively multi-threaded architecture of the GPU. The split primitive distributes elements of a list according to their category. Split is an important operation for data mapping and is used to build data structures, distribute work load, performing database join queries etc. Simultaneous operations on a common memory is the main problem for parallel split and other applications on the GPU. He et al. overcame the problem of simultaneous writes by using personal memory space for each parallel thread on the GPU. Limited small shared memory available limits the maximum number of categories they can handle to 64. We use hardware atomic operations for split and propose ordered atomic operations for stable split which maintains the original order for elements belonging to the same category. Due to limited shared memory, such a split can only be performed to maximum of 2048 bins in a single pass. For number of bins higher than that, we propose an Iterative Split approach which can handle billions of bins using multiple passes of stable split operation by using an efficient basic split which splits the data to fixed number of categories. We also present a variant of split that partitions the dragonindexes of records. This facilitates the use of the GPU as a co-processor for split or sort, with the actual data movement handled separately. We can compute the split indexes for a list of 32 million records in 180 milliseconds for a 32-bit key and in 800 ms for a 96-bit key.

The gather and scatter primitive performs fast, distributed data movement. Efficient data movement is critical to high performance on the GPUs as suboptimal memory accesses can pay heavy penalties. In spite of high-bandwidth (130 GBps) offered by the current GPUs, naive implementation of the above operations hampers the performance and can only utilize a part of the bandwidth. The {\em instantaneous locality of memory reference} play a critical role in data movement on the current GPU memory architectures. For scatter and gather involving large records, we use collective data movement in which multiple threads cooperate on individual records to improve the instantaneous locality. We use multiple threads to move bytes of a record, maximizing the coalesced memory access. Our implementation of gather and scatter operations efficiently moves multi element records on the GPU. These data movement primitives can be used in conjunction with split for splitting of large records on the GPU.

We extend the split primitive to devide a SplitSort algorithm that can sort 32-bit, 64-bit and 128-bit integers on the GPU. The performance of SplitSort is faster than the best GPU sort available today by Satish et al for 32 bit integers. To our knowledge we are the first to present results on sorting 64-bits and higher integers on the GPU. With split and gather operations we can sort large data records by first sorting their indexes and then moving the original data efficiently. We show sorting of 16 million 128-byte records in 379 milliseconds with 4-byte keys and in 556 ms with 8-byte keys.

Using our fast split primitive, we explore the problem of real time ray casting of large deformable models (over a million triangles) on large displays (a million pixels) on an on-the-shelf GPU. We build a GPU-efficient three dimensional data structure for this purpose and a corresponding algorithm that uses it for fast ray casting. The data structure provides us with blocks of pixels and their corresponding geometry in a list of cells. Thus, each block of pixels can work in parallel on the GPU contributing a region of output to the final image. Our algorithm builds the data structure rapidly using the split operation for each frame (5 milliseconds for 1 million triangles) and can thus support deforming geometry by rebuilding the data structure every frame. Early work on ray tracing by Purcell et al built the data structures once on the CPU and used the GPU for repeated ray tracing using the data structure. Recent work on ray tracing of deformable objects by Lauterbach et al handles light models with upto 200K triangles at 7-10 fps. We achieve real-time rates (25 fps) for ray-casting a million triangle model onto a million pixels on current Nvidia GPUs.

Primitives we proposed are widely used and our results show that their performance scales logarithmically with the number of categories, linearly with the list length, and linearly with the number of cores on the GPU. This makes it useful for applications that deal with large data sets. The ideas presented in the thesis are likely to extend to later models and architectures of the GPU as well as to other multi core architectures. Our implementation for the data primitives viz. split, sort, gather, scatter and their combinations are expected to be widely used by future GPU programmers. A recent algorithm by Vineet et al. computes the minimum spanning tree on large graphs used the split primitive to improve the performance.

 

Year of completion:  June 2009
 Advisor : P. J. Narayanan

Related Publications

  • Vibhav Vineet, Pawan Harish, Suryakanth Patidar and P. J. Narayanan - Fast Minimum Spanning Tree for Large Graphs on the GPU Proceedings of High-Performance Graphics (HPG 09), 1-3 August, 2009, Louisiana, USA. [PDF]

  • Shiben Bhattacharjee, Suryakanth Patidar and P. J. Narayanan - Real-time Rendering and Manipulation of Large Terrains IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 551-559, 16-19 Dec,2008, Bhubaneswar, India. [PDF]

  • Suryakanth Patidar and P.J. Narayanan - Ray Casting Deformable Model on the GPU IEEE Sixth Indian Conference on Computer Vision, Graphics & Image Processing (ICVGIP 2008), pp. 481-488, 16-19 Dec,2008, Bhubaneswar, India. [PDF]

  • Soumyajit Deb, Shiben Bhattacharjee, Suryakant Patidar and P. J. Narayanan - Real-time Streaming and Rendering of Terrains, 5th Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India, LNCS 4338 pp.276-288, 2006. [PDF]

  • Suryakant Patidar, P. J. Narayanan - Scalable Split and Sort Primitives using Ordered Atomic Operations on the GPU, High Performance Graphics (Poster), April 2009. [PDF]
  • Kishore K, Rishabh M, Suhail Rehman, P. J. Narayanan, Kannan S, Suryakant Patidar - A Performance Prediction Model for th e CUDA GPGPU Platform. International Conference on High Performance Computing, April 2009. [PDF]
  • Suryakant Patidar, Shiben Bhattacharjee, Jag Mohan Singh, P. J. Narayanan - Exploiting the Shader Model 4.0 Architecture, IIIT/TR/2007/145, March 2007. [PDF]

Downloads

thesis

 ppt

 

 

Learning in Large Scale Image Retrieval Systems


Pradhee Tandon

Recent explosive growth in images and videos accessible to any individual on the Internet have made automatic management the prime choice. Contemporary systems used tags but over time they were found to be inadequate and unreliable. This has brought content based retrieval and management of such data to the fore front of research in the information retrieval community.

Content based retrieval methods generally represent visual information in the images or videos in terms of machine centric numeric features. These allow efficient processing and management of a large volume of data. These features are primitive and thus, are incapable of capturing the way humans perceive visual content. This leads to a semantic gap between the user and the system, resulting in poor user satisfaction. User centric techniques are needed which will help reduce this gap efficiently. Given the ever expanding volume of images and videos, techniques should also be able to retrieve in real time from millions of samples. A practical image retrieval approach in summary is expected to perform well on most of the following parameters, (1) acceptable accuracy, (2) efficiency, (3) minimal and non-cumbersome user input, (4) scalability to large collections (millions) of objects, (5) support interactive retrieval and (6) meaningful presentation of results. Most of the noted efforts in CBIR literature have focused primarily on providing answers to only subsets of the above. A real world system, on the other hand, requires practical solutions to nearly all of them. In this thesis we propose our solutions for an image retrieval application, keeping in mind the above expectations.

fishimage

In this thesis we present our system for interactive image retrieval from huge databases. The system has a web-based interface which emphasizes ease of use and encourages interaction. It is modeled on the query-by-example paradigm. It uses an efficient B+-tree based dimensional indexing scheme for retrieving similar ones from millions of images in less than a second. Perception of visual similarity is subjective. Therefore, to be able to serve these varying interpretations the index has to be adaptive. Our system supports user interaction through feedback. Our index is designed to support changing similarity metrics using this feedback and is able to respond in sub-second retrieval times. We have also optimized the basic B+-tree based indexing scheme to achieve better performance when learning is available.

 

 content

 

Content based access to images requires the visual information in them to be abstracted into some numeric features. These features generally represent low level visual characteristics of the data like colors, textures and shapes. They are inherently incapable of weak and cannot represent human perception of visual content, which has evolved over years of evolution. Relevance feedback from the user has been widely accepted as a means to bridge this semantic gap. In this thesis, we propose an inexpensive, feedback driven, feature relevance learning scheme. We estimate iteratively improving relevance weights for the low level numeric features. These weights capture the relevant visual content and are used to tune the similarity metric and iteratively improve retrieval for the active user. We propose to incrementally memorize this learning across users, for the set of relevant images in each query. This helps the system in incrementally converging to the popular content in the images in the database. We also use this learning in the similarity metric to tune retrieval further. Our learning scheme integrates seamlessly with our index making interactive accurate retrieval possible.

Feature based learning improves accuracy; however it is critically dependent on the low level features used. On the other hand, human perception is independent of it. Based on the underlying assumption, that user opinion on the same image remains similar over time and across users, a content free approach has recently become popular. The idea relies on collaborative filtering of user interaction logs for predicting the next set of results for the active user. This method performs better by virtue of being independent of primitive features but suffers from the critical cold start problem. In this thesis we propose a Bayesian inference approach for integrating similarity information from these two complimentary sources. It also overcomes the critical shortcomings of the two paradigms. We pose the problem as that of posterior estimation. The logs provide a priori information in terms of co-occurrence of images. Visual similarity provides the evidence of matching. We efficiently archive and update the co-occurrence relationships facilitating sub-second retrieval.

diversity

Studies have shown that the user refines his query based on the results he is shown. Studies have also shown that quality of retrieval improves with the effectiveness of learning acquired by the system. Presenting the right images to the user for his feedback can thus result in the best retrieval. A set of results which are similar to the query in different ways can help the user narrow down to his intended query at the earliest. This diversity cannot be achieved by similarity retrieval methods. In this thesis, we propose to efficiently use skyline queries for effectively removing conceptual redundancy from the retrieval set. Such a diversely similar set of results is then presented to the user for his feedback. We use our indexing scheme to extract the skyline efficiently, a computationally prohibitive process otherwise. User’s perception changes with the results, so should the nature of the diverse set. We propose the idea of preferential skylines to handle this. We use the user preference, based on feedback, to adapt the retrieval to the user intent. We reduce the diversity and include more similarity from the preferred attributes. Thus, we are slowly able to tune the retrieval to match the user’s exact intent. This provides improved accuracy and in far fewer iterations.

We validate all the ideas proposed in the thesis with experiments on real natural images. We also employ synthetic datasets for other computational experiments. We would like to mention that in spite of the considerable improvements in accuracy with our learning approaches, the effectiveness our solutions is still limited by the features used for encoding the human visual perception. Our methods of learning and other optimizations are only a means to reduce this gap. We present extensive experimental results with discussions validating different aspects of our expectations from the proposed ideas, throughout this thesis. (more...)

 

Year of completion:  June 2009
 Advisor : C. V. Jawahar & Vikram Pudi

Related Publications

  • Pradhee Tandon and C. V. Jawahar - Long Term Learning for Content Extraction in Image Retrieval Proceedings of the 15th National Conference on Communications (NCC 2009), Jan. 16-18, 2009, Guwahati, India. [PDF]

  • Pradhee Tandon, Piyush Nigam, Vikram Pudi, C. V. Jawahar - FISH: A Practical System for Fast Interactive Image Search in Huge Databases Proceedings of 7th ACM International Conference on Image and Video Retrieval (CIVR '08), July 7-9, 2008, Niagara Falls, Canada. [PDF]


Downloads

thesis

ppt

Document Enhancement Using Text Specific Prior


Jyotirmoy Banerjee

Document images are often obtained by digitizing paper documents like books or manuscripts. They could be poor in appearance due to degradation of paper quality, spreading and flaking of ink toner, imaging artifacts etc. All the above phenomena lead to different types of noise at the word level including boundary erosion, dilation, cuts/breaks and merges of characters. Further, with the advent of modern electronic gadgets like PDAs, cellular phones, and digital cameras, the scope of document imaging has widened. Document image analysis systems are becoming increasingly visible in everyday life. For instance, one may be interested in systems that process, store, understand document images obtained by cellular phones. Processing challenges in this class of documents are considerably different from the conventional scanned document images. Many of this new class of documents are characterized by low resolution and poor quality. Super resolution provides an algorithmic solution to the resolution enhancement problem by exploiting the image-specific apriori information. In this thesis we study and propose new methods for restoration and resolution enhancement of document images. We presents a single image super-resolution algorithm for gray level document images without using any training set. Super-resolution of document images is characterized by bimodality, smoothness along the edges as well as subsampling consistency. These characteristics are enforced in a Markov Random Field (MRF) framework by defining an appropriate energy function. In our case, subsampling of super-resolution image will return the original low-resolution one, proving the correctness of the method. The restored image, is generated by iteratively reducing the energy function of the MRF, which is a nonlinear optimization problem. This approach is a single frame approach and is useful when you do not have multiple low-resolution images. Document images have repetitive structural nature as the characters and words are found more than once in a page/book. The extraction of a single high-quality text image from a set of degraded images is benefited from the apriori information. A character segmentation is performed to extract the characters. A total variation based prior model is used in a Maximum A Posteriori (MAP) estimate, to smoothen the edges and preserve the corners, so characteristic of text images. Dependence on character segmentation still remains a bottle-neck. Character segmentation problem is not a completely solved problem. The segmentation accuracy depends on the quantity of noise in the text image. In our next approach, we shall overcome the dependency on character segmentation. We shall look for a restoration approach that does not perform a explicit character segmentation, but still uses the repetitive component nature of document images. In document images degradation is varied at different places in a document. Context plays an important role in textual image understanding. A MRF framework that exploits the contextual relation between image patches, is proposed. Using the topological/spacial constraints between the image patches, the impossible combinations are eliminated from the initial set of matchings, resulting in an unambiguous textual output. The local consistency is adjusted to the global consistency using the belief propagation algorithm. As we are working with patches and not characters, we avoid performing an explicit segmentation. The ability to work with larger patch sizes allows us to deal with severe degradations including cuts, blobs, merges and vandalized documents. This approach can also integrate document restoration and super-resolution into a single framework, thus directly generating high quality images from degraded documents. To conclude, the thesis presents an approach for reconstructing document images. Unlike other conventional reconstruction methods, the unknown pixel values are not estimated based on their local surrounding neighbourhood, but on the whole image. We exploit the multiple occurrence of characters in the scanned document. A great advantage of our proposed approach over conventional approaches is that we have more information at our disposal, which leads to a better enhancement of the document image. Experimental results show significant improvement in image quality on document images collected from various sources including magazines and books, comprehensively demonstrate the robustness and adaptability of the approach.

 

Year of completion:  2009
 Advisor : C. V. Jawahar

Related Publications

  • Jyotirmoy Banerjee, Anoop M. Namboodiri and C.V. Jawahar - Contextual Restoration of Severely Degraded Document Images Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 09), 20-25 June, 2009, Miami Beach, Florida, USA. [PDF]

  • Jyotirmoy Banerjee, and C.V. Jawahar - Super-resolution of Text Images Using Edge-Directed Tangent Field in Eighth IAPR Workshop on Document Analysis Systems (DAS), Sep 17-19, 2008, Nara, Japan. [PDF] [Received Honorable Mentions AWARD]

  • Jyotirmoy Banerjee, and C. V. Jawahar - Restoration of Document Images Using Bayesian Inference in National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), Jan 11-13, 2008, Gandhinagar, Gujarat, India. [PDF]


Downloads

thesis

ppt

Efficient Image Retrieval Methods For Large Scale Dynamic Image Databases


Suman Karthik

The commoditization of imaging hardware has led to an exponential growth in image and video data, making it difficult to access relevant data when it is required. This has led to a great amount of research into multimedia retrieval and Content Based Image Retrieval (CBIR) in particular. Yet, CBIR has not found widespread acceptance in real world systems. One of the primary reasons for this is the inability of traditional CBIR systems to scale effectively to Large Scale image databases. The introduction of the Bag of Words model for image retrieval has changed some of these issues for the better, yet bottlenecks remain and their utility is limited when it comes to Highly Dynamic image databases (image databases where the set of images is constantly changing). In this thesis, we focus on developing methods that address the scalability issues of traditional CBIR systems and adaptability issues of Bag of Words based image retrieval systems.

cbirTraditional CBIR systems find relevant images by finding nearest neighbors in a high dimensional feature space. This is computationally expensive, and does not scale as the number of images in the database grow. We address this problem by posing the image retrieval problem as a text retrieval task. We do this by transforming the images into text documents called the Virtual Textual Description (VTD). Once this transformation is done, we further enhance the performance of the system by incorporating a novel relevance feedback algorithm called discriminative relevance feedback. Then we use the virtual textual description of images to index and retrieve images efficiently using a data structure called the Elastic Bucket Trie(EBT).

IVQContemporary bag of visual words approaches to image retrieval perform one-time offline vector quantization to create the visual vocabulary. However, these methods do not adapt well to dynamic image databases whose nature constantly changes as new data is added. In this thesis, we design, present and examine with experiments a novel method for incremental vector quantization(IVQ) to be used in image and video retrieval systems with dynamic databases.

BGMSemantic indexing has been invaluable in improving the performance of bag of words based image retrieval systems. However, contemporary approaches to semantic indexing for bag of words image retrieval do not adapt well to dynamic image databases. We introduce and examine with experiments a bipartite graph model (BGM), which is a scalable datastructure that aids in on-line semantic indexing and a cash flow algorithm that works on the BGM to retrieve semantically relevant images from the database. We also demonstrate how traditional text search engines can beused to build scalable image retrieval systems.

 

Year of completion:  July 2009
 Advisor : C. V. Jawahar

Related Publications

  • Suman Karthik, Chandrika Pulla and C.V. Jawahar - Incremental Online Semantic Indexing for Image Retrieval in Dynamic Databases Proceedings of International Workshop on Semantic Learning Applications in Multimedia (SLAM: CVPR 2009), 20-25 June 2009, Miami, Florida, USA. [PDF]

  • Suman Karthik, C.V. Jawahar - Analysis of Relevance Feedback in Content Based Image Retrieval, Proceedings of the 9th International Conference on Control, Automation, Robotics and Vision (ICARCV), 2006, Singapore. [PDF]
  • Suman Karthik, C.V. Jawahar, Virtual Textual Representation for Efficient Image Retrieval, Proceedings of the 3rd International Conference on Visual Information Engineering(VIE), 26-28 September 2006 in Bangalore, India. [PDF]
  • Suman Karthik, C.V. Jawahar, Effecient Region Based Indexing and Retrieval for Images with Elastic Bucket Tries, Proceedings of the International Conference on Pattern Recognition(ICPR), 2006. [PDF]

Downloads

thesis

ppt

More Articles …

  1. High Quality Image Reconstruction by Optimal Capture and Accurate Registration
  2. Towards Large Scale Image Based Robot Navigation
  3. Exploring Irregular Memory Access Applications on the GPU
  4. Learning Non-Linear Kernel Combinations Subject to General Regularization: Theory and Applications
  • Start
  • Prev
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • Next
  • End
  1. You are here:  
  2. Home
  3. Research
  4. Thesis
  5. Thesis Students
Bootstrap is a front-end framework of Twitter, Inc. Code licensed under MIT License. Font Awesome font licensed under SIL OFL 1.1.