Program

Time Jan 23rd Jan 24th Jan 25th Jan 26th Jan 27th Jan 28th
09.00 AM to 09.30 AM Registration Invited Talk: Prof. Gernot Fink Invited Talk: Prof. Koichi Kise Indian Republic Day: Off for the School
Potential Options for tourism:
Route 1: Trip to Agra by Train
Route 2: Local tour in Jaipur
Invited Talk: Prof. Marcus Liwicki Invited Talk: Dr. UtkarshPorwal
09.30 AM to 10.00 AM Inauguration
10.00 AM to 10.30 AM Participant Introduction
10.30 AM to 11.00 AM Tea Break Break Break Break Break
11.00 AM to 11.30 AM Invited Talk: Prof. Pushpak Bhattacharya Invited Talk: Prof. B. B. Chaudhuri Invited Talk: Prof. Bhabatosh Chanda Invited Talk: Dr. LipikaDey Invited Talk: Prof. Manik Varma
11.30 AM to 12.00 AM
12.00 PM to 12.30 PM
12.30 PM to 01.00 PM Lunch Lunch Lunch Lunch Lunch
01.00 AM to 01.30 PM
01.30 PM to 02.00 PM
02.00 PM to 02.30 PM Invited Talk: Dr. Ashok Popat Lab/Demo: Gernot Fink Lab/Demo: Koichi Kise Lab/Demo: Marcus Liwicki Lab: Dr. UtkarshPorwal
02.30 PM to 03.00 PM
03.00 PM to 03.30 PM
03.30 PM to 04.00 PM Break
04.00 PM to 04.30 PM Demo of Indian Language OCRs and OHWRs - - Break Break
04.30 PM to 05.00 PM - High Tea + Poster Session-2 Texcape: TCS demo Valedictory Session
05.00 PM to 05.30 PM Social Event @ ChokhiDhani (Bus leaves at 17.00) -
05.30 PM to 06.00 PM - National Digital Library Presentation -
06.00 PM to 06.30 PM Poster Session - 1 -
06.30 PM to 07.00 PM - -
07.00 PM to 07.30 PM - - -
07.30 PM to 08.00 PM - - - -
08.00 PM to 10.00 PM - - - -

Talk Details

Cognitive Natural Language Processing

Monday, January 23

Speaker: Prof. Pushpak Bhattacharya

Abstract: We present in this talk the use of eye tracking for Natural Language Processing, which we call Cognitive Natural Language Processing. NLP is machine learning dependent these days, and clues from eye tracking provide valuable features in ML for NLP. We study Machine Translation, Sentiment Analysis, Readability, Sarcasm and such problems to show that cognition based features augment the efficacy of ML based NLP manifolds. An additional attractiveness of cognitive NLP is possible rationalization of compensation for annotation effort on text. The presentation is derived from multiple publications in ACL, EMNLP, NAACL etc. based on work done by PhD and Masters Students.

Bio: Prof. Pushpak Bhattacharyya is the current President of ACL (2016-17). He is the Director of IIT Patna and Vijay and SitaVashee Chair Professor in IIT Bombay, Computer Science and Engineering Department. He was educated in IIT Kharagpur (B.Tech), IIT Kanpur (M.Tech) and IIT Bombay (PhD). He has been visiting scholar and faculty in MIT, Stanford, UT Houston and University Joseph Fouriere (France). Prof. Bhattacharyya's research areas are Natural Language Processing, Machine Learning and AI. He has guided more than 250 students (PhD, masters and Bachelors), has published more than 250 research papers and led government and industry projects of international and national importance. A significant contribution of his is Multilingual Lexical Knowledge Bases and Projection. Author of the text book "Machine Translation", Prof. Bhattacharyya is loved by his students for his inspiring teaching and mentorship. He is a Fellow of National Academy of Engineering and recipient of Patwardhan Award of IIT Bombay and VNMM award of IIT Roorkey- both for technology development, and faculty grants of IBM, Microsoft, Yahoo and United Nations.


Developing Multilingual OCR and Handwriting Recognition at Google

Monday, January 23

Speaker: Dr. Ashok Popat

Lecture Slides

Abstract: In this talk I will I reflect on our team's experiences in developing a multilingual OCR and handwriting recognition systems at Google: enabling factors, effective practices, and challenges. I'll tell you what I think I've learned along the way, drawing on some experiences with other projects inside and outside Google.

Bio: Dr. Ashok C. Popat received the SB and SM degrees from the Massachusetts Institute of Technology in Electrical Engineering in 1986 and 1990, and the PhD from the MIT Media Lab in 1997. He is a Staff Research Scientist and manager at Google in Mountain View, California. Prior to joining Google in 2005 he worked at Xerox PARC for 8 years, as a researcher and later as a research area manager. Between 2002 and 2005 he was also a consulting assistant professor of Electrical Engineering at Stanford, where he taught a course "Electronic documents: paper to digital." He has also worked at Motorola, Hewlett Packard, PictureTel, and the EPFL in Switzerland. His areas of interest include signal processing, data compression, machine translation, and pattern recognition.Personal: skiing, sailing, hiking, traveling, learning languages.


Word Spotting: From Bag-of-Features to Deep Learning

Tuesday, January 24

Speaker: Prof. Gernot Fink

Abstract: Research in building automatic reading systems has made considerable progress since its first inception in the 1960's. Today, quite mature techniques are available for the automatic recognition of machine-printed text. However, the automatic reading of handwriting is a considerably more challanging task, especially when it comes to historical manuscripts. When current methods for handwriting recognition reach their limits, approaches for so-called word spotting come into play. These can be considered as specialized versions of image retrieval techniques. The most successful methods rely on machine learning methods in order to derive powerful models for representing queries for handwriting retrieval.

This lecture will first give a brief introduction to the problem of word spotting and the methodological developments in the field. In the first part of the lecture, classical approaches for learning word spotting models will be described that build on on Bag-of-Features (BoF) representations. These have been developed in the field of computer vision for learning characteristic representations of image content in an unsupervised manner. It will be shown how word spotting models can be built applying the BoF principle. It will also be described, how basic BoF models can be extended by learning common sub-space representations between different modalities.

In the second part of the lecture, advanced models for word spotting will be presented that apply techniques of deep learning and, currently, define the state-of-the-art in the field. After a discussion of pros and cons of the classical approaches, first foundations of neural networks in general and deep architectures in particular will be laid. Combining the idea of common sub-space representations and the application of a unified framework that can be learned in an end-to-end fashion, unprecedented performance on a number of challenging word spotting tasks can be achieved, as has been demonstrated by the PHOCNet.

Bio: Prof. Gernot A. Fink received his diploma in computer science from the University of Erlangen-Nuremberg, Germany, in 1991. From 1991 to 2005, he was with the Applied Computer Science Group at Bielefeld University, Germany, where he received his Ph.D. degree (Dr.- Ing.) in 1995 and his venialegendi (Habilitation) in 2002. Since 2005, he has been a professor at the Technical University of Dortmund, Germany, where he heads the Pattern Recognition in Embedded Systems Group. His research interests are machine perception, statistical pattern recognition, and document analysis. He has published more than 150 papers and a textbook on Markov models for pattern recognition.

Lab-Session: In the accompanying lab-session, participants of the summer school will be able to experiment themselves with different word spotting models and thus obtain hands-on experience with the techniques presented in the lecture.

Lab related material: http://patrec.cs.tu-dortmund.de/cms/en/home/Resources/index.html

Lecture Slides: http://patrec.cs.tu-dortmund.de/pubs/papers/SSDA17-Tutorial-Fink.pdf


Detection and cleaning of strike-out texts in offline handwritten documents

Tuesday, January 24

Speaker: Prof. B. B. Chaudhuri

Lecture Slides

Abstract: The talk starts with brief study on OCR of offline unconstrained handwritten text, including our BLSTM based work on Bangla script. It is noted that the published papers on the topic consider ideal inputs, i.e. the documents containing no writing error. However, a free-form creative handwritten page may contain misspelled/inappropriate word, that is struck-out by the writer and the adequate word is written next to it. The strike-out may also be longer e.g. consisting several consecutive words, even several lines, after which the writer pens his/her revised statement at the next free space. If a document image with such errors is fed to handwriting OCR, then unpredictable erroneous strings will be generated for the struck-out texts. The present talk mainly deals with such strike-out problem in English and Bangla script. Here a pattern classifier followed by a graph based method is employed to detect struck-out text and locate the strike-out strokes. For detection, we employed hand-crafted as well as Recurrent Neural Net generated features into a SVM classifier to detect the struck-out words. Then, to locate the strike-out stroke, the skeleton of the text component is computed. The skeleton is treated as a graph and a shortest-path algorithm, which satisfies certain properties of strike-out stroke is employed. To locate the zig-zag, wavy, slanted or crossed strike-outs, appropriate modification in the path detection algorithm is made. Multiword/multiline strike-outs are also tackled in a suitable manner.

Sometimes the user may be interested in deleting the detected strike-out stroke. When this is done, the cleaned text may be better visible for manual analysis, or subjected to OCR system for transcript generation of a manuscript (of say, a famous person). We have employed Inpainting method for such cleaning. Tested on 250 English and 250 Bangla document pages, fairly good results on the above tasks have been obtained.

Bio: Prof. Bidyut B. Chaudhuri received Ph.D. degree from Indian Institute of Technology, Kanpur, in 1980 and worked as a Lever hulme PostDoc fellow at Queen's University, UK, in 1981-1982. He joined Indian Statistical Institute in 1978, where he is currently INAE Distinguished Professor and J.C.Bose Fellow at Computer Vision and Pattern Recognition Unit. His research interests include pattern recognition, image processing, computer vision, NLP, information retrieval, digital document processing and OCR. He pioneered the first Indian language Bharati Braille System for the blind, a successful Bangla speech synthesis system, as well as the first workable OCR for Bangla, Devanagari, Assamese and Oriya scripts. In NLP, a robust Indian language spell-checker, morphological processor, multi-word expression detector and statistical analyser were pioneered by him.

Some of his technologies have been transferred to industry for commercialization. He has published about 400 research papers in reputed international journals, conference Proceedings, and edited books. He has authored/co-authored 8technical books and holds four international patents. He is a Fellow of Indian national academies like INSA, NASc and INAE. Among International academies, he is a Fellow of IAPR and TWAS, and a Life Fellow of IEEE. He is serving as an Associate editor of IJPRAI, IJDAR, JIETE and served as guest editor to special issues of several journals.


Reading behavior analysis for reading-life logand its fundamental technologies

Wednesday, January 25

Speaker: Koichi Kise

Lecture Slides

Abstract: In our daily life, we are spending hours for reading documents. This is because “reading” is our primal mean of acquiring information. “Reading-life log” is a field of research to extract fruitful information for enriching our life by mutual analysis of reading activity and documents read by readers. We can estimate many things from the results of analysis, e.g., how much we read (wordometer, reading detection), and how well we understand (the level of understanding and proficiency), both by analyzing eye gaze obtained by eye-trackers. Fundamental technologies which support reading-life log are sensing human reading behavior and retrieval of documents inputted as images. In my talk, I introduce the fundamental technologies and their application to implementation of various types of reading-life log.

Bio: Prof. Koichi Kise received B.E., M.E. and Ph.D. degrees in communication engineering from Osaka University, Osaka, Japan in 1986, 1988 and 1991, respectively. From 2000 to 2001, he was a visiting professor at German Research Center for Artificial Intelligence (DFKI), Germany. He is now a Professor of the Department of Computer Science and Intelligent Systems, and the director of the Institute of Document Analysis and Knowledge Science (IDAKS), Osaka Prefecture University, Japan. He received awards including the best paper award of IEICE in 2008, the IAPR/ICDAR best paper awards in 2007 and 2013, the IAPR Nakano award in 2010, the ICFHR best paper award in 2010 and the ACPR best paper award in 2011. He works as the chair of the IAPR technical committee 11 (reading systems), a member of the IAPR conferences and meetings committee, and an editor-in-chief of the international journal of document analysis and recognition. His major research activities are in analysis, recognition and retrieval of documents, images and activities. He is a member of IEEE, ACM, IPSJ, IEEJ, ANLP and HIS

Demo:I will demonstrate fundamental technologies and implementations of reading-life log using some sensors. Document image retrieval called LLAH (Locally Likely Arrangement Hashing) is a fundamental technology to be demonstrated. I also show several sensing technologies such as eye-tracking and EOG (electrooculography).

Students are able to try to use sensors to know more about their functions. In addition, students have an opportunity of implementing simple activity recognition by using an eye-tracker.


Document page layout analysis

Wednesday, January 25

Speaker: Prof. Bhabatosh Chanda

Lecture Slides

Abstract: ‘Document page layout analysis’ usually refers to decomposition of page image into textual and various non-textual components, to understand geometrical and logical structure, and thereafter linking them together for efficient presentation and abstraction. With the growing necessity in automatic transformation of complex paper document to its electronic version, geometrical and logical structure analysis remains an active research area for decades. Such analysis helps ‘OCR’ to produce its best possible result. It also helps extracting various logical components such as image and line drawing. In this presentation our objective is to make a quick journey starting from elementary approach suitable for strictly structured layout to more sophisticated methods that can handle complicated designer layout. We also discuss evaluation methodology for layout analysis algorithms and mention various benchmark datasets available for performance evaluation.

Bio: Prof. Bhabatosh Chanda received B.E. in Electronics and Telecommunication Engineering and PhD in Electrical Engineering from University of Calcutta in 1979 and 1988 respectively. His research interest includes Image and video Processing, Pattern Recognition, Computer Vision and Mathematical Morphology. He has published more than 100 technical articles in refereed journals and conferences, authored one book and edited five books. He has received ‘Young Scientist Medal’ of Indian National Science Academy in 1989, ‘Computer Engineering Division Medal’ of the Institution of Engineers (India) in 1998, ’Vikram Sarabhai Research Award in 2002, and IETE-Ram Lal Wadhwa Gold medal in 2007. He is also recipient of UN fellowship, UNESCO-INRIA fellowship and Diamond Jubilee fellowship of National Academy of Science, India. He is fellow of Institute of Electronics and Telecommunication Engineers (FIETE), of National Academy of Science, India (FNASc.), of Indian National Academy of Engineering (FNAE) and of International Association of Pattern Recognition (FIAPR). He is a Professor in Indian Statistical Institute, Kolkata, India.


Historical Document Analysis

Friday, January 27

Speaker: Prof. Marcus Liwicki

Lecture Slides

Abstract: I will give an overview over the challenges of historical documents and the current research highlights for various document image analysis (DIA) problems. Historical Documents pose very tough challenges to automatic DIA algorithms. Typically, exotic scripts and layouts have been used and the documents degraded over time. I will give an overview over typical processing algorithms and furthermore report on recent trends towards interoperability.

In the first part of the presentation, I will describe methods for line segmentation, binarization, and layout analysis. Especially very recent deep learning trends led to remarkable improvements of the processing systems when compared to conventional methods. On top of that, if enough data is available, those methods are also much easier to apply since they perform end-to-end recognition and make several processing steps obsolete. On the basis of examples, I will show that the separation of the analysis into several independent steps even leads to problems and worse performance of the later methods. The reasons for that are twofold: First, it is not clear how to define the ground truth (i.e., the expected perfect outcome) of some individual steps; second, early recognition errors can lead to much more difficult processing for the later stages. The only remaining problem for deep learning is the need for large amount of training data. I will demonstrate methods to automatically extend existing ground truthed datasets for more training data generation.

In the second part, I will sketch recent approaches of the Document, Image, and Voice Analysis (DIVA) group towards enabling libraries and researchers in the humanities for easier use of state-of-the-art DIA methods. Common structures, adaptable methods, public datasets, and Open Services (e.g., the DIVAServices which will be more deeply presented by Marcel Würsch in the next presentation) lead to easier re-use, access, and integration into tools used at the libraries or archives or in research environments

Lab: It will involve hands-on practices on DIVAServices, web services for Document Image Analysis. The participants will be able to try out state-of-the-art Document Image Processing methods and learn how to easily integrate their own methods into DIVAServices.

Bio: Marcus Liwicki received his M.S. degree in Computer Science from the Free University of Berlin, Germany, in 2004, his PhD degree from the University of Bern, Switzerland, in 2007, and his habilitation degree at the Technical University of Kaiserslautern, Germany, in 2011. Currently he is an apl.-professor in the University of Kaiserslautern and a senior assistant in the University of Fribourg. His research interests include machine learning, pattern recognition, artificial intelligence, human computer interaction, digital humanities, knowledge management, ubiquitous intuitive input devices, document analysis, and graph matching. From October 2009 to March 2010 he visited Kyushu University (Fukuoka, Japan) as a research fellow (visiting professor), supported by the Japanese Society for the Promotion of Science. In 2015, at the young age of 32, he received the ICDAR young investigator award, a bi-annual award acknowledging outstanding achievements of in pattern recognition for researchers up to the age of 40. Marcus Liwicki gave a number of invited talks at several international workshops, universities, and companies. He also gave several tutorials on IAPR conferences. Marcus Liwicki is a co-author of the book "Recognition of Whiteboard Notes – Online, Offline, and Combination", published by World Scientific in October 2008. He has more than 150 publications, including more than 20 journal papers, and excluding more than 20 publications which currently undergo the review stage or will soon be published.


Analyzing text documents - separating the wheat from chaff

Friday, January 27

Speaker: Dr. Lipika Dey

Lecture Slides

Abstract: The rapid rise of digital text document collections is exciting for decision makers across different sections - be it from academia or industry. While the academia is interested to gather insights about scientific and technical progress in different areas of research, industry is interested to know more about its potential consumers and competitors. All this and much more is available today almost free of cost on the open web. However, text data can be extremely noisy and deceptive. Noise creeps in from various sources - some intended and some unintended. While some of this noise can be treated at pre-processing levels, some need to be dealt with during the analysis process itself. In this talk we shall take a look at the various pitfalls that need to be carefully avoided or taken care of in order to come up with meaningful insights from text documents.

Demo: Texcape Given the volumes and velocity at which research publications are growing, keeping up with the advances in various fields is a challenging task. However decision makers including academics, program managers, venture capital investors, industry leaders and funding agencies not only need to be abreast of latest developments but also be able to assess the future impact of research on industry, academics or society. Automated extraction of key information and insights from these text documents is necessary to help in this endeavor. Texcape is a technology landscaping tool built on top of scientific publications and patents, that attempts to help in this task. This demo will show how Texcape performs automated topical analysis from large volumes of text and analyzes evolutions, commercialization’s and trends to help in collaborative decision making.

Bio: Dr. LipikaDey is a Senior Consultant and Principal Scientist at Tata Consultancy Services, India with over 20 years of experience in Academic and Industrial R&D. She heads the Web Intelligence and Text Mining research group at Innovation Labs. Lipika's research interests are in the areas of content analytics from social media and News, social network analytics, predictive modeling, sentiment analysis and opinion mining, and semantic search of enterprise content. Her focus is on seamless integration of social intelligence and business intelligence. She is keenly interested in developing analytical frameworks for integrated analysis of unstructured and structured data. Lipika publishes her work in various International Conferences and Journals. She has also presented her earlier works at Sentiment Analysis Symposium and Text Mining Summit. Lipika was awarded with the Distinguished Scientist award by TCS in 2012. Prior to joining the industry in 2007, Lipika was a faculty member in the Department of Mathematics at Indian Institute of Technology, Delhi, from 1995 to 2006. She has several publications in International journals and refereed conference proceedings. Lipika has a Ph.D. in Computer Science and Engineering, M.Tech in Computer Science and Data Processing and 5 Year Integrated M.Sc in Mathematics from IIT Kharagpur.


Language Model: Theory and Applications

Saturday, January 28

Speaker: Dr. Utkarsh Porwal

Lab related material

Abstract: A language model helps us compute the probability of sequence of terms such as words given a /corpus/. It is widely used in applications like spell correction, POS tagging, information retrieval, speech recognition and handwriting recognition. In this talk, we will cover theory of language models from n-gram based models to recent RNN based models, parameter estimation, evaluation etc. We will also cover a wide range of applications where language modeling is used.

Lab: In this lab session, participants will learn to train and evaluate different types of language models such as n-gram based model and RNN based model and will be able to compare them based on performance, data efficiency, storage etc.

Bio: Dr. UtkarshPorwal is an applied researcher at eBay. He works on automatic query rewrites, entity recognition and structured data. Before joining search science, he was part of the trust science group where he was working on detecting abusive buyers and feature selection. His research interest lies broadly in the areas of information retrieval, pattern recognition and applied machine learning. He received his Ph. D. from State University of New York at Buffalo in 2014.


Extreme Classification for Tagging on Wikipedia, Query Ranking on Bing and Product Recommendation on Amazon

Saturday, January 28

Speaker: Prof. Manik Verma

Abstract: The objective in extreme classification is to develop classifiers that can automatically annotate each data point with the most relevant subset of labels from an extremely large label set. In this talk, we will develop a new paradigm for tagging, ranking and recommendation based on extreme classification. In particular, we design extreme multi-label loss functions which are tailored for tagging, ranking and recommendation and show that these loss functions are more suitable for performance evaluation as compared to traditional metrics. Furthermore, we develop novel algorithms for optimizing the proposed loss functions and demonstrate that these can lead to a significant improvement over the state-of-the-art on various real world applications ranging from tagging on Wikipedia to sponsored search advertising on Bing to product recommendation on Amazon. More details including publications, videos, datasets and source code can be found on http://www.manikvarma.org/.

Brief Bio: Prof. Manik Varma is a researcher at Microsoft Research India and an adjunct professor of computer science at IIT Delhi. His research interests span machine learning, computational advertising and computer vision. He has served as an area chair for CVPR, ICCV, ICML, ICVGIP, IJCAI and NIPS. Classifiers that he has developed are running live on millions of devices around the world protecting them from viruses and malware. Manik has been awarded the Microsoft Gold Star award, the Microsoft Achievement award, won the PASCAL VOC Object Detection Challenge and stood first in chicken chess tournaments and Pepsi drinking competitions. He is a failed physicist (BSc St. Stephen's College, David Raja Ram Prize), theoretician (BA Oxford, Rhodes Scholar), engineer (DPhil Oxford, University Scholar) and mathematician (MSRI Berkeley, Post-doctoral Fellow)


System Demo:

Demo of Indian Language OCRs

Mr. Tushar Patnayak CDAC, Noida

Abstract: Indian Language OCR: e-Aksharayan, an Indian language OCR facilitates converting hardcopy printed documents into electronic forms using a new approachleading to, for the first time, a technology for recognizing characters and words in scanned images ofdocuments in a large set of Indian scripts/languages.Optical Character Recognition (OCR) for Indian scripts opens up the possibility of delivering traditionalIndian language content, which today are confined to printed books, to readers across the world throughelectronic means. OCR makes the content searchable as well as readable via a variety of devices like mobile phones, tablets, e-readers. Further, the same content can now be transformed electronically to meetneeds of the visually challenged through generation of Braille and/or audio books among otherpossibilities. Use of OCR on printed Indian language circulars and notifications can make embedded information widely accessible facilitating effective e-governance. The circulars can be now very easilyedited, if required, for adaptation to different needs.The OCR process involves first converting printed matter into electronic image using scanner or a digitalcamera, followed by electronic image processing to generate Unicode text. This can be opened in any word-processing application for editing. e-Aksharayan has user-friendly design and allows intuitive editing of the scanned image and the generated text.

Features of e-Aksharayan are:

  • It enables users to harness the power of computersto access printed documents in Indian language/scripts.
  • A number of pre-processing routines are available such as skew detection and correction, Noise removal and thresholding to convert an input gray-scale document image into clean binary image for successful recognition. Other pre-processing steps can be color image processing, Dithering/Color Highlight/Color Stamp/Underline/Annotation/Marginal Noise Removal and Text-NonText Separation.
  • Present version of e-Aksharayan supports major Indian languages/scripts- Assamese, Bangla, Gurmukhi, Hindi, Kannada, Malayalam, Tamil, Telugu, Urdu, Gujarati, Oriya, Manipuri and Marathi.
  • It converts printed document images to editable text with up to 90-95% recognition accuracy at character level &85-90% at word level.
  • Current version of e-Aksharayan takes 45 to 60 sec to process an A4 size page on a standard desktop.
  • The digitized text can be converted to Braille for the visually impaired.
  • Other applications that can be built around with OCR technology at hand can be text-to-speech conversion for visually impaired, proof-reading for authors, Search Engine and Content analysis, Multilingual tool formitigating code of conduct cases at Election Commission, Interactive learninggames/toys forchildren tounderstandletter/wordformation, Android app for OCR based dictionary and translator for recognition of multilingual scene text captured from sign-boards, hoardings etc.

Demo of Indian Language OHWRs

Swapnil Belhe, CDAC Pune

Abstract: With the recent advancement in Indian languages Optical Character Recognition (OCR) and Online Handwritten Character Recognition(OHWR) engines, there has been wide variety of applications which are developed around these engines to cater to various needs. The engines make use of the latest developments in document and handwriting analysis making them robust to font and writing style variations. Most of the OCR and OHWR engines make use of huge collection of data during training making them robust.

The demonstrations will focus on desktop and mobile based OCR’s for Indian languages and their complexities. At the same time, the demonstrations of OHWR’s will show the effectiveness of handwritten recognition for handheld devices. The effective way of multi-modal inputting for form processing in Indian languages using handwritten recognition will be showcased. Various learning games developed using the OCR’s & OHWR’s will be demonstrated. These demos will also provide the glimpse of future challenges.