The school will provide an in-depth and objective exposure to researchers of the emerging area of Understanding large scale document collections and highlight open problems in this area. With deluge of documents (in a variety of forms – images, web-pages, etc.) on the web, individual and/or collaborative information foraging from document collections have become a challenging task. Researchers are developing tools not only to understand structure of the documents but also to facilitate comprehensive interpretation of the content, may be embedded in more than one document in the form of text, table and/or graphics. In this summer school we shall address different aspects of this problem of discovering actionable insight in large document collections through tutorial level talks and research overview presentations. The school will focus on following core areas:

  • Features, representation and indexing for large document collections
  • Content representation and manipulation
  • Information and document retrieval
  • Machine learning and analytics for large document repositories

The school will be conducted by world-­renowned and experienced researchers. It will have (i) Tutorial talks; (ii) Focused research seminars; (iii) Poster session presenting research of the participants; (iv) Hands-­‐on Exercises; (v) Industrial Presentations.


  • Document Image Analysis, OCR, ICR, Large Scale Systems, Information Extraction and Retrieval, Multilingual Systems, Advanced Machine Learning techniques
  • Partial Travel Scholarships for International students
  • Field Experts will provide tutorial style lectures and “hands on” laboratory class

