Year wise list: 2018 | 2017 | 2016 | 2015 | 2014 | 2013 |

Understanding Deep Image Representations by Inverting Them


Aravindh Mahendran

D.Phil at University of Oxford

Date : 19/12/2015



Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of them remains limited. In this talk I'll discuss our experiments on the visual information contained in representations by asking the following question: given an encoding of an image, to which extent is it possible to reconstruct the image itself? To answer this question we contribute a general framework to invert representations. We show that this method can invert representations such as HOG and SIFT more accurately than recent alternatives while being applicable to CNNs too. We then use this technique to study the inverse of recent state-of-the-art CNN image representations for the first time. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.

Brief Bio:

Aravindh did his undergraduate in CSE at IIIT Hyderabad from 2008 to 2012. He worked in the Cognitive Science lab and Robotics lab as part of his undergraduate research (B.Tech honors). He completed an MSc in Robotics at Carnegie Mellon University and is currently reading for a D.Phil in Engineering Science at the University of Oxford, Visual Geometry Group with Prof. Andrea Vedaldi.


Understanding Reality for Generate Credible Augmentations


Pushmeet Kohli

Microsoft Research

Date : 01/12/2015


Brief Bio:

Pushmeet Kohli is a principal research scientist in Microsoft Research. In 2015, he was appointed the technical advisor to Rick Rashid, the Chief Research Officer of Microsoft. Pushmeet's research revolves around Intelligent Systems and Computational Sciences, and he publishes in the fields of Machine Learning, Computer Vision, Information Retrieval, and Game Theory. His current research interests include 3D Reconstruction and Rendering, Probabilistic Programming, Interpretable and Verifiable Knowledge Representations from Deep Models. He is also interested in Conversation agents for Task completion, Machine learning systems for Healthcare and 3D rendering and interaction for augmented and virtual reality. His papers have won awards in ICVGIP 2006, 2010, ECCV 2010, ISMAR 2011, TVX 2014, CHI 2014, WWW 2014 and CVPR 2015. His research has also been the subject of a number of articles in popular media outlets such as Forbes, Wired, BBC, New Scientist and MIT Technology Review. Pushmeet is a part of the Association for Computing Machinery's (ACM) Distinguished Speaker Program.


Learning to Super-Resolve Images Using Self-Similarities


Dr. Abhishek Singh

Research Scientist, Amazon Lab126

Date : 17/11/2015



The single image super-resolution problem involves estimating a high-resolution image from a single, low-resolution observation. Due to its highly ill-posed nature, the choice of appropriate priors has been an active research area of late. Data driven or learning based priors have been successful in addressing this problem. In this talk, I will review some recent learning based approaches to the super-resolution problem, and present some novel algorithms which can better super-resolve high-frequency details in the scene. In articular, I will talk about novel self-similarity driven algorithms that do not require any external database of training images, but instead, learn the mapping from low-resolution to high-resolution using patch recurrence across scales,within the same image. Furthermore, I will also present a novel framework for jointly/simultaneously addressing the super-resolution and denoising problems, in order to obtain a clean, high-resolution image from a single, noise corrupted, low-resolution observation.

Brief Bio:

Abhishek Singh is a Research Scientist at Amazon Lab126 in Sunnyvale, California. He obtained a Ph.D. in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign in Feb 2015, where he worked with Prof. Narendra Ahuja on learning based super-resolution algorithms, among other problems. He was the recipient of the Joan and Lalit Bahl Fellowship, and the Computational Science and Engineering Fellowship at the University of Illinois. He has also been affiliated with Mitsubishi Electric Research Labs, Siemens Corporate Research, and UtopiaCompression Corporation. His current research interests include learning based approaches for low level vision and image processing problems. For more information, please visit


Multi-view Learning using Statistical Dependence


Dr. Abhishek Tripathi

Research Scientist, Xerox Research Centre India

Date : 02/11/2015



Multi-view learning is a task of learning from multiple sources with co-occurred samples. Here, I will talk about multi-view learning techniques which find shared information between multiple sources in an unsupervised setting. We use statistical dependence as a measure to find shared information. Multi-view learning becomes more challenging and interesting (i) without co-occurred samples in multiple views and (ii) with arbitrary collection of matrices. I will present our work around these two problems with the help of some practical applications.

Brief Bio:

Dr. Abhishek Tripathi is working as a Research Scientist in Xerox Research Centre India (XRCI), Bangalore since January 2012. He is part of the Machine Learning group, where the focus domains include Transportation, Healthcare and Human Resource. Prior to XRCI, Abhishek had spent one year at Xerox Research Centre Europe, France. He received his PhD in Computer Science from University of Helsinki, Finland. His research interests include unsupervised multi-view learning, matrix factorization, recommender systems, data fusion and dimensionality reduction.


Parallel Inverse Kinematics for Multi-Threaded Architectures


Dr. Pawan Harish

Post Doctoral Researcher at EPFL

Date : 02/11/2015



In this talk I will present a parallel prioritized Jacobian based inverse kinematics algorithm for multi-threaded architectures. The approach solves damped least squares inverse kinematics using a parallel line search by identifying and sampling critical input parameters. Parallel competing execution paths are spawned for each parameter in order to select the optimum which minimizes the error criteria. The algorithm is highly scalable and can handle complex articulated bodies at interactive frame rates. The results are shown on complex skeletons consisting of more than 600 degrees of freedom while being controlled using multiple end effectors. We implement our algorithm both on multi-core and GPU architectures and demonstrate how the GPU can further exploit fine-grain parallelism not directly available on a multicore processor. The implementations are 10 - 150 times faster compared to a state-of-art serial implementations while providing higher accuracy. We also demonstrate the scalability of the algorithm over multiple scenarios and explore the GPU implementation in detail.

Brief Bio:

Pawan Harish joined the PhD program at IIIT, Hyderabad, India in 2005 where he focused on Computational Displays and on parallelizing graph algorithms on the GPU under the supervision of Prof. P. J. Narayanan. He completed his PhD in 2013 and joined University of California, Irvine as a visiting scholar. He worked at Samsung Research India as a technical lead before joining IIG EPFL as a post doctoral researcher in June 2014. His current research, in association with Moka Studios, is on designing parallel inverse kinematics algorithm on the GPU. His interests include parallel algorithms, novel displays, CHI and computer graphics.


Artificial Intelligence Research @ Facebook


Dr. Manohar Paluri

Ph.D. at Georgia Institute of Technology

Date : 30/10/2015



Facebook has to deal with billions of data points (likes, comments, shares, posts, photos, videos and many more). The only way to provide access to this plethora of information in a structured way is to understand the data and learn mathematical models for various problems like Search, NewsFeed Ranking, Instagram trending, Ads targeting etc.). Facebook Artificial Intelligence Research(FAIR) group aims to do this with some of the brightest minds in the field directed by Dr. Yann Lecun. Our goal is to be at the forefront of Artificial Intelligence and bring that technology to the billions of users using Facebook and beyond. In this talk I will touch a few asepcts of our focus areas and focus more on Computer Vision related projects. This will be a very high-level overview talk with a few slides of technical details and the goal is to motivate everyone about the challenges we face at FAIR.

Brief Bio:

Manohar Paluri is currently managing the Applied Computer Vision group at Facebook. His group focuses on building the world's largest image and video understanding platform. Manohar got his Bachelor's degree from IIIT Hyderabad with Honors in Computer Vision and Masters from Georgia Tech. While pursuing his Ph.D. at Georgia Tech and prior to joining Facebook Manohar worked with Dr. Steve Seitz's 3D Maps team at Google Research, Video surveillance group at IBM Watson labs and applied computer vision group at Sarnoff (now Stanford Research Institute).


My Research on Human-centered computing


Dr. Ramanathan Subramanian

National University of Singapore

Date : 26/03/2015



Human-centered computing focuses on all aspects of integrating the human/user within the computational loop of Artificial Intelligence (AI) systems. I am currently interested in developing applications that can analyze humans to make informed decisions (Human Behavior Understanding), and interactively employ implicit human signals (eye movements, EEG/MEG responses) for feedback.

In my talk, I will first present my research on analyzing human behavior from free-standing conversational groups, and the associated problems of head pose estimation and F-formation detection from low-resolution images. Then, I will also describe my studies on emotional perception using eye movements and brain signals.

Brief Bio:

Ramanathan Subramanian is a Research Scientist at the Advanced Digital Sciences Center (ADSC). Previously, he has served as a post-doctoral researcher at the Dept. of Information Engineering and Computer Science, University of Trento, Italy and the School of Computing, NUS. His research interests span Human-computer Interaction, Human-behavior understanding, Computer Vision and Computational Social sciences. He is especially interested in studying and modelling various aspects of human visual and emotional perception. His research has contributed to a number of behavioral databases including the prominent NUSEF (eye-fixation), DPOSE (dynamic head pose) and DECAF (Decoding affective multimedia from MEG brain responses) datasets.


Towards automatic video production of staged performances

VineethGandhiDr. Vineet Gandhi

University of Grenoble, France

Date : 26/02/2015



Professional quality videos of live staged performances are created by recording them from different appropriate viewpoints. These are then edited together to portray an eloquent story replete with the ability to draw out the intended emotion from the viewers. Creating such competent videos typically requires a team of skilled camera operators to capture the scene from multiple viewpoints. In this talk, I will introduce an alternative approach where we automatically compute camera movements in post-production using specially designed computer vision methods.

First, I will explain our novel approach for tracking objects and actors in long video sequences. Second, I will describe how the actor tracks can be used for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single high resolution static camera. I will conclude my talk by presenting test and validation results on a challenging corpus of theatre recordings and demonstrating how the proposed methods open the way to novel applications for cost effective video production of live performances including, but not restricted to, theatre, music and opera.

Brief Bio:

Vineet Gandhi obtained his B.Tech degree in Electronics and Communication engineering from Indian Institute of Information Technology, Design and Manufacturing Jabalpur and a master degree under prestigious Erasmus Mundus Scholarship program. During his masters he studied in three different countries for a semester each specializing in the areas of Optics, image and vision. He did his master thesis in perception team at INRIA France in collaboration with Samsung research. He later joined Imagine team at INRIA as a doctoral researcher and obtained his PhD degree in computer science and applied mathematics.

His current research interests are in the areas of visual learning/detection/recognition, computational photography/videography and sensor fusion for 3D reconstruction. He also enjoys working in the field of optics, colorimetry and general mathematics of signal and image processing.

Facebook AI Research

Date : 18/02/2015



Recent years have seen rapid growth of the field of Deep Learning. Research in Deep Learning has put forth many new ideas which have led to many path breaking developments in Machine Learning. The large set of tools and techniques that have been churned out of this research are fast being explored and applied to different problems in Computer Vision and NLP. This talk shall focus on some of the current trends and practices in Deep Learning research and the challenges that lie ahead.



Extreme Classification: A New Paradigm for Ranking & Recommendation

Microsoft Research India

Date : 10/02/2015



The objective in extreme multi-label classification is to learn a classifier that can automatically tag a data point with the most relevant subset of labels from a large label set. Extreme multi-label classification is an important research problem since not only does it enable the tackling of applications with many labels but it also allows the reformulation of ranking and recommendation problems with certain advantages over existing formulations.

Our objective, in this talk, is to develop an extreme multi-label classifier that is faster to train and more accurate at prediction than the state-of-the-art Multi-label Random Forest (MLRF) algorithm [Agrawal et al. WWW 13] and the Label Partitioning for Sub-linear Ranking (LPSR) algorithm [Weston et al. ICML 13]. MLRF and LPSR learn a hierarchy to deal with the large number of labels but optimize task independent measures, such as the Gini index or clustering error, in order to learn the hierarchy. Our proposed FastXML algorithm achieves significantly higher accuracies by directly optimizing an nDCG based ranking loss function. We also develop an alternating minimization algorithm for efficiently optimizing the proposed formulation. Experiments reveal that FastXML can be trained on problems with more than a million labels on a standard desktop in eight hours using a single core and in an hour using multiple cores.

Brief Bio:

Manik Varma is a researcher at Microsoft Research India where he helps champion the Machine Learning and Optimization area. Manik received a bachelor's degree in Physics from St. Stephen's College, University of Delhi in 1997 and another one in Computation from the University of Oxford in 2000 on a Rhodes Scholarship. He then stayed on at Oxford on a University Scholarship and obtained a DPhil in Engineering in 2004. Before joining Microsoft Research, he was a Post-Doctoral Fellow at the Mathematical Sciences Research Institute Berkeley. He has been an Adjunct Professor at the Indian Institute of Technology (IIT) Delhi in the Computer Science and Engineering Department since 2009 and jointly in the School of Information Technology since 2011. His research interests lie in the areas of machine learning, computational advertising and computer vision. He has served as an Area Chair for machine learning and computer vision conferences such as ACCV, CVPR, ICCV, ICML and NIPS. He has been awarded the Microsoft Gold Star award and has won the PASCAL VOC Object Detection Challenge.