Understanding Deep Image Representations by Inverting Them
Aravindh Mahendran
D.Phil at University of Oxford
Date : 19/12/2015
Abstract:
Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of them remains limited. In this talk I'll discuss our experiments on the visual information contained in representations by asking the following question: given an encoding of an image, to which extent is it possible to reconstruct the image itself? To answer this question we contribute a general framework to invert representations. We show that this method can invert representations such as HOG and SIFT more accurately than recent alternatives while being applicable to CNNs too. We then use this technique to study the inverse of recent state-of-the-art CNN image representations for the first time. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.
Brief Bio:
Understanding Reality for Generate Credible Augmentations
Pushmeet Kohli
Microsoft Research
Date : 01/12/2015
Brief Bio:
Pushmeet Kohli is a principal research scientist in Microsoft Research. In 2015, he was appointed the technical advisor to Rick Rashid, the Chief Research Officer of Microsoft. Pushmeet's research revolves around Intelligent Systems and Computational Sciences, and he publishes in the fields of Machine Learning, Computer Vision, Information Retrieval, and Game Theory. His current research interests include 3D Reconstruction and Rendering, Probabilistic Programming, Interpretable and Verifiable Knowledge Representations from Deep Models. He is also interested in Conversation agents for Task completion, Machine learning systems for Healthcare and 3D rendering and interaction for augmented and virtual reality. His papers have won awards in ICVGIP 2006, 2010, ECCV 2010, ISMAR 2011, TVX 2014, CHI 2014, WWW 2014 and CVPR 2015. His research has also been the subject of a number of articles in popular media outlets such as Forbes, Wired, BBC, New Scientist and MIT Technology Review. Pushmeet is a part of the Association for Computing Machinery's (ACM) Distinguished Speaker Program.
Learning to Super-Resolve Images Using Self-Similarities
Dr. Abhishek Singh
Research Scientist, Amazon Lab126
Date : 17/11/2015
Abstract:
The single image super-resolution problem involves estimating a high-resolution image from a single, low-resolution observation. Due to its highly ill-posed nature, the choice of appropriate priors has been an active research area of late. Data driven or learning based priors have been successful in addressing this problem. In this talk, I will review some recent learning based approaches to the super-resolution problem, and present some novel algorithms which can better super-resolve high-frequency details in the scene. In articular, I will talk about novel self-similarity driven algorithms that do not require any external database of training images, but instead, learn the mapping from low-resolution to high-resolution using patch recurrence across scales,within the same image. Furthermore, I will also present a novel framework for jointly/simultaneously addressing the super-resolution and denoising problems, in order to obtain a clean, high-resolution image from a single, noise corrupted, low-resolution observation.
Brief Bio:
Multi-view Learning using Statistical Dependence
Dr. Abhishek Tripathi
Research Scientist, Xerox Research Centre India
Date : 02/11/2015
Abstract:
Multi-view learning is a task of learning from multiple sources with co-occurred samples. Here, I will talk about multi-view learning techniques which find shared information between multiple sources in an unsupervised setting. We use statistical dependence as a measure to find shared information. Multi-view learning becomes more challenging and interesting (i) without co-occurred samples in multiple views and (ii) with arbitrary collection of matrices. I will present our work around these two problems with the help of some practical applications.
Brief Bio:
Parallel Inverse Kinematics for Multi-Threaded Architectures
Dr. Pawan Harish
Post Doctoral Researcher at EPFL
Date : 02/11/2015
Abstract:
In this talk I will present a parallel prioritized Jacobian based inverse kinematics algorithm for multi-threaded architectures. The approach solves damped least squares inverse kinematics using a parallel line search by identifying and sampling critical input parameters. Parallel competing execution paths are spawned for each parameter in order to select the optimum which minimizes the error criteria. The algorithm is highly scalable and can handle complex articulated bodies at interactive frame rates. The results are shown on complex skeletons consisting of more than 600 degrees of freedom while being controlled using multiple end effectors. We implement our algorithm both on multi-core and GPU architectures and demonstrate how the GPU can further exploit fine-grain parallelism not directly available on a multicore processor. The implementations are 10 - 150 times faster compared to a state-of-art serial implementations while providing higher accuracy. We also demonstrate the scalability of the algorithm over multiple scenarios and explore the GPU implementation in detail.
Brief Bio:
Artificial Intelligence Research @ Facebook
Dr. Manohar Paluri
Ph.D. at Georgia Institute of Technology
Date : 30/10/2015
Abstract:
Facebook has to deal with billions of data points (likes, comments, shares, posts, photos, videos and many more). The only way to provide access to this plethora of information in a structured way is to understand the data and learn mathematical models for various problems like Search, NewsFeed Ranking, Instagram trending, Ads targeting etc.). Facebook Artificial Intelligence Research(FAIR) group aims to do this with some of the brightest minds in the field directed by Dr. Yann Lecun. Our goal is to be at the forefront of Artificial Intelligence and bring that technology to the billions of users using Facebook and beyond. In this talk I will touch a few asepcts of our focus areas and focus more on Computer Vision related projects. This will be a very high-level overview talk with a few slides of technical details and the goal is to motivate everyone about the challenges we face at FAIR.
Brief Bio:
My Research on Human-centered computing
Dr. Ramanathan Subramanian
National University of Singapore
Date : 26/03/2015
Abstract:
Human-centered computing focuses on all aspects of integrating the human/user within the computational loop of Artificial Intelligence (AI) systems. I am currently interested in developing applications that can analyze humans to make informed decisions (Human Behavior Understanding), and interactively employ implicit human signals (eye movements, EEG/MEG responses) for feedback.
In my talk, I will first present my research on analyzing human behavior from free-standing conversational groups, and the associated problems of head pose estimation and F-formation detection from low-resolution images. Then, I will also describe my studies on emotional perception using eye movements and brain signals.
Brief Bio:
Ramanathan Subramanian is a Research Scientist at the Advanced Digital Sciences Center (ADSC). Previously, he has served as a post-doctoral researcher at the Dept. of Information Engineering and Computer Science, University of Trento, Italy and the School of Computing, NUS. His research interests span Human-computer Interaction, Human-behavior understanding, Computer Vision and Computational Social sciences. He is especially interested in studying and modelling various aspects of human visual and emotional perception. His research has contributed to a number of behavioral databases including the prominent NUSEF (eye-fixation), DPOSE (dynamic head pose) and DECAF (Decoding affective multimedia from MEG brain responses) datasets.
Towards automatic video production of staged performances
Dr. Vineet Gandhi
University of Grenoble, France
Date : 26/02/2015
Abstract:
Professional quality videos of live staged performances are created by recording them from different appropriate viewpoints. These are then edited together to portray an eloquent story replete with the ability to draw out the intended emotion from the viewers. Creating such competent videos typically requires a team of skilled camera operators to capture the scene from multiple viewpoints. In this talk, I will introduce an alternative approach where we automatically compute camera movements in post-production using specially designed computer vision methods.
First, I will explain our novel approach for tracking objects and actors in long video sequences. Second, I will describe how the actor tracks can be used for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single high resolution static camera. I will conclude my talk by presenting test and validation results on a challenging corpus of theatre recordings and demonstrating how the proposed methods open the way to novel applications for cost effective video production of live performances including, but not restricted to, theatre, music and opera.
Brief Bio:
Vineet Gandhi obtained his B.Tech degree in Electronics and Communication engineering from Indian Institute of Information Technology, Design and Manufacturing Jabalpur and a master degree under prestigious Erasmus Mundus Scholarship program. During his masters he studied in three different countries for a semester each specializing in the areas of Optics, image and vision. He did his master thesis in perception team at INRIA France in collaboration with Samsung research. He later joined Imagine team at INRIA as a doctoral researcher and obtained his PhD degree in computer science and applied mathematics.
His current research interests are in the areas of visual learning/detection/recognition, computational photography/videography and sensor fusion for 3D reconstruction. He also enjoys working in the field of optics, colorimetry and general mathematics of signal and image processing.
The current landscape of deep learning. Trends and Challenges
Facebook AI Research
Date : 18/02/2015
Abstract:
Recent years have seen rapid growth of the field of Deep Learning. Research in Deep Learning has put forth many new ideas which have led to many path breaking developments in Machine Learning. The large set of tools and techniques that have been churned out of this research are fast being explored and applied to different problems in Computer Vision and NLP. This talk shall focus on some of the current trends and practices in Deep Learning research and the challenges that lie ahead.
Microsoft Research India
Date : 10/02/2015
Abstract:
The objective in extreme multi-label classification is to learn a classifier that can automatically tag a data point with the most relevant subset of labels from a large label set. Extreme multi-label classification is an important research problem since not only does it enable the tackling of applications with many labels but it also allows the reformulation of ranking and recommendation problems with certain advantages over existing formulations.
Our objective, in this talk, is to develop an extreme multi-label classifier that is faster to train and more accurate at prediction than the state-of-the-art Multi-label Random Forest (MLRF) algorithm [Agrawal et al. WWW 13] and the Label Partitioning for Sub-linear Ranking (LPSR) algorithm [Weston et al. ICML 13]. MLRF and LPSR learn a hierarchy to deal with the large number of labels but optimize task independent measures, such as the Gini index or clustering error, in order to learn the hierarchy. Our proposed FastXML algorithm achieves significantly higher accuracies by directly optimizing an nDCG based ranking loss function. We also develop an alternating minimization algorithm for efficiently optimizing the proposed formulation. Experiments reveal that FastXML can be trained on problems with more than a million labels on a standard desktop in eight hours using a single core and in an hour using multiple cores.
Brief Bio:
Manik Varma is a researcher at Microsoft Research India where he helps champion the Machine Learning and Optimization area. Manik received a bachelor's degree in Physics from St. Stephen's College, University of Delhi in 1997 and another one in Computation from the University of Oxford in 2000 on a Rhodes Scholarship. He then stayed on at Oxford on a University Scholarship and obtained a DPhil in Engineering in 2004. Before joining Microsoft Research, he was a Post-Doctoral Fellow at the Mathematical Sciences Research Institute Berkeley. He has been an Adjunct Professor at the Indian Institute of Technology (IIT) Delhi in the Computer Science and Engineering Department since 2009 and jointly in the School of Information Technology since 2011. His research interests lie in the areas of machine learning, computational advertising and computer vision. He has served as an Area Chair for machine learning and computer vision conferences such as ACCV, CVPR, ICCV, ICML and NIPS. He has been awarded the Microsoft Gold Star award and has won the PASCAL VOC Object Detection Challenge.