Talks and Visits

Talks and Visits 2015

From 2015

Year wise list: 2024 | 2023 | 2022 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 |

Understanding Deep Image Representations by Inverting Them

Aravindh Mahendran

D.Phil at University of Oxford

Date : 19/12/2015

Abstract:

Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of them remains limited. In this talk I'll discuss our experiments on the visual information contained in representations by asking the following question: given an encoding of an image, to which extent is it possible to reconstruct the image itself? To answer this question we contribute a general framework to invert representations. We show that this method can invert representations such as HOG and SIFT more accurately than recent alternatives while being applicable to CNNs too. We then use this technique to study the inverse of recent state-of-the-art CNN image representations for the first time. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.

Brief Bio:

Aravindh did his undergraduate in CSE at IIIT Hyderabad from 2008 to 2012. He worked in the Cognitive Science lab and Robotics lab as part of his undergraduate research (B.Tech honors). He completed an MSc in Robotics at Carnegie Mellon University and is currently reading for a D.Phil in Engineering Science at the University of Oxford, Visual Geometry Group with Prof. Andrea Vedaldi.

Understanding Reality for Generate Credible Augmentations

Pushmeet Kohli

Microsoft Research

Date : 01/12/2015

Brief Bio:

Pushmeet Kohli is a principal research scientist in Microsoft Research. In 2015, he was appointed the technical advisor to Rick Rashid, the Chief Research Officer of Microsoft. Pushmeet's research revolves around Intelligent Systems and Computational Sciences, and he publishes in the fields of Machine Learning, Computer Vision, Information Retrieval, and Game Theory. His current research interests include 3D Reconstruction and Rendering, Probabilistic Programming, Interpretable and Verifiable Knowledge Representations from Deep Models. He is also interested in Conversation agents for Task completion, Machine learning systems for Healthcare and 3D rendering and interaction for augmented and virtual reality. His papers have won awards in ICVGIP 2006, 2010, ECCV 2010, ISMAR 2011, TVX 2014, CHI 2014, WWW 2014 and CVPR 2015. His research has also been the subject of a number of articles in popular media outlets such as Forbes, Wired, BBC, New Scientist and MIT Technology Review. Pushmeet is a part of the Association for Computing Machinery's (ACM) Distinguished Speaker Program.

Learning to Super-Resolve Images Using Self-Similarities

Dr. Abhishek Singh

Research Scientist, Amazon Lab126

Date : 17/11/2015

Abstract:

The single image super-resolution problem involves estimating a high-resolution image from a single, low-resolution observation. Due to its highly ill-posed nature, the choice of appropriate priors has been an active research area of late. Data driven or learning based priors have been successful in addressing this problem. In this talk, I will review some recent learning based approaches to the super-resolution problem, and present some novel algorithms which can better super-resolve high-frequency details in the scene. In articular, I will talk about novel self-similarity driven algorithms that do not require any external database of training images, but instead, learn the mapping from low-resolution to high-resolution using patch recurrence across scales,within the same image. Furthermore, I will also present a novel framework for jointly/simultaneously addressing the super-resolution and denoising problems, in order to obtain a clean, high-resolution image from a single, noise corrupted, low-resolution observation.

Brief Bio:

Abhishek Singh is a Research Scientist at Amazon Lab126 in Sunnyvale, California. He obtained a Ph.D. in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign in Feb 2015, where he worked with Prof. Narendra Ahuja on learning based super-resolution algorithms, among other problems. He was the recipient of the Joan and Lalit Bahl Fellowship, and the Computational Science and Engineering Fellowship at the University of Illinois. He has also been affiliated with Mitsubishi Electric Research Labs, Siemens Corporate Research, and UtopiaCompression Corporation. His current research interests include learning based approaches for low level vision and image processing problems. For more information, please visit http://www.abhishek486.com

Multi-view Learning using Statistical Dependence

Dr. Abhishek Tripathi

Research Scientist, Xerox Research Centre India

Date : 02/11/2015

Abstract:

Multi-view learning is a task of learning from multiple sources with co-occurred samples. Here, I will talk about multi-view learning techniques which find shared information between multiple sources in an unsupervised setting. We use statistical dependence as a measure to find shared information. Multi-view learning becomes more challenging and interesting (i) without co-occurred samples in multiple views and (ii) with arbitrary collection of matrices. I will present our work around these two problems with the help of some practical applications.

Brief Bio:

Dr. Abhishek Tripathi is working as a Research Scientist in Xerox Research Centre India (XRCI), Bangalore since January 2012. He is part of the Machine Learning group, where the focus domains include Transportation, Healthcare and Human Resource. Prior to XRCI, Abhishek had spent one year at Xerox Research Centre Europe, France. He received his PhD in Computer Science from University of Helsinki, Finland. His research interests include unsupervised multi-view learning, matrix factorization, recommender systems, data fusion and dimensionality reduction.

Parallel Inverse Kinematics for Multi-Threaded Architectures

Dr. Pawan Harish

Post Doctoral Researcher at EPFL

Date : 02/11/2015

Abstract:

In this talk I will present a parallel prioritized Jacobian based inverse kinematics algorithm for multi-threaded architectures. The approach solves damped least squares inverse kinematics using a parallel line search by identifying and sampling critical input parameters. Parallel competing execution paths are spawned for each parameter in order to select the optimum which minimizes the error criteria. The algorithm is highly scalable and can handle complex articulated bodies at interactive frame rates. The results are shown on complex skeletons consisting of more than 600 degrees of freedom while being controlled using multiple end effectors. We implement our algorithm both on multi-core and GPU architectures and demonstrate how the GPU can further exploit fine-grain parallelism not directly available on a multicore processor. The implementations are 10 - 150 times faster compared to a state-of-art serial implementations while providing higher accuracy. We also demonstrate the scalability of the algorithm over multiple scenarios and explore the GPU implementation in detail.

Brief Bio:

Pawan Harish joined the PhD program at IIIT, Hyderabad, India in 2005 where he focused on Computational Displays and on parallelizing graph algorithms on the GPU under the supervision of Prof. P. J. Narayanan. He completed his PhD in 2013 and joined University of California, Irvine as a visiting scholar. He worked at Samsung Research India as a technical lead before joining IIG EPFL as a post doctoral researcher in June 2014. His current research, in association with Moka Studios, is on designing parallel inverse kinematics algorithm on the GPU. His interests include parallel algorithms, novel displays, CHI and computer graphics.

Artificial Intelligence Research @ Facebook

Dr. Manohar Paluri

Ph.D. at Georgia Institute of Technology

Date : 30/10/2015

Abstract:

Facebook has to deal with billions of data points (likes, comments, shares, posts, photos, videos and many more). The only way to provide access to this plethora of information in a structured way is to understand the data and learn mathematical models for various problems like Search, NewsFeed Ranking, Instagram trending, Ads targeting etc.). Facebook Artificial Intelligence Research(FAIR) group aims to do this with some of the brightest minds in the field directed by Dr. Yann Lecun. Our goal is to be at the forefront of Artificial Intelligence and bring that technology to the billions of users using Facebook and beyond. In this talk I will touch a few asepcts of our focus areas and focus more on Computer Vision related projects. This will be a very high-level overview talk with a few slides of technical details and the goal is to motivate everyone about the challenges we face at FAIR.

Brief Bio:

Manohar Paluri is currently managing the Applied Computer Vision group at Facebook. His group focuses on building the world's largest image and video understanding platform. Manohar got his Bachelor's degree from IIIT Hyderabad with Honors in Computer Vision and Masters from Georgia Tech. While pursuing his Ph.D. at Georgia Tech and prior to joining Facebook Manohar worked with Dr. Steve Seitz's 3D Maps team at Google Research, Video surveillance group at IBM Watson labs and applied computer vision group at Sarnoff (now Stanford Research Institute).

My Research on Human-centered computing

Dr. Ramanathan Subramanian

National University of Singapore

Date : 26/03/2015

Abstract:

Human-centered computing focuses on all aspects of integrating the human/user within the computational loop of Artificial Intelligence (AI) systems. I am currently interested in developing applications that can analyze humans to make informed decisions (Human Behavior Understanding), and interactively employ implicit human signals (eye movements, EEG/MEG responses) for feedback.

In my talk, I will first present my research on analyzing human behavior from free-standing conversational groups, and the associated problems of head pose estimation and F-formation detection from low-resolution images. Then, I will also describe my studies on emotional perception using eye movements and brain signals.

Brief Bio:

Ramanathan Subramanian is a Research Scientist at the Advanced Digital Sciences Center (ADSC). Previously, he has served as a post-doctoral researcher at the Dept. of Information Engineering and Computer Science, University of Trento, Italy and the School of Computing, NUS. His research interests span Human-computer Interaction, Human-behavior understanding, Computer Vision and Computational Social sciences. He is especially interested in studying and modelling various aspects of human visual and emotional perception. His research has contributed to a number of behavioral databases including the prominent NUSEF (eye-fixation), DPOSE (dynamic head pose) and DECAF (Decoding affective multimedia from MEG brain responses) datasets.

Towards automatic video production of staged performances

Dr. Vineet Gandhi

University of Grenoble, France

Date : 26/02/2015

Abstract:

Professional quality videos of live staged performances are created by recording them from different appropriate viewpoints. These are then edited together to portray an eloquent story replete with the ability to draw out the intended emotion from the viewers. Creating such competent videos typically requires a team of skilled camera operators to capture the scene from multiple viewpoints. In this talk, I will introduce an alternative approach where we automatically compute camera movements in post-production using specially designed computer vision methods.

First, I will explain our novel approach for tracking objects and actors in long video sequences. Second, I will describe how the actor tracks can be used for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single high resolution static camera. I will conclude my talk by presenting test and validation results on a challenging corpus of theatre recordings and demonstrating how the proposed methods open the way to novel applications for cost effective video production of live performances including, but not restricted to, theatre, music and opera.

Brief Bio:

Vineet Gandhi obtained his B.Tech degree in Electronics and Communication engineering from Indian Institute of Information Technology, Design and Manufacturing Jabalpur and a master degree under prestigious Erasmus Mundus Scholarship program. During his masters he studied in three different countries for a semester each specializing in the areas of Optics, image and vision. He did his master thesis in perception team at INRIA France in collaboration with Samsung research. He later joined Imagine team at INRIA as a doctoral researcher and obtained his PhD degree in computer science and applied mathematics.

His current research interests are in the areas of visual learning/detection/recognition, computational photography/videography and sensor fusion for 3D reconstruction. He also enjoys working in the field of optics, colorimetry and general mathematics of signal and image processing.

Soumith Chintala

Facebook AI Research

Date : 18/02/2015

Abstract:

Recent years have seen rapid growth of the field of Deep Learning. Research in Deep Learning has put forth many new ideas which have led to many path breaking developments in Machine Learning. The large set of tools and techniques that have been churned out of this research are fast being explored and applied to different problems in Computer Vision and NLP. This talk shall focus on some of the current trends and practices in Deep Learning research and the challenges that lie ahead.

Extreme Classification: A New Paradigm for Ranking & Recommendation

Manik Varma

Microsoft Research India

Date : 10/02/2015

Abstract:

The objective in extreme multi-label classification is to learn a classifier that can automatically tag a data point with the most relevant subset of labels from a large label set. Extreme multi-label classification is an important research problem since not only does it enable the tackling of applications with many labels but it also allows the reformulation of ranking and recommendation problems with certain advantages over existing formulations.

Our objective, in this talk, is to develop an extreme multi-label classifier that is faster to train and more accurate at prediction than the state-of-the-art Multi-label Random Forest (MLRF) algorithm [Agrawal et al. WWW 13] and the Label Partitioning for Sub-linear Ranking (LPSR) algorithm [Weston et al. ICML 13]. MLRF and LPSR learn a hierarchy to deal with the large number of labels but optimize task independent measures, such as the Gini index or clustering error, in order to learn the hierarchy. Our proposed FastXML algorithm achieves significantly higher accuracies by directly optimizing an nDCG based ranking loss function. We also develop an alternating minimization algorithm for efficiently optimizing the proposed formulation. Experiments reveal that FastXML can be trained on problems with more than a million labels on a standard desktop in eight hours using a single core and in an hour using multiple cores.

Brief Bio:

Manik Varma is a researcher at Microsoft Research India where he helps champion the Machine Learning and Optimization area. Manik received a bachelor's degree in Physics from St. Stephen's College, University of Delhi in 1997 and another one in Computation from the University of Oxford in 2000 on a Rhodes Scholarship. He then stayed on at Oxford on a University Scholarship and obtained a DPhil in Engineering in 2004. Before joining Microsoft Research, he was a Post-Doctoral Fellow at the Mathematical Sciences Research Institute Berkeley. He has been an Adjunct Professor at the Indian Institute of Technology (IIT) Delhi in the Computer Science and Engineering Department since 2009 and jointly in the School of Information Technology since 2011. His research interests lie in the areas of machine learning, computational advertising and computer vision. He has served as an Area Chair for machine learning and computer vision conferences such as ACCV, CVPR, ICCV, ICML and NIPS. He has been awarded the Microsoft Gold Star award and has won the PASCAL VOC Object Detection Challenge.

Talks and Visits 2014

From 2014

Year wise list: 2024 | 2023 | 2022 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 |

What Motion Reveals about Shape with Unknown Material Behavior

Manmohan Chandraker

NEC Labs America
Date : 11/12/2014

Abstract:

Image formation is an outcome of a complex interaction between object geometry, lighting and camera, as governed by the reflectance of the underlying material. Psychophysical studies show that motion of the object, light source or camera are important cues for shape perception from image sequences. However, due to the complex and often unknown nature of the bidirectional reflectance distribution function (BRDF) that determines material behavior, computer vision algorithms have traditionally relied on simplifying assumptions such as brightness constancy or Lambertian reflectance. We take a step towards overcoming those limitations by answering a fundamental question: what does motion reveal about unknown shape and material? In each case of light source, object or camera motion, we show that physical properties of BRDFs yield PDE invariants that precisely characterize the extent of shape recovery under a given imaging condition. Conventional optical flow, multiview stereo and photometric stereo follow as special cases. This leads to the surprising result that motion can decipher shape even with complex, unknown material behavior and unknown lighting. Further, we show that contrary to intuition, joint recovery of shape, material and lighting using motion cues is often well-posed and tractable, requiring the solution of only sparse linear systems.

At the beginning of the talk, I will also describe our recent work on 3D scene understanding for autonomous driving. Using a single camera, we demonstrate real-time structure from motion performance on par with stereo, on the challenging KITTI benchmark. Combined with top-performing methods in object detection and tracking, we demonstrate 3D object localization with high accuracy comparable to LIDAR. We demonstrate high-level applications such as scene recognition that form the basis for collaborations on collision avoidance and danger prediction with automobile manufacturers.

Brief Bio:

Manmohan Chandraker received a B.Tech. in Electrical Engineering at the Indian Institute of Technology, Bombay and a PhD in Computer Science at the University of California, San Diego. Following a postdoctoral scholarship at the University of California, Berkeley, he joined NEC Labs America in Cupertino, where he conducts research in computer vision. His principal research interests are modern optimization methods for geometric 3D reconstruction, 3D scene understanding and recognition for autonomous driving and shape recovery in the presence of complex illumination and material behavior. His work has received the Marr Prize Honorable Mention for Best Paper at ICCV 2007, the 2009 CSE Dissertation Award for Best Thesis at UC San Diego, a nomination for the 2010 ACM Dissertation Award and the Best Paper Award at CVPR 2014, besides appearing in Best Paper Special Issues of IJCV 2009, IEEE PAMI 2011 and 2014.

Scalable Scientific Image Informatics

Prof. B.S. Manjunath

University of California
Date : 25/09/2014

Abstract:

Recent advances in microscopy imaging, image processing and computing technologies enable large scale scientific experiments that generate not only large collections of images and video, but also pose new computing and information processing challenges. These include providing ubiquitous access to images, videos and metadata resources; creating easily accessible image and video analysis, visualizations and workflows; and publishing both data and analysis resources. Further, contextual metadata, such as experimental conditions in biology, are critical for quantitative analysis. Streamlining collaborative efforts across distributed research teams with online virtual environments will improve scientific productivity, enhance understanding of complex phenomena and allow a growing number of researchers to quantify conditions based on image evidence that so far have remained subjective. This talk will focus on recent work in my group on image segmentation and quantification, followed by a detailed description of the BisQue platform. BisQue (Bio-Image Semantic Query and Environment) is an open-source platform for integrating image collections, metadata, analysis, visualization and database methods for querying and search. We have developed new techniques for managing user-defined datamodels for biological datasets, including experimental protocols, images, and analysis. BisQue is currently used in many laboratories around the world and is integrated into the iPlant cyber-infrastructure (see http://www.iplantcollaborative.org) which serves the plant biology community. For more information on BisQue see http://www.bioimage.ucsb.edu.

Brief Bio:

B. S. Manjunath received the B.E. degree (with distinction) in electronics from Bangalore University, Bangalore, India, in 1985, the M.E. degree (with distinction) in systems science and automation from the Indian Institute of Science, Bangalore, in 1987, and the Ph.D. degree in electrical engineering from University of Southern California, Los Angeles, in 1991. He is a Professor of electrical and computer engineering, Director of the Center for Bio-Image Informatics, and Director of the newly established Center on Multimodal Big Data Science and Healthcare at the University of California, Santa Barbara. His current research interests include image processing, distributed processing in camera networks, data hiding, multimedia databases, and Bio-image informatics. He has published over 250 peer-reviewed articles on these topics and is a co-editor of the book Introduction to MPEG-7 (Wiley, 2002). He was an associate editor of the IEEE Transactions on Image Processing, Pattern Analysis and Machine Intelligence, Multimedia, Information Forensics, IEEE Signal Processing Letters, and is currently an AE for the BMC Bio Informatics Journal. He is a co-author of the paper that won the 2013 Transactions on Multimedia best paper award and is a fellow of the IEEE.

Rounding-based Moves for Metric Labeling

Dr. M. Pawan Kumar

Ecole Centrale Paris
Date : 27/08/2014

Abstract:

Metric labeling is an important special case of energy minimization in pairwise graphical models. The dominant methods for metric labeling in the computer vision community belong to the move-making family, due to their computational efficiency. The dominant methods in the computer science community belong to the convex relaxations family, due to their strong theoretical guarantees. In this talk, I will present algorithms that combine the best of both worlds: efficient move-making algorithms that provide the same guarantees as the standard linear programming relaxation.

Brief Bio:

M. Pawan Kumar is an Assistant Professor at Ecole Centrale Paris, and a member of INRIA Saclay. Prior to that, he was a PhD student at Oxford Brookes University (Brookes Vision Group; 2003-2007), a postdoc at Oxford University (Visual Geometry Group; 2008), and a postdoc at Stanford University (Daphne's Approximate Group of Students; 2009-2011).

Signal Processing Meets Optics: Theory, Design and Inference for Computational Imaging

Dr. Kaushik Mitra

Rice University
Date : 24/04/2014

Abstract:

For centuries, imaging devices have been based on the principle of 'pinhole projection', which directly captures the desired image. However, it has recently been shown that by co-designing the imaging optics and processing algorithms, we can obtain Computational Imaging (CI) systems that far exceed the performance of traditional cameras. Despite the advances made in designing new computational cameras, there are still many open issues such as: 1) lack of a proper theoretical framework for analysis and design of CI cameras, 2) lack of camera designs for capturing the various dimensions of light with high fidelity and 3) lack of proper use of data-driven methods that have shown tremendous success in other domains.

In this talk, I will address the above mentioned issues. First, I will present a comprehensive framework for analysis of computational imaging systems and provide explicit performance guarantees for many CI systems such as light field and extended-depth-of-field cameras. Second, I will show how camera array can be exploited to capture the various dimensions of light such as spectrum and angle. Capturing these dimensions leads to novel imaging capabilities such as post-capture refocussing, hyper-spectral imaging and natural image retouching. Finally, I will talk about how various machine learning techniques such as robust regression and matrix factorization can be used for solving many imaging problems.

Brief Bio:

Kaushik Mitra is currently a postdoctoral research associate in the Electrical and Computer Engineering department of Rice University. His research interests are in computational imaging, computer vision and statistical signal processing. He earned his Ph.D. in Electrical and Computer Engineering from the University of Maryland, College Park, where his research focus was on the development of statistical models and optimization algorithms for computer vision problems

Cross-modal Retrieval: Retrieval Across Different Content Modalities

Dr. Nikhil Rasiwasia

Yahoo Labs, Bangalore
Date : 01/04/2014

Abstract:

In this talk, the problem of cross-modal retrieval from multimedia repositories is considered. This problem addresses the design of retrieval systems that support queries across content modalities, e.g., using an image to search for texts. A mathematical formulation is proposed, equating the design of cross-modal retrieval systems to that of isomorphic feature spaces for different content modalities. Two hypotheses are then investigated, regarding the fundamental attributes of these spaces. The ﬁrst is that low-level cross-modal correlations should be accounted for. The second is that the space should enable semantic abstraction. Three new solutions to the cross-modal retrieval problem are then derived from these hypotheses: correlation matching (CM), an unsupervised method which models cross-modal correlations, semantic matching (SM), a supervised technique that relies on semantic representation, and semantic correlation matching (SCM), which combines both. An extensive evaluation of retrieval performance is conducted to test the validity of the hypotheses. All approaches are shown successful for text retrieval in response to image queries and vice-versa. It is concluded that both hypotheses hold, in a complementary form, although the evidence in favor of the abstraction hypothesis is stronger than that for correlation.

Brief Bio:

Nikhil Rasiwasia received the B.Tech degree in electrical engineering from Indian Institute of Technology Kanpur (India) in 2005. He received the MS and PhD degrees from the University of California, San Diego in 2007 and 2011 respectively, where he was a graduate student researcher at the Statistical Visual Computing Laboratory, in the ECE department. Currently, he is working as scientist for Yahoo Labs! Bangalore, India. In 2008, he was recognized as an `Emerging Leader in Multimedia' by IBM T. J. Watson Research. He also received the best student paper award at ACM Multimedia conference in 2010. His research interests are in the areas of computer vision and machine learning, in particular applying machine learning solutions to computer vision problems.

Lifting 3D Manhattan Lines from a Single Image

Dr. Srikumar Ramalingam

University of California
Date : 10/01/2014

Abstract:

In the first part of the talk, I will present a novel and an efficient method for reconstructing the 3D arrangement of lines extracted from a single image, using vanishing points, orthogonal structure, and an optimization procedure that considers all plausible connectivity constraints between lines. Line detection identifies a large number of salient lines that intersect or nearly intersect in an image, but relatively a few of these apparent junctions correspond to real intersections in the 3D scene. We use linear programming (LP) to identify a minimal set of least-violated connectivity constraints that are sufficient to unambiguously reconstruct the 3D lines. In contrast to prior solutions that primarily focused on well-behaved synthetic line drawings with severely restricting assumptions, we develop an algorithm that can work on real images. The algorithm produces line reconstruction by identifying 95% correct connectivity constraints in York Urban database, with a total computation time of 1 second per image.

In the second part of the talk, I will briefly mention about my other work in graphical models, robotics, geo-localization, generic camera modeling and 3D reconstruction.

Brief Bio:

Srikumar Ramalingam is a Principal Research Scientist at Mitsubishi Electric Research Lab (MERL). He received his B.E from Anna University (Guindy) in India and his M.S from University of California (Santa Cruz) in USA. He received a Marie Curie Fellowship from European Union to pursue his studies at INRIA Rhone Alpes (France) and he obtained his PhD in 2007. His thesis on generic imaging models received INPG best thesis prize and AFRIF thesis prize (honorable mention) from the French Association for Pattern Recognition. After his PhD, he spent two years in Oxford working as a research associate in Oxford Brookes University, while being an associate member in visual geometry group in Oxford University. He has published more than 30 papers in flagship conferences such as CVPR, ICCV, SIGGRAPH ASIA and ECCV. He has co-edited journals, coauthored books, given tutorials and organized workshops on topics such as multi-view geometry and discrete optimization. His research interests are in computer vision, machine learning and robotics problems.

Talks and Visits 2013

From 2013

Year wise list: 2024 | 2023 | 2022 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 |

Higher-Order Grouping on the Grassmann Manifold

Venu Madhav Govindu

University of Grenoble, France
Date : 15/11/2013

Abstract:

The higher-order clustering problem arises when data is drawn from multiple subspaces or when observations fit a higher-order parametric model. In my talk I will present a tensor decomposition solution to this problem and its refinement based on estimation on a Grassmann manifold. This method exploits recent advances in online estimation on the Grassmann manifold and is resultantly efficient and scalable with a low memory requirement. I will present results of this method applied to a variety of segmentation problems including planar segmentation of Kinect depth maps and motion segmentation of the Hopkins 155 dataset for which we achieve performance comparable to the state-of-the-art.

Brief Bio:

Venu Madhav Govindu is with the Department of Electrical Engineering, Indian Institute of Science, Bengaluru. His research interests pertain to geometric and statistical estimation in computer vision.

Talks and Visits 2015

Understanding Deep Image Representations by Inverting Them

Abstract:

Brief Bio:

Understanding Reality for Generate Credible Augmentations

Brief Bio:

Learning to Super-Resolve Images Using Self-Similarities

Abstract:

Brief Bio:

Multi-view Learning using Statistical Dependence

Dr. Abhishek Tripathi

Abstract:

Brief Bio:

Parallel Inverse Kinematics for Multi-Threaded Architectures

Abstract:

Brief Bio:

Artificial Intelligence Research @ Facebook

Abstract:

Brief Bio:

My Research on Human-centered computing

Abstract:

Brief Bio:

Towards automatic video production of staged performances

Abstract:

Brief Bio:

The current landscape of deep learning. Trends and Challenges

Abstract:

Extreme Classification: A New Paradigm for Ranking & Recommendation

Abstract:

Brief Bio:

Talks and Visits 2014

What Motion Reveals about Shape with Unknown Material Behavior

Abstract:

Brief Bio:

Scalable Scientific Image Informatics

Abstract:

Brief Bio:

Rounding-based Moves for Metric Labeling

Abstract:

Brief Bio:

Signal Processing Meets Optics: Theory, Design and Inference for Computational Imaging

Abstract:

Brief Bio:

Cross-modal Retrieval: Retrieval Across Different Content Modalities

Abstract:

Brief Bio:

Lifting 3D Manhattan Lines from a Single Image

Abstract:

Brief Bio:

Talks and Visits 2013

Higher-Order Grouping on the Grassmann Manifold

Abstract:

Brief Bio: