Year wise list: 2017 | 2016 | 2015 | 2014 | 2013 |

Facial Analysis and Synthesis for Photo Editing and Virtual Reality

Vivek Kwatra

Dr.Vivek Kwatra

Georgia Institute of Technology

Date : 25/10/2017



​In this talk, speaker will discuss some techniques built around facial analysis and synthesis, focusing broadly on two application areas:

Photo editing:

Photo editing: We are motivated by the observation that group photographs are seldom perfect. Subjects may have inadvertently closed their eyes, may be looking away, or may not be smiling at that moment. He will describe how we can combine multiple photos of people into one shot, optimizing desirable facial attributes such as smiles and open-eyes. Our approach leverages advances in facial analysis as well as image stitching to generate high-quality composites.

Headset removal for Virtual Reality:

Headset removal for Virtual Reality: Experiencing VR requires users to wear a headset, but it occludes the face and blocks eye-gaze. He will demonstrate a technique that virtually “removes” the headset and reveals the face underneath it, creating a realistic see-through effect. Using a combination of 3D vision, machine learning and graphics techniques, we synthesize a realistic, personalized 3D model of the user's face that allows us to reproduce their appearance, eye gaze and blinks, which are otherwise hidden by the VR headset.



Vivek Kwatra is a Research Scientist in Machine Perception at Google. He received his B.Tech. in Computer Science & Engineering from IIT Delhi and his Ph.D. in Computer Science from Georgia Tech. He spent two years at UNC Chapel Hill as a Postdoctoral Researcher. His interests lie in computer vision, computational video and photography, facial analysis and synthesis, and virtual reality. He has authored many papers and patents on these topics, and his work has found its way into several Google products.


Perceptual Quality Assessment of Real-World Images and Videos

Deepti Ghadiyaram

Dr. Deepti Ghadiyaram

University of Texas at Austin

Date : 22/09/2017



​The development of several online social-media venues and rapid advances in technology by camera and mobile device manufacturers have led to the creation and consumption of a seemingly limitless supply of visual content. However, a vast majority of these digital images and videos are often afflicted with annoying artifacts during acquisition, subsequent storage and transmission over the network. Existing automatic quality predictors are designed on unrepresentative databases that only model single, synthetic distortions (such as blur, compression, and so on). However, these models lose their prediction capability when evaluated on real world camera pictures and videos captured using typical real-world mobile camera devices that contain complex mixtures of multiple distortions. Pertaining to over-the-top video streaming, all of the existing quality of experience (QoE) prediction models fail to model the behavioral responses of our visual system to a stimuli containing stalls and playback interruptions.

In her talk, She will focus on two broad topics: 1) construction of distortion-representative image and video quality assessment databases and 2) design of novel quality predictors for real world images and videos. The image and video quality predictors that she present in this talk rely on models based on the natural scene statistics of images and videos, model the complex non-linearities and linearities in the human visual system, and effectively capture a viewer’s behavioral responses to unpleasant stimuli (distortions).



Deepti Ghadiyaram received her PhD in Computer Science from the University of Texas at Austin in August 2017. Her research interests broadly include image and video processing, computer vision, and machine learning. Her Ph.D work was focused on perceptual image and video quality assessment, particularly on building accurate quality prediction models for pictures and videos captured in the wild and understanding a viewer’s time-varying quality of experience while streaming videos. She was a recipient of the UT Austin’s Microelectronics and Computer Development (MCD) Fellowship from 2013 to 2014 and the recipient of Graduate Student Fellowship by the Department of Computer Science for the academic years 2013-2016.


Machine Learning Seminar Series


Dr. Soumith Chintala

Facebook AI Research

Date : 14/02/2017



In this talk, we will first go over the PyTorch deep learning framework, it's architecture, code and new research paradigms that it enables.After that, we shall go over latest developments in Generative Adversarial Networks, and introduce Wasserstein GANs -- a reformulation of GANs that is more stable and fixes optimization problems that traditional GANs face.



Dr. Soumith is a Researcher at Facebook AI Research, where he works on deep learning, reinforcement learning, generative image models, agents for video games and large-scale high-performance deep learning. Prior to joining Facebook in August 2014, he worked at MuseAmi, where he built deep learning models for music and vision targeted at mobile devices.He holds a Masters in CS from CS from NYU, and spent time in Yann LeCun's NYU lab building deep learning models for pedestrian detection, natural image OCR, depth-images among others.