Arun Mallya

Currently a Research Scientist at Meta (Gen AI → MSL) working on video generation, video editing models, and model auto-evaluation.

Previously a Senior Research Scientist in the Deep Imagination Research (DIR) group at NVIDIA (now Cosmos Lab). Part of the lab since its inception, when it began with just 3 members.

PhD from the University of Illinois at Urbana-Champaign, advised by Prof. Svetlana Lazebnik. M.S. in CS at UIUC; B.Tech in CSE at IIT Kharagpur.

My research currently focuses on generative content creation with neural networks.

Arun Mallya

Selected Research

See all on Google Scholar →

Image/Video Generation

Movie Gen: A Cast of Media Foundation Models
Technical Report, Meta, 2024
Edify Image teaser
Edify Image Generation
Technical Report, NVIDIA, 2024

Facial Animation

SPACE teaser
SPACE: Speech-driven Portrait Animation with Controllable Expression
International Conference on Computer Vision (ICCV), 2023
IMWA teaser
Implicit Warping for Animation with Image Sets
Arun Mallya, Ting-Chun Wang, Ming-Yu Liu
Neural Information Processing Systems (NeurIPS), 2022
face-vid2vid teaser
One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing oral
Ting-Chun Wang, Arun Mallya, Ming-Yu Liu
Computer Vision and Pattern Recognition (CVPR), 2021

Neural Rendering

LoE teaser
Implicit Neural Representations with Levels-of-Experts
Neural Information Processing Systems (NeurIPS), 2022
GANcraft teaser
GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds oral
International Conference on Computer Vision (ICCV), 2021

Model Efficiency / Interesting Properties

GradInversion teaser
See through Gradients: Image Batch Recovery via GradInversion
Hongxu Yin, Arun Mallya, Arash Vahdat, Jose Alvarez, Pavlo Molchanov, Jan Kautz
Computer Vision and Pattern Recognition (CVPR), 2021
DeepInversion teaser
Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion oral
Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose Alvarez, Arun Mallya, Derek Hoiem, Niraj Jha, Jan Kautz
Computer Vision and Pattern Recognition (CVPR), 2020
Piggyback teaser
Piggyback: Adding Multiple Tasks to a Single, Fixed Network by Learning to Mask
Arun Mallya, Dillon Davis, Svetlana Lazebnik
European Conference on Computer Vision (ECCV), 2018
PackNet teaser
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
Arun Mallya, Svetlana Lazebnik
Computer Vision and Pattern Recognition (CVPR), 2018

Tutorials & Workshops

  1. Machine Learning with Synthetic Data, CVPR 2022
  2. Accelerating Computer Vision with Mixed Precision, ECCV 2020
  3. Accelerating Computer Vision with Mixed Precision, ICCV 2019

Writeups & Notes

Hosted on GitHub. Edit requests, additions, and corrections are welcome.

  1. A Backpropagation Refresher
  2. An Illustrated Explanation of the LSTM Forward-Backward Pass
  3. Introduction to RNNs
  4. Introduction to RNNs — II
  5. Jupyter notebook to find Receptive Field Size and Effective Stride (supports dilated convs)
  6. Visualization of neuron connections and receptive field of a CNN (including dilation)