Publications

(For a full list see below)

Conferences

IEEE Conference on Computer Vision and Pattern Recognition

Energy-Based Learning for Scene Graph Generation
M. Suhail, A. Mittal, B. Siddiquie, C. Broaddus, J. Eledath, G. Medioni, L. Sigal

Traditional scene graph generation methods are trained using cross-entropy losses that treat objects and relationships as independent entities. Such a formulation, however, ignores the structure in the output space, in an inherently structured prediction problem. In this work, we introduce a novel energy-based learning framework for generating scene graphs. The proposed formulation allows for efficiently incorporating the structure of scene graphs in the output space. This additional constraint in the learning framework acts as an inductive bias and allows models to learn efficiently from a small number of labels. We use the proposed energy-based framework to train existing state-of-the-art models and obtain a significant performance improvement, of up to 21% and 27%, on the Visual Genome and GQA benchmark datasets, respectively. Furthermore, we showcase the learning efficiency of the proposed framework by demonstrating superior performance in the zero- and few-shot settings where data is scarce.

MIST: Multiple Instance Spatial Transformer Network
B. Angles, Y. Jin, S. Kornblith, A. Tagliasacchi, K. M. Yi

We propose a deep network that can be trained to tackle image reconstruction and classification problems that involve detection of multiple object instances, without any supervision regarding their whereabouts. The network learns to extract the most significant top-K patches, and feeds these patches to a task-specific network -- e.g., auto-encoder or classifier -- to solve a domain specific problem. The challenge in training such a network is the non-differentiable top-K selection process. To address this issue, we lift the training optimization problem by treating the result of top-K selection as a slack variable, resulting in a simple, yet effective, multi-stage training. Our method is able to learn to detect recurrent structures in the training dataset by learning to reconstruct images. It can also learn to localize structures when only knowledge on the occurrence of the object is provided, and in doing so it outperforms the state-of-the-art.

DeRF: Decomposed Radiance Fields
D. Rebain, W. Jiang, S. Yazdani, K. Li, K. Yi, and A. Tagliasacchi

With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. Yet, generating these images is very computationally intensive, limiting their applicability in practical scenarios. In this paper, we propose a technique based on spatial decomposition capable of mitigating this issue. Our key observation is that there are diminishing returns in employing larger (deeper and/or wider) networks. Hence, we propose to spatially decompose a scene and dedicate smaller networks for each decomposed part. When working together, these networks can render the whole scene. This allows us near-constant inference time regardless of the number of decomposed parts. Moreover, we show that a Voronoi spatial decomposition is preferable for this purpose, as it is provably compatible with the Painter's Algorithm for efficient and GPU-friendly rendering. Our experiments show that for real-world scenes, our method provides up to 3x more efficient inference than NeRF (with the same rendering quality), or an improvement of up to 1.0~dB in PSNR (for the same inference cost).

VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
J. Choi, K. M. Yi, J. Kim, J. Choo, B. Kim, J.-Y. Chang, Y. Gwon, H. J. Chang

Active Learning for discriminative models has largely been studied with the focus on individual samples, with less emphasis on how classes are distributed or which classes are hard to deal with. In this work, we show that this is harmful. We propose a method based on the Bayes’ rule, that can naturally incorporate class imbalance into the Active Learning framework. We derive that three terms should be considered together when estimating the probability of a classifier making a mistake for a given sample; i) probability of mislabelling a class, ii) likelihood of the data given a predicted class, and iii) the prior probability on the abundance of a predicted class. Implementing these terms requires a generative model and an intractable likelihood estimation. Therefore, we train a Variational Auto Encoder (VAE) for this purpose. To further tie the VAE with the classifier and facilitate VAE training, we use the classifiers’ deep feature representations as input to the VAE. By considering all three probabilities, among them, especially the data imbalance, we can substantially improve the potential of existing methods under limited data budget. We show that our method can be applied to classification tasks on multiple different datasets – including one that is a real-world dataset with heavy data imbalance – significantly outperforming the state of the art.

Full List

Energy-Based Learning for Scene Graph Generation
M. Suhail, A. Mittal, B. Siddiquie, C. Broaddus, J. Eledath, G. Medioni, L. Sigal
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Canonical Capsules: Unsupervised Capsules in Canonical Pose
W. Sun, A. Tagliasacch, B. Deng, S. Sabou, S. Yazdani, G. E. Hinton, and K. M. Yi
arXiv:2012.04718, 2020

LatentKeypointGAN: Controlling GANs via Latent Keypoints
X. He and B. Wandt and H. Rhodin
arXiv:2103.15812, 2021

MIST: Multiple Instance Spatial Transformer Network
B. Angles, Y. Jin, S. Kornblith, A. Tagliasacchi, K. M. Yi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

DeRF: Decomposed Radiance Fields
D. Rebain, W. Jiang, S. Yazdani, K. Li, K. Yi, and A. Tagliasacchi
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
J. Choi, K. M. Yi, J. Kim, J. Choo, B. Kim, J.-Y. Chang, Y. Gwon, H. J. Chang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Image Matching Across Wide Baselines: From Paper to Practice
Y. Jin, D. Mishkin, A. Mishchuk, J. Matas, P. Fua, K. M. Yi, E. Trulls
International Journal of Computer Vision (IJCV), 2020

Optimizing Through Learned Errors for Accurate Sports Field Registration
W. Jiang, J. C. G. Higuera, B. Angles, W. Sun, M. Javan, K. M. Yi
Winter Conference on Applications of Computer Vision (WACV), 2020

Linearized Multi-Sampling for Differentiable Image Transformation
W. Jiang, W. Sun, A. Tagliasacchi, E. Trulls, K. M. Yi
IEEE/CVF International Conference on Computer Vision (ICCV), 2019

Mixture-Kernel Graph Attention Network for Situation Recognition
M. Suhail and L. Sigal
IEEE/CVF International Conference on Computer Vision (ICCV), 2019

ATTENTIONRNN: A Structured Spatial Attention Mechanism
S. Khandelwal and L. Sigal
IEEE/CVF International Conference on Computer Vision (ICCV), 2019

Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning
T. Rahman, B. Xu and L. Sigal
IEEE/CVF International Conference on Computer Vision (ICCV), 2019

GraphGROUND: Graph-based Language Grounding
M. Bajaj, L. Wang and L. Sigal
IEEE/CVF International Conference on Computer Vision (ICCV), 2019

DwNet: Dense warp-based network for pose-guided human video generation
P. Zablotskaia, A. Siarohin, B. Zhao and L. Sigal
British Machine Vision Conference (BMVC), 2019

Spatio-temporal Relational Reasoning for Video Question Answering
G. Singh, L. Sigal and J. Little
British Machine Vision Conference (BMVC), 2019

A Less Biased Evaluation of Out-of-distribution Sample Detectors
A. Shafaei, Mark Schmidt, James J. Little
British Machine Vision Conference (BMVC), 2019

Pan-tilt-zoom SLAM for Sports Videos
J Lu, J Chen and J J. Little
British Machine Vision Conference (BMVC), 2019

Image Generation from Layout
B. Zhao, L. Meng, W. Yin and L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Modular Generative Adversarial Networks
B. Zhao, B. Chang, Z. Jie and L. Sigal
European Conference on Computer Vision (ECCV), 2018

Probabilistic Video Generation using Holistic Attribute Control
J. He, A. Lehrmann, J. Marino, G. Mori and L. Sigal
European Conference on Computer Vision (ECCV), 2018

A Neural Multi-sequence Alignment TeCHnique (NeuMATCH)
P. Dogan, B. Li, L. Sigal and M. Gross
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Show Me a Story: Towards Coherent Neural Story Illustration
H. Ravi, L. Wang, C Muniz, L. Sigal, D. Metaxas and M. Kapadia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Predicting Personality from Book Preferences with User-Generated Content Labels
N. Annalyn, M. W. Bos, L. Sigal and B. Li
IEEE Transactions on Affective Computing (TAC), 2018

Where should cameras look at soccer games: improving smoothness using the overlapped hidden Markov model
J Chen and J J. Little
Compuer Vision and Image Understanding (2017)

Story Albums: Creating Fictional Stories from Personal Photograph Sets
O. Radiano, Y. Graber, M. Mahler, L. Sigal and A. Shamir
Computer Graphics Forum, Volume 36, 2017

Non-parametric Structured Outputs Networks
A. Lehrmann and L. Sigal
Neural Information Processing Systems (NIPS), 2017

Visual Reference Resolution using Attention Memory for Visual Dialog
P. H. Seo, A. Lehrmann, B. Han and L. Sigal
Neural Information Processing Systems (NIPS), 2017

Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
F. Xiao, L. Sigal and Y. J. Lee
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Story Albums: Creating Fictional Stories from Personal Photograph Sets
O. Radiano, Y. Graber, M. Mahler, L. Sigal and A. Shamir
Computer Graphics Forum, Volume 36, 2017

Learn How to Choose: Independent Detectors versus Composite Visual Phrases

Winter Conference on Applications of Computer Vision (WACV), 2017

Where should cameras look at soccer games: improving smoothness using the overlapped hidden Markov model
J Chen and J J. Little
Compuer Vision and Image Understanding (2017)

Learning Online Smooth Predictions for Realtime Camera Planning using Recurrent Decision Trees
J Chen, H M. Le. P Carr, Y Yue, J J. Little
Computer Vision and Pattern Recognition (2016)

Real-time Physics-based Motion Capture with Sparse Sensors
S. Andrews, I. Huerta, T. Komura, L. Sigal and K. Mitchell
European Conference on Visual Media Production (CVMP), 2016

Heterogeneous Knowledge Transfer in Video Emotion Recognition, Attribution and Summarization
B. Xu, Y. Fu, Y.-G. Jiang, B. Li and L. Sigal
IEEE Transactions on Affective Computing (TAC), 2016

Learning Language-Visual Embedding for Movie Understanding with Natural-Language
A. Torabi, N. Tandon and L. Sigal
arXiv:1609.081241, 2016

Semi-supervised Vocabulary-informed Learning
Y. Fu and L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

Learning Activity Progression in LSTMs for Activity Detection and Early Detection
S. Ma, L. Sigal and S. Sclaroff
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

Harnessing Object and Scene Semantics for Large-Scale Video Understanding
Z. Wu, Y. Fu, Y.-G. Jiang and L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

Video Emotion Recognition with Transferred Deep Feature Encodings
B. Xu, Y. Fu, Y.-G. Jiang, B. Li and L. Sigal
ACM International Conference in Multimedia Retrieval (ICMR), 2016

Knowledge Transfer with Interactive Learning of Semantics Relationships
J. Choi, S. Hwang, L. Sigal and L. Davis
AAAI Conference on Artificial Intelligence (AAAI), 2016

Exploiting View-Specific Appearance Similarities Across Classes for Zero-shot Pose Prediction: A Metric Learning Approach
A. Kuznetsova, S. Hwang, B. Rosenhahn and L. Sigal
AAAI Conference on Artificial Intelligence (AAAI), 2016

Learning to Generate Posters of Scientific Papers
Y. Qiang, Y. Fu, Y. Guo, Z.-H. Zhou and L. Sigal
AAAI Conference on Artificial Intelligence (AAAI), 2016

Play and Learn: Using Video Games to Train Computer Vision Models
A. Shafaei, J. J. Little, Mark Schmidt
BMVC (2016)

Real-Time Human Motion Capture with Multiple Depth Cameras
A. Shafaei, J. J. Little
CRV (2016)

Learning Online Smooth Predictions for Realtime Camera Planning using Recurrent Decision Trees
J Chen, H M. Le. P Carr, Y Yue, J J. Little
Computer Vision and Pattern Recognition (2016)

Storyline Representation of Egocentric Videos with an Application to Story-based Search
B. Xiong, G. Kim and L. Sigal
IEEE International Conference on Computer Vision (ICCV), 2015

Learning from Synthetic Data Using a Stacked Multichannel Autoencoder
X. Zhang, Y. Fu, S. Jiang, L. Sigal and G. Agam
IEEE International Conference on Machine Learning and Applications (ICMLA), 2015

Cross-Domain Matching with Squared-Loss Mutual Information
M. Yamada, L. Sigal, M. Raptis, M. Toyoda, Y. Chang and M. Sugiyama
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015

A Perceptual Control Space for Garment Simulation
L. Sigal, M. Mahler, S. Diaz, K. McIntosh, E. Carter, T. Richards and J. Hodgins
ACM Transactions on Graphics (Proc. SIGGRAPH), 2015

Discovering Collective Narratives of Theme Parks from Large Collections of Visitors Photo Streams
G. Kim and L. Sigal
KDD 2015

Hierarchical Maximum-Margin Clustering
G.-T. Zhou, S. Hwang, M. Schmidt, L. Sigal and G. Mori
arXiv:1502.01827, 2015

Joint Photo Stream and Blog Post Summarization and Exploration
G. Kim, S. Moon, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Ranking and Retrival of Image Sequences from Multiple Paragraph Queries
G. Kim, S. Moon, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Space-Time Tree Ensemble for Action Recognition
S. Ma, L. Sigal, S. Sclaroff
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Expanding Object Detector's Horizon: Incremental Learning Framework for Object Detection in Videos
A. Kuznetsova, S.-J. Hwang, B. Rosenhahn, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015

Learning to Select and Order Vacation Photographs
F. Sadeghi, J. R. Tena, A. Farhadi, L. Sigal
IEEE Winter Conference on Applications of Computer Vision (WACV), 2015

Family Member Identification from Photo Collections
Q. Dai, P. Carr, L. Sigal, D. Hoiem
IEEE Winter Conference on Applications of Computer Vision (WACV), 2015

Unlabelled 3D Motion Examples Improve Cross View Action Recognition
A. Gupta, A. Shafaei, J. J. Little and R. J. Woodham
BMVC (2014)

A Unified Semantic Embedding: Relating Taxonomies and Attributes
S.-J. Hwang, L. Sigal
Neural Information Processing Systems (NIPS), 2014

Parameterizing Object Detectors in the Continuous Pose Space
K. He, L. Sigal, S. Sclaroff
European Conference on Computer Vision (ECCV), 2014

Nonparametric Clustering with Distance Dependent Hierarchies
S. Ghosh, M. Raptis, L. Sigal, E. Sudderth
Conference on Uncertainty in Artificial Intelligence (UAI), 2014

Joint Summarization of Large-scale Collections of Web Images and Videos for Storyline Reconstruction
G. Kim, L. Sigal, E. P. Xing
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014

Domain Adaptation for Structured Regression
M. Yamada, Y. Chang and L. Sigal
International Journal of Computer Vision (IJCV), Special Issue on Domain Adaptation for Vision Applications, 2014

High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso
M. Yamada, W. Jitkrittum, L. Sigal, E. P. Xing and M. Sugiyama
Neural Computation (NC), 26(1):185-207, 2014

Covariate Shift Adaptation for Discriminative 3D Pose Estimation
M. Yamada, L. Sigal and M. Raptis
EEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2013

Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization
N. Shapovalova, M. Raptis, L. Sigal
G. Mori, Neural Information Processing Systems (NIPS), 2013

From Subcategories to Visual Composites: A Multi-Level Framework for Object Detection
T. Lan, M. Raptis, L. Sigal, G. Mori
IEEE International Conference on Computer Vision (ICCV), 2013

Poselet Key-framing: A Model for Human Activity Recognition
M. Raptis, L. Sigal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013

Dynamical Simulation Priors for Human Motion Tracking
M. Vondrak, L. Sigal and O. C. Jenkins
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 35(1):52-65, 2013

Canonical Locality Preserving Latent Variable Model for Discriminative Pose Inference
Y. Tian, L. Sigal, F. De la Torre and Y. Jia
Image and Vision Computing (IVC), 31(3):223-230, 2013

Destination Flow for Crowd Simulation
S. Pellegrini, J. Gall, L. Sigal, L. van Gool
Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams (ARTEMIS'12), 2012

No Bias Left Behind: Covariate Shift Adaptation for Discriminative 3D Pose Estimation
M. Yamada, L. Sigal, M. Raptis
European Conference on Computer Vision (ECCV), 2012

Multi-linear Data-Driven Dynamic Hair Model with Efficient Hair-Body Collision Handling
P. Guan, L. Sigal, V. Reznitskaya, J. K. Hodgins
ACM/Eurographics Symposium on Computer Animation (SCA), 2012

Video-based 3D Motion Capture through Biped Control
M. Vondrak, L. Sigal, J. K. Hodgins and Odest Jenkins
ACM Transactions on Graphics (Proc. SIGGRAPH), 2012

Human Context: Modeling human-human interactions for monocular 3D pose estimation
M. Andriluka and L. Sigal
VII Conference on Articulated Motion and Deformable Objects (AMDO), 2012

Social Roles in Hierarchical Models for Human Activity Recognition
T. Lan, L. Sigal and G. Mor
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012

Human attributes from 3D pose tracking
M. Livne, L. Sigal, N. Troje and D. Fleet
Computer Vision and Image Understanding (CVIU), 116:648-660, 2012

Shared kernel information embedding for discriminative inference
R. Memisevic, L. Sigal and D. Fleet
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 34(4):778-790, 2012

Loose-limbed People: Estimating Human Pose and Motion using Non-parametric Belief Propagation
L. Sigal, M. Isard, H. Haussecker and M. J. Black
International Journal of Computer Vision (IJCV), 98(1):15-48, 2012

Recognizing Character-directed Utterances in Multi-child Interactions
H. Hajishirzi, J. Lehman, K. Kumatani, L. Sigal, and J. Hodgins
late-breaking report section of Human Robot Interaction (HRI), 2012

Facial Expression Transfer with Input-Output Temporal Restricted Boltzmann Machines
M. Zeiler, G. Taylor, L. Sigal, I. Matthews and R. Fergus
Neural Information Processing Systems (NIPS), 2011

Visual Analysis of Humans: Looking at People
T. Moeslund, A. Hilton, V. Krüger and L. Sigal
ISBN 978-0-85729-996-3. To be published by Springer Verlag in October 2011

Benchmark Datasets for Pose Estimation and Tracking
M. Andriluka, L. Sigal and M. J. Black, Visual Analysis of Humans, Looking at People, T. Moeslund, A. Hilton, V. Krüger and L. Sigal
ISBN 978-0-85729-996-3. To be published by Springer Verlag in October 2011

Human Pose Estimation
L. Sigal
Encyclopedia of Computer Vision, Springer, 2011

Motion Capture from Body-Mounted Cameras
. Shiratori, H. S. Park, L. Sigal, Y. Sheikh and J. K. Hodgins
ACM Transactions on Graphics (Proc. SIGGRAPH), July 2011

Inferring 3D Body Pose Using Variational Semi-parametric Regression
Y. Tian, Y. Jia, Y. Shi, Y. Liu, J. Hao and L. Sigal
IEEE International Conference on Image Processing (ICIP), 2011

Latent Gaussian Mixture Regression for Human Pose Estimation
Y. Tian, L. Sigal, H. Badino, F. De la Torre and Y. Liu
Asian Conference on Computer Vision (ACCV), 2010

Human Attributes from 3D Pose Tracking
L. Sigal, D. Fleet, N. Troje, M. Livne
European Conference on Computer Vision, ECCV 2010.

Stable Spaces for Real-time Clothing
E. de Aguiar, L. Sigal, A. Treuille and J. K. Hodgins
ACM Trans. Graphics (Proc. SIGGRAPH), July 2010

Dynamical Binary Latent Variable Models for 3D Human Pose Tracking
G. Taylor, L. Sigal, D. Fleet, G. Hinton
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010

HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion
L. Sigal, A. Balan and M. J. Black
International Journal of Computer Vision (IJCV), Special Issue on Evaluation of Articulated Human Motion and Pose Estimation, 2010

Estimating Contact Dynamics
M. Brubaker, L. Sigal, D. Fleet
IEEE International Conference on Computer Vision, ICCV 2009

Dynamics and Control of Multibody Systems
M. Vondrak, L. Sigal and O. C. Jenkins
Motion Control, A. Lazinica (Eds), ISBN978-953-7619-X-X, 2009

Shared Kernel Information Embedding for Discriminative Inference
L. Sigal, R. Memisevic, D. Fleet
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009

Video-Based People Tracking
M. Brubaker, L. Sigal and D. Fleet
Handbook on Ambient Intelligence and Smart Environments, H. Nakashima, H. Aghajan, and J.C. Augusto (Eds), Springer Verlag, 2009

Physical Simulation for Probabilistic Motion Tracking
M. Vondrak, L. Sigal and O. C. Jenkins
IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008