Pre-prints

Yixin Lin*, Jan Humplik*, Sandy H. Huang*, Leonard Hasenclever, Francesco Romano, Stefano Saliceti, Daniel Zheng, Jose Enrique Chen, Catarina Barros, Adrian Collister, Matt Young, Adil Dostmohamed, Ben Moran, Ken Caluwaerts, Marissa Giustina, Joss Moore, Kieran Connell, Francesco Nori, Nicolas Heess, Steven Bohez, Arunkumar Byravan. Proc4Gem: Foundation models for physical agency through procedural generation. 2025. [arXiv]

Abbas Abdolmaleki*, Sandy H. Huang*, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, Andras Gyorgy, Csaba Szepesvari, Raia Hadsell, Nicolas Heess, Martin Riedmiller. On multi-objective policy optimization as a tool for reinforcement learning: Case studies in offline RL and finetuning. 2021. [arXiv]

Sandy H. Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter Abbeel. Adversarial attacks on neural network policies. 2017. [arXiv] [videos]

Conference & Journal Papers

Markus Wulfmeier, Michael Bloesch, Nino Vieillard, Arun Ahuja, Jorg Bornschein, Sandy H. Huang, Artem Sokolov, Matt Barnes, Guillaume Desjardins, Alex Bewley, Sarah Bechtle, Jost Springenberg, Nikola Momchev, Olivier Bachem, Matthieu Geist, Martin Riedmiller. Imitating language via scalable inverse reinforcement learning. NeurIPS 2024. [pdf]

Dhruva Tirumala, Markus Wulfmeier, Ben Moran, Sandy H. Huang, Jan Humplik, Guy Lever, Tuomas Haarnoja, Leonard Hasenclever, Arunkumar Byravan, Nathan Batchelor, Neil Sreendra, Kushal Patel, Marlon Gwira, Francesco Nori, Martin Riedmiller, Nicolas Heess. Learning robot soccer from egocentric vision with deep reinforcement learning. CoRL 2024. [pdf]

Tuomas Haarnoja*, Ben Moran*, Guy Lever*, Sandy H. Huang*, Dhruva Tirumala, Jan Humplik, Markus Wulfmeier, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess. Learning agile soccer skills for a bipedal robot with deep reinforcement learning. Science Robotics. April 2024. [Science Robotics] [arXiv] [videos]

Thomas Lampe, Abbas Abdolmaleki, Sarah Bechtle, Sandy H. Huang, Jost Tobias Springenberg, Michael Bloesch, Oliver Groth, Roland Hafner, Tim Hertweck, Michael Neunert, Markus Wulfmeier, Jingwei Zhang, Francesco Nori, Nicolas Heess, Martin Riedmiller. Mastering stacking of diverse shapes with large-scale iterative reinforcement learning on real robots. ICRA 2024. [arXiv]

Dhruva Tirumala, Thomas Lampe, Jose Enrique Chen, Tuomas Haarnoja, Sandy H. Huang, Guy Lever, Ben Moran, Tim Hertweck, Leonard Hasenclever, Martin Riedmiller, Nicolas Heess, Markus Wulfmeier. Replay across experiments: A natural extension of off-policy RL. ICLR 2024. [pdf]

Joe Watson, Sandy H. Huang, Nicolas Heess. Coherent soft imitation learning. NeurIPS 2023. [pdf]

Eliza Kosoy*, Adrian Liu*, Jasmine L. Collins, David Chan, Jessica B. Hamrick, Nan Rosemary Ke, Sandy H. Huang, Bryanna Kaufmann, John Canny, Alison Gopnik. Learning causal overhypotheses through exploration in children and computational models. Conference on Causal Learning and Reasoning 2022. [pdf]

Sandy H. Huang*, Abbas Abdolmaleki*, Giulia Vezzani, Philemon Brakel, Daniel J Mankowitz, Michael Neunert, Steven Bohez, Yuval Tassa, Nicolas Heess, Martin Riedmiller, Raia Hadsell. A constrained multi-objective reinforcement learning framework. CoRL 2022. [pdf]

Olivia Watkins, Sandy H. Huang, Julius Frost, Kush Bhatia, Eric Weiner, Pieter Abbeel, Trevor Darrell, Bryan Plummer, Kate Saenko, Anca Dragan. Explaining robot policies. Applied AI Letters. 2021. [pdf]

Abbas Abdolmaleki*, Sandy H. Huang*, Leonard Hasenclever, Michael Neunert, H. Francis Song, Martina Zambelli, Murilo F. Martins, Nicolas Heess, Raia Hadsell, Martin Riedmiller. A distributional view on multi-objective policy optimization. ICML 2020. [pdf]

Sandy H. Huang*, Isabella Huang*, Ravi Pandya*, Anca D. Dragan. Nonverbal robot feedback for human teachers. CoRL 2019. [arXiv] (oral, 5.3%)

Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan. Enabling robots to communicate their objectives. Autonomous Robots 2019. [pdf]

Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, Anca D. Dragan. Human-AI learning performance in multi-armed bandits. AIES 2019.

Sandy H. Huang, Kush Bhatia, Pieter Abbeel, Anca D. Dragan. Establishing appropriate trust via critical states. IROS 2018. [arXiv]

Minae Kwon, Sandy H. Huang, Anca D. Dragan. Expressing robot incapability. HRI 2018. [pdf] [arXiv] [video] (best paper award finalist)

Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan. Enabling robots to communicate their objectives. RSS 2017. [pdf] [arXiv] (invited to special issue)

Sandy H. Huang, Jia Pan, George Mulcaire, Pieter Abbeel. Leveraging appearance priors in non-rigid registration, with application to manipulation of deformable objects. IROS 2015. [pdf]

Dylan Hadfield-Menell, Alex X. Lee, Chelsea Finn, Eric Tzeng, Sandy H. Huang, Pieter Abbeel. Beyond lowest-warping cost action selection in trajectory transfer. ICRA 2015. [pdf]

Alex X. Lee, Sandy H. Huang, Dylan Hadfield-Menell, Eric Tzeng, Pieter Abbeel. Unifying scene registration and trajectory optimization for learning from demonstrations with application to manipulation of deformable objects. IROS 2014. [pdf]

Sandy H. Huang, Paea LePendu, Srinivasan V. Iyer, Ming Tai-Seale, David Carrell, Nigam H. Shah. Toward personalizing treatment for depression: predicting diagnosis and severity. JAMIA 21 (6), 1069-1075. 2014. [pdf]

Caroline Suen*, Sandy H. Huang*, Chantat Eksombatchai*, Rok Sosic, Jure Leskovec. NIFTY: A system for large scale information flow tracking and clustering. WWW 2013. [pdf] [interactive]