About me

I’m a Member of Technical Staff at AMI Labs. I think about how to build visually intelligent systems. My research is on multimodal generation, understanding, and open world perception, with the goal of building a unified world model.

I build large-scale systems and research how they should work. I’ve explored concurrent mixed-modal generation [OneFlow], temporally expansive video generation [Flowception], scaling laws for multimodal pretraining [Beyond Language Modeling], and tokenization-free language modeling [Byte Latent Transformer, ACL Outstanding Paper]. On the systems side, I built Opacus (4M+ downloads) for differentially private model training, and Papaya (MLSys 2022), a large-scale asynchronous federated learning platform deployed to millions of users.

Previously, I was at Meta (FAIR) working on large multimodal models and federated learning. I graduated Cum Laude from UC Davis with double majors in Statistics and Computer Science (2018) and M.S. in Computer Science (2019).

Publications (see all)

2026

Beyond Language Modeling: An Exploration of Multimodal Pretraining

  • Shengbang Tong*, David Fan*, John Nguyen*, Ellis Brown, Gaoyue Zhou, Shengyi Qian, Boyang Zheng, Théophane Vallaeys, Junlin Han, Rob Fergus, Naila Murray, Marjan Ghazvininejad, Mike Lewis, Nicolas Ballas, Amir Bar, Michael Rabbat, Jakob Verbeek, Luke Zettlemoyer, Koustuv Sinha, Yann LeCun, Saining Xie
  • *Joint first author
  • Website

2025

Flowception: Temporally Expansive Flow Matching for Video Generation

  • Tariq Berrada Ifriqi, John Nguyen, Karteek Alahari, Jakob Verbeek, Ricky T. Q. Chen

OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows

  • John Nguyen, Marton Havasi, Tariq Berrada, Luke Zettlemoyer, Ricky T. Q. Chen
  • Website

VUGEN: Visual Understanding priors for GENeration

  • Xingyi Chen, Tom Vallaeys, Maha Elbayad, John Nguyen, Jakob Verbeek

2024

Byte Latent Transformer: Patches Scale Better Than Tokens

  • Artidoro Pagnoni, Ram Pasunuru*, Pedro Rodriguez *, John Nguyen *, Benjamin Muller, Margaret Li, Chunting Zhou, Lili Yu, Jason Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srinivasan Iyer
  • *Joint second author
  • Outstanding Paper Award ACL 2025

Now It Sounds Like You: Learning Personalized Vocabulary On Device

  • Sid Wang, Ashish Shenoy, Pierce Chuang, John Nguyen
  • AAAI 2024 Spring Symposium

2023

READ: Recurrent Adaptation of Large Transformers

  • John Nguyen*, Sid Wang*, Ke Li, Carole-Jean Wu
  • NeurIPS 2023 R0-FoMo: Robustness of Few-shot and Zero-shot Learning in Foundation Models Workshop

On Noisy Evaluation in Federated Hyperparameter Tuning

  • Kevin Kuo, Pratiksha Thaker, Mikhail Khodak, John Nguyen, Daniel Jiang, Ameet Talwalkar, Virginia Smith
  • Conference on Machine Learning and Systems (MLSys), 2023

Where to Begin? Exploring the Impact of Pre-Training and Initialization in Federated Learning

  • John Nguyen, Jianyu Wang, Kshitiz Malik, Maziar Sanjabi, Michael Rabbat
  • Spotlight at International Conference on Learning Representations (ICLR) 2023
  • Presentation

2022

Toward Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity

  • Kiwan Maeng, Haiyu Lu, Luca Melis, John Nguyen, Mike Rabbat, Carole-Jean Wu
  • Best Paper Finalist Award at the ACM Conference Series on Recommender Systems (RecSys), 2022

Papaya: Practical, Private, and Scalable Federated Learning

  • Dzmitry Huba, John Nguyen, Kshitiz Malik, Ruiyu Zhu, Mike Rabbat, Ashkan Yousefpour, Carole-Jean Wu, Hongyuan Zhan, Pavel Ustinov, Harish Srinivas, Kaikai Wang, Anthony Shoumikhin, Jesik Min, Mani Malek
  • Conference on Machine Learning and Systems (MLSys), 2022

Federated Learning with Buffered Asynchronous Aggregation

  • John Nguyen, Kshitiz Malik, Hongyuan Zhan, Ashkan Yousefpour, Mike Rabbat, Mani Malek, Dzmitry Huba
  • International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
  • Presentation

2021

Opacus: User-Friendly Differential Privacy Library in PyTorch

  • Ashkan Yousefpour*, Igor Shilov*, Alexandre Sablayrolles*, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, Ilya Mironov
  • ∗Equal contribution
  • Privacy in Machine Learning (PriML) workshop, NeurIPS 2021