A. Tuan Nguyen

"The past is just practice"

I am a Research Scientist at Meta, working on large multi-modal language models. Prior to joining Meta, I obtained a PhD (DPhil) in Machine Learning from the University of Oxford. During my time at Oxford, I was fortunate to have worked with Prof. Philip Torr, Prof. Yarin Gal, and Dr. Gunes Baydin. My PhD thesis focused on addressing the distribution shift problem in machine learning. Before my doctoral studies, I pursued a Master’s degree at Korea Advanced Institute of Science and Technology. During this period, I worked as a research assistant at KAIST’s MLAI Group, supervised by Prof. Sung Ju Hwang.

My (might-be-outdated) resume is available here.

You can reach out to me via email: a.tuan.nguyen at outlook.com

News

Aug 12, 2024 A paper (uCAP) has been accepted to ECCV as oral.
Apr 8, 2024 I join Meta (New York) as a Research Scientist.
May 7, 2023 Arrived in Menlo Park for an internship at Meta. Let’s connect!
Feb 27, 2023 A paper accepted to CVPR 2023!
Sep 15, 2022 A paper gets accepted to NeurIPS 2022! Hope to see you in New Orleans.
May 15, 2022 A paper gets accepted to ICML 2022!
Jan 21, 2022 Two papers get accepted to ICLR 2022!
Dec 15, 2021 Get an internship offer at Meta (Facebook). Going to San Francisco for a four-month internship!
Sep 28, 2021 A paper gets accepted to NeurIPS 2021!
Dec 1, 2020 A paper gets accepted to AAAI 2021! Many thanks to my supervisor and collaborators.

Selected Publications

  1. ECCV uCAP: An Unsupervised Prompting Method for Vision-Language Models A. Tuan Nguyen, Kai Sheng Tai, Sirius Chen, Satya Narayan Shukla, Hanchao Yu, Philip Torr, Taipeng Tian, and Ser-Nam Lim European Conference on Computer Vision (Oral), 2024 Abs Bib PDF

    This paper addresses a significant limitation that prevents Contrastive Language-Image Pretrained Models (CLIP) from achieving optimal performance on downstream image classification tasks. The key problem with CLIP-style zero-shot classification is that it requires domain-specific context in the form of prompts to better align the class descriptions to the downstream data distribution. In particular, prompts for vision-language models are domain-level texts (e.g., “a centered satellite image of ...”) which, together with the class names, are fed into the text encoder to provide more context for the downstream dataset. These prompts are typically manually tuned, which is time consuming and often sub-optimal. To overcome this bottleneck, this paper proposes uCAP, a method to automatically learn domain-specific prompts/contexts using only unlabeled in-domain images. We achieve this by modeling the generation of images given the class names and a domain-specific prompt with an unsupervised likelihood distribution, and then performing inference of the prompts. We validate the proposed method across various models and datasets, showing that uCAP consistently outperforms manually tuned prompts and related baselines on the evaluated datasets: ImageNet, CIFAR-10, CIFAR-100, OxfordPets (up to 2%), SUN397 (up to 5%), and Caltech101 (up to 3%).

    @article{nguyen2024ucap, abbr = {ECCV}, selected = {true}, bibtex_show = {true}, pdf = {nguyen2024ucap.pdf}, title = {uCAP: An Unsupervised Prompting Method for Vision-Language Models}, author = {Nguyen, A. Tuan and Tai, Kai Sheng and Chen, Sirius and Shukla, Satya Narayan and Yu, Hanchao and Torr, Philip and Tian, Taipeng and Lim, Ser-Nam}, journal = {European Conference on Computer Vision (Oral)}, year = {2024} }
  2. CVPR TIPI: Test Time Adaptation with Transformation Invariance A. Tuan Nguyen, Thanh Nguyen-Tang, Ser-Nam Lim, and Philip Torr IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023 Abs Bib HTML PDF

    When deploying a machine learning model to a new environment, we often encounter the distribution shift problem – meaning the target data distribution is different from the model’s training distribution. In this paper, we assume that labels are not provided for this new domain, and that we do not store the source data (e.g., for privacy reasons). It has been shown that even small shifts in the data distribution can affect the model’s performance severely. Test Time Adaptation offers a means to combat this problem, as it allows the model to adapt during test time to the new data distribution, using only unlabeled test data batches. To achieve this, the predominant approach is to optimize a surrogate loss on the test-time unlabeled target data. In particular, minimizing the prediction’s entropy on target samples \citewang2020tent has received much interest as it is task-agnostic and does not require altering the model’s training phase (e.g., does not require adding a self-supervised task during training on the source domain). However, as the target data’s batch size is often small in real-world scenarios (e.g., autonomous driving models process each few frames in real-time), we argue that this surrogate loss is not optimal since it often collapses with small batch sizes. To tackle this problem, in this paper, we propose to use an invariance regularizer as the surrogate loss during test-time adaptation, motivated by our theoretical results regarding the model’s performance under input transformations. The resulting method (TIPI – Test tIme adaPtation with transformation Invariance) is validated with extensive experiments in various benchmarks (Cifar10-C, Cifar100-C, ImageNet-C, DIGITS, and VisDA17). Remarkably, TIPI is robust against small batch sizes (as small as 2 in our experiments), and consistently outperforms TENT \citewang2020tent in all settings.

    @article{nguyen2023tipi, abbr = {CVPR}, selected = {true}, html = {https://openaccess.thecvf.com/content/CVPR2023/html/Nguyen_TIPI_Test_Time_Adaptation_With_Transformation_Invariance_CVPR_2023_paper.html}, bibtex_show = {true}, pdf = {CVPR_nguyen2023tipi.pdf}, title = {TIPI: Test Time Adaptation with Transformation Invariance}, author = {Nguyen, A. Tuan and Nguyen-Tang, Thanh and Lim, Ser-Nam and Torr, Philip}, journal = {IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year = {2023} }
  3. NeurIPS FedSR: A Simple and Effective Domain Generalization Method for Federated Learning A. Tuan Nguyen, Philip Torr, and Ser-Nam Lim Advances in Neural Information Processing Systems, 2022 Abs Bib HTML PDF

    Federated Learning (FL) refers to the decentralized and privacy-preserving machine learning framework in which multiple clients collaborate (with the help of a central server) to train a global model without sharing their data. However, most existing FL methods only focus on maximizing the model’s performance on the source clients’ data (e.g., mobile users) without considering its generalization ability to unknown target data (e.g., a new user). In this paper, we incorporate the problem of Domain Generalization (DG) into Federated Learning to tackle the aforementioned issue. However, virtually all existing DG methods require a centralized setting where data is shared across the domains, which violates the principles of decentralized FL and hence not applicable. To this end, we propose a simple yet novel representation learning framework, namely FedSR, which enables domain generalization while still respecting the decentralized and privacy-preserving natures of this FL setting. Motivated by classical machine learning algorithms, we aim to learn a simple representation of the data for better generalization. In particular, we enforce an L2-norm regularizer on the representation and a conditional mutual information (between the representation and the data given the label) regularizer to encourage the model to only learn essential information (while ignoring spurious correlations such as the background). Furthermore, we provide theoretical connections between the above two objectives and representation alignment in domain generalization. Extensive experimental results suggest that our method significantly outperforms relevant baselines in this particular problem.

    @article{nguyen2022fedsr, abbr = {NeurIPS}, pdf = {NeurIPS_nguyen2022fedsr.pdf}, selected = {true}, html = {https://openreview.net/forum?id=mrt90D00aQX}, bibtex_show = {true}, title = {FedSR: A Simple and Effective Domain Generalization Method for Federated Learning}, author = {Nguyen, A. Tuan and Torr, Philip and Lim, Ser-Nam}, journal = {Advances in Neural Information Processing Systems}, year = {2022} }
  4. ICLR KL Guided Domain Adaptation A. Tuan Nguyen, Toan Tran, Yarin Gal, Philip H. S. Torr, and Atılım Güneş Baydin International Conference on Learning Representations, 2022 Abs Bib HTML PDF

    Domain adaptation is an important problem and often needed for real-world ap- plications. In this problem, instead of i.i.d. datapoints, we assume that the source (training) data and the target (testing) data have different distributions. With that setting, the empirical risk minimization training procedure often does not perform well, since it does not account for the change in the distribution. A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain. However, these approaches often require additional networks and/or optimizing an adversarial (minimax) objective, which can be very expensive or unstable in practice. To tackle this problem, we first derive a generalization bound for the target loss based on the training loss and the reverse Kullback–Leibler (KL) divergence between the source and the target representation distributions. Based on this bound, we derive an algorithm that minimizes the KL term to obtain a better generalization to the target domain. We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples without any additional network or a minimax objective. This leads to a theoretically sound alignment method which is also very efficient and stable in practice. Experimental results also suggest that our method outperforms other representation-alignment approaches.

    @article{nguyen2022kl, abbr = {ICLR}, bibtex_show = {true}, selected = {true}, pdf = {ICLR_nguyen2022kl.pdf}, html = {https://openreview.net/forum?id=0JzqUlIVVDd}, title = {KL Guided Domain Adaptation}, author = {Nguyen, A. Tuan and Tran, Toan and Gal, Yarin and Torr, Philip H. S. and Baydin, At{\i}l{\i}m G{\"u}ne{\c{s}}}, journal = {International Conference on Learning Representations}, year = {2022} }
  5. NeurIPS Domain Invariant Representation Learning with Domain Density Transformations A. Tuan Nguyen, Toan Tran, Yarin Gal, and Atılım Güneş Baydin Advances in Neural Information Processing Systems, 2021 Abs Bib HTML PDF

    Domain generalization refers to the problem where we aim to train a model on data from a set of source domains so that the model can generalize to unseen target domains. Naively training a model on the aggregate set of data (pooled from all source domains) has been shown to perform suboptimally, since the information learned by that model might be domain-specific and generalize imperfectly to target domains. To tackle this problem, a predominant domain generalization approach is to learn some domain-invariant information for the prediction task, aiming at a good generalization across domains. In this paper, we propose a theoretically grounded method to learn a domain-invariant representation by enforcing the representation network to be invariant under all transformation functions among domains. We next introduce the use of generative adversarial networks to learn such domain transformations in a possible implementation of our method in practice. We demonstrate the effectiveness of our method on several widely used datasets for the domain generalization problem, on all of which we achieve competitive results with state-of-the-art models.

    @article{nguyen2021domain, abbr = {NeurIPS}, bibtex_show = {true}, selected = {true}, pdf = {NeurIPS_nguyen2021domain.pdf}, html = {https://papers.nips.cc/paper/2021/hash/2a2717956118b4d223ceca17ce3865e2-Abstract.html}, title = {Domain Invariant Representation Learning with Domain Density Transformations}, author = {Nguyen, A. Tuan and Tran, Toan and Gal, Yarin and Baydin, At{\i}l{\i}m G{\"u}ne{\c{s}}}, journal = {Advances in Neural Information Processing Systems}, year = {2021} }
  6. AAAI Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning A. Tuan Nguyen, Hyewon Jeong, Eunho Yang, and Sung Ju Hwang Proceedings of the AAAI Conference on Artificial Intelligence, 2021 Abs Bib HTML PDF

    Although recent multi-task learning methods have shown to be effective in improving the generalization of deep neural networks, they should be used with caution for safety-critical applications, such as clinical risk prediction. This is because even if they achieve improved task-average performance, they may still yield degraded performance on individual tasks, which may be critical (e.g., prediction of mortality risk). Existing asymmetric multi-task learning methods tackle this negative transfer problem by performing knowledge transfer from tasks with low loss to tasks with high loss. However, using loss as a measure of reliability is risky since low loss could result from overfitting. In the case of time-series prediction tasks, knowledge learned for one task (e.g., predicting the sepsis onset) at a specific timestep may be useful for learning another task (e.g., prediction of mortality) at a later timestep, but lack of loss at each timestep makes it difficult to measure the reliability at each timestep. To capture such dynamically changing asymmetric relationships between tasks in time-series data, we propose a novel temporal asymmetric multi-task learning model that performs knowledge transfer from certain tasks/timesteps to relevant uncertain tasks, based on the feature-level uncertainty. We validate our model on multiple clinical risk prediction tasks against various deep learning models for time-series prediction, which our model significantly outperforms without any sign of negative transfer. Further qualitative analysis of learned knowledge graphs by clinicians shows that they are helpful in analyzing the predictions of the model.

    @article{nguyen2021clinical, abbr = {AAAI}, bibtex_show = {true}, selected = {true}, pdf = {AAAI_nguyen2021clinical.pdf}, html = {https://ojs.aaai.org/index.php/AAAI/article/view/17097}, title = {Clinical Risk Prediction with Temporal Probabilistic Asymmetric Multi-Task Learning}, volume = {35}, url = {https://ojs.aaai.org/index.php/AAAI/article/view/17097}, number = {10}, journal = {Proceedings of the AAAI Conference on Artificial Intelligence}, author = {Nguyen, A. Tuan and Jeong, Hyewon and Yang, Eunho and Hwang, Sung Ju}, year = {2021}, pages = {9081-9091} }
The best way to reach me is via email.

Từ khóa » Tuấn Nguyễn