Chunbin Lin

  • Home
  • Publication
  • Service

Chunbin Lin, Ph.D.

Staff Software Engineer @ MongoDB ExAamzon, ExVisa, UCSD PhD Email: [email protected] LinkedIn DBLP Google Scholar
Chunbin Lin is currently a staff software engineer at MongoDB working on vector search in Atlas Search. Before that, he worked at Visa's Genai Platform team and Amazon Web Services (AWS) in Redshift team. Besides that, he also interned at IBM, Informatica and NUS. He obtained his PhD degree in Computer Science from University of California, San Diego (UCSD). His PhD advisor is Prof. Yannis Papakonstantinou. Major Projects
  • Vector Search - Design and optimize vector search infrastructure to support large-scale, high-dimensional data. Focus on HNSW index construction and maintenance, including custom merge policies and memory-aware graph building strategies. Ensure high-quality, deterministic search results with efficient query execution and robust index management.
  • Genai Platform - Develop a high-performance platform that integrates various LLMs, including OpenAI models, Anthropic models, and open-source alternatives. It provides load balancing, rate limiting, and content inspection services, ensuring a low-latency, high-availability system.
  • Machine Learning Platform - Build a scalable and efficient platform for feature creation, model training and inference, and model deployment. Enable AutoML to automate the end-to-end ML lifecycle, simplifying model development and optimization.
  • Database Optimization - Improve database performance through advanced query processing techniques, including join optimization and result cache enhancements. Leverage machine learning-based workload management to predict query execution time, memory requirements, and CPU cycles for optimal scheduling.

Publications

Selected Conference Papers

C23 Auto-WLM: ML-enhanced workload management in Amazon Redshift Gaurav Saxena, Mohammad Rahman, Naresh Chainani, Chunbin Lin, George Caragea, Fahim Chowdhury, Ryan Marcus, Tim Kraska, Ippokratis Pandis, Balakrishnan (Murali) Narayanaswamy Proceedings of ACM Conference on Management of Data (SIGMOD), 2023 Link PDF
C22 Multivariate Time Series Data Imputation Using Attention-Based Mechanism Jingqi Zhao, Chuitian Rong, Chunbin Lin, Xin Dang Neurocomputing, 2023
C21 Highly Efficient String Similarity Search and Join over Compressed Indexes Guorui Xiao, Jin Wang, Chunbin Lin, Carlo Zaniolo IEEE International Conference on Data Engineering (ICDE), 2022
C20 Workload-Aware Performance Tuning for Autonomous DBMSs Zhengtong Yan, Jiaheng Lu, Naresh Chainani, Chunbin Lin IEEE International Conference on Data Engineering (ICDE), 2021 Link
C19 Evaluating List Intersection on SSDs for Parallel I/O Skipping Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, Steven Swanson IEEE International Conference on Data Engineering (ICDE), 2021
C18 Plato: Approximate Analytics over Compressed Time Series with Tight Deterministic Error Guarantees Chunbin Lin, Etienne Boursier, Yannis Papakonstantinou Proceedings of the VLDB Endowment (PVLDB), 2020 Link PDF
C17 Fast Error-tolerant Location-aware Query Autocompletion Jin Wang, Chunbin Lin IEEE International Conference on Data Engineering (ICDE), 2020 Link PDF
C16 Motif Discovery Using Similarity-Constraints Deep Neural Networks Chuitian Rong, Ziliang Chen, Chunbin Lin, Jianming Wang International Conference on Database Systems for Advanced Applications (DASFAA), 2020 Link PDF
C15 Synergy of Database Techniques and Machine Learning Models for String Similarity Search and Join Jiaheng Lu, Chunbin Lin, Jin Wang, Chen Li Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM), pages: 2975-2976, 2019 Link PDF Full Version (PDF) Website
C14 MF-Join: Efficient Fuzzy String Similarity Join with Multi-level Filtering Jin Wang, Chunbin Lin, Carlo Zaniolo IEEE International Conference on Data Engineering (ICDE), 2019 Link PDF
C13 Scalable Metric Similarity Join Using MapReduce Jiacheng Wu, Yong Zhang, Jin Wang, Chunbin Lin, Yingjia Fu, Chunxiao Xing IEEE International Conference on Data Engineering (ICDE), 2019 Link
C12 An Efficient Sliding Window Approach for Approximate Entity Extraction with Synonyms Jin Wang, Chunbin Lin, Mingda Li, Carlo Zaniolo International Conference on Extending Database Technology (EDBT), 2019 Link PDF
C11 An Experimental Study of Bitmap Compression vs. Inverted List Compression Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, Steven Swanson Proceedings of ACM Conference on Management of Data (SIGMOD), pages: 993-1008, 2017 Link PDF
C10 MILC: Inverted List Compression in Memory Jianguo Wang, Chunbin Lin, Ruining He, Moojin Chae, Yannis Papakonstantinou, Steven Swanson Proceedings of the VLDB Endowment (PVLDB), Volume 10, Issue 8, pages: 853-864, 2017 Link PDF
C9 Towards heterogeneous keyword search Chunbin Lin, Jianguo Wang, Chuitian Rong Proceedings of the ACM Turing 50th Celebration Conference-China (ACM TUR-C), pages: 1-6, 2017 Link PDF
C8 Fast and Scalable Distributed Set Similarity Joins for Big Data Analytics Chuitian Rong, Chunbin Lin, Yasin Silva, Jianguo Wang, Wei Lu, Xiaoyong Du Proceedings of the International Conference on Data Engineering (ICDE), pages: 1059-1070, 2017 Link PDF
C7 Answer yes/no queries in search engines Chunbin Lin The Conference on Innovative Data Systems Research (CIDR), 2017 PDF
C6 Fast In-Memory SQL Analytics on Typed Graphs Chunbin Lin, Benjamin Mandel, Yannis Papakonstantinou, Matthias Springer Proceedings of the VLDB Endowment (PVLDB), Volume 10, Issue 3, pages: 265-276, 2016 Link PDF Website
C5 HippogriffDB: Balancing I/O and GPU Bandwidth in Big Data Analytics Jing Li, Hung-Wei Tseng, Chunbin Lin, Yannis Papakonstantinou, Steven Swanson Proceedings of the VLDB Endowment (PVLDB), Volume 9, Issue 14, pages: 1647-1658, 2016 Link PDF
C4 Sherlock: Sparse Hierarchical Embeddings for Visually-aware One-class Collaborative Filtering Ruining He, Chunbin Lin, Jianguo Wang, Julian McAuley Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pages: 3740-3746, 2016 Link PDF
C3 String similarity measures and joins with synonyms Jiaheng Lu, Chunbin Lin, Wei Wang, Chen Li, Haiyong Wang Proceedings of ACM Conference on Management of Data (SIGMOD), pages: 373-384, 2013 Link PDF
C2 Processing XML Twig Pattern Query with Wildcards Huayu Wu, Chunbin Lin, Tok Wang Ling, Jiaheng Lu International Conference on Database and Expert Systems Applications (DEXA), pages: 326-341, 2012 Link
C1 Optimal top-k generation of attribute combinations based on ranked lists Jiaheng Lu, Pierre Senellart, Chunbin Lin, Xiaoyong Du, Shan Wang, Xinxing Chen Proceedings of ACM Conference on Management of Data (SIGMOD), pages: 409-420, 2012 Link PDF

Selected Journal Articles

(* denotes corresponding author)
J3 Boosting Approximate Dictionary-based Entity Extraction with Synonyms Jin Wang, Chunbin Lin*, Mingda Li, Carlo Zaniolo Information Sciences, Volume 530, pages: 1-21, 2020. (Impact Factor: 5.524) Link
J2 Optimal Algorithms for Selecting Top-k Combinations of Attributes: Theory and Applications Chunbin Lin*, Jiaheng Lu, Zhewei Wei, Jianguo Wang, Xiaokui Xiao The International Journal on Very Large Data Bases (VLDB Journal), Volume 27, Issue 1, pages: 27-52, 2018 Link PDF
J1 Boosting the Quality of Approximate String Matching by Synonyms Jiaheng Lu, Chunbin Lin, Wei Wang, Chen Li, Xiaokui Xiao ACM Transactions on Database Systems (TODS), Volume 40, Issue 3, pages: 1-42, 2015 Link PDF

Selected Demo papers

D5 GQFast: Fast Graph Exploration with Context-Aware Autocompletion Chunbin Lin, Jianguo Wang, Yannis Papakonstantinou Proceedings of the International Conference on Data Engineering (ICDE), pages 1389-1390, 2017 Link Demo Video
D4 SpiderX: Fast XML Exploration System Chunbin Lin, Jianguo Wang Proceedings of International World Wide Web Conference (WWW), pages: 237-241, 2017 Link PDF
D3 Location-sensitive Query Auto-completion Chunbin Lin, Jianguo Wang, Jiaheng Lu Proceedings of International World Wide Web Conference (WWW), pages: 819-820, 2017 Link PDF
D2 Fashionista: A Fashion-aware Graphical System for Exploring Visually Similar Items Ruining He, Chunbin Lin, Julian McAuley Proceedings of the International Conference on World Wide Web (WWW), pages 199-202, 2016 Link PDF
D1 LotusX: A Position-Aware XML Graphical Search System with Auto-Completion Chunbin Lin, Jiaheng Lu, Tok Wang Ling, Bogdan Cautis Proceedings of the International Conference on Data Engineering (ICDE), pages 1265-1268, 2012 Link PDF

Academic Service

Session Chair
  • International Conference on Very Large Data Bases (VLDB'20)
  • ACM SIGMOD International Conference on Management of Data (SIGMOD'22)
Program Committee (PC)
  • ACM SIGMOD International Conference on Management of Data (SIGMOD'19, SIGMOD'21, SIGMOD'22)
  • International Conference on Very Large Data Bases (VLDB'21, VLDB'22)
  • IEEE International Conference on Data Engineering (ICDE'22, ICDE'23)
  • ACM Special Interest Group on Knowledge Discovery and Data Mining(SIGKDD'22')
  • International Conference on Extending Database Technology (EDBT'21)
  • AAAI Conference on Artificial Intelligence (AAAI'20, AAAI'21, AAAI'22, AAAI'23)
  • The Web Conference (WWW'19)
  • International Joint Conferences on Artificial Intelligence (IJCAI'19, IJCAI'20, IJCAI'21, IJCAI'22, IJCAI'23)
Journal Reviewer
  • The International Journal on Very Large Data Bases (VLDB Journal)
  • World Wide Web Journal
  • Transactions on Knowledge and Data Engineering (TKDE)
  • Information Sciences
  • Information Systems
  • News

    • 12/2024. I joined MongoDB Atlas Search team
    • 12/2022. I am honored to be a Program Committee member of IJCAI'23, KDD'23
    • 12/2022. US patent "Predicting query performance for prioritizing query execution" is granted. US-11537616-B1
    • 08/2022. I am honored to be a program Committee member of WSDM'23, AAAI'23, and DASFAA'23
    • Show older news ▼
    • 06/2022. I am honored to serve as a session chair of SIGMOD'22
    • 03/2022. I am honored to be a Program Committee member of ICDE'23
    • 12/2021. I am honored to be a Program Committee member of SIGKDD'22
    • 11/2021. Paper "Highly Efficient String Similarity Search and Join over Compressed Indexes" is accepted by ICDE'22
    • 07/2021. I am honored to be a Program Committee member of IJCAI'22
    • 07/2021. I am honored to be a Program Committee member of AAAI'22
    • 05/2021. I am honored to be a Program Committee member of ICDE'22
    • 04/2021. I am honored to be a Program Committee member of VLDB'22
    • 02/2021. I am honored to be a Program Committee member of SIGMOD'22
    • 12/2020. Tutorial paper "Workload-Aware Performance Tuning for Autonomous DBMSs" is accepted by ICDE'21
    • 08/2020. I am honored to be a Program Committee member of VLDB'21
    • 08/2020. I am honored to serve as a session chair of VLDB'20 for four sessions
    • 06/2020. I am honored to be a Program Committee member of SIGMOD'21
    • 06/2020. I am honored to be a Program Committee member of EDBT'21
    • 04/2020. Paper "Boosting Approximate Dictionary-based Entity Extraction with Synonyms" is accepted by Information Sciences
    • 02/2020. Paper "Approximate Analytics System over Compressed Time Series with Tight Deterministic Error Guarantees" is accepted by PVLDB'20
    • 02/2020. Paper "Deep Neural Networks based Motif Discovery with Similarity Guarantees" is accepted by DASFAA'20
    • 02/2020. Paper "Fast Error-tolerant Location-aware Query Autocompletion" is accepted by ICDE'20
    • 11/2019. Gave a talk about "string similarity joins" at Renmin University of China
    • 11/2019. Gave a talk about "approximate query processing" at Tsinghua University
    • 11/2019. Gave a tutorial talk "synergy of database techniques and machine learning models for string similarity search and join" at CIKM'19

Từ khóa » Chun Bin