I have been a computational linguist at Shanghai Jiao Tong University since 2021. Previously, I worked at the Japanese National Institute of Information and Communications Technology from 2016 to 2020. I was a joint Ph.D. student at Shanghai Jiao Tong University and the French National Centre for Scientific Research from 2012 to 2016.
Language intelligence is a form of intelligence in which humans (and machines) learn about the outside world by reading natural language, producing highly abstract linguistic thinking, and then understanding, improving, and creating about the outside world. We aim to explore and extend the boundaries of language intelligence in the following areas:
1. Computational Linguistics: An interdisciplinary field of computer science, linguistics, cognitive science, psychology, etc.
2. Language Modeling: The mechanism and application.
3. Machine Translation: I worked on it for over ten years and will never give it up.
Language Intelligence and Computational Linguistics Lab (MT Lab)
I have always been fortunate to work with these brilliant young researchers (Previously at the Machine Translation Lab from 2017 to 2022).
实验室2026级夏令营博士生计划招生3-4人左右,硕士生1-2人(硕士生需要先通过夏令营后再联系我)
招生方向一(基础研究):语言模型的心智理论探索
--探索人类智能和机器智能形成机制的异同,主要切入点是机器如何从记忆能力演化出泛化能力。前期与国家儿童医学中心共同构建了儿童认知数据库。
--适合CS相关学科背景,对心理学和认知科学感兴趣的同学
招生方向二(应用研究):大规模语言模型推理能力探究
--探索语言模型推理能力的形成机制,并构建出远超出目前模型训练数据能力的Benchmark。前期已发现Overthinking等机制,并阿里和腾讯构建出Deepmath103K,PolyMath等著名Benchmark
--适合逻辑思维强,并且能够在大厂与其他研究者高效协作
招生方向三(许建峰教授招收,与我联合培养):信息科学和人工智能基础理论
--针对信息科学缺乏统一、普适的基础理论这一核心问题,以前期“客观信息论”成果为基础,研究深化通用、可解释、可预期的人工智能理论体系,并开展充分实验验证
--适合数学和CS相关学科背景,有志于信息科学基础理论突破并经受实际检验的同学
招生方向四(许建峰教授招收,与我联合培养):人工智能在司法信息体系工程中的应用
--针对人工智能在司法领域的广泛应用需求,研究智慧司法信息体系架构和智能体研发技术
--适合CS相关背景,有志于推动人工智能应用技术突破的同学
@SJTU
Ph.D. Students:
Lizhen Xu (2025-)
Ziyin Zhang (2025-)
Wenhong Zhu (2024-)
Qingyuan Tian (2024-)
Yang Han (2023-)
Yiming Wang (2023-)
Xingyu Chen (2022-)
Zhiwei He (2021-)
Master Students:
2024-: Chenxi Yang, Haonan Zang, and Jianing Guo
2023-: Ziyin Zhang, Xiaofeng Wang, and Lizhen Xu
2022-: Hongkun Hao (-->Alibaba DAMO Academy), Yiming Ai (-->China Merchants Bank), Tianxiang Hu (-->The Pudong Government), Wenhong Zhu(-->Ph.D Student, SJTU), and Tian Xia (-->Kuaishou Technology)
2021-: Ruize Gao (-->Alibaba DAMO Academy)
Undergraduate Students:
-2025: Haoxiang Sun, Yaoyao Wang, Tianyi Liang (All-->Master Student, SJTU), and Binlin Zhou (-->Ph.D. student, PSU )
-2024: Chenxi Yang (ALL-->Master Student, SJTU)
-2023: Ziyin Zhang and Xiaofeng Wang (ALL-->Master Student, SJTU)
-2022: Yushen Chen, Hongkun Hao, Yiming Ai, and Tianxiang Hu (ALL-->Master Student, SJTU)
-2021: Xiaoyi Bao (-->Microsoft), Ruiyi Wang (-->Master Student, CMU)
CS3966: Natural Language Processing and Large Language Model (for the John Class), 2024-
CS3602: Natural Language Processing (for the CS and AI major), 2021-
CS438: Information Extraction, 2021-2023
CS247: Data Mining, 2021-2022