Ziyang Ma (马子阳)

alt text 

Ph.D. student,
Shanghai Jiao Tong University.
800 Dongchuan RD. Minhang District,
Shanghai, China.
E-mail: zym.22@sjtu.edu.cn

Biography

Hi👋 nice to meet you!

Currently I am a Ph.D. student of Shanghai Jiao Tong University (SJTU) and SJTU Artificial Intelligence Institute, and a member in Cross Media (X-) Language Intelligence Lab (X-LANCE) of the Department of Computer Science and Engineering, co-supervised by Prof. Xie Chen, Yanmin Qian and working closely with Prof. Kai Yu. As the first Ph.D. supervised by Prof. Chen, I will try my best in the next five exciting years! 💪

I was a research assistant at InteLligent media research center (iLearn), working closely with Prof. Xuemeng Song and Liqiang Nie during my undergraduate years.

My research usually follows the KISS philosophy. My recent work focuses on speech, language, audio and music processing with Self-Supervised Learning (SSL) and Large Language Model (LLM). If you are also interested, please feel free to contact me.

Education

  • Ph.D., Computer Science and Engineering, Shanghai Jiao Tong University, 2022.09-Now

  • B.E., Computer Science and Technology, Shandong University, 2018.09-2022.06

Interests

  • Self-Supervised Learning

  • Speech and Audio Processing

  • Natural Language Processing

  • Multimedia and Multimodal

NEWS

  • [2024.5] BAT was accepted by ICML 2024.

  • [2024.4] alt text MER24 Challenge@IJCAI and MRAC24 Workshop@ACM Multimedia are coming! [Baseline Paper][Baseline Code][Challenge Homepage]

  • [2024.4] EAT was accepted by IJCAI 2024.

  • [2024.3] 🎉 We won the 1st place in Categorical Emotion Recognition at Odyssey 2024 Emotion Recognition Challenge.

  • [2024.1] Check out our Repo for EAT, a new audio representation model with both effectiveness and efficiency.

  • [2023.12] Check out our Repo for emotion2vec, the first universal speech emotion representation model.

  • [2023.12] 4 papers were accpeted by IEEE ICASSP2024.

  • [2023.9] Check out our Repo for Fast-HuBERT. We accelerate HuBERT pre-training in 5.2X speedup without performance drop.

  • [2023.9] 2 papers were accpeted by IEEE ASRU2023.

  • [2023.8] MT4SSL was nominated in ISCA Interspeech Best Student Paper Shortlist. Congrats!

Research

Selected Publications

Thanks to all the collaborators for their great work!

Check out Google Scholar for more information.

Speech, Language, Audio, Music Processing with SSL

Speech, Language, Audio, Music Processing with LLM

Experiences

Research Intern, Speech Lab, Alibaba DAMO Academy, 2023.06-2024.02

Research Intern, NLC Group, Microsoft Research Asia(MSRA), 2022.02-2022.08

  • Investigate joint pre-training of speech and text to help improve the accuracy of ASR and other downstream tasks.

  • Led by Furu Wei, supervised by Shujie Liu, and working closely with Yu Wu and Long Zhou.

Research Intern, Video Group, MEGVII Research, 2021.04-2021.06

  • Investigate re-identification of vehicle with Transformer architecture.

  • Supervised by Chi Zhang.

Research Assistant, InteLligent media research center (iLearn), Shandong University, 2020.09-2021.09

Academic Service

Conference Reviewer / TPC Member

  • ACL Rolling Review (ACL ARR) 2024

  • IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP) 2023, 2024

  • AAAI Conference on Artificial Intelligence 2022

  • ACM International Conference on Multimedia (ACM MM) 2022

Journal Reviewer

  • IEEE Signal Processing Letters (IEEE SPL)

  • IEEE Transactions on Multimedia (IEEE TMM)

  • IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCAVT)

Accomplishments

Awards

  • SPS Travel Grant, IEEE, 2024.02

  • Best Presentation Award in Student Forum, the 18th National Conference on Man-Machine Speech Communication (NCMMSC), 2023.12

  • Interspeech Best Student Paper Shortlist, ISCA, 2023.08

  • Excellent Graduate, Department of Education, Shandong Province, China, 2022.06

  • "Intelligent Pedestal" Scholarship, Huawei, 2021.12

  • SIGMM Student Travel Grant, ACM, 2021.11

  • National Scholarship, Ministry of Education, China, 2021.10

Competitions

Activities

  • [SpeechHome Invited Talk]: INTERSPEECH 2023 Pre-presentation , 2023.07

  • [Datawhale Invited Talk]: How to conduct audio-driven talking head? An introduction and solution sharing, 2022.11

  • Member of Datawhale, 2022.09-Now

  • Teaching Assistant, Computer Science and Technology, Shandong University, 2021.03-2021.06

  • Member of Elite Class, Computer Science and Technology, Shandong University, 2020.09-2022.06