MaoSong2022 MaoSong2022

Hi there 👋 I'm Mao Song (毛松)

I'm a Researcher at Shanghai Artificial Intelligence Laboratory (Shanghai AI LAB), currently focusing on the exciting field of Multimodal Large Language Models (MLLMs) and the potential of unified understanding & generation models towards AGI.

My academic journey includes a Master's degree from ShanghaiTech University (supervised by Professor Wang Hao) and a Bachelor's degree from Beijing Institute of Technology (BIT).

My Recent Work & Interests:

Multimodal Large Language Models (MLLMs): I developed DocParser, a tool to process academic papers with LaTeX source files from arXiv. Leveraging DocParser, we released DocGenome, a rich academic dataset providing annotations across layouts, OCR, and entity relationships to enhance MLLM understanding of text-rich images.
Unified Understanding & Generation Models: While I haven't initiated a specific project in this area yet, I believe it represents a crucial step towards achieving true Artificial General Intelligence.

As a newcomer to this pioneering domain, I am actively learning from foundational works like Qwen-VL and Intern-VL, aiming to contribute meaningfully to this emerging field.

Explore More:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaoSong2022 MaoSong2022

Achievements

Achievements

Block or report MaoSong2022

Hi there 👋 I'm Mao Song (毛松)

Pinned Loading