[Yuhang :)]
Yuhang Ma
AI researcher at Bytedance
Work Email: mayuhang.26942 (at) bytedance.com
Personal Email: yuhang_ma0307 (at) 163.com / astronaut0307 (at) gmail.com
Research Interests
AIGC, Multimodule Pretraining including Text-to-image, LLM and MLLM pretrained model, Conditioned Text-to-image Generation especially focused on IP Consistency, Multiagent System
Biography
Yuhang Ma (马宇航) serves as an AI researcher at Bytedance, focusing on conditioned image generation. Previously, she worked at Fuxi AI Lab, NetEase Inc.(2022-2025), responsible for Danqing text-to-image generation model pretraining (Wechat mini app "丹青约"), Danqing VLM and LLM model fine-tuning and IP Consistency research. She obtained her master degree from University College London and bachelor degree from Hunan University. She studied at National University of Singapore as a visit student in 2019 Winnter semester.
Publications
(* indicates equal contribution, † indicates project leader, )
@misc{huang2025comfygptselfoptimizingmultiagentcomprehensive,
  title={ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation}, 
  author={Oucheng Huang and Yuhang Ma and Zeng Zhao and Mingrui Wu and Jiayi Ji and Rongsheng Zhang and Zhipeng Hu and Xiaoshuai Sun and Rongrong Ji},
  year={2025},
  eprint={2503.17671},
  archivePrefix={arXiv},
  primaryClass={cs.MA},
  url={https://arxiv.org/abs/2503.17671}, 
}
Yuhang Ma*†, Wenting Xu*, Chaoyi Zhao*, Keqiang Sun, Qinfeng Jin, Zeng Zhao, Changjie Fan, Zhipeng Hu
@misc{ma2024storynizorconsistentstorygeneration,
  title={Storynizor: Consistent Story Generation via Inter-Frame Synchronized and Shuffled ID Injection}, 
  author={Yuhang Ma and Wenting Xu and Chaoyi Zhao and Keqiang Sun and Qinfeng Jin and Zeng Zhao and Changjie Fan and Zhipeng Hu},
  year={2024},
  eprint={2409.19624},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2409.19624}, 
}
Yuhang Ma*†, Wenting Xu*, Jiji Tang*, Qinfeng Jin, Rongsheng Zhang, Zeng Zhao, Changjie Fan, Zhipeng Hu
@misc{ma2024characteradapterpromptguidedregioncontrol,
  title={Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization}, 
  author={Yuhang Ma and Wenting Xu and Jiji Tang and Qinfeng Jin and Rongsheng Zhang and Zeng Zhao and Changjie Fan and Zhipeng Hu},
  year={2024},
  eprint={2406.16537},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2406.16537}, 
}
Mushui Liu*, Yuhang Ma*†, Xinfeng Zhang, Zhen Yang, Zeng Zhao, Bai Liu, Changjie Fan, Zhipeng Hu
@misc{liu2024llm4genleveragingsemanticrepresentation,
  title={LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation}, 
  author={Mushui Liu and Yuhang Ma and Xinfeng Zhang and Yang Zhen and Zeng Zhao and Zhipeng Hu and Bai Liu and Changjie Fan},
  year={2024},
  eprint={2407.00737},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2407.00737}, 
}
Jingqun Tang*, Qiao Su*, Benlei Cui*, Yuhang Ma, Sheng Zhang, Dimitrios Kanoulas
@inproceedings{tang2022you,
  title={You can even annotate text with voice: Transcription-only-supervised text spotting},
  author={Tang, Jingqun and Qiao, Su and Cui, Benlei and Ma, Yuhang and Zhang, Sheng and Kanoulas, Dimitrios},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages={4154--4163},
  year={2022}
}
Career
[05/2022-03/2025] Fuxi AI Lab, NetEase Inc.
[03/2025-Present] Bytedance
Hobbies
Werewolves(狼人杀) fanatics, LOL, Piano, Vlogger, a cat person owning 4 cats.