I am a senior student at College of Computer Science and Technology, Southeast University, majoring in Artificial Intelligence, and will graduate from Southeast University with a B.S. in Engineering in 2025! Additionally, I am an incoming Master student at LAMDA@Nanjing University, advised by Prof. Hanjia Ye.

🔥 News

  • 2024.11:  🎉🎉 I win the President Scholarship(1%)!
  • 2024.08:  🎉🎉 One paper is accepted by AAAI 2024 Alignment Workshop!
  • 2024.07:  🎉🎉 One paper is accepted by NeurIPS 2024!
  • 2023.07:  🎉🎉 I won a silver medal on Kaggle!

📝 Publications

AAAI2024 workshop
sym

Reinforcement Learning from Multi-role Debates as Feedback for Bias Mitigation in LLMs

Ruoxi Cheng†, Haoxuan Ma†, Shuirong Cao†, Jiaqi Li, Aihua Pei, Zhiqiang Wang‡, Pengliang Ji, Haoyu Wang, Jiaqi Huo​​ († Equal contribution,)

Project Page

  • we propose Reinforcement Learning from Multi-role Debates as Feedback (RLDF), a novel approach for bias mitigation replacing human feedback in traditional RLHF. We utilize LLMs in multi-role debates to create a dataset that includes both high-bias and low-bias instances for training the reward model in reinforcement learning.
NeurIPS2024
sym

Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models

Xu Yang, Yingzhe Peng, Haoxuan Ma, Shuo Xu, Chi Zhang, Yucheng Han, Hanwang Zhang

Project Page

  • We propose to use a tiny Language Model (LM), e.g., a Transformer with 67M parameters, to lever much larger Vision-Language Models (LVLMs) with 9B parameters. Specifically, we use this tiny Lever-LM to configure effective in-context demonstration (ICD) sequences to improve the In-Context Learinng (ICL) performance of LVLMs
Preprint
sym

Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment

Ruoxi Cheng†, Haoxuan Ma†, Weixin Wang†, Zhiqiang Wang, Xiaoshuang Jia, Simeng Qin, Xiaochun Cao, Yang Liu, Xiaojun Jia († Equal contribution)

Project Page

  • we propose DR-IRL, which Dynamically adjusts Rewards through Inverse Reinforcement Learning. We first construct a balanced safety dataset of seven harmful categories using Chain-of-Draft (CoD) template prompts. Then we train category - specific reward models using this dataset as demonstration via IRL. Finally, we propose GRPO-S, Group Relative Policy Optimization-Scaling, a variant of GRPO that scales the reward in optimization to tast difficulty—data-level hardness by CLIP similarity, model-level responsiveness by reward gaps.
Preprint
sym

Fine-grained Masked-image Language Alignment

YiCheng Xiao†, Yu Chen†, Haoxuan Ma†, Jiale Hong†, Caorui Li, Lingxiang Wu, Zheng Wang, Kuan Zhu, Haiyun Guo, Jinqiao Wang († equal contribution)

Project Page

  • We propose Fine-grained Masked-image Language Alignment (FMLA), a novel fine-tuning approach that utilizes the local semantic alignment between masks and complex long texts. Our FMLA model can effectively represent images at any granularity (whether local or global) while leveraging the LLM to process complex long texts. This makes it the first model capable of simultaneously meeting demands for local visual prompts input and long text input, consequently overcomes the granularity limitations in both the visual and textual domains.
Preprint
sym

SegBins: Self-Supervised Monocular Depth Estimation Based On Depth Bins And Semantic Segmentation

Yicheng Xiao†, Haoxuan Ma†, Zhenhao Shen, Jinfei Qi, RuiFeng Xie, Zixiang Zhang, Haoxiao Wang, Weijie Wang, Peilin Sun, Jiale Hong, Jingyang Fan, Xiaolin Fang, Haiyun Guo, Jinqiao Wang († equal contribution)

Project Page

  • We propose a new self-supervised monocular depth estimation framework, which innovatively proposes that the framework enhances spatial interaction information and applies multi-layer feature fusion information to extract potential geometric priors of scenes in images, and finally classifies them into multiple depth bin to obtain probabilities, which are combined to form depth.

🎖 Honors and Awards

  • 2024.11 President’s Scholarship (top 1%) in Southeast University
  • 2023.11 Zhishan Scholarship in Southeast University
  • 2023.09 Suzhou Industrial Scholarship in Southeast University
  • 2023.06 Merit Student in Southeast University.
  • 2022.11 Zhishan Scholarship in Southeast University
  • 2022.09 Lenovo Research Institute Scholarship in Southeast University

📖 Educations

  • 2021.08 - 2025.05 (now), Southeast University, College of Computer Science and Technology.

💻 Internships