Research
I'm interested in computer vision, generative AI. I mainly focus on diffusion-based image and video generation.
UniCon: A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li , Charles Herrmann, Kelvin C.K. Chan, Yinxiao Li, Deqing Sun, Chao Ma, Ming-Hsuan Yang
arXiv , 2024.
arxiv /
project page /
A simple, unified framework to handle diverse conditional generation tasks involving a specific image-condition correlation in one diffusion model.
Your browser does not support the video tag.
VidToMe: Video Token Merging for Zero-Shot Video Editing
Xirui Li , Chao Ma, Xiaokang Yang, Ming-Hsuan Yang
CVPR , 2024.
arxiv /
project page /
code /
A zero-shot video editing method utilizing a pretrained image diffusion model. The key idea is to enforce video temporal consistency by merging self-attention tokens across frames.
Frame Fusion with Vehicle Motion Prediction for 3D Object Detection
Xirui Li , Feng Wang, Naiyan Wang, Chao Ma
ICRA , 2024.
arxiv /
A detection enhancement method which improves 3D object detection results by forwarding and fusing history detection results.