Xiaobao Wei

I am a second-year dual PhD student at Institute of Software, Chinese Academy of Sciences and Peking University, supervised by Prof. Hui Chen from ISCAS, Prof. Shanghang Zhang from PKU and Ming Lu from Intel Labs China. I received my B.S. in Robotics Engineering from Beihang University in 2023 and obtained Beijing Distinguished Graduate Award.

I serve as a reviewer for prestigious conferences and journals including CVPR, ICLR, NeurIPS, ICML, ICME, AISTATS, RA-L, ICSVT.

My areas of focus include neural field, 3D vision.

Email  /  Github  /  Google Scholar

profile photo
Research
GraphAvatar
[AAAI 2025] GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians
Xiaobao Wei, Peng Chen, Ming Lu, Hui Chen, Feng Tian,
Paper / Code

We propose a compact method named GraphAvatar that leverages Graph Neural Networks (GNN) to generate the 3D Gaussians for head avatar animation.

Mixed
MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussians
Peng Chen, Xiaobao Wei, Qingpo Wuwu, Xinyi Wang, Xingyu Xiao, Ming Lu
Paper / Project / Code

We use 2DGS to maintain the surface geometry and employ 3DGS for color correction in areas where the rendering quality of 2DGS is insufficient, reconstructing a realistically and geometrically accurate 3D head avatar.

EMD
EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting
Xiaobao Wei*, Qingpo Wuwu*, Zhongyu Zhao, Zhuangzhe Wu, Nan Huang, Ming Lu, Ningning MA, Shanghang Zhang
Paper / Code

We propose Explicit Motion Decomposition (EMD), which models the motions of dynamic objects by introducing learnable motion embeddings to the Gaussians, enhancing the decomposition in street scenes.

GazeGaussian
GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting
Xiaobao Wei, Peng Chen, Guangyu Li,
Ming Lu, Hui Chen, Feng Tian
Paper / Code

We propose GazeGaussian, a high-fidelity gaze redirection method that uses a two-stream 3DGS model to represent the face and eye regions separately.

PLGS
PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting
Yu Wang, Xiaobao Wei, Ming Lu, Guoliang Kang
Paper

We propose a new method called PLGS that enables 3DGS to generate consistent panoptic segmentation masks from noisy 2D segmentation masks while maintaining superior efficiency compared to NeRF-based methods.

S3Gaussian
S3Gaussian: Self-Supervised Street Gaussians for Autonomous Driving
Nan Huang, Xiaobao Wei, Wenzhao Zheng, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang,
Paper / Code

We propose a self-supervised street Gaussian (S3Gaussian) method to decompose dynamic and static elements in driving scenes without costly annotations.

I-MedSAM
[ECCV 2024]I-MedSAM: Implicit Medical Image Segmentation with Segment Anything
Xiaobao Wei*, Jiajun Cao*, Yizhu Jin, Ming Lu, Guangyu Wang, Shanghang Zhang,
Paper / Code

We propose I-MedSAM, which leverages the benefits of both continuous representations and SAM, to obtain better cross-domain ability and accurate boundary delineation.

MFS-Seg
[Neural Networks 2024] Multi-scale full spike pattern for semantic segmentation
Qiaoyi Su, Weihua He, Xiaobao Wei, Bo Xu, Guoqi Li,
Paper / Code

We propose the multi-scale and full spike segmentation network (MFS-Seg), which is based on the deep direct trained SNN and represents the first attempt to train a deep SNN with surrogate gradients for semantic segmentation.

NTO3D
[CVPR 2024] NTO3D: Neural Target Object 3D Reconstruction with Segment Anything
Xiaobao Wei, Renrui Zhang, Jiarui Wu, Jiaming Li, Yandong Guo, Shanghang Zhang,
Paper / Code

We propose NTO3D, a novel high-quality Neural Target Object 3D (NTO3D) reconstruction method, which leverages the benefits of both neural field and SAM.

DiffusionTalker
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser
Peng Chen*, Xiaobao Wei*, Ming Lu, Yitong Zhu, Naiming Yao, Xingyu Xiao, Hui Chen,
Paper / Code

We propose DiffusionTalker, a diffusion-based method that utilizes contrastive learning to personalize 3D facial animation and knowledge distillation to accelerate 3D animation generation.

OV-3DET
[CVPR 2023] Open-Vocabulary Point-Cloud Object Detection without 3D Annotation
Yuheng Lu*, Chenfeng Xu*, Xiaobao Wei, Xiaodong Xie, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang,
Paper / Code

We propose OV-3DET, which leverages advanced image/vision-language pre-trained models to achieve Open-Vocabulary 3D point-cloud DETection.

MTTrans
[ECCV 2022] MTTrans: Cross-domain object detection with mean teacher transformer
Jinze Yu, Jiaming Liu, Xiaobao Wei, Haoyi Zhou, Yohei Nakata, Denis Gudovskiy, Tomoyuki Okuno, Jianxin Li, Kurt Keutzer, Shanghang Zhang,
Paper / Code

We propose an end-to-end cross-domain detection Transformer based on the mean teacher framework, MTTrans, which can fully exploit unlabeled target domain data in object detection training and transfer knowledge between domains via pseudo labels.

robot_grasp
[ICGNC 2022] Center-of-Mass-Based Robust Grasp Pose Adaptation Using RGBD Camera and Force/Torque Sensing
Shang Liu*, Xiaobao Wei*, Lulu Wang, Jing Zhang, Boyu Li, Haosong Yue,
Paper

Object dropping may occur when the robotic arm grasps objects with uneven mass distribution due to additional moments generated by objects gravity. To solve this problem, we present a novel work that does not require extra wrist and tactile sensors and large amounts of experiments for learning.

formation
[CCC 2021] Time-varying group formation-tracking control for heterogeneous multi-agent systems with switching topologies and time-varying delays
Shiyu Zhou, Xiaobao Wei, Xiwang Dong, Yongzhao Hua, Zhang Ren,
Paper

We investigate group formation-tracking problem for heterogeneous multi-agent systems (HMASs) with both switching networks and communication delays in this paper.

Internships

2023.07-2023.08
Ai2Robotics 智平方科技
NeRF for Driving Scenes
2024.01-2024.06
AMD
End-to-end Driving at Scale
2024.07-Now
NIO 蔚来汽车
3DGS for Driving Scenes

Miscellaneous

Friends (click to expand, random order)


Last updated: Nov. 2023
Web page design credit to Jon Barron