Semantic VSLAM System with Inertial, Wheel, and Surround-view Sensors for Autonomous Indoor Parking

Feature-based VSLAM often suffers from perspective and illumination change in GPS-denied parking lots. To this end, we use semantic features extracted from surround-view cameras to achieve long-term stable and robust localization. Thanks to accurate multi-camera calibration and IPM(Inverse Perspective Mapping), semantic features on the surround-view image can be projected on the ground to construct a more stable and consistent feature submap. In the back end, on the basis of the original tightly-coupled optimization consisting of IMU, wheel, and visual measurement, semantic features are added in a way similar to point-to-point residuals. At last, loop closure is detected to eliminate drift.

Tightly-coupled Visual-Inertial-Wheel Odometry for Ground Robot

VINS has additional unobservable directions for localizing wheeled robots such as scale when a ground robot is constrained to particular motion. Furthermore, accelerometer measurements on the ground robot are greatly affected by noise compared to those on the aerial robot. For these considerations, Wheel measurements are integrated into VINS, where we reference some excellent open-source codes(such as VIW-Fusion) and implement wheel odometer pre-integration, residuals and extrinsic parameters calibration. On the other hand, GPU-accelerated feature extraction and optical flow methods are integrated into the system to accelerate the front end. The optimization in the back end is also improved to detect and remove(or reduce weights) the outliers of IMU and wheel pre-integrations and visual measurements. Fast-LIO2 is also integrated based on a factor graph. Furthermore, the Sparsification for graph optimization is on the to-do list.

Autonomous Driving for Tracked Robot in Off-road Environment

In an off-road environment, the assumption of horizontal ground is usually invalid, so IMU and wheel encoders are integrated into a closed form on SE3, which can be used to correct the distortion caused by motion. In addition, LPD-Net (reproduced by myself) is integrated into LIO-SAM to detect loop-closure with a coarse-to-fine sequence matching strategy, which helps to build a more accurate map for map-based localization. Then PLReg3D learns local and global descriptors jointly for global localization at the initial step. Finally, a loosely-coupled method based on the pose graph is applied to provide the robot with a robust and accurate pose.

Semi-supervised 3D detection

Current detection models in autonomous driving greatly rely on annotated data, which is expensive for the autonomous driving company. To this end, unlabeled large-scale collected data is considered to be exploited in self- or semi-supervised training. In this project, I adopt SESS, Mean Teacher, Pseudo-Label and 3DIoUMatch to my detection model. The picture below is the visualization of the labeled scan and unlabeled scan.