Object residual constrained Visual-Inertial Odometry
Department of Electrical and Computer Engineering University of California,
San Diego IROS, 2020
|
Overview
|
Introduction of OrcVIO.
|
Animations
|
This animation shows color-coded object level tracks of semantic keypoints, and
green tracks of geometric features.
|
|
This animation shows the 2D IOU of bounding-boxes from annotation and those
detected by YOLO. In the label, id means our object id, while gt means id in annotation.
|
|
This animation shows the reprojected objects. The object state is reprojected on
the image, where object detection is the blue rectangle, object shape is the red wireframe, and the green
ellipse is the reprojection of the ellipsoid that we use to represent objects.
|
Presentation
IROS 2020 short version
IROS 2020 long version
Chinese version
More results for OrcVIO
Semantic keypoint detection
Our approach uses StarMap for semantic keypoint detection. As could be observed in the upper row in Figure below, it
could handle a certain degree of viewpoint, scale, and visibility variation, since StarMap uses a large training set
to prevent overfitting.
Nonetheless, the lower row shows some failure cases due to occlusion or instance variation. Wrong detections or too
few detections will cause troubles in our approach.
|
Semantic keypoint detection from starmap.
|
Keypoint detection covariance
We use Monte Carlo Dropout to obtain the semantic keypoint covariances. Figure below shows how we insert the Dropout
layer into the Starmap network and the average covariance obtained from a sampled KITTI dataset.
|
Semantic keypoint uncertainty obtained from approximate Bayesian inference
through
the stacked hourglass convolutional neural network.
|
Below is a closer view of the keypoint covariance on one car.
|
Semantic keypoint uncertainty on one car.
|
Detection and tracking on KITTI
The front end could work with both colored images or grayscale images. Below is an exmaple of using color images as
input.
Object state reprojection on KITTI odometry sequences
KITTI odometry 06
The bottom left window shows the object tracking for semantic keypoints and bounding boxes.
The right shows the trajectory estimation and object mapping.
KITTI raw data 09 26 0117
The top left window shows the semantic keypoints tracking, while the bottom left window shows the geometric features
tracking. The right window shows the trajectory estimation and object mapping.
Forest scene
This demo shows the peformance of OrcVIO in a forest using a RealSense sensor.
The red line represents the estimated trajectory of OrcVIO.
The purple ellispoid is the covariance of the pose.
Indoor scene with chairs and monitors
This demo shows the construction of an object map for the lab scene with chairs and monitors, using a RealSense sensor.
The red line is the estimated trajectory, and the axes mark the current pose. The black dots are the reconstructed geometric landmarks, whereas the green dots are the estimated semantic keypoints.
The blue ellipsoids are the chairs and the orange ellipsoids are the monitors mapped by OrcVIO.
Outdoor scene at UCSD campus
This demo shows the construction of an object map for the outdoor scene with chairs, bikes, and cars, using an
INDEMIND sensor. The red line is the estimated trajectory, and the axes mark the current pose. The green dots are the
estimated semantic keypoints. The blue ellipsoids are the chairs mapped by OrcVIO, whereas the red ellipsoids are the
bikes, and the black ellipsoids are the cars.
Object map with 40 cars in Unity simulator
|
Upper row: Unity simulation scene. Lower row: reconstructed objects, where the
orange line is the estimated trajectory, the green ellipsoids
are the reconstructed cars, and the blue meshes are the groundtruth car positions.
|
Object map with car and door categories in Unity simulator
|
We propose a tightly coupled visual-inertial odometry and object state
optimization algorithm. (a) A simulated scene from Unity, where a quadrotor flies over cars and doors. (b)
Color-coded semantic keypoint tracklets on cars and doors. (c) Estimated trajectory (green) that coincides
with the groundtruth trajectory (red), and the object map with reconstructed cars (green ellipsoids), doors
(red ellispoids), semantic keypoints (yellow spheres), and the groundtruth objects (blue meshes).
|
Demo with cars, doors, and barriers in Unity simulator
More results for OrcVIO Lite
KITTI odometry 06
OrcVIO Lite uses bounding box only and no semantic keypoints, more suitable for real time experiments. The test on
KITTI odometry 06 uses grayscale images for both front end and back end.
The red line is the estimated trajectory, while the purple ellipsoid is the covariance of the pose. The white points
are the geometric landmarks, the colored dots are the active features. The black spheres are the reconstructed cars.
Flea3 camera
OrcVIO Lite uses bounding box only and no semantic keypoints, more suitable for real time experiments. This test
uses grayscale images from a Flea3 camera.
The red line is the estimated trajectory, while the purple ellipsoid is the covariance of the pose. The white points
are the geometric landmarks, the colored dots are the active features. The black spheres are the reconstructed cars.
Jackal robot
OrcVIO Lite runs on a Jackal robot, which is equipped with a RealSense sensor.
The yellow path is the VIO trajectory, whereas the red ellipsoids are the detected barrels. The marker size for barrels are exagerated for illustration purpose.
The barrel detection and tracking results are shown on the top left, where the bounding boxes show detections and the lines are the tracklets of the bounding boxes.
The tracklets are very long since the Jackal has a large viewpoint change. In this case the SORT tracker is modified to use centroid distance for affinity instead of IOU.
Due to the low quality IMU, the inertial data is not reliable and there is significant drift. The constant stop also makes it challenging for VIO.
Despite those difficulties, OrcVIO Lite is still able to localize the robot and map the barrels.
More results for OrcVIO Stereo
EuRoC V1 01
OrcVIO Stereo uses stereo camera instead of monocular camera to increase robustness. This demo shows its performance
on EuRoC V1 01 sequence.
The red line is ground-truth trajectory, while the blue line is the estimated trajectory.
EuRoC MH 01
This demo shows the performance of OrcVIO Stereo on EuRoC MH 01 sequence.
The red line is ground-truth trajectory, while the blue line is the estimated trajectory.
This demo shows the performance of OrcVIO Stereo Python version on EuRoC MH 01 sequence.
The green line is ground-truth trajectory, while the black line is the estimated trajectory.
Jackal robot
OrcVIO Stereo runs on a Jackal robot, which is equipped with a RealSense sensor.
The small linear acceleration due to constant velocity and limited angular velocity make this senario very challenging.
A drift could be noticed when the Jackal runs over uneven terrain, e.g. curb.
OrcVIO Stereo runs on a Jackal robot, with more features, in a loopy trajectory.
The accuracy is increased because there are more features, as can be seen in the completed loops with very small drifts.
Due to small drift the point clouds are also reconstructed well, for instance the corridors can be clearly seen.
Racecar
OrcVIO Stereo runs on a racecar, which is equipped with a RealSense D435i, in the lab.
OrcVIO Stereo runs on a racecar, which is equipped with a RealSense D435i. The mapping module
maps the chairs in the lab.
Publication
@inproceedings{shan2020orcvio,
title={OrcVIO: Object residual constrained Visual-Inertial Odometry},
author={Shan, Mo and Feng, Qiaojun and Atanasov, Nikolay},
booktitle={2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
pages={5104--5111},
year={2020},
organization={IEEE}
}
Codebase
|
We have made public different flavours of OrcVIO, include mono, stereo, mapping, mapping-lite, etc, as depicted in the summary above.
|