Multimodal Estimation of Movement and Depth Based on Events for Scene Analysis

Abstract

Event cameras open new perception capabilities, allowing for the analysis of highly dynamic scenes with complex lighting. In the context of this thesis, two low-level perception tasks were examined in particular: (1) optical flow and (2) depth estimation.

In the case of optical flow, an optimization-based approach was developed, allowing for the estimation of optical flow in real-time with a single high-resolution event camera. Our approach provides accurate results, and was at the time of publishing the only event-based optical flow method working in real-time with high-resolution event cameras.

As for the depth estimation, a learning-based data-fusion method between a LiDAR and an event camera was proposed for estimating dense depth maps, in the form of a convolutional neural network (ALED). A novel notion of "two depths per event" was also proposed, as well as a novel simulated dataset containing high-resolution LiDAR, event data, and perfect ground truth depth maps. Compared to the state of the art, an error reduction of up to 61% was achieved, demonstrating the quality of the network and the benefits brought by the use of our novel dataset.

An extension to this depth estimation work was also proposed, this time using a recurrent attention-based network (DELTA) for a better modeling of the spatial and temporal relations between the LiDAR and the event data. Compared to ALED, DELTA is able to improve results across all metrics, and especially for short ranges (which are the most critical for robotic applications), where the average error is reduced up to four times.

Citation

@phdthesis{Brebion2024MultimodalEO,
  title={Multimodal Estimation of Movement and Depth Based on Events for Scene Analysis},
  author={Vincent Brebion},
  school={Université de technologie de Compiègne},
  year={2024},
  month={January},
  type={PhD thesis}
}

Acknowledgment

This work was supported in part by the Hauts-de-France Region and in part by the SIVALab Joint Laboratory (Renault Group - Université de technologie de Compiègne (UTC) - Centre National de la Recherche Scientifique (CNRS)).

Multimodal Estimation of Movement and Depth Based on Events for Scene Analysis

PhD Thesis, defended in January 2024

Abstract

Citation

Acknowledgment