Goals of the chair
Recent Earth Observation (EO) systems have opened up new opportunities for land survey systems that provide critical information for climate change monitoring, mitigation, and adaptation. Monitoring Essential Climate and Biodiversity Variables (EVs) provides key information to understand climate, biodiversity and environmental changes.
However, retrieving EVs from multi-source data is challenging due to the singularities of EO data, such as indirect observation of interest variables, varying spatial resolution and irregularly sampled time series.
Principal investigators
- Jordi Inglada (Senior Expert CNES, CESBIO)
- Nicolas Dobigeon (Full Professor Toulouse-INP)
- Mathieu Fauvel (1st grade Researcher INRAe, CESBIO)
- Silvia Valero (CPJ Researcher, IRD – OMP)
Deep learning (DL) models offer promising solutions to learn complex patterns from huge amounts of data.
However, most of the recent models lack physical consistency and interpretability. Furthermore, they are not able to process data with irregular and unaligned sampling, which is common in multi-modal EO.
Training also requires large amounts of labeled data, which are scarce and noisy in EO.
Consequently, current models have a restricted usage in large scale EO systems.
Co-chairs
-
- Thomas Oberlin (Professor, ISAE-SUPAERO, Université de Toulouse)
- Julien Michel (Research Engineer CNES, CESBIO)
- Selime Gürol (Senior researcher, ALGO team, CERFACS)
This project will develop new self-supervised representation learning methods to produce semantically meaningful probabilistic representations from high-dimensional multi-modal EO data.
The originality lies on the use of prior knowledge from physical models into DL and thus proposing advances in uncertainty estimation and interpretability.
The proposed hybrid AI system will blend physical priors and DL to pre-train models that can learn (1) semantically meaningful representations related to EVs and (2) task- agnostic generic embeddings (AI-ready data) that can be used by downstream tasks.
The system will process multi-modal data to capture complementary spatio-temporal patterns. Physics-guided DL methods will be designed to condition the decoding of generic embeddings to retrieve and forecast EVs and their uncertainties.
To ensure the continuity of land monitoring, the system will use new data assimilation strategies combining satellite observations with pre-trained model forecasts.
Continual learning will be used to update the models in response to new EO data.
Non-stationary and long-term trends beyond the temporal range of the initial training will be accounted for.
The project raises scientific questions regarding joint probabilistic representation learning, incorporation of physical prior information, efficient use of pre-trained models, and continuous model updating with newly acquired data and new on-orbit sensors.
- Joint representation learning for multi-modal, heterogeneous (in time, space and measure) of remote sensing data.
- Evolutive generative models to model data and their uncertainties taking into account complex multi-modal distributions.
- Using inductive and learning biases by using physical simulators of the bio/geo-physical processes involved in the observed landscapes.
- Decoding generic probabilistic representations of remote sensing data into climate and biodiversity essential variables maps, at user defined spatio-temporal resolutions, with associated uncertainties.
- Design of an end-to-end self-supervised multi-task learning framework.
- Industrial Partners: Cerfacs, CS Sopra Steria, Magellium, Thales Alenia Space, Thales Services Numériques
- Vito, CLS, CNES, DLR, GAF embh: EVOLAND: develop and test new and innovative methods, algorithms and candidate Copernicus Land Monitoring Service prototypes by integrating novel EO/in-situ data and latest Machine Learning techniques to continuously monitor the status, dynamics and biomass of the land surface (Jordi Inglada, Julien Michel, Silvia Valero)
- CS Group, ESA: Sentinel-2 Agriculture (2014-2017) developing an operational system for cropland mapping (crop mask, crop type, vegetation status) using satellite RS imagery (Jordi Inglada, Silvia Valero)
- A training and inference infrastructure for big Earth Observation models which scales to petabytes of data.
- A new deep learning architecture for large scale multi-modal Earth Observation data probabilistic representation learning.
- Production of biodiversity and climate essential variables with a improved resolution and accuracy with respect to state of the art methods.
- Assessment on selected use-cases of real-world applications.