Data Science and Machine Learning for Planet Earth

Earth Observation and AI

Earth Observation (EO) involves the collection and analysis of data about the Earth’s physical, chemical, and biological systems using remote sensing technologies, including satellites, drones, and ground-based instruments. EO data is crucial for understanding climate systems, monitoring environmental changes, managing natural resources, and supporting disaster response. However, the growing volume and complexity of EO datasets present significant challenges for traditional data processing methods.

In this context, Artificial Intelligence (AI) offers transformative potential by enhancing the extraction of meaningful insights from EO data. AI can automate the detection of patterns and anomalies across vast datasets, improve predictive modeling of environmental processes, and reduce the time needed for data analysis. Despite these benefits, applying AI to EO requires addressing challenges such as data heterogeneity, model interpretability, and the need for robust validation across diverse geographies and timeframes. Through innovative research, we aim to bridge these gaps and advance the use of AI in EO to support climate monitoring, environmental adaptation, and sustainable development.

Research Example: Monitoring Coral Reefs

The world’s shallow water marine ecosystems, including coral reefs, are among the most biodiverse and ecologically critical environments on the planet. These ecosystems are highly vulnerable to climate change, pollution, and human activity, making continuous monitoring essential for understanding their health and long-term viability. However, the challenges of inconsistent image quality in satellite data, cloud cover, and variable water depths have historically limited the effectiveness of traditional remote sensing methods.

Our research focuses on developing advanced machine learning workflows to automate the characterization and monitoring of shallow water environments using Earth Observation time-series data. By leveraging techniques such as Principal Component Analysis (PCA) for image quality assessment, cloud removal using XGBoost, and depth correction algorithms, we aim to improve the accuracy and consistency of marine habitat classifications. The incorporation of unsupervised clustering methods, such as KMeans and superpixel analysis, allows for autonomous classification of reef ecosystems without requiring extensive manual input.

Examples of Earth Observation data that was systematically corrected using our new machine learning workflow, and then segmented into different environments of deposition. From AlZayer et al, in review.

This work offers new opportunities for scalable, long-term monitoring of coral reefs, which are vital indicators of climate change impacts in marine environments. By reducing the reliance on labor-intensive fieldwork and improving the interpretation of multi-temporal satellite data, our approach provides valuable tools for conservation efforts and environmental management. As marine ecosystems face increasing pressures, these AI-driven techniques will play a critical role in preserving biodiversity and building resilience against future environmental challenges.

Earth Observation and AI Beyond Earth

The exciting aspect of using artificial intelligence with satellite data is that it enables one to work beyond the remits of our planet. For instance, in a work published by Platt et al, 2024, we demonstrated the use of deep learning on the Mars Orbiter satellite.

Hyperspectral imaging has revolutionized our understanding of Mars’ surface mineralogy, with data from instruments like the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) offering unprecedented insights into the planet’s geological history. However, sensor degradation over time has rendered much of the recently acquired data noisy and difficult to interpret, limiting the ability to identify key minerals and features on Mars.

Our research addresses this challenge through the development of Noise2Noise4Mars (N2N4M), a self-supervised machine learning model designed to denoise CRISM hyperspectral images without the need for pristine, noise-free reference data. Traditional noise removal methods rely on high-quality training datasets, which are scarce in planetary science applications. In contrast, our approach allows for autonomous noise reduction using only the noisy images themselves, making it particularly well-suited for remote sensing in space exploration.

This innovation significantly enhances the quality and usability of CRISM data, enabling more detailed mapping of Martian surface features and improving classification accuracy for important sites, such as proposed lander locations. By improving our ability to analyze degraded datasets, this work contributes to future Mars missions and advances our broader understanding of planetary geology, unlocking new opportunities to explore the Red Planet with greater precision.

Examples of the effectivness of our N2N4M denoising method, and the impact it has on detecting hydrated clays on Mars. From Platt et al, 2024.