Exploring Geospatial AI for Aquaculture Mapping - Ode Partners
Exploring Geospatial AI for Aquaculture Mapping
What we learned using AI to enable fish farm monitoring at scale.
Fish and shellfish from farms could help sustainably feed more people around the world (FAO 2023). But right now, we don’t have complete maps of where these farms are or how they are changing over time.
We teamed up with Seafood Watch, a program of the nonprofit Monterey Bay Aquarium. They work with businesses to improve their sourcing and producers to improve their fishing or aquaculture practices. The program also creates tools for consumers to choose sustainable seafood options. To do this work, Seafood Watch needs to systematically assess aquaculture operations for environmental sustainability. They need to understand where fish farms are located, what kinds of fish they raise, and how they raise them.
To date, there has not been a consistently updated, global map of aquaculture. Seafood Watch has relied on reports, visiting farms, and putting together datasets generated periodically. This takes significant effort and is tough to scale. We set out to find a faster, easier way to create up-to-date maps of land-based aquaculture building on previous modeling efforts.
Representation of image “chips” – square parcels of land covering 6.553 km2. Source: Google Earth 2025
Towards up-to-date, global aquaculture mapping with geospatial AI
Geospatial AI approaches hold promise to make traditional workflows cheaper and easier to scale. Geospatial foundation models, in particular, centralize computational processing upfront to deliver generalized, learned representations of Earth’s surface. These representations, or embeddings, can serve as a helpful starting point for various remote sensing tasks, and could allow us to map aquaculture more quickly and easily across different regions and over time.
With lightweight machine learning models trained on these embeddings, we hypothesized that we could produce global aquaculture maps at a cost that would enable annual to semi-annual updates that were previously impractical. Over the course of several weeks, we worked with Seafood Watch to understand how well lightweight geospatial AI approaches, such as the Clay model, could map aquaculture pond extent. To start, we focused on using Clay to map land-based aquaculture ponds in Andhra Pradesh, India.
In this article, we’ll cover some challenges we encountered and the ways we addressed them, which we hope can be useful to other teams exploring geospatial AI approaches to environmental modeling. These challenges included adapting to Clay chip-level outputs, differentiating aquaculture from other similar land use, and interannual embedding variability.
Initial set-up and results
Drawing on aquaculture datasets created by Seafood Watch and Clark University’s Center for Geospatial Analytics (2024) as ground truth, we trained a model to predict the presence of aquaculture in a given area. We used the Clay model to embed composites of multispectral Sentinel-2 imagery (ESA 2025) at a 10m resolution from January through April in a given year. While our primary ground truth dataset predicts aquaculture presence within 10m x 10m pixels, the main unit of analysis for Clay is different: the “chip”.
This lightweight application of Clay, similar to other machine learning approaches (e.g., convolutional neural networks), makes predictions at the chip-level, which represent larger areas of land. In this case, we created composite image chips, of size 256x256 pixels, representing a bounding box of 2560m x 2560m or square area of 6.553 km2.
We used pixel-level labels from the Clark Labs dataset and estimated pixel counts around Seafood Watch’s point dataset to form our ground truth dataset. To mark a chip as positive or having aquaculture present, we set our threshold at 0.1% of the chip, meaning that aquaculture had to cover 0.1% of the chip, or 6,553 m2 for the chip to be labeled as containing aquaculture. Clay produces a 1024-dimensional vector embedding for each chip. These embeddings were used as input to a lightweight machine learning model, a multilayer perceptron (MLP) classifier.
Based on this threshold, initial, out-of-the-box results were promising – for chip classification (predicting the presence of aquaculture within a chip), we achieved an F1 score, a composite measure of classification model performance, of 0.73 on a test region of Andhra Pradesh. Although there are limited direct benchmarks for this work, for reference, recent studies using Sentinel-2 to classify pixel-level land use and land cover have achieved F1 scores ranging from 0.66-0.99 (Paul et. al 2025, Aryal et. al 2023). We even visually identified areas that seemed to have been identified as aquaculture by the MLP, but were missed by previous modeling efforts. To increase the model’s performance beyond this, we explored a number of questions and issues that were identified in the model results.
Identifying small to large areas of aquaculture
THE CHALLENGE
One key challenge to address in working with chips is that the feature of interest may vary in size, while the chip size remains the same. Aquaculture can be done in large farming operations with hundreds of neighboring ponds, or in small outcroppings. When dividing the land areas into chips, we have to identify aquaculture ponds that may be in just one corner of a given chip, all the way up through aquaculture areas that cover an entire chip.
4 neighboring chips with varying aquaculture pond coverage. Ponds are highlighted in pink. Source: Google Earth 2025
HOW WE APPROACHED IT
To account for the variability in chips that should be classified as having aquaculture present, we introduced “soft” labels. Soft labels can allow the model to learn uncertainty when the coverage of aquaculture in a chip is low. Functionally, it means assigning a value from in the interval [0,1] as the chip label, rather than 0 or 1 (Sierra et. al 2025). During model training, the model learns more nuanced information, as we calculate the loss based on the soft label.
To understand how this helps the model learn, let’s take the example of a chip with very high coverage and another with very low coverage of aquaculture ponds. If the model predicts that aquaculture is not present in the chip with very high coverage, the loss will be high. For the chip with relatively low coverage, if the model predicts no aquaculture, the loss will be lower. This loss difference signals to the model that there is an inherent difference between the two chips – conceptually, that it was very wrong about the high coverage chip, whereas for the low coverage chip, it was wrong, but not quite as wrong as the other prediction.
As a result of introducing these soft labels, we saw a ~0.05 increase in the test F1 score, demonstrating that this labeling strategy provided useful information for the model training. This technique helped us achieve strong predictive capabilities at low levels of chip coverage, identifying chips with just 0.1% of pixels labeled as aquaculture.
Temporal strategies to address false positives
THE CHALLENGE
A common remote sensing challenge that we encountered relates to differentiating between similar kinds of land use. As we reviewed predicted aquaculture with Seafood Watch experts, we found that some of the areas the model had identified as aquaculture were actually salt pans or rice paddies. In India, these have a lot of similarities to aquaculture – they are generally rectangular trenches dug in low-lying areas, and are periodically filled with water and drained over the harvesting season. From satellite imagery, it can be hard to distinguish these types of land use from aquaculture.
Salt evaporation ponds. Source: NASA
Flooded rice fields. Source: EUMETSAT
HOW WE APPROACHED IT
Through research and consulting with Seafood Watch experts, we realized that these features have different temporal patterns, as the seasonality and timing of the flooding and draining differs for each type of framing. When they are drained, you can often visually see differences between the salt pans, rice fields, and aquaculture ponds.
We wanted to create embeddings that could effectively capture different farming events to allow the model to differentiate between these land use types. Based on an approach using different percentiles of imagery composites in a machine learning model by Greenstreet et. al, we created two embeddings for any given area, each representing a different “slice” of the imagery. Using imagery values over the course of several months, typical imagery composites slice the image down the middle – we use the median value to create an image to embed with Clay. Instead, we took the 25th and 75th percentile values for each pixel to create two images, the former which would represent what aquaculture ponds looked like filled, and the latter which would represent them when drained. For every chip, each of the two images were embedded and the resulting embeddings were concatenated to create a single 2048-dimensional embedding.
Implementing this percentile-based embedding approach resulted in an ~0.03 F1 increase. In manual review, we visually saw fewer of these incorrectly identified examples. Depending on the presence of rice or salt farming in the areas the model is applied to, this approach could potentially have an even greater impact in improving model performance.
25th percentile and 75th percentile image composites, with red boxes highlighting areas where aquaculture ponds can be shown filled vs. drained.
Applying the model across multiple years
THE CHALLENGE
The last challenge we’ll discuss is in training a model that can apply across geographies and time. While the model performed well in regions outside of the training data geographically, when testing model performance across previous years outside of our training data, we saw a significant drop in performance. The model classified many more chips as aquaculture, incorrectly, compared to years that it had seen during training.
Principal component analysis (PCA) of Andhra Pradesh embeddings in 2018 vs. 2022, before transformation.
HOW WE APPROACHED IT
The model’s overprediction in a new year pointed to a potential shift in the characteristics of the underlying dataset, or a domain shift. We used principal component analysis (PCA), a dimensionality reduction technique which can allow us to analyze or visualize embeddings by representing them more simply. When looking at the first two principal components of the embeddings and coloring by year, there was a clear distinction between embeddings from different years. We hypothesize that the heavy presence of agriculture in the region and year-to-year climate and agricultural productivity changes resulted in a shift in the embedding space.
This shift led to the poor out-of-sample performance, and these differences need to be accounted for before embeddings are passed to the model. After experimenting with different transformations, we found that an orthogonal Procrustes embedding transformation (scipy) could re-align the embedding space for areas evaluated outside of the training data year (2022). This accounted for year-to-year differences while maintaining model performance, producing predictions with much better alignment across years. The uncorrected embeddings produced significant over-prediction of aquaculture, shown by the extensive coverage of yellow and green on the left hand plot, while the corrected predictions on the right hand plot depict a much more realistic distribution.
Predicted probabilities for aquaculture in 2025 in Andhra Pradesh. The left plot shows model predictions before applying the transformation, which shows a much greater proportion of predicted positive chips (higher values). The right plot represents the fi
Final results
Ultimately, this approach achieved around 0.85 recall (i.e. 85 out of every 100 aquaculture areas were identified) and 0.75 precision (i.e. 75 out of every 100 predicted aquaculture areas were correct), and 0.80 F1 score on test sets of the coastal region of Andhra Pradesh. We calibrated findings against the Clark Labs dataset to calculate area estimates, identifying total aquaculture land area in Andhra Pradesh in 2025 as approximately 1,232 km2, with a 95% confidence interval between 1,048 and 1,441 km2. Drawing on Seafood Watch’s expertise, this modeling identified areas outside of the original dataset that, upon visual inspection, appeared to be aquaculture. We could also very easily map areas much further inland than the original dataset, which came from an approach evaluated primarily on areas 10km from the coastline. With this approach, we identified areas up to 200 km inland from the coast that had aquaculture farming.
Map of 2025 predicted aquaculture areas. Yellow areas indicate our predicted aquaculture areas in 2025, while pink indicates Clark Labs’ predicted aquaculture in 2022.
Conclusion
We’ve covered some of our learnings and adaptive approaches to operationalize geospatial AI towards cheaper, more scalable ways to produce global, up-to-date aquaculture maps. To address varying sizes of aquaculture farms and adapt to a chip-based classification approach, we used soft labels to improve model training and performance. We also found that percentile imagery approaches can be applied in geospatial AI to distinguish environmental features based on temporal signals. Finally, in applying our embeddings-based workflow across different years, we learned that embedding domains may shift, but can be addressed using algorithms such as orthogonal Procrustes.
We share these approaches in hopes they may spark new ideas or be useful for other geospatial AI modeling work across various environmental and humanitarian applications. This could include critical tasks such as delineating wetlands for conservation, identifying illegal mining sites for resource governance, or mapping infrastructure for flood mitigation. With such promising findings from initial exploration, an exciting next step is to further investigate and refine the operational path to scale across regions. As more global applications of geospatial AI are developed, experimenting with different modeling strategies can help maximize performance across diverse regions with differing aquaculture and land use patterns. Especially as efforts move toward global individual pond mapping, an important area for future work will be designing and assessing full workflows for cost-effectiveness and robustness across geographies, bringing global, up-to-date aquaculture maps within reach.
Footnotes
[1] We used Sentinel-2’s red, red edge, green, blue, near infrared, and short-wave infrared bands.
[2] Based on our experience and previous embedding model performance with Random Forest and MLP architectures, we chose to use an MLP here, without extensive exploration of other model types. We hope to investigate the impacts of model choice further as the work progresses.
[3] We define soft labels for a given chip based on the fraction of positive pixels. Soft labels are computed as yi= min(ci / t, 1) where c is the fraction of positive pixels in a 256 x 256 px chip, and t is a maximum coverage threshold, beyond which chips are treated as fully positive.
[4] We use binary cross-entropy (BCE) as the loss function, which can inherently account for labels in the interval [0,1].