Although many algorithms used within the GEE are based on Open Source software, understanding what happens on Google’s servers before an image of Earth is generated is not trivial.
A European Union-funded project OpenEO (Open Earth Observation) now looks at ensuring “cross cloud backend reproducibility”.
It develops a language (API) that can be used to ask questions to different cloud back-ends, including GEE.
By that, it aims to make it relatively trivial to compare and verify their results, but also to find e.
the cheapest options for doing particular types of computations.
The O2R project has many practical tips on how to back-up and share code and data and enable computational reproducibility at relatively low cost.
EO is a basis of objectivised monitoring of land dynamics, but not all land management indicators can be mapped directly using RS technology alone.
Some land management indicators need to be estimated by interpolating sampled (i.
observed/measured) values at point locations, or by using predictive mapping approaches.
Some variables are simply more complex and can’t be mapped using spectral reflectances only.
For example, no EO system can (yet) be used to directly estimate soil organic carbon (SOC) density.
Technology simply does not yet exist to do this and, therefore, we need to use point samples and Predictive Mapping to try to estimate SOC at each pixel.
Luckily, there is now also ever more point data with environmental variables being made available across borders via Open Data licenses and through international initiatives.
Established field observation compilations include: the Global Biodiversity Information Facilities observations, Global Historical Climatology Network station data, WoSIS Soil Profile Database points, sPlot global plant database, BirdLife, to mention just a few of the most widely known.
Increasingly more and more point data measurements and observations today are also contributed through Citizen science and crowdsourcing projects such as iNaturalist (Fig.
4), Geo-wiki, Citizen Weather Observer Program, and similar (Irwin, 2018).
Even commercial companies are now open to sharing their proprietary data and using it for global data mining projects (provided that their users agree of course).
For example, continuous measurements of temperature, pressure, noise levels etc.
from the Netatmo weather stations (estimated over 150k devices sold worldwide).
Many users are also open to sharing weather measurements from their mobile phone devices.
Plantix app is currently producing a global database of photographs of plant diseases and growth aberrations.
We mention here only a few major global or continental data repositories and initiatives.
There are, of course, many more repositories available in local languages and for local areas.
As global compilations of ground observations and measurements become larger and more consistent, there is an increasing need to extract value-added global maps from them.
There is now a burgeoning interest in using exascale high-performance computing to process emerging novel, in-situ observation networks to transform the quality and cost-effectiveness of high-resolution weather forecasting, agricultural management and Earth system modeling in general.
An especially exciting development in the years to come will likely be a hybrid process-based modeling + machine learning approach which combines the best of data and the best of our geophysical/geochemical knowledge (Reichstein et al.
org: a Citizen Science system.
Screenshot showing geographical density and type of data (usually field photographs with description and interpretation of species names etc) contributed.
To summarize, the most promising path to building trust in data seems to be by achieving computational reproducibility (documenting all processing steps and providing all relevant metadata required to reproduce exactly the same results).
There are now increasingly more robust ways to achieve reproducibility, even for projects that are relatively computationally heavy.
Due to increasing provision of free RS data associated with the launch of new EO satellites, it seems that the mainstream data analytics will be moving from local networks to (virtual) clouds and Big Earth Data cubes i.
data analysis through web-based workflows (Sudmanns et al.
In that context, GEE will remain a valuable and (hopefully) increasingly trustworthy place to process global public RS data and produce new valuable maps of the status of our environment.
But society also needs more Open and not-for-profit infrastructures such as OpenStreetMap to ensure longevity.
Global field observation repositories, Citizen Science and Machine Learning will, in any case, play an increasing role in generating more reliable maps of more holistic land management indicators.
LandGIS: our contribution to a global land commonsWe (the OpenGeoHub foundation) have recently started providing hosting and data science services to help produce and share the most up-to-date, fully documented (potentially to the level of fully reproducibility) data sets on the actual and potential status of multiple environmental measures through a system we call “LandGIS” and which is available via https://landgis.
Initially, LandGIS provides access to new and existing data on soil properties/classes, relief, geology, land cover/use/degradation, climate, current and potential vegetation, through a simple web-mapping interface allowing for interactive queries and overlays.
This is a genuine Open Land Data and Services system where anyone can contribute and share global maps and make them accessible to hundreds of thousands of researchers and businesses.
LandGIS in action: a 6-minutes video tutorial on how to access, use and download data.
LandGIS is based on the following six main pillars:Open data license (Open Data Commons Open Database License and/or Creative Commons Attribution-ShareAlike) with a copy of data placed on zenodo.
org,Fully-documented, reproducible procedures with most of the code available via a github repository,Predictions based on the state-of-the-art Ensemble Machine Learning techniques implemented using Open Source software,Distribution of data based on using the Open Geospatial Consortium (OGC) standards: GDAL, WMS, WCS and similar,Diversity of web-services optimized for high traffic usage (cloud-optimized GeoTIFFs),Managed, open user and developer communities.
We contribute to LandGIS results of our own data mining and spatial prediction processes including mapping of potential natural vegetation (e.
Hengl et al.
2018) and 3D mapping of soil properties and classes.
However, we also host maps contributed by others, especially if the data are already peer-reviewed and fully documented.
5: LandGIS interface showing predicted global distribution of USDA great groups (soil types) based on a global compilation of soil profiles.
Data available for download at doi: 10.
To illustrate our general approach to producing usable information to improve global land management, consider the example of soil type mapping.
USDA and USGS have invested decades, and possibly billions of dollars, to collect data on soils and produce and maintain knowledge about soils, mainly through the soil classification system “USDA Soil Taxonomy”.
As a result of decades of field work and laboratory analysis of thousands of soil samples, USDA and USGS produced a repository of over 350,000 field observations of soil types (in this case we focus on soil great groups).
We combined these point data with other national and international compilations to produce the world’s most consistent and complete training data set of USDA soil great groups.
We then overlaid these points over some 300 global covariate layers which represent soil forming factors, and then fitted spatial prediction models using Random Forest (Hengl et al.
Although we achieved only limited classification accuracy outside the USA (where a majority of the training points are located), having a large-enough training data set allows us to produce initial maps of soil types (at relatively fine resolution of 250 m) also for countries where we basically had no training points at all (Fig.
Because we have fully automated overlay, modeling and spatial prediction, as soon as we obtain more contributed observations of soil types, we can update these initial predictions and gradually produce better and more usable / useful maps.
6: Example of a general workflow of how LandGIS can be used to recommend optimal soil use practices at a farm scale based on accurately predicting the soil type (in this case: USDA soil great groups).
Note we run a circular process where with each new contribution (training points) we can produce increasingly more detailed / more accurate soil maps.
Other LandGIS functionalities in development include:Multi-user: Relevant and useful functionality for various participants in environmental activities, including landowners, community leaders, scientific advisory agencies, commercial contractors, donors and investors.
Multi-module: various activities in environmental management are being integrated, including project discovery, farm due diligence, fundraising, implementation, automated land management KPI (Key-Performance-Indicator) tracking.
Integrations with enterprise IT and social media: to achieve enhanced security, data protection, and interoperability features enabling this.
Context-based customization and enrichment: spatially auto-filled user data (saving manual setup effort), peer activity-based features, location based alerts for specific conditions (e.
frost warning, temperature thresholds etc).
Integration of ‘blockchain’ capability for environmental token trading support.
So, in summary, there is still a need for an OpenLandMap type system to allow for archiving and sharing environmental variables and land management indicators.
With LandGIS, we have shown that new, value-added, information can be produced immediately and affordably using “old legacy data”.
We have demonstrated a technology and knowledge transfer opportunities (from data-rich countries to data-poor countries), which we believe is a win-win scenario.
We release all code used to generate LandGIS layers as open source, allowing full replicability of state-of-the-art spatial analytics by anyone, including as a basis for commercial services.
We have released all our data as open data, allowing anyone, including businesses, to build upon this soils and other environmental data — hopefully in ways we can’t even imagine.
, & Salmon, J.
Mapping the world’s degraded lands.
Applied geography, 57, 12–21.
, Hancher, M.
, Dixon, M.
, Ilyushchenko, S.
, Thau, D.
, & Moore, R.
Google Earth Engine: Planetary-scale geospatial analysis for everyone.
Remote Sensing of Environment, 202, 18–27.
, Potapov, P.
, Moore, R.
, Hancher, M.
, Turubanova, S.
, Tyukavina, A.
, … & Kommareddy, A.
High-resolution global maps of 21st-century forest cover change.
science, 342(6160), 850–853.
, de Jesus, J.
, Heuvelink, G.
, Gonzalez, M.
, Kilibarda, M.
, Blagotić, A.
, … & Guevara, M.
SoilGrids250m: Global gridded soil information based on machine learning.
PLoS one, 12(2), e0169748.
, Walsh, M.
, Sanderman, J.
, Wheeler, I.
, Harrison, S.
, & Prentice, I.
Global mapping of potential natural vegetation: an assessment of machine learning algorithms for estimating land potential.
PeerJ, 6, e5457.
, MacMillan, R.
Predictive Soil Mapping with R.
OpenGeoHub foundation, Wageningen, the Netherlands, 370 pages, www.
org, ISBN: 978–0–359–30635–0.
, See, L.
, Tsendbazar, N.
, & Fritz, S.
Towards an Integrated Global Land Cover Monitoring and Mapping System.
Remote Sensing, 8(12).
, & McCabe, M.
Daily Retrieval of NDVI and LAI at 3 m Resolution via the Fusion of CubeSat, Landsat, and MODIS Data.
Remote Sensing, 10(6), 890.
No PhDs needed: how citizen science is transforming research.
Nature, 562, 480–482.
1038/d41586-018-07106-5Klein Goldewijk, K.
de Vos and G.
van Drecht (2011).
The HYDE 3.
1 spatially explicit database of human induced land use change over the past 12,000 years, Global Ecology and Biogeography 20(1): 73–86.
, Shakun, J.
, Clark, P.
, & Mix, A.
A reconstruction of regional and global temperature for the past 11,300 years.
science, 339(6124), 1198–1201.
, Rizzoli, P.
, Wecklich, C.
, González, C.
, Bueso-Bello, J.
, Valdo, P.
, … & Moreira, A.
The global forest/non-forest map from TanDEM-X interferometric SAR data.
Remote Sensing of Environment, 205, 352–373.
& Peters, J.
Trends in global CO2 and total greenhouse gas emissions: 2018 report.
PBL Netherlands Environmental Assessment Agency, The Hague.
Advancements in medium and high resolution Earth observation for land-surface imaging: Evolutions, future trends and contributions to sustainable development.
Advances in Space Research, 57(1), 110–126.
, Cottam, A.
, Gorelick, N.
, & Belward, A.
High-resolution mapping of global surface water and its long-term changes.
Nature, 540(7633), 418.
, Camps-Valls, G.
, Stevens, B.
, Jung, M.
, Denzler, J.
, & Carvalhais, N.
Deep learning and process understanding for data-driven Earth system science.
Nature, 566(7743), 195.
, Drouet, L.
, Caldeira, K.
, & Tavoni, M.
Country-level social cost of carbon.
Nature Climate Change, 8(10), 895.
, Hansen, M.
, Stehman, S.
, Potapov, P.
, Tyukavina, A.
, Vermote, E.
, & Townshend, J.
Global land change from 1982 to 2016.
Nature, 560(7720), 639.
, Tiede, D.
, Lang, S.
, Bergstedt, H.
, Trost, G.
, Augustin, H.
, … & Blaschke, T.
Big Earth data: disruptive changes in Earth observation data management and analysis?.
International Journal of Digital Earth, 1–19.
, Rondinini, C.
, Pettorelli, N.
, Mora, B.
, Leidner, A.
, Szantoi, Z.
, … & Koh, L.
Free and open-access satellite data are key to biodiversity conservation.
Biological Conservation, 182, 173–176.
, Thenkabail, P.
, Gumma, M.
, Teluguntla, P.
, Poehnelt, J.
, Congalton, R.
, … & Thau, D.
Automated cropland mapping of continental Africa using Google Earth Engine cloud computing.
ISPRS Journal of Photogrammetry and Remote Sensing, 126, 225–244.
, Wulder, M.
, Roy, D.
, Woodcock, C.
, Hansen, M.
, Radeloff, V.
, … & Pekel, J.
Benefits of the free and open Landsat data policy.
Remote Sensing of Environment, 224, 382–385.