Ninja code: make a satellite mosaic image with the least clouds possible
This topic started as a 2021 New Year greetings, a joke between a few Earth Observation (EO) geeks¹. But it has attracted interest of more people than I thought, with about more than 2600 views in 24 hours, so I thought it’s worth publishing a code directly usable instead of “leaving as exercise to the reader”.
The initial post was a snippet of 9 lines ninja code using the Google Earth Engine (GEE) JavaScript Code Editor. It builds a set of mosaic images, as cloudless as possible over a month, from an unlimited sequence of images, for stats or classification or AI. See an example in the title image. When I started EO in 1981, the same would have needed more than 500 lines of Fortran + libraries and hours of execution. Now it’s 9 lines of code and the result is instantaneous.
To achieve this, we use a mix between computer science and EO. Let’s start with the code below, that you can copy-paste-run directly in the GEE Code Editor. The detailed tutorial explanations come below. If you are totally new, read more hand-holding in annex² and annex³
The fully functional code
Copy-paste the following in the Code Editor of GEE and click “Run”.
You can stop here, unless you are curious to reproduce the pattern in your own GEE code.
Detailed explanation of the code
You can jump directly to the tutorial of the map
function, below, and come back later to this detailed explanation.
Preparation code
- line 9: set the center point of analysis, longitude-latitude in degrees. You may want to change to your favorite location.
- line 13: set the month over which we want to calculate the most cloudless image as possible over several years, for example to do statistics on the vegetation index, or to train classification or AI algorithms. Here we use April, which is the end of the dry season in Tamil Nadu. You may want to change to November which is one of the most cloudy months.
- line 15: set a list of years of study. Sentinel 2 only has images over the complete year since 2017. If you use MODIS, you can have images since 2000.
Actual ninja code: 1 line :-)
- line 16: first and main “ninja code”: it builds a collection of images, each of them being itself a mosaic over all images of the month using the least clouded pixel as possible. We use the JavaScript function “
map
” over the function “makeBestImage
”. See more explanations below.
Illustration of the use of the result
- lines 17–22: illustrative example use of the result, not directly part of the ninja code. As an exercise, you may want to use the JavaScript
reduce
function to calculate the average and standard deviation on these cloudless images?
Where it all happens…
- lines 25–41: this is the function
makeBestImage
that is given as argument to call the JavaScript “map
” function of line 16. In line 33, it returns the anomymous function that we will see now - lines 33–40: anonymous function,
function(year).
It is used by the JavaScript “map
” function and it receives as unique argument from “map
” one element “year
” of the list on which the function “map
” iterates. - line 34: set the start date of the range on which the cloud-free mosaic is done.
- line 35: set the duration of the range. You may want to try and change it from “1 month” to “4 weeks” or “2 weeks” etc. If the revisit is frequent enough, maybe 2 weeks are enough?
- line 36: extract into
filtered
a reduced set of images on which to apply mosaic - line 37: another “ninja code” using “
map
” to calculate the NDVI of each pixel - line 38: not so “ninja” here, we just call a GEE
mosaic
function to make the mosaic image, from pixels of allfiltered
images with the best NDVI index - lines 44–54: helper function to calculate the index
Short tutorial on JavaScript “map” function
Let’s do some basic computer science :-)
There are 3 built-in JavaScript functions that are specially useful in GEE because you can make GEE massively parallelize your code, i.e. dispatch it over potentially dozens of virtual machines executing in parallel. They are map
, reduce
and filter
. See in depth explanations here. In short, they are a vast improvement over the good old for
loop of our grand-dads.
- In a
for
loop, you ask the compiler to tell the CPU to repeatedly execute some instructions, for example calling a function over all elements of an array. - With a
map
function, you ask the GEE server⁵ to generate as many copies as required of a function and map them to each element of an array.
When you do a for
loop, you tell GEE “please do exactly as I tell you, iteration by iteration, I might do things more complicated than simply iterating over an array”. In this case, GEE restrains from parallelizing.
On the contrary, when working with GEE which is a cloud-based service with virtual machines, if you use a map
call you actually tell GEE “hey, this function is working on an array, feel free to parallelize it as much as you can”.
Notice 3 things here:
map
is a built in function of the classarray
⁴: you callmap
using the dot notation,array.map(function)
.- first class variable: you give
map
an argument that is a function and its callbackmakeBestImage
even returns a function. Functions are treated as “first class objects” in JavaScript. - callback functions: a callback function like
makeBestImage
can execute in a different scope. This allows you to use, in a wider context, functions likemap
that have only one argument.
Let’s continue on callbacks. In EO programming and blockchain programming you will use very often callbacks.
Passing functions as arguments or as returns, and callbacks
The problem to solve here is the following: when map
operates on an array, it knows only the element of the array it’s working on and nothing else, so how can we give more information to the function that map
associates with the array element?
For example, you may want to call map
to calculate not only the NDVI of images (NIR and Red), but to reuse it to calculate any normalized differential index like NDWI-Gao (NIR, Short Wave IR) or NDWI-Mc Feeters (NIR, Green) etc. Instead of hard-coding the spectral bands in the function, you give it 2 arguments, over the shoulder of map
, as following (compare with line 37 above):
imageArray_withNDVI = imageArray.map(addIndex(NIR, Red));
when map
calls addIndex()
, the callback, the latter does nothing but returns an anonymous function. This function is the true iterator: it calculates the normalized difference of the 2 bands in the context of the caller and adds the resulting band to the argument image
. Compare with lines 51–54 above.
return function (image) { var ndvi = image.normalizedDifference ([NIR, RED]).rename (‘NDVI’); return image.addBands (ndvi); }}
The following picture may be easier to understand.
- On the left column we let
map
apply directly a function on all elements of an array and make another array with the results. This function can have only one argument, the element of the array. - On the right column, we let
map
apply a function that itself calls a function that returns a result tomap
. The first function is the callback that can have as many arguments as needed.
In our example of ninja code, we use this setup twice, once to map makeBestImage
and once to map addNDVI
. Master it and you will use it all the time in GEE. You are now tooled to read and understand the code above.
We’ll also use callbacks when doing blockchain programming, but we’ll leave this topic to another article. Stay tuned. EO and blockchain programming are not so different, actually.
[2] You don’t have a GEE account to run the code? here is how to obtain a fee GEE account.
[3 ]You are new in Earth Observation? Here is a hands-on interactive exercise to explore vegetation indexes (NDVI, NDWI, LAI and more).
[4] actually, map,
reduce
and filter
work on arrays, lists, and collections. Not only on arrays.
[5] because map
is a server-side function, you may experience surprises or dubious error messages from GEE when you insert client-side functions in a “map
ped” scope, for example if you use print()
.