Skip to content

Modules

ee_quick_start()

Quick start function to initialize Earth Engine with automatic credential detection.

Automatically detects and uses Earth Engine credentials from the GEE_KEY environment variable. Supports both service account JSON files and project tokens, providing informative feedback about the initialization process.

Environment Variables

GEE_KEY : str Earth Engine authentication key. Can be either: - Path to service account JSON file (ends with .json) - Project token string for standard authentication

Returns:

Type Description
None

Prints initialization status messages but doesn't return values.

Examples:

>>> import os
>>> os.environ['GEE_KEY'] = '/path/to/service-account.json'
>>> ee_quick_start()
Earth Engine initialized successfully using AgriGEE.lite...
Notes

For service account authentication, the function also sets the GOOGLE_APPLICATION_CREDENTIALS environment variable for use Google Cloud Storage.

Source code in agrigee_lite/ee_utils.py
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
def ee_quick_start() -> None:
    """
    Quick start function to initialize Earth Engine with automatic credential detection.

    Automatically detects and uses Earth Engine credentials from the GEE_KEY
    environment variable. Supports both service account JSON files and project
    tokens, providing informative feedback about the initialization process.

    Environment Variables
    ---------------------
    GEE_KEY : str
        Earth Engine authentication key. Can be either:
        - Path to service account JSON file (ends with .json)
        - Project token string for standard authentication

    Returns
    -------
    None
        Prints initialization status messages but doesn't return values.

    Examples
    --------
    >>> import os
    >>> os.environ['GEE_KEY'] = '/path/to/service-account.json'
    >>> ee_quick_start()
    Earth Engine initialized successfully using AgriGEE.lite...

    Notes
    -----
    For service account authentication, the function also sets the
    GOOGLE_APPLICATION_CREDENTIALS environment variable for use Google Cloud Storage.
    """

    if not ee_is_authenticated():
        if "GEE_KEY" in os.environ:
            gee_key = os.environ["GEE_KEY"]

            if gee_key.endswith(".json"):  # with service account
                credentials = ee.ServiceAccountCredentials(gee_key, gee_key)
                ee.Initialize(credentials)

                os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = gee_key

                with open(gee_key) as f:
                    key_data = json.load(f)
                    print(
                        f"Earth Engine initialized successfully using AgriGEE.lite with service account. Project: {key_data.get('project_id', 'Unknown')}, Email: {key_data.get('client_email', 'Unknown')}."
                    )

            else:  # using token
                ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com", project=gee_key)
                print(f"Earth Engine initialized successfully using AgriGEE.lite using token (project={gee_key}).")

        else:
            print(
                "Earth Engine not initialized. Please set the GEE_KEY environment variable to your Earth Engine key. You can find more information in the AgriGEE.lite documentation."
            )

get_all_tasks()

Retrieve status information for all Earth Engine tasks.

Fetches comprehensive information about all Earth Engine operations/tasks associated with the authenticated account, including metadata, timing, resource usage, and cost estimates.

Returns:

Type Description
DataFrame

DataFrame containing task information with the following columns: - attempt: Task attempt number - create_time: Task creation timestamp - description: Task description/name - destination_uris: Output destination URIs - done: Boolean indicating completion status - end_time: Task completion timestamp - name: Internal task name - priority: Task priority level - progress: Completion progress (0.0 to 1.0) - script_uri: Source script URI - start_time: Task start timestamp - state: Current task state (RUNNING, COMPLETED, FAILED, etc.) - total_batch_eecu_usage_seconds: Total EECU usage in seconds - type: Task type (EXPORT_IMAGE, EXPORT_TABLE, etc.) - update_time: Last update timestamp - estimated_cost_usd_tier_1: Estimated cost in US Dollars for Tier 1 pricing - estimated_cost_usd_tier_2: Estimated cost in US Dollars for Tier 2 pricing - estimated_cost_usd_tier_3: Estimated cost in US Dollars for Tier 3 pricing

Notes

Cost estimates are based on EECU usage and standard pricing tiers. If no tasks exist, returns an empty DataFrame with the same column structure.

Source code in agrigee_lite/ee_utils.py
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
def ee_get_tasks_status() -> pd.DataFrame:
    """
    Retrieve status information for all Earth Engine tasks.

    Fetches comprehensive information about all Earth Engine operations/tasks
    associated with the authenticated account, including metadata, timing,
    resource usage, and cost estimates.

    Returns
    -------
    pd.DataFrame
        DataFrame containing task information with the following columns:
        - attempt: Task attempt number
        - create_time: Task creation timestamp
        - description: Task description/name
        - destination_uris: Output destination URIs
        - done: Boolean indicating completion status
        - end_time: Task completion timestamp
        - name: Internal task name
        - priority: Task priority level
        - progress: Completion progress (0.0 to 1.0)
        - script_uri: Source script URI
        - start_time: Task start timestamp
        - state: Current task state (RUNNING, COMPLETED, FAILED, etc.)
        - total_batch_eecu_usage_seconds: Total EECU usage in seconds
        - type: Task type (EXPORT_IMAGE, EXPORT_TABLE, etc.)
        - update_time: Last update timestamp
        - estimated_cost_usd_tier_1: Estimated cost in US Dollars for Tier 1 pricing
        - estimated_cost_usd_tier_2: Estimated cost in US Dollars for Tier 2 pricing
        - estimated_cost_usd_tier_3: Estimated cost in US Dollars for Tier 3 pricing

    Notes
    -----
    Cost estimates are based on EECU usage and standard pricing tiers.
    If no tasks exist, returns an empty DataFrame with the same column structure.
    """
    tasks = ee.data.listOperations()

    if tasks:
        records = []
        for op in tasks:
            metadata = op.get("metadata", {})

            record = {
                "attempt": metadata.get("attempt"),
                "create_time": metadata.get("createTime"),
                "description": metadata.get("description"),
                "destination_uris": metadata.get("destinationUris", [None])[0],
                "done": op.get("done"),
                "end_time": metadata.get("endTime"),
                "name": op.get("name"),
                "priority": metadata.get("priority"),
                "progress": metadata.get("progress"),
                "script_uri": metadata.get("scriptUri"),
                "start_time": metadata.get("startTime"),
                "state": metadata.get("state"),
                "total_batch_eecu_usage_seconds": metadata.get("batchEecuUsageSeconds", 0.0),
                "type": metadata.get("type"),
                "update_time": metadata.get("updateTime"),
            }
            records.append(record)

        df = pd.DataFrame(records)
        df["create_time"] = pd.to_datetime(df.create_time, format="mixed")
        df["end_time"] = pd.to_datetime(df.end_time, format="mixed")
        df["start_time"] = pd.to_datetime(df.start_time, format="mixed")
        df["update_time"] = pd.to_datetime(df.update_time, format="mixed")

        df["estimated_cost_usd_tier_1"] = (df.total_batch_eecu_usage_seconds / (60 * 60)) * 0.40
        df["estimated_cost_usd_tier_2"] = (df.total_batch_eecu_usage_seconds / (60 * 60)) * 0.28
        df["estimated_cost_usd_tier_3"] = (df.total_batch_eecu_usage_seconds / (60 * 60)) * 0.16

    else:  # If no tasks are found, create an empty DataFrame with the same columns
        df = pd.DataFrame(
            columns=[
                "attempt",
                "create_time",
                "description",
                "destination_uris",
                "done",
                "end_time",
                "name",
                "priority",
                "progress",
                "script_uri",
                "start_time",
                "state",
                "total_batch_eecu_usage_seconds",
                "type",
                "update_time",
            ]
        )

    return df

quadtree_clustering(gdf, max_size=1000)

Cluster geometries in a GeoDataFrame using a quadtree and simplify clusters.

Parameters:

Name Type Description Default
gdf GeoDataFrame

GeoDataFrame containing geometries (Polygon, MultiPolygon, or Point).

required
max_size int

Maximum number of geometries per cluster (default is 1000).

1000

Returns:

Type Description
GeoDataFrame

GeoDataFrame with cluster labels and simplified geometries.

Source code in agrigee_lite/misc.py
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
def quadtree_clustering(gdf: gpd.GeoDataFrame, max_size: int = 1000) -> gpd.GeoDataFrame:
    """
    Cluster geometries in a GeoDataFrame using a quadtree and simplify clusters.

    Parameters
    ----------
    gdf : geopandas.GeoDataFrame
        GeoDataFrame containing geometries (Polygon, MultiPolygon, or Point).
    max_size : int, optional
        Maximum number of geometries per cluster (default is 1000).

    Returns
    -------
    geopandas.GeoDataFrame
        GeoDataFrame with cluster labels and simplified geometries.
    """
    gdf = gdf.copy()

    # Centroid columns (ignore CRS warning)
    with warnings.catch_warnings():
        warnings.simplefilter("ignore", category=UserWarning)
        gdf["centroid_x"] = gdf.geometry.centroid.x
        gdf["centroid_y"] = gdf.geometry.centroid.y

    # Build quadtree and label clusters
    clusters = build_quadtree_iterative(gdf, max_size=max_size)

    cluster_array = np.zeros(len(gdf), dtype=int)
    for i, cluster_indexes in enumerate(clusters):
        cluster_array[cluster_indexes] = i

    gdf["cluster_id"] = cluster_array
    gdf = gdf.sort_values(by=["cluster_id", "centroid_x"]).reset_index(drop=True)

    unique_cluster_ids = gdf["cluster_id"].unique()

    with concurrent.futures.ProcessPoolExecutor() as executor:
        futures = {
            executor.submit(_simplify_cluster, gdf[gdf.cluster_id == cluster_id][["geometry"]], cluster_id): cluster_id
            for cluster_id in unique_cluster_ids
        }

        for future in tqdm(
            concurrent.futures.as_completed(futures),
            total=len(futures),
            desc="Simplifying clusters",
            smoothing=0.5,
        ):
            cluster_id, simplified_geom = future.result()
            gdf.loc[gdf["cluster_id"] == cluster_id, "geometry"] = simplified_geom.values

    return gdf

random_points_from_gdf(gdf, num_points_per_geometry=10, buffer=-10)

Generate random points from geometries in a GeoDataFrame, with optional buffering.

Parameters:

Name Type Description Default
gdf GeoDataFrame

GeoDataFrame containing geometries (Polygon, MultiPolygon, or Point).

required
num_points_per_geometry int

Number of points to generate per geometry (default is 10).

10
buffer int

Buffer distance to apply to geometries before generating points (default is -10).

-10

Returns:

Type Description
GeoDataFrame

GeoDataFrame of generated points merged with original attributes.

Source code in agrigee_lite/misc.py
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
def random_points_from_gdf(
    gdf: gpd.GeoDataFrame, num_points_per_geometry: int = 10, buffer: int = -10
) -> gpd.GeoDataFrame:
    """
    Generate random points from geometries in a GeoDataFrame, with optional buffering.

    Parameters
    ----------
    gdf : geopandas.GeoDataFrame
        GeoDataFrame containing geometries (Polygon, MultiPolygon, or Point).
    num_points_per_geometry : int, optional
        Number of points to generate per geometry (default is 10).
    buffer : int, optional
        Buffer distance to apply to geometries before generating points (default is -10).

    Returns
    -------
    geopandas.GeoDataFrame
        GeoDataFrame of generated points merged with original attributes.
    """
    if buffer != 0:
        gdf = gdf.copy()
        gdf = quadtree_clustering(gdf)
        gdf["geometry"] = gdf.to_crs(gdf.estimate_utm_crs()).buffer(-10).to_crs("EPSG:4326")

    gdf["geometry_id"] = pd.factorize(gdf["geometry"])[0]
    points_gdf = generate_grid_random_points_from_gdf(gdf, num_points_per_geometry)
    points_gdf = points_gdf.merge(
        gdf.drop(columns=["geometry"]).reset_index().rename(columns={"index": "original_index"}),
        on="geometry_id",
        how="inner",
    )
    points_gdf = points_gdf[points_gdf.geometry.x != 0].reset_index(drop=True)

    return points_gdf

images(geometry, start_date, end_date, satellite, invalid_images_threshold=0.5, max_parallel_downloads=40, force_redownload=False, image_indices=None)

Download multiple satellite images for a given geometry and date range.

Parameters:

Name Type Description Default
geometry Polygon or MultiPolygon

The area of interest as a shapely Polygon or MultiPolygon.

required
start_date Timestamp or str

Start date for image collection.

required
end_date Timestamp or str

End date for image collection.

required
satellite AbstractSatellite

The satellite configuration to use for image collection.

required
invalid_images_threshold float

Threshold for filtering images based on valid pixels (0.0-1.0), by default 0.5.

0.5
max_parallel_downloads int

Maximum number of parallel downloads, by default 40.

40
force_redownload bool

Whether to force re-download of existing files, by default False.

False
image_indices list[int] or None

List of specific image indices to download (e.g., [0, 1] for first two images). If None, all images in the date range will be downloaded, by default None.

None

Returns:

Type Description
list[str]

List of image names (dates in YYYY-MM-DD format) that were downloaded.

Source code in agrigee_lite/get/image.py
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
def download_multiple_images(  # noqa: C901
    geometry: Polygon | MultiPolygon,
    start_date: pd.Timestamp | str,
    end_date: pd.Timestamp | str,
    satellite: AbstractSatellite,
    invalid_images_threshold: float = 0.5,
    max_parallel_downloads: int = 40,
    force_redownload: bool = False,
    image_indices: list[int] | None = None,
) -> list[str]:
    """
    Download multiple satellite images for a given geometry and date range.

    Parameters
    ----------
    geometry : Polygon or MultiPolygon
        The area of interest as a shapely Polygon or MultiPolygon.
    start_date : pd.Timestamp or str
        Start date for image collection.
    end_date : pd.Timestamp or str
        End date for image collection.
    satellite : AbstractSatellite
        The satellite configuration to use for image collection.
    invalid_images_threshold : float, optional
        Threshold for filtering images based on valid pixels (0.0-1.0), by default 0.5.
    max_parallel_downloads : int, optional
        Maximum number of parallel downloads, by default 40.
    force_redownload : bool, optional
        Whether to force re-download of existing files, by default False.
    image_indices : list[int] or None, optional
        List of specific image indices to download (e.g., [0, 1] for first two images).
        If None, all images in the date range will be downloaded, by default None.

    Returns
    -------
    list[str]
        List of image names (dates in YYYY-MM-DD format) that were downloaded.
    """

    start_date = start_date.strftime("%Y-%m-%d") if isinstance(start_date, pd.Timestamp) else start_date
    end_date = end_date.strftime("%Y-%m-%d") if isinstance(end_date, pd.Timestamp) else end_date

    ee_geometry = ee.Geometry(geometry.__geo_interface__)
    ee_feature = ee.Feature(
        ee_geometry,
        {"s": start_date, "e": end_date, "0": 1},
    )
    ee_expression = satellite.imageCollection(ee_feature)

    metadata_dict: dict[str, str] = {}
    metadata_dict |= log_dict_function_call_summary([
        "geometry",
        "start_date",
        "end_date",
        "satellite",
        "max_parallel_downloads",
        "force_redownload",
    ])
    metadata_dict |= satellite.log_dict()
    metadata_dict["start_date"] = start_date
    metadata_dict["end_date"] = end_date
    metadata_dict["centroid_x"] = geometry.centroid.x
    metadata_dict["centroid_y"] = geometry.centroid.y

    if ee_expression.size().getInfo() == 0:
        print("No images found for the specified parameters.")
        return np.array([]), []

    max_valid_pixels = ee_expression.aggregate_max("ZZ_USER_VALID_PIXELS")
    threshold = ee.Number(max_valid_pixels).multiply(invalid_images_threshold)
    ee_expression = ee_expression.filter(ee.Filter.gte("ZZ_USER_VALID_PIXELS", threshold))

    image_names = ee_expression.aggregate_array("ZZ_USER_TIME_DUMMY").getInfo()
    image_indexes = ee_expression.aggregate_array("system:index").getInfo()

    # Filter images by indices if provided
    if image_indices is not None:
        # Ensure indices are valid
        valid_indices = [i for i in image_indices if 0 <= i < len(image_indexes)]
        if not valid_indices:
            print("No valid image indices provided.")
            return np.array([]), []

        image_names = [image_names[i] for i in valid_indices]
        image_indexes = [image_indexes[i] for i in valid_indices]

    output_path = pathlib.Path("data/temp/images") / f"{create_dict_hash(metadata_dict)}"
    output_path.mkdir(parents=True, exist_ok=True)

    if force_redownload:
        for f in output_path.glob("*.zip"):
            f.unlink()

    downloader = DownloaderStrategy(download_folder=output_path)

    already_downloaded_files = {int(x.stem) for x in output_path.glob("*.zip")}
    all_chunks = set(range(len(image_indexes)))
    pending_chunks = sorted(all_chunks - already_downloaded_files)

    pbar = tqdm(total=len(pending_chunks), desc=f"Downloading images ({output_path.name})", unit="feature")

    def update_pbar():
        pbar.n = downloader.num_completed_downloads
        pbar.refresh()
        pbar.set_postfix({
            "aria2_errors": downloader.num_downloads_with_error,
            "active_downloads": downloader.num_unfinished_downloads,
        })

    def download_task(chunk_index):
        try:
            img = ee.Image(ee_expression.filter(ee.Filter.eq("system:index", image_indexes[chunk_index])).first())
            # Use only the image date as filename (GEE standard format)
            image_date = image_names[chunk_index]
            filename = f"{image_date}"
            url = img.getDownloadURL({"name": filename, "region": ee_geometry})
            downloader.add_download([(chunk_index, url)])
            return chunk_index, True  # noqa: TRY300
        except Exception as _:
            return chunk_index, False

    while downloader.num_completed_downloads < len(pending_chunks):
        with ThreadPoolExecutor(max_workers=max_parallel_downloads) as executor:
            futures = {executor.submit(download_task, chunk): chunk for chunk in pending_chunks}

            failed_chunks = []
            for future in as_completed(futures):
                chunk, success = future.result()
                if not success:
                    failed_chunks.append(chunk)
                    logging.warning(f"Download images - {output_path} - Failed to initiate download for chunk {chunk}.")

                update_pbar()

                while downloader.num_unfinished_downloads >= max_parallel_downloads:
                    time.sleep(1)
                    update_pbar()

        while downloader.num_unfinished_downloads > 0:
            time.sleep(1)
            update_pbar()

        pending_chunks = sorted(set(failed_chunks + downloader.failed_downloads))

    update_pbar()
    pbar.close()

    return image_names

multiple_sits(gdf, satellite, reducers=None, original_index_column_name='original_index', start_date_column_name='start_date', end_date_column_name='end_date', subsampling_max_pixels=1000, chunksize=10, max_parallel_downloads=40, force_redownload=False)

Download satellite time series for multiple geometries using parallel processing.

Parameters:

Name Type Description Default
gdf GeoDataFrame

GeoDataFrame containing geometries and temporal information.

required
satellite AbstractSatellite

Satellite configuration object.

required
reducers set[str] or None

Set of reducer names to apply, by default None.

None
original_index_column_name str

Name of the column to store original indices, by default "original_index".

'original_index'
start_date_column_name str

Name of the start date column, by default "start_date".

'start_date'
end_date_column_name str

Name of the end date column, by default "end_date".

'end_date'
subsampling_max_pixels float

Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.

1000
chunksize int

Number of features to process per chunk, by default 10.

10
max_parallel_downloads int

Maximum number of parallel downloads, by default 40.

40
force_redownload bool

Whether to force re-download of existing data, by default False.

False

Returns:

Type Description
DataFrame

Combined DataFrame containing satellite time series for all geometries.

Source code in agrigee_lite/get/sits.py
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
def download_multiple_sits(  # noqa: C901
    gdf: gpd.GeoDataFrame,
    satellite: AbstractSatellite,
    reducers: set[str] | None = None,
    original_index_column_name: str = "original_index",
    start_date_column_name: str = "start_date",
    end_date_column_name: str = "end_date",
    subsampling_max_pixels: float = 1_000,
    chunksize: int = 10,
    max_parallel_downloads: int = 40,
    force_redownload: bool = False,
) -> pd.DataFrame:
    """
    Download satellite time series for multiple geometries using parallel processing.

    Parameters
    ----------
    gdf : gpd.GeoDataFrame
        GeoDataFrame containing geometries and temporal information.
    satellite : AbstractSatellite
        Satellite configuration object.
    reducers : set[str] or None, optional
        Set of reducer names to apply, by default None.
    original_index_column_name : str, optional
        Name of the column to store original indices, by default "original_index".
    start_date_column_name : str, optional
        Name of the start date column, by default "start_date".
    end_date_column_name : str, optional
        Name of the end date column, by default "end_date".
    subsampling_max_pixels : float, optional
        Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.
    chunksize : int, optional
        Number of features to process per chunk, by default 10.
    max_parallel_downloads : int, optional
        Maximum number of parallel downloads, by default 40.
    force_redownload : bool, optional
        Whether to force re-download of existing data, by default False.

    Returns
    -------
    pd.DataFrame
        Combined DataFrame containing satellite time series for all geometries.
    """
    if len(gdf) == 0:
        return pd.DataFrame()

    gdf = sanitize_and_prepare_input_gdf(
        gdf, satellite, original_index_column_name, 1000, start_date_column_name, end_date_column_name
    )

    metadata_dict: dict[str, str] = {}
    metadata_dict |= log_dict_function_call_summary(["gdf", "satellite", "max_parallel_downloads", "force_redownload"])
    metadata_dict |= satellite.log_dict()

    output_path = (
        pathlib.Path("data/temp/sits")
        / f"{create_gdf_hash(gdf, start_date_column_name, end_date_column_name)}_{create_dict_hash(metadata_dict)}"
    )

    if force_redownload:
        for f in output_path.glob("*"):
            f.unlink()

    output_path.mkdir(parents=True, exist_ok=True)

    downloader = DownloaderStrategy(download_folder=output_path)

    num_chunks = (len(gdf) + chunksize - 1) // chunksize

    already_downloaded_files = [int(x.stem) for x in output_path.glob("*.csv")]
    logging.info(output_path, "-", len(already_downloaded_files), "chunks already downloaded and will be skipped.")
    initial_download_chunks = sorted(set(range(num_chunks)) - set(already_downloaded_files))

    pbar = tqdm(
        total=len(initial_download_chunks) * chunksize,
        desc=f"Building download URLs ({output_path.name})",
        unit="feature",
        smoothing=0,
    )

    def update_pbar():
        pbar.n = downloader.num_completed_downloads * chunksize
        pbar.refresh()
        pbar.set_postfix({
            "not_sent_to_server": len(not_sent_to_server),
            "aria2_errors": downloader.num_downloads_with_error,
            "active_downloads": downloader.num_unfinished_downloads,
        })

    to_download_chunks = initial_download_chunks
    while downloader.num_completed_downloads != len(initial_download_chunks):
        not_sent_to_server = []

        for current_chunk in to_download_chunks:
            while downloader.num_unfinished_downloads >= max_parallel_downloads:
                time.sleep(1)
                update_pbar()

            sub = gdf.iloc[current_chunk * chunksize : (current_chunk + 1) * chunksize]
            ee_expression = build_ee_expression(
                sub,
                satellite,
                reducers,
                subsampling_max_pixels,
                original_index_column_name,
                start_date_column_name,
                end_date_column_name,
            )

            try:
                url = ee_expression.getDownloadURL(
                    filetype="csv",
                    selectors=build_selectors(satellite, reducers),
                    filename=f"{current_chunk}",
                )

                downloader.add_download([(current_chunk, url)])

            except KeyboardInterrupt:
                pbar.close()
                raise

            except ee.ee_exception.EEException:
                logging.exception(output_path, "- Chunk id =", current_chunk, " - Failed to get download URL.")
                not_sent_to_server.append(current_chunk)

            update_pbar()

        to_download_chunks = sorted(set(not_sent_to_server) | set(downloader.failed_downloads))
        while len(to_download_chunks) == 0 and (downloader.num_completed_downloads != len(initial_download_chunks)):
            time.sleep(1)
            to_download_chunks = sorted(set(not_sent_to_server) | set(downloader.failed_downloads))
            update_pbar()

    pbar.close()
    whole_result_df = pd.DataFrame()
    for f in output_path.glob("*.csv"):
        df = pd.read_csv(f)
        whole_result_df = pd.concat([whole_result_df, df], ignore_index=True)

    whole_result_df = prepare_output_df(whole_result_df, satellite, original_index_column_name)

    return whole_result_df

multiple_sits_gcs(gdf, satellite, bucket_name, reducers=None, original_index_column_name='original_index', start_date_column_name='start_date', end_date_column_name='end_date', subsampling_max_pixels=1000, cluster_size=500, force_redownload=False, wait=True)

Download satellite time series using Google Earth Engine tasks to Google Cloud Storage.

Parameters:

Name Type Description Default
gdf GeoDataFrame

GeoDataFrame containing geometries and temporal information.

required
satellite AbstractSatellite

Satellite configuration object.

required
bucket_name str

Google Cloud Storage bucket name for exports.

required
reducers set[str] or None

Set of reducer names to apply, by default None.

None
original_index_column_name str

Name of the column to store original indices, by default "original_index".

'original_index'
start_date_column_name str

Name of the start date column, by default "start_date".

'start_date'
end_date_column_name str

Name of the end date column, by default "end_date".

'end_date'
subsampling_max_pixels float

Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.

1000
cluster_size int

Maximum cluster size for spatial grouping, by default 500.

500
force_redownload bool

Whether to force re-download of existing data, by default False.

False
wait bool

Whether to wait for task completion, by default True.

True

Returns:

Type Description
None or DataFrame

If wait is True, returns DataFrame with combined results. If wait is False, returns None.

Source code in agrigee_lite/get/sits.py
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
def download_multiple_sits_chunks_gcs(
    gdf: gpd.GeoDataFrame,
    satellite: AbstractSatellite,
    bucket_name: str,
    reducers: set[str] | None = None,
    original_index_column_name: str = "original_index",
    start_date_column_name: str = "start_date",
    end_date_column_name: str = "end_date",
    subsampling_max_pixels: float = 1_000,
    cluster_size: int = 500,
    force_redownload: bool = False,
    wait: bool = True,
) -> None | pd.DataFrame:
    """
    Download satellite time series using Google Earth Engine tasks to Google Cloud Storage.

    Parameters
    ----------
    gdf : gpd.GeoDataFrame
        GeoDataFrame containing geometries and temporal information.
    satellite : AbstractSatellite
        Satellite configuration object.
    bucket_name : str
        Google Cloud Storage bucket name for exports.
    reducers : set[str] or None, optional
        Set of reducer names to apply, by default None.
    original_index_column_name : str, optional
        Name of the column to store original indices, by default "original_index".
    start_date_column_name : str, optional
        Name of the start date column, by default "start_date".
    end_date_column_name : str, optional
        Name of the end date column, by default "end_date".
    subsampling_max_pixels : float, optional
        Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.
    cluster_size : int, optional
        Maximum cluster size for spatial grouping, by default 500.
    force_redownload : bool, optional
        Whether to force re-download of existing data, by default False.
    wait : bool, optional
        Whether to wait for task completion, by default True.

    Returns
    -------
    None or pd.DataFrame
        If wait is True, returns DataFrame with combined results.
        If wait is False, returns None.
    """
    from smart_open import open  # noqa: A004

    if len(gdf) == 0:
        logging.warning("Empty GeoDataFrame, nothing to download")
        return None

    def download_multiple_sits_task_gcs(
        gdf: gpd.GeoDataFrame,
        satellite: AbstractSatellite,
        bucket_name: str,
        file_path: str,
        reducers: set[str] | None = None,
        original_index_column_name: str = "original_index",
        start_date_column_name: str = "start_date",
        end_date_column_name: str = "end_date",
        subsampling_max_pixels: float = 1_000,
        taskname: str = "",
    ) -> ee.batch.Task:
        """
        Create a Google Earth Engine export task to Google Cloud Storage.

        Parameters
        ----------
        gdf : gpd.GeoDataFrame
            GeoDataFrame containing geometries and temporal information.
        satellite : AbstractSatellite
            Satellite configuration object.
        bucket_name : str
            Google Cloud Storage bucket name.
        file_path : str
            File path within the bucket for the exported file.
        reducers : set[str] or None, optional
            Set of reducer names to apply, by default None.
        original_index_column_name : str, optional
            Name of the column to store original indices, by default "original_index".
        start_date_column_name : str, optional
            Name of the start date column, by default "start_date".
        end_date_column_name : str, optional
            Name of the end date column, by default "end_date".
        subsampling_max_pixels : float, optional
            Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.
        taskname : str, optional
            Custom task name, by default "".

        Returns
        -------
        ee.batch.Task
            Earth Engine export task object.
        """
        if taskname == "":
            taskname = file_path

        ee_expression = build_ee_expression(
            gdf,
            satellite,
            reducers,
            subsampling_max_pixels,
            original_index_column_name,
            start_date_column_name,
            end_date_column_name,
        )

        task = ee.batch.Export.table.toCloudStorage(
            bucket=bucket_name,
            collection=ee_expression,
            description=taskname,
            fileFormat="CSV",
            fileNamePrefix=file_path,
            selectors=build_selectors(satellite, reducers),
        )

        return task

    gdf = sanitize_and_prepare_input_gdf(
        gdf, satellite, original_index_column_name, cluster_size, start_date_column_name, end_date_column_name
    )

    task_mgr = GEETaskManager()
    tasks_df = ee_get_tasks_status()
    tasks_df = tasks_df[tasks_df.description.str.startswith("agl_")].reset_index(drop=True)
    completed_or_running_tasks = set(
        tasks_df.description.apply(lambda x: x.split("_", 1)[0] + "_" + x.split("_", 2)[2]).tolist()
    )  # The task is the same, no matter who started it

    username = getpass.getuser().replace("_", "")
    hashname = create_gdf_hash(gdf, start_date_column_name, end_date_column_name)

    gcs_save_folder = f"agl/{satellite.shortName}_{hashname}"
    metadata_dict: dict[str, str] = {}
    metadata_dict |= log_dict_function_call_summary(["gdf", "satellite"])
    metadata_dict |= satellite.log_dict()
    metadata_dict["user"] = username
    metadata_dict["creation_date"] = pd.Timestamp.now().strftime("%Y-%m-%d %H:%M:%S")

    with open(f"gs://{bucket_name}/{gcs_save_folder}/metadata.json", "w") as f:
        json.dump(metadata_dict, f, indent=4)

    with open(f"gs://{bucket_name}/{gcs_save_folder}/geodataframe.parquet", "wb") as f:
        gdf.to_parquet(f, compression="brotli")

    file_uris = []

    for cluster_id in tqdm(sorted(gdf.cluster_id.unique())):
        cluster_id = int(cluster_id)

        if (force_redownload) or (
            f"agl_multiple_sits_{satellite.shortName}_{hashname}_{cluster_id}" not in completed_or_running_tasks
        ):
            # TODO: Also skip if the file already exists in GCS
            task = download_multiple_sits_task_gcs(
                gdf[gdf.cluster_id == cluster_id],
                satellite,
                bucket_name=bucket_name,
                file_path=f"{gcs_save_folder}/{cluster_id}",
                reducers=reducers,
                original_index_column_name=original_index_column_name,
                start_date_column_name=start_date_column_name,
                end_date_column_name=end_date_column_name,
                subsampling_max_pixels=subsampling_max_pixels,
                taskname=f"agl_{username}_multiple_sits_{satellite.shortName}_{hashname}_{cluster_id}",
            )

            task_mgr.add(task)

        file_uris.append(f"gs://{bucket_name}/{gcs_save_folder}/{cluster_id}.csv")

    task_mgr.start()

    if wait:
        task_mgr.wait()

        df = pd.DataFrame()
        for file_uri in file_uris:
            with open(file_uri, "r") as f:
                sub_df = pd.read_csv(f)
                df = pd.concat([df, sub_df], ignore_index=True)

        df = prepare_output_df(df, satellite, original_index_column_name)
        return df
    else:
        return None

multiple_sits_gdrive(gdf, satellite, reducers=None, original_index_column_name='original_index', start_date_column_name='start_date', end_date_column_name='end_date', subsampling_max_pixels=1000, cluster_size=500, gee_save_folder='AGL_EXPORTS', force_redownload=False, wait=True)

Download satellite time series using Google Earth Engine tasks to Google Drive.

Parameters:

Name Type Description Default
gdf GeoDataFrame

GeoDataFrame containing geometries and temporal information.

required
satellite AbstractSatellite

Satellite configuration object.

required
reducers set[str] or None

Set of reducer names to apply, by default None.

None
original_index_column_name str

Name of the column to store original indices, by default "original_index".

'original_index'
start_date_column_name str

Name of the start date column, by default "start_date".

'start_date'
end_date_column_name str

Name of the end date column, by default "end_date".

'end_date'
subsampling_max_pixels float

Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.

1000
cluster_size int

Maximum cluster size for spatial grouping, by default 500.

500
gee_save_folder str

Google Drive folder name for saving exports, by default "AGL_EXPORTS".

'AGL_EXPORTS'
force_redownload bool

Whether to force re-download of existing data, by default False.

False
wait bool

Whether to wait for task completion, by default True.

True
Source code in agrigee_lite/get/sits.py
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
def download_multiple_sits_chunks_gdrive(
    gdf: gpd.GeoDataFrame,
    satellite: AbstractSatellite,
    reducers: set[str] | None = None,
    original_index_column_name: str = "original_index",
    start_date_column_name: str = "start_date",
    end_date_column_name: str = "end_date",
    subsampling_max_pixels: float = 1_000,
    cluster_size: int = 500,
    gee_save_folder: str = "AGL_EXPORTS",
    force_redownload: bool = False,
    wait: bool = True,
) -> None:
    """
    Download satellite time series using Google Earth Engine tasks to Google Drive.

    Parameters
    ----------
    gdf : gpd.GeoDataFrame
        GeoDataFrame containing geometries and temporal information.
    satellite : AbstractSatellite
        Satellite configuration object.
    reducers : set[str] or None, optional
        Set of reducer names to apply, by default None.
    original_index_column_name : str, optional
        Name of the column to store original indices, by default "original_index".
    start_date_column_name : str, optional
        Name of the start date column, by default "start_date".
    end_date_column_name : str, optional
        Name of the end date column, by default "end_date".
    subsampling_max_pixels : float, optional
        Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.
    cluster_size : int, optional
        Maximum cluster size for spatial grouping, by default 500.
    gee_save_folder : str, optional
        Google Drive folder name for saving exports, by default "AGL_EXPORTS".
    force_redownload : bool, optional
        Whether to force re-download of existing data, by default False.
    wait : bool, optional
        Whether to wait for task completion, by default True.
    """
    if len(gdf) == 0:
        return None

    def download_multiple_sits_task_gdrive(
        gdf: gpd.GeoDataFrame,
        satellite: AbstractSatellite,
        file_stem: str,
        reducers: set[str] | None = None,
        original_index_column_name: str = "original_index",
        start_date_column_name: str = "start_date",
        end_date_column_name: str = "end_date",
        subsampling_max_pixels: float = 1_000,
        taskname: str = "",
        gee_save_folder: str = "AGL_EXPORTS",
    ) -> ee.batch.Task:
        """
        Create a Google Earth Engine export task to Google Drive.

        Parameters
        ----------
        gdf : gpd.GeoDataFrame
            GeoDataFrame containing geometries and temporal information.
        satellite : AbstractSatellite
            Satellite configuration object.
        file_stem : str
            Base filename for the exported file.
        reducers : set[str] or None, optional
            Set of reducer names to apply, by default None.
        original_index_column_name : str, optional
            Name of the column to store original indices, by default "original_index".
        start_date_column_name : str, optional
            Name of the start date column, by default "start_date".
        end_date_column_name : str, optional
            Name of the end date column, by default "end_date".
        subsampling_max_pixels : float, optional
            Maximum pixels for sampling: >1 = absolute count, ≤1 = fraction of area (e.g., 0.5 = 50% sampling), by default 1_000.
        taskname : str, optional
            Custom task name, by default "".
        gee_save_folder : str, optional
            Google Drive folder name, by default "AGL_EXPORTS".

        Returns
        -------
        ee.batch.Task
            Earth Engine export task object.
        """
        if taskname == "":
            taskname = file_stem

        ee_expression = build_ee_expression(
            gdf,
            satellite,
            reducers,
            subsampling_max_pixels,
            original_index_column_name,
            start_date_column_name,
            end_date_column_name,
        )

        task = ee.batch.Export.table.toDrive(
            collection=ee_expression,
            description=taskname,
            fileFormat="CSV",
            fileNamePrefix=file_stem,
            folder=gee_save_folder,
            selectors=build_selectors(satellite, reducers),
        )

        return task

    gdf = sanitize_and_prepare_input_gdf(
        gdf, satellite, original_index_column_name, cluster_size, start_date_column_name, end_date_column_name
    )

    task_mgr = GEETaskManager()

    tasks_df = ee_get_tasks_status()
    tasks_df = tasks_df[tasks_df.description.str.startswith("agl_multiple_sits_")]
    completed_or_running_tasks = set(
        tasks_df.description.apply(lambda x: x.split("_", 1)[0] + "_" + x.split("_", 2)[2]).tolist()
    )  # The task is the same, no matter who started it

    username = getpass.getuser().replace("_", "")
    hashname = create_gdf_hash(gdf, start_date_column_name, end_date_column_name)

    for cluster_id in tqdm(
        sorted(gdf.cluster_id.unique()), desc=f"Creating GEE tasks ({satellite.shortName}_{hashname}_{cluster_size})"
    ):
        cluster_id = int(cluster_id)

        if (force_redownload) or (
            f"agl_multiple_sits_{satellite.shortName}_{hashname}_{cluster_id}" not in completed_or_running_tasks
        ):
            task = download_multiple_sits_task_gdrive(
                gdf[gdf.cluster_id == cluster_id],
                satellite,
                f"{satellite.shortName}_{hashname}_{cluster_id}",
                reducers=reducers,
                original_index_column_name=original_index_column_name,
                start_date_column_name=start_date_column_name,
                end_date_column_name=end_date_column_name,
                subsampling_max_pixels=subsampling_max_pixels,
                taskname=f"agl_{username}_sits_{satellite.shortName}_{hashname}_{cluster_id}",
                gee_save_folder=gee_save_folder,
            )

            task_mgr.add(task)

    task_mgr.start()  # Start all tasks at once allows user to cancel them before submitted to GEE

    if wait:
        task_mgr.wait()

ANADEM

Bases: SingleImageSatellite

Satellite abstraction for ANADEM (Altimetric and Topographic Attributes of the Brazilian Territory).

ANADEM is a DEM-derived (Digital Elevation Model) product designed to support land analysis based on elevation, slope, and aspect characteristics across the Brazilian territory. It is particularly useful for ecological zoning, terrain classification, hydrological modeling, and environmental risk assessment.

Parameters:

Name Type Description Default
bands list of str

List of bands to select. Defaults to ['elevation', 'slope', 'aspect']. - 'elevation': Ground elevation in meters. - 'slope': Degree of inclination derived from elevation. - 'aspect': Direction of slope (0-360°), where 0 = North.

None
border_pixels_to_erode float

Number of pixels to erode from the geometry border to reduce edge artifacts.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

50_000
Satellite Information

+------------------------------------+-----------------------------+ | Field | Value | +------------------------------------+-----------------------------+ | Name | ANADEM | | Resolution | 30 meters | | Source | FURGS, ANA | | Coverage | Brazil | | Derived From | SRTM + auxiliary DEMs | +------------------------------------+-----------------------------+

Band Information

+-----------+----------------------------+---------------------------+ | Band Name | Description | Unit / Range | +-----------+----------------------------+---------------------------+ | elevation | Ground elevation | meters above sea level | | slope | Terrain slope | degrees (0°-90°) | | aspect | Orientation of slope | degrees (0°-360° from N) | +-----------+----------------------------+---------------------------+

Notes
  • The slope and aspect bands are computed from the elevation layer using the ee.Terrain.products() utility.
  • The compute() method calculates:
    • Mean elevation over the region.
    • Percentage breakdown of slope classes:
      • Flat (0-3°), Gentle (3-8°), Undulating (8-20°), Strong (20-45°), Mountainous (45-75°), and Steep (>75°).
    • Percentage breakdown of aspect classes:
      • North, NE, East, SE, South, SW, West, NW.
  • These statistics are returned as a FeatureCollection with a single feature containing the computed values.
  • Reference paper: https://www.mdpi.com/2072-4292/16/13/2321
  • Data source: https://hge-iph.github.io/anadem/
Source code in agrigee_lite/sat/dem.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
class ANADEM(SingleImageSatellite):
    """
    Satellite abstraction for ANADEM (Altimetric and Topographic Attributes of the Brazilian Territory).

    ANADEM is a DEM-derived (Digital Elevation Model) product designed to support land analysis
    based on elevation, slope, and aspect characteristics across the Brazilian territory.
    It is particularly useful for ecological zoning, terrain classification, hydrological modeling,
    and environmental risk assessment.

    Parameters
    ----------
    bands : list of str, optional
        List of bands to select. Defaults to ['elevation', 'slope', 'aspect'].
        - 'elevation': Ground elevation in meters.
        - 'slope': Degree of inclination derived from elevation.
        - 'aspect': Direction of slope (0-360°), where 0 = North.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border to reduce edge artifacts.
    min_area_to_keep_border : int, default=50_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Satellite Information
    ---------------------
    +------------------------------------+-----------------------------+
    | Field                              | Value                       |
    +------------------------------------+-----------------------------+
    | Name                               | ANADEM                      |
    | Resolution                         | 30 meters                   |
    | Source                             | FURGS, ANA                  |
    | Coverage                           | Brazil                      |
    | Derived From                       | SRTM + auxiliary DEMs       |
    +------------------------------------+-----------------------------+

    Band Information
    ----------------
    +-----------+----------------------------+---------------------------+
    | Band Name | Description                | Unit / Range              |
    +-----------+----------------------------+---------------------------+
    | elevation | Ground elevation           | meters above sea level    |
    | slope     | Terrain slope              | degrees (0°-90°)          |
    | aspect    | Orientation of slope       | degrees (0°-360° from N)  |
    +-----------+----------------------------+---------------------------+

    Notes
    -----
    - The slope and aspect bands are computed from the elevation layer using the
    `ee.Terrain.products()` utility.
    - The `compute()` method calculates:
        - Mean elevation over the region.
        - Percentage breakdown of slope classes:
            - Flat (0-3°), Gentle (3-8°), Undulating (8-20°),
            Strong (20-45°), Mountainous (45-75°), and Steep (>75°).
        - Percentage breakdown of aspect classes:
            - North, NE, East, SE, South, SW, West, NW.
    - These statistics are returned as a `FeatureCollection` with a single feature
    containing the computed values.
    - Reference paper: https://www.mdpi.com/2072-4292/16/13/2321
    - Data source: https://hge-iph.github.io/anadem/
    """

    def __init__(
        self,
        bands: list[str] | None = None,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 50_000,
    ):
        if bands is None:
            bands = ["elevation", "slope", "aspect"]

        super().__init__()

        self.imageName: str = "projects/et-brasil/assets/anadem/v1"
        self.pixelSize: int = 30
        self.shortName: str = "anadem"

        self.selectedBands: list[tuple[str, str]] = [(band, f"{band}") for band in bands]

        self.startDate = "1900-01-01"
        self.endDate = "2050-01-01"
        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode

        self.toDownloadSelectors = self._build_to_download_selectors()

    def _build_to_download_selectors(self) -> list[str]:
        selectors = []

        band_aliases = [alias for _, alias in self.selectedBands]

        if "elevation" in band_aliases:
            selectors += ["40_elevation_mean"]

        if "slope" in band_aliases:
            selectors += [
                "41_slope_flat",
                "42_slope_gentle",
                "43_slope_undulating",
                "44_slope_strong",
                "45_slope_mountainous",
                "46_slope_steep",
            ]

        if "aspect" in band_aliases:
            selectors += [
                "47_cardinal_n",
                "48_cardinal_ne",
                "49_cardinal_e",
                "50_cardinal_se",
                "51_cardinal_s",
                "52_cardinal_sw",
                "53_cardinal_w",
                "54_cardinal_nw",
            ]

        return selectors

    def image(self, ee_feature: ee.Feature) -> ee.Image:
        image = ee.Image(self.imageName).updateMask(ee.Image(self.imageName).neq(-9999))

        requested_bands = [b for b, _ in self.selectedBands]

        if any(b in requested_bands for b in ["slope", "aspect"]):
            terrain = ee.Terrain.products(image)
            image = image.addBands(terrain.select(["slope", "aspect"]))

        selected_band_names = [b for b, _ in self.selectedBands]
        renamed_band_names = [alias for _, alias in self.selectedBands]

        return image.select(selected_band_names, renamed_band_names)

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        ee_img = self.image(ee_feature)
        ee_img = ee_map_valid_pixels(ee_img, ee_geometry, self.pixelSize)

        selected_band_names = [alias for _, alias in self.selectedBands]

        stats_dict = {
            "00_indexnum": ee_feature.get("0"),
        }

        # --- Elevation mean ---
        if "elevation" in selected_band_names:
            elevation_mean = (
                ee_img.select("elevation")
                .reduceRegion(
                    reducer=ee.Reducer.mean(),
                    geometry=ee_geometry,
                    scale=self.pixelSize,
                    maxPixels=subsampling_max_pixels,
                    bestEffort=True,
                )
                .get("elevation")
            )
            stats_dict["40_elevation_mean"] = elevation_mean

        # --- Slope class breakdown ---
        if "slope" in selected_band_names:
            slope = ee_img.select("slope")

            slope_classes = {
                "41_slope_flat": slope.gte(0).And(slope.lt(3)),
                "42_slope_gentle": slope.gte(3).And(slope.lt(8)),
                "43_slope_undulating": slope.gte(8).And(slope.lt(20)),
                "44_slope_strong": slope.gte(20).And(slope.lt(45)),
                "45_slope_mountainous": slope.gte(45).And(slope.lte(75)),
                "46_slope_steep": slope.gt(75),
            }

            valid_mask = ee_img.select("slope").mask()
            total_pixels = (
                ee.Image(1)
                .updateMask(valid_mask)
                .reduceRegion(
                    reducer=ee.Reducer.count(),
                    geometry=ee_geometry,
                    scale=self.pixelSize,
                    maxPixels=subsampling_max_pixels,
                    bestEffort=True,
                )
                .getNumber("constant")
            )

            for class_name, mask in slope_classes.items():
                count = (
                    ee.Image(1)
                    .updateMask(mask)
                    .reduceRegion(
                        reducer=ee.Reducer.count(),
                        geometry=ee_geometry,
                        scale=self.pixelSize,
                        maxPixels=subsampling_max_pixels,
                        bestEffort=True,
                    )
                    .getNumber("constant")
                )
                percent = count.divide(total_pixels)
                stats_dict[class_name] = percent

        # --- Aspect class breakdown ---
        if "aspect" in selected_band_names:
            aspect = ee_img.select("aspect")

            aspect_classes = {
                "47_cardinal_n": aspect.gte(337.5).Or(aspect.lt(22.5)),
                "48_cardinal_ne": aspect.gte(22.5).And(aspect.lt(67.5)),
                "49_cardinal_e": aspect.gte(67.5).And(aspect.lt(112.5)),
                "50_cardinal_se": aspect.gte(112.5).And(aspect.lt(157.5)),
                "51_cardinal_s": aspect.gte(157.5).And(aspect.lt(202.5)),
                "52_cardinal_sw": aspect.gte(202.5).And(aspect.lt(247.5)),
                "53_cardinal_w": aspect.gte(247.5).And(aspect.lt(292.5)),
                "54_cardinal_nw": aspect.gte(292.5).And(aspect.lt(337.5)),
            }

            valid_aspect = aspect.mask()
            total_aspect_pixels = (
                ee.Image(1)
                .updateMask(valid_aspect)
                .reduceRegion(
                    reducer=ee.Reducer.count(),
                    geometry=ee_geometry,
                    scale=self.pixelSize,
                    maxPixels=subsampling_max_pixels,
                    bestEffort=True,
                )
                .getNumber("constant")
            )

            for class_name, mask in aspect_classes.items():
                count = (
                    ee.Image(1)
                    .updateMask(mask)
                    .reduceRegion(
                        reducer=ee.Reducer.count(),
                        geometry=ee_geometry,
                        scale=self.pixelSize,
                        maxPixels=subsampling_max_pixels,
                        bestEffort=True,
                    )
                    .getNumber("constant")
                )
                percent = count.divide(total_aspect_pixels)
                stats_dict[class_name] = percent

        # --- ValidPixelCount ---
        valid_pixel_count = (
            ee_img.select(selected_band_names[0])
            .mask()
            .reduceRegion(
                reducer=ee.Reducer.count(),
                geometry=ee_geometry,
                scale=self.pixelSize,
                maxPixels=subsampling_max_pixels,
                bestEffort=True,
            )
            .getNumber(selected_band_names[0])
        )
        stats_dict["99_validPixelsCount"] = valid_pixel_count

        stats_feature = ee.Feature(None, stats_dict)
        return ee.FeatureCollection([stats_feature])

Landsat5

Bases: AbstractLandsat

Satellite abstraction for Landsat 5 (TM sensor, Collection 2).

Landsat 5 was launched in 1984 and provided more than 29 years of Earth observation data. This class supports both TOA and SR products, with optional cloud masking using the QA_PIXEL band.

Parameters:

Name Type Description Default
bands set of str

Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].

None
indices set of str

Spectral indices to compute from the selected bands.

None
use_sr bool

Whether to use surface reflectance products ('SR_B' bands). If False, uses top-of-atmosphere reflectance ('B' bands).

True
tier int

Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.

1
use_cloud_mask bool

Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

12
toa_cloud_filter_strength int

Strength of the additional cloud filter applied to TOA imagery (if use_sr=False). Used in the remove_l_toa_tough_clouds step.

15
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

50_000
Cloud Masking

Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS: - Applied to both TOA and SR products when use_cloud_mask=True - For TOA collections, an additional filter (remove_l_toa_tough_clouds) is applied to remove low-quality observations based on a simple cloud scoring method.

Satellite Information

+----------------------------+------------------------+ | Field | Value | +----------------------------+------------------------+ | Name | Landsat 5 TM | | Sensor | TM (Thematic Mapper) | | Platform | Landsat 5 | | Temporal Resolution | 16 days | | Pixel Size | 30 meters | | Coverage | Global | +----------------------------+------------------------+

Collection Dates

+-------------+------------+------------+ | Product | Start Date | End Date | +-------------+------------+------------+ | TOA | 1984-03-01 | 2013-05-05 | | SR | 1984-03-01 | 2012-05-05 | +-------------+------------+------------+

Band Information

+-----------+----------+-----------+------------------------+ | Band Name | TOA Name | SR Name | Spectral Wavelength | +-----------+----------+-----------+------------------------+ | blue | B1 | SR_B1 | 450-520 nm | | green | B2 | SR_B2 | 520-600 nm | | red | B3 | SR_B3 | 630-690 nm | | nir | B4 | SR_B4 | 770-900 nm | | swir1 | B5 | SR_B5 | 1550-1750 nm | | swir2 | B7 | SR_B7 | 2090-2350 nm | +-----------+----------+-----------+------------------------+

Notes
  • Landsat 5 TOA Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T1_TOA

  • Landsat 5 TOA Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T2_TOA

  • Landsat 5 SR Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T1_L2

  • Landsat 5 SR Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T2_L2

  • Cloud mask reference (QA_PIXEL flags): https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

  • TOA cloud filtering (Simple Cloud Score): https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score

Source code in agrigee_lite/sat/landsat.py
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
class Landsat5(AbstractLandsat):
    """
    Satellite abstraction for Landsat 5 (TM sensor, Collection 2).

    Landsat 5 was launched in 1984 and provided more than 29 years of Earth observation data.
    This class supports both TOA and SR products, with optional cloud masking using the QA_PIXEL band.

    Parameters
    ----------
    bands : set of str, optional
        Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].
    indices : set of str, optional
        Spectral indices to compute from the selected bands.
    use_sr : bool, default=True
        Whether to use surface reflectance products ('SR_B*' bands).
        If False, uses top-of-atmosphere reflectance ('B*' bands).
    tier : int, default=1
        Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.
    use_cloud_mask : bool, default=True
        Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.
    min_valid_pixel_count : int, default=12
        Minimum number of valid (non-cloud) pixels required to retain an image.
    toa_cloud_filter_strength : int, default=15
        Strength of the additional cloud filter applied to TOA imagery (if use_sr=False).
        Used in the `remove_l_toa_tough_clouds` step.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=50_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Cloud Masking
    -------------
    Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS:
    - Applied to both TOA and SR products when `use_cloud_mask=True`
    - For TOA collections, an additional filter (`remove_l_toa_tough_clouds`) is applied
    to remove low-quality observations based on a simple cloud scoring method.

    Satellite Information
    ---------------------
    +----------------------------+------------------------+
    | Field                      | Value                  |
    +----------------------------+------------------------+
    | Name                       | Landsat 5 TM           |
    | Sensor                     | TM (Thematic Mapper)   |
    | Platform                   | Landsat 5              |
    | Temporal Resolution        | 16 days                |
    | Pixel Size                 | 30 meters              |
    | Coverage                   | Global                 |
    +----------------------------+------------------------+

    Collection Dates
    ----------------
    +-------------+------------+------------+
    | Product     | Start Date | End Date  |
    +-------------+------------+------------+
    | TOA         | 1984-03-01 | 2013-05-05 |
    | SR          | 1984-03-01 | 2012-05-05 |
    +-------------+------------+------------+

    Band Information
    ----------------
    +-----------+----------+-----------+------------------------+
    | Band Name | TOA Name | SR Name   | Spectral Wavelength    |
    +-----------+----------+-----------+------------------------+
    | blue      | B1       | SR_B1     | 450-520 nm             |
    | green     | B2       | SR_B2     | 520-600 nm             |
    | red       | B3       | SR_B3     | 630-690 nm             |
    | nir       | B4       | SR_B4     | 770-900 nm             |
    | swir1     | B5       | SR_B5     | 1550-1750 nm           |
    | swir2     | B7       | SR_B7     | 2090-2350 nm           |
    +-----------+----------+-----------+------------------------+

    Notes
    -----
    - Landsat 5 TOA Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T1_TOA

    - Landsat 5 TOA Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T2_TOA

    - Landsat 5 SR Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T1_L2

    - Landsat 5 SR Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LT05_C02_T2_L2

    - Cloud mask reference (QA_PIXEL flags):
        https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

    - TOA cloud filtering (Simple Cloud Score):
        https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        use_sr: bool = True,
        tier: int = 1,
        use_cloud_mask: bool = True,
        min_valid_pixel_count: int = 12,
        toa_cloud_filter_strength: int = 15,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 50_000,
    ):
        toa = {"blue": "B1", "green": "B2", "red": "B3", "nir": "B4", "swir1": "B5", "swir2": "B7"}
        sr = {
            "blue": "SR_B1",
            "green": "SR_B2",
            "red": "SR_B3",
            "nir": "SR_B4",
            "swir1": "SR_B5",
            "swir2": "SR_B7",
        }
        super().__init__(
            indices=indices,
            sensor_code="LT05",
            toa_band_map=toa,
            sr_band_map=sr,
            short_base="l5",
            start_date="1984-03-01",
            end_date="2013-05-05",
            bands=bands,
            use_sr=use_sr,
            tier=tier,
            use_cloud_mask=use_cloud_mask,
            min_valid_pixel_count=min_valid_pixel_count,
            toa_cloud_filter_strength=toa_cloud_filter_strength,
            border_pixels_to_erode=border_pixels_to_erode,
            min_area_to_keep_border=min_area_to_keep_border,
            use_pan_sharpening=False,
        )

Landsat7

Bases: AbstractLandsat

Satellite abstraction for Landsat 7 (ETM+ sensor, Collection 2).

Landsat 7 was launched in 1999 and provided over two decades of data. This class supports both TOA and SR products, with optional cloud masking using the QA_PIXEL band.

Parameters:

Name Type Description Default
bands set of str

Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].

None
indices set of str

Spectral indices to compute from the selected bands.

None
use_sr bool

Whether to use surface reflectance products ('SR_B' bands). If False, uses top-of-atmosphere reflectance ('B' bands).

True
tier int

Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.

1
use_cloud_mask bool

Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

12
toa_cloud_filter_strength int

Strength of the additional cloud filter applied to TOA imagery (if use_sr=False). Used in the remove_l_toa_tough_clouds step.

15
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

50_000
use_pan_sharpening bool

If True, applies pan sharpening to the RGB bands using the 15m-resolution panchromatic band (B8). Only applicable when use_sr=False. Raises ValueError if used with SR products.

False
Cloud Masking

Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS: - Applied to both TOA and SR products when use_cloud_mask=True - For TOA collections, an additional filter (remove_l_toa_tough_clouds) is applied to remove low-quality observations based on a simple cloud scoring method.

Satellite Information

+----------------------------+------------------------+ | Field | Value | +----------------------------+------------------------+ | Name | Landsat 7 ETM+ | | Sensor | ETM+ (Enhanced TM Plus)| | Platform | Landsat 7 | | Temporal Resolution | 16 days | | Pixel Size | 30 meters | | Coverage | Global | +----------------------------+------------------------+

Collection Dates

+-------------+------------+------------+ | Product | Start Date | End Date | +-------------+------------+------------+ | TOA | 1999-04-15 | 2022-04-06 | | SR | 1999-04-15 | 2022-04-06 | +-------------+------------+------------+

Band Information

+-----------+----------+-----------+------------------------+ | Band Name | TOA Name | SR Name | Spectral Wavelength | +-----------+----------+-----------+------------------------+ | blue | B1 | SR_B1 | 450-520 nm | | green | B2 | SR_B2 | 520-600 nm | | red | B3 | SR_B3 | 630-690 nm | | nir | B4 | SR_B4 | 770-900 nm | | swir1 | B5 | SR_B5 | 1550-1750 nm | | swir2 | B7 | SR_B7 | 2090-2350 nm | | pan | B8 | — | 520-900 nm (panchromatic) | +-----------+----------+-----------+------------------------+

Notes
  • Landsat 7 TOA Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T1_TOA

  • Landsat 7 TOA Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T2_TOA

  • Landsat 7 SR Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T1_L2

  • Landsat 7 SR Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T2_L2

  • Cloud mask reference (QA_PIXEL flags): https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

  • TOA cloud filtering (Simple Cloud Score): https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score

Source code in agrigee_lite/sat/landsat.py
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
class Landsat7(AbstractLandsat):
    """
    Satellite abstraction for Landsat 7 (ETM+ sensor, Collection 2).

    Landsat 7 was launched in 1999 and provided over two decades of data.
    This class supports both TOA and SR products, with optional cloud masking using the QA_PIXEL band.

    Parameters
    ----------
    bands : set of str, optional
        Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].
    indices : set of str, optional
        Spectral indices to compute from the selected bands.
    use_sr : bool, default=True
        Whether to use surface reflectance products ('SR_B*' bands).
        If False, uses top-of-atmosphere reflectance ('B*' bands).
    tier : int, default=1
        Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.
    use_cloud_mask : bool, default=True
        Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.
    min_valid_pixel_count : int, default=12
        Minimum number of valid (non-cloud) pixels required to retain an image.
    toa_cloud_filter_strength : int, default=15
        Strength of the additional cloud filter applied to TOA imagery (if use_sr=False).
        Used in the `remove_l_toa_tough_clouds` step.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=50_000
        Minimum area (in m²) required to retain geometry after border erosion.
    use_pan_sharpening : bool, default=False
        If True, applies pan sharpening to the RGB bands using the 15m-resolution panchromatic band (B8).
        Only applicable when `use_sr=False`. Raises ValueError if used with SR products.

    Cloud Masking
    -------------
    Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS:
    - Applied to both TOA and SR products when `use_cloud_mask=True`
    - For TOA collections, an additional filter (`remove_l_toa_tough_clouds`) is applied
    to remove low-quality observations based on a simple cloud scoring method.

    Satellite Information
    ---------------------
    +----------------------------+------------------------+
    | Field                      | Value                  |
    +----------------------------+------------------------+
    | Name                       | Landsat 7 ETM+         |
    | Sensor                     | ETM+ (Enhanced TM Plus)|
    | Platform                   | Landsat 7              |
    | Temporal Resolution        | 16 days                |
    | Pixel Size                 | 30 meters              |
    | Coverage                   | Global                 |
    +----------------------------+------------------------+

    Collection Dates
    ----------------
    +-------------+------------+------------+
    | Product     | Start Date | End Date  |
    +-------------+------------+------------+
    | TOA         | 1999-04-15 | 2022-04-06 |
    | SR          | 1999-04-15 | 2022-04-06 |
    +-------------+------------+------------+

    Band Information
    ----------------
    +-----------+----------+-----------+------------------------+
    | Band Name | TOA Name | SR Name   | Spectral Wavelength    |
    +-----------+----------+-----------+------------------------+
    | blue      | B1       | SR_B1     | 450-520 nm             |
    | green     | B2       | SR_B2     | 520-600 nm             |
    | red       | B3       | SR_B3     | 630-690 nm             |
    | nir       | B4       | SR_B4     | 770-900 nm             |
    | swir1     | B5       | SR_B5     | 1550-1750 nm           |
    | swir2     | B7       | SR_B7     | 2090-2350 nm           |
    | pan       | B8       |    —      | 520-900 nm (panchromatic) |
    +-----------+----------+-----------+------------------------+

    Notes
    -----
    - Landsat 7 TOA Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T1_TOA

    - Landsat 7 TOA Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T2_TOA

    - Landsat 7 SR Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T1_L2

    - Landsat 7 SR Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LE07_C02_T2_L2

    - Cloud mask reference (QA_PIXEL flags):
        https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

    - TOA cloud filtering (Simple Cloud Score):
        https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        use_sr: bool = True,
        tier: int = 1,
        use_cloud_mask: bool = True,
        min_valid_pixel_count: int = 12,
        toa_cloud_filter_strength: int = 15,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 50_000,
        use_pan_sharpening: bool = False,
    ):
        toa = {"blue": "B1", "green": "B2", "red": "B3", "nir": "B4", "swir1": "B5", "swir2": "B7", "pan": "B8"}
        sr = {
            "blue": "SR_B1",
            "green": "SR_B2",
            "red": "SR_B3",
            "nir": "SR_B4",
            "swir1": "SR_B5",
            "swir2": "SR_B7",
        }
        super().__init__(
            indices=indices,
            sensor_code="LE07",
            toa_band_map=toa,
            sr_band_map=sr,
            short_base="l7",
            start_date="1999-04-15",
            end_date="2022-04-06",
            bands=bands,
            use_sr=use_sr,
            tier=tier,
            use_cloud_mask=use_cloud_mask,
            min_valid_pixel_count=min_valid_pixel_count,
            toa_cloud_filter_strength=toa_cloud_filter_strength,
            border_pixels_to_erode=border_pixels_to_erode,
            min_area_to_keep_border=min_area_to_keep_border,
            use_pan_sharpening=use_pan_sharpening,
        )

Landsat8

Bases: AbstractLandsat

Satellite abstraction for Landsat 8 (OLI/TIRS sensor, Collection 2).

Landsat 8 was launched in 2013 and remains in operation, delivering high-quality Earth observation data. This class supports both TOA and SR products, with optional cloud masking using the QA_PIXEL band.

Parameters:

Name Type Description Default
bands set of str

Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].

None
indices set of str

Spectral indices to compute from the selected bands.

None
use_sr bool

Whether to use surface reflectance products ('SR_B' bands). If False, uses top-of-atmosphere reflectance ('B' bands).

True
tier int

Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.

1
use_cloud_mask bool

Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

12
toa_cloud_filter_strength int

Strength of the additional cloud filter applied to TOA imagery (if use_sr=False). Used in the remove_l_toa_tough_clouds step.

15
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

50_000
use_pan_sharpening bool

If True, applies pan sharpening to the RGB bands using the 15m-resolution panchromatic band (B8). Only applicable when use_sr=False. Raises ValueError if used with SR products.

False
Cloud Masking

Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS: - Applied to both TOA and SR products when use_cloud_mask=True - For TOA collections, an additional filter (remove_l_toa_tough_clouds) is applied to remove low-quality observations based on a simple cloud scoring method.

Satellite Information

+----------------------------+------------------------+ | Field | Value | +----------------------------+------------------------+ | Name | Landsat 8 OLI/TIRS | | Sensor | OLI + TIRS | | Platform | Landsat 8 | | Temporal Resolution | 16 days | | Pixel Size | 30 meters | | Coverage | Global | +----------------------------+------------------------+

Collection Dates

+-------------+------------+------------+ | Product | Start Date | End Date | +-------------+------------+------------+ | TOA | 2013-04-11 | present | | SR | 2013-04-11 | present | +-------------+------------+------------+

Band Information

+-----------+----------+-----------+------------------------+ | Band Name | TOA Name | SR Name | Spectral Wavelength | +-----------+----------+-----------+------------------------+ | blue | B2 | SR_B2 | 450-515 nm | | green | B3 | SR_B3 | 525-600 nm | | red | B4 | SR_B4 | 630-680 nm | | nir | B5 | SR_B5 | 845-885 nm | | swir1 | B6 | SR_B6 | 1560-1660 nm | | swir2 | B7 | SR_B7 | 2100-2300 nm | | pan | B8 | — | 520-900 nm (panchromatic) | +-----------+----------+-----------+------------------------+

Notes
  • Landsat 8 TOA Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_TOA

  • Landsat 8 TOA Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T2_TOA

  • Landsat 8 SR Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2

  • Landsat 8 SR Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T2_L2

  • Cloud mask reference (QA_PIXEL flags): https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

  • TOA cloud filtering (Simple Cloud Score): https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score

Source code in agrigee_lite/sat/landsat.py
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
class Landsat8(AbstractLandsat):
    """
    Satellite abstraction for Landsat 8 (OLI/TIRS sensor, Collection 2).

    Landsat 8 was launched in 2013 and remains in operation, delivering high-quality Earth observation data.
    This class supports both TOA and SR products, with optional cloud masking using the QA_PIXEL band.

    Parameters
    ----------
    bands : set of str, optional
        Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].
    indices : set of str, optional
        Spectral indices to compute from the selected bands.
    use_sr : bool, default=True
        Whether to use surface reflectance products ('SR_B*' bands).
        If False, uses top-of-atmosphere reflectance ('B*' bands).
    tier : int, default=1
        Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.
    use_cloud_mask : bool, default=True
        Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.
    min_valid_pixel_count : int, default=12
        Minimum number of valid (non-cloud) pixels required to retain an image.
    toa_cloud_filter_strength : int, default=15
        Strength of the additional cloud filter applied to TOA imagery (if use_sr=False).
        Used in the `remove_l_toa_tough_clouds` step.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=50_000
        Minimum area (in m²) required to retain geometry after border erosion.
    use_pan_sharpening : bool, default=False
        If True, applies pan sharpening to the RGB bands using the 15m-resolution panchromatic band (B8).
        Only applicable when `use_sr=False`. Raises ValueError if used with SR products.

    Cloud Masking
    -------------
    Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS:
    - Applied to both TOA and SR products when `use_cloud_mask=True`
    - For TOA collections, an additional filter (`remove_l_toa_tough_clouds`) is applied
    to remove low-quality observations based on a simple cloud scoring method.

    Satellite Information
    ---------------------
    +----------------------------+------------------------+
    | Field                      | Value                  |
    +----------------------------+------------------------+
    | Name                       | Landsat 8 OLI/TIRS     |
    | Sensor                     | OLI + TIRS             |
    | Platform                   | Landsat 8              |
    | Temporal Resolution        | 16 days                |
    | Pixel Size                 | 30 meters              |
    | Coverage                   | Global                 |
    +----------------------------+------------------------+

    Collection Dates
    ----------------
    +-------------+------------+------------+
    | Product     | Start Date | End Date  |
    +-------------+------------+------------+
    | TOA         | 2013-04-11 | present   |
    | SR          | 2013-04-11 | present   |
    +-------------+------------+------------+

    Band Information
    ----------------
    +-----------+----------+-----------+------------------------+
    | Band Name | TOA Name | SR Name   | Spectral Wavelength    |
    +-----------+----------+-----------+------------------------+
    | blue      | B2       | SR_B2     | 450-515 nm             |
    | green     | B3       | SR_B3     | 525-600 nm             |
    | red       | B4       | SR_B4     | 630-680 nm             |
    | nir       | B5       | SR_B5     | 845-885 nm             |
    | swir1     | B6       | SR_B6     | 1560-1660 nm           |
    | swir2     | B7       | SR_B7     | 2100-2300 nm           |
    | pan       | B8       |    —      | 520-900 nm (panchromatic) |
    +-----------+----------+-----------+------------------------+

    Notes
    -----
    - Landsat 8 TOA Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_TOA

    - Landsat 8 TOA Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T2_TOA

    - Landsat 8 SR Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2

    - Landsat 8 SR Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T2_L2

    - Cloud mask reference (QA_PIXEL flags):
        https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

    - TOA cloud filtering (Simple Cloud Score):
        https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        use_sr: bool = True,
        tier: int = 1,
        use_cloud_mask: bool = True,
        min_valid_pixel_count: int = 12,
        toa_cloud_filter_strength: int = 15,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 50_000,
        use_pan_sharpening: bool = False,
    ):
        toa = {"blue": "B2", "green": "B3", "red": "B4", "nir": "B5", "swir1": "B6", "swir2": "B7", "pan": "B8"}
        sr = {
            "blue": "SR_B2",
            "green": "SR_B3",
            "red": "SR_B4",
            "nir": "SR_B5",
            "swir1": "SR_B6",
            "swir2": "SR_B7",
        }
        super().__init__(
            indices=indices,
            sensor_code="LC08",
            toa_band_map=toa,
            sr_band_map=sr,
            short_base="l8",
            start_date="2013-04-11",
            end_date="2050-01-01",
            bands=bands,
            use_sr=use_sr,
            tier=tier,
            use_cloud_mask=use_cloud_mask,
            min_valid_pixel_count=min_valid_pixel_count,
            toa_cloud_filter_strength=toa_cloud_filter_strength,
            border_pixels_to_erode=border_pixels_to_erode,
            min_area_to_keep_border=min_area_to_keep_border,
            use_pan_sharpening=use_pan_sharpening,
        )

Landsat9

Bases: AbstractLandsat

Satellite abstraction for Landsat 9 (OLI-2/TIRS-2 sensor, Collection 2).

Landsat 9 is the latest mission in the Landsat program, launched in 2021. It is nearly identical to Landsat 8 and provides continuity for high-quality multispectral Earth observation. This class supports both TOA and SR products, with optional cloud masking using the QA_PIXEL band.

Parameters:

Name Type Description Default
bands set of str

Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].

None
indices set of str

Spectral indices to compute from the selected bands.

None
use_sr bool

Whether to use surface reflectance products ('SR_B' bands). If False, uses top-of-atmosphere reflectance ('B' bands).

True
tier int

Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.

1
use_cloud_mask bool

Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

12
toa_cloud_filter_strength int

Strength of the additional cloud filter applied to TOA imagery (if use_sr=False). Used in the remove_l_toa_tough_clouds step.

15
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

50_000
use_pan_sharpening bool

If True, applies pan sharpening to the RGB bands using the 15m-resolution panchromatic band (B8). Only applicable when use_sr=False. Raises ValueError if used with SR products.

False
Cloud Masking

Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS: - Applied to both TOA and SR products when use_cloud_mask=True - For TOA collections, an additional filter (remove_l_toa_tough_clouds) is applied to remove low-quality observations based on a simple cloud scoring method.

Satellite Information

+----------------------------+------------------------+ | Field | Value | +----------------------------+------------------------+ | Name | Landsat 9 OLI-2/TIRS-2 | | Sensor | OLI-2 + TIRS-2 | | Platform | Landsat 9 | | Temporal Resolution | 16 days | | Pixel Size | 30 meters | | Coverage | Global | +----------------------------+------------------------+

Collection Dates

+-------------+------------+------------+ | Product | Start Date | End Date | +-------------+------------+------------+ | TOA | 2021-11-01 | present | | SR | 2021-11-01 | present | +-------------+------------+------------+

Band Information

+-----------+----------+-----------+------------------------+ | Band Name | TOA Name | SR Name | Spectral Wavelength | +-----------+----------+-----------+------------------------+ | blue | B2 | SR_B2 | 450-515 nm | | green | B3 | SR_B3 | 525-600 nm | | red | B4 | SR_B4 | 630-680 nm | | nir | B5 | SR_B5 | 845-885 nm | | swir1 | B6 | SR_B6 | 1560-1660 nm | | swir2 | B7 | SR_B7 | 2100-2300 nm | | pan | B8 | — | 520-900 nm (panchromatic) | +-----------+----------+-----------+------------------------+

Notes
  • Landsat 9 TOA Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T1_TOA

  • Landsat 9 TOA Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T2_TOA

  • Landsat 9 SR Collection (Tier 1): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T1_L2

  • Landsat 9 SR Collection (Tier 2): https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T2_L2

  • Cloud mask reference (QA_PIXEL flags): https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

  • TOA cloud filtering (Simple Cloud Score): https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score

Source code in agrigee_lite/sat/landsat.py
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
class Landsat9(AbstractLandsat):
    """
    Satellite abstraction for Landsat 9 (OLI-2/TIRS-2 sensor, Collection 2).

    Landsat 9 is the latest mission in the Landsat program, launched in 2021. It is nearly identical to Landsat 8
    and provides continuity for high-quality multispectral Earth observation. This class supports both TOA and SR
    products, with optional cloud masking using the QA_PIXEL band.

    Parameters
    ----------
    bands : set of str, optional
        Set of bands to select. Defaults to ['blue', 'green', 'red', 'nir', 'swir1', 'swir2'].
    indices : set of str, optional
        Spectral indices to compute from the selected bands.
    use_sr : bool, default=True
        Whether to use surface reflectance products ('SR_B*' bands).
        If False, uses top-of-atmosphere reflectance ('B*' bands).
    tier : int, default=1
        Landsat collection tier to use (1 or 2). Tier 1 has highest geometric accuracy.
    use_cloud_mask : bool, default=True
        Whether to apply QA_PIXEL-based cloud masking. If False, no cloud mask is applied.
    min_valid_pixel_count : int, default=12
        Minimum number of valid (non-cloud) pixels required to retain an image.
    toa_cloud_filter_strength : int, default=15
        Strength of the additional cloud filter applied to TOA imagery (if use_sr=False).
        Used in the `remove_l_toa_tough_clouds` step.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=50_000
        Minimum area (in m²) required to retain geometry after border erosion.
    use_pan_sharpening : bool, default=False
        If True, applies pan sharpening to the RGB bands using the 15m-resolution panchromatic band (B8).
        Only applicable when `use_sr=False`. Raises ValueError if used with SR products.

    Cloud Masking
    -------------
    Cloud masking is based on the QA_PIXEL band, using bit flags defined by USGS:
    - Applied to both TOA and SR products when `use_cloud_mask=True`
    - For TOA collections, an additional filter (`remove_l_toa_tough_clouds`) is applied
    to remove low-quality observations based on a simple cloud scoring method.

    Satellite Information
    ---------------------
    +----------------------------+------------------------+
    | Field                      | Value                  |
    +----------------------------+------------------------+
    | Name                       | Landsat 9 OLI-2/TIRS-2 |
    | Sensor                     | OLI-2 + TIRS-2         |
    | Platform                   | Landsat 9              |
    | Temporal Resolution        | 16 days                |
    | Pixel Size                 | 30 meters              |
    | Coverage                   | Global                 |
    +----------------------------+------------------------+

    Collection Dates
    ----------------
    +-------------+------------+------------+
    | Product     | Start Date | End Date  |
    +-------------+------------+------------+
    | TOA         | 2021-11-01 | present   |
    | SR          | 2021-11-01 | present   |
    +-------------+------------+------------+

    Band Information
    ----------------
    +-----------+----------+-----------+------------------------+
    | Band Name | TOA Name | SR Name   | Spectral Wavelength    |
    +-----------+----------+-----------+------------------------+
    | blue      | B2       | SR_B2     | 450-515 nm             |
    | green     | B3       | SR_B3     | 525-600 nm             |
    | red       | B4       | SR_B4     | 630-680 nm             |
    | nir       | B5       | SR_B5     | 845-885 nm             |
    | swir1     | B6       | SR_B6     | 1560-1660 nm           |
    | swir2     | B7       | SR_B7     | 2100-2300 nm           |
    | pan       | B8       |    —      | 520-900 nm (panchromatic) |
    +-----------+----------+-----------+------------------------+

    Notes
    -----
    - Landsat 9 TOA Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T1_TOA

    - Landsat 9 TOA Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T2_TOA

    - Landsat 9 SR Collection (Tier 1):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T1_L2

    - Landsat 9 SR Collection (Tier 2):
        https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC09_C02_T2_L2

    - Cloud mask reference (QA_PIXEL flags):
        https://www.usgs.gov/media/files/landsat-collection-2-pixel-quality-assessment

    - TOA cloud filtering (Simple Cloud Score):
        https://developers.google.com/earth-engine/guides/landsat?hl=pt-br#simple-cloud-score
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        use_sr: bool = True,
        tier: int = 1,
        use_cloud_mask: bool = True,
        min_valid_pixel_count: int = 12,
        toa_cloud_filter_strength: int = 15,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 50_000,
        use_pan_sharpening: bool = False,
    ):
        toa = {"blue": "B2", "green": "B3", "red": "B4", "nir": "B5", "swir1": "B6", "swir2": "B7", "pan": "B8"}
        sr = {
            "blue": "SR_B2",
            "green": "SR_B3",
            "red": "SR_B4",
            "nir": "SR_B5",
            "swir1": "SR_B6",
            "swir2": "SR_B7",
        }
        super().__init__(
            indices=indices,
            sensor_code="LC09",
            toa_band_map=toa,
            sr_band_map=sr,
            short_base="l9",
            start_date="2021-11-01",
            end_date="2050-01-01",
            bands=bands,
            use_sr=use_sr,
            tier=tier,
            use_cloud_mask=use_cloud_mask,
            min_valid_pixel_count=min_valid_pixel_count,
            toa_cloud_filter_strength=toa_cloud_filter_strength,
            border_pixels_to_erode=border_pixels_to_erode,
            min_area_to_keep_border=min_area_to_keep_border,
            use_pan_sharpening=use_pan_sharpening,
        )

MapBiomas

Bases: DataSourceSatellite

Satellite abstraction for MapBiomas Brazil Collection 10 Land Use and Land Cover (LULC) data.

This class wraps the official MapBiomas Collection 10 LULC classification product for Brazil. The dataset provides annual land use and land cover classifications from 1985 to 2023 at 30-meter resolution, with majority class (10_class) and percent agreement (11_percent) bands.

It is suitable for long-term land cover trend analysis, ecosystem monitoring, and environmental assessments.

Parameters:

Name Type Description Default
border_pixels_to_erode float

Number of border pixels (in pixels) to erode from the input geometry before analysis. Helps remove classification noise from edges. Use 0 to disable.

1
min_area_to_keep_border int

Minimum area in square meters to retain the eroded region. Used to avoid discarding small geometries entirely.

50000
Bands

+-------------+------------------------+-------------------------------------------------------------+ | Band Name | Type | Description | +-------------+------------------------+-------------------------------------------------------------+ | 10_class | Categorical (int) | Most frequent land use/cover class for the pixel/year | | 11_percent | Float (0-1) | Proportion of classification votes for the majority class | +-------------+------------------------+-------------------------------------------------------------+

Classes:

Name Description
Each integer value in the `10_class` band corresponds to a LULC class defined by MapBiomas.
Refer to `self.classes` for full label and color mapping. Examples include:
- 3: Forest Formation
- 14: Farming
- 24: Urban Area
- 26: Water
- 39: Soybean
- 46: Coffee
Processing Overview
  1. The MapBiomas classification image (mapbiomas_brazil_collection10_coverage_v2) is loaded.
  2. For each year between the start and end date of the input feature:
  3. The modal (most frequent) class is computed (10_class)
  4. Its pixel agreement (% of pixels matching that class) is calculated (11_percent)
  5. Optionally, the geometry is eroded to avoid edge noise.
  6. Final features are returned as an annual time series of LULC summaries.
Dataset Information

+-------------------------+------------------------------------------------------+ | Field | Value | +-------------------------+------------------------------------------------------+ | Dataset | MapBiomas Brazil Collection 10 | | Temporal Coverage | 1985 - 2023 | | Spatial Resolution | 30 meters | | Projection | EPSG: 4674 (SIRGAS 2000) | | Source Imagery | Landsat (TM, ETM+, OLI) | | Classification Method | Random Forest + Temporal Filtering | +-------------------------+------------------------------------------------------+

Notes
  • Official MapBiomas dataset (Earth Engine): https://developers.google.com/earth-engine/datasets/catalog/projects_mapbiomas-public_assets_brazil_lulc_collection10_mapbiomas_brazil_collection10_coverage_v2

  • ATBD (Algorithm Theoretical Basis Document) Collection 10: https://mapbiomas.org/downloads?cama_set_language=en

  • Only the majority class (classification_YEAR) is used here — secondary confidence or transitions are not included.

Source code in agrigee_lite/sat/mapbiomas.py
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
class MapBiomas(DataSourceSatellite):
    """
    Satellite abstraction for MapBiomas Brazil Collection 10 Land Use and Land Cover (LULC) data.

    This class wraps the official MapBiomas Collection 10 LULC classification product for Brazil.
    The dataset provides annual land use and land cover classifications from 1985 to 2023 at 30-meter resolution,
    with majority class (`10_class`) and percent agreement (`11_percent`) bands.

    It is suitable for long-term land cover trend analysis, ecosystem monitoring, and environmental assessments.

    Parameters
    ----------
    border_pixels_to_erode : float, default=1
        Number of border pixels (in pixels) to erode from the input geometry before analysis.
        Helps remove classification noise from edges. Use 0 to disable.
    min_area_to_keep_border : int, default=50000
        Minimum area in square meters to retain the eroded region.
        Used to avoid discarding small geometries entirely.

    Bands
    -----
    +-------------+------------------------+-------------------------------------------------------------+
    | Band Name   | Type                   | Description                                                 |
    +-------------+------------------------+-------------------------------------------------------------+
    | 10_class    | Categorical (int)      | Most frequent land use/cover class for the pixel/year       |
    | 11_percent  | Float (0-1)            | Proportion of classification votes for the majority class   |
    +-------------+------------------------+-------------------------------------------------------------+

    Classes
    -------
    Each integer value in the `10_class` band corresponds to a LULC class defined by MapBiomas.
    Refer to `self.classes` for full label and color mapping. Examples include:
    - 3: Forest Formation
    - 14: Farming
    - 24: Urban Area
    - 26: Water
    - 39: Soybean
    - 46: Coffee

    Processing Overview
    -------------------
    1. The MapBiomas classification image (`mapbiomas_brazil_collection10_coverage_v2`) is loaded.
    2. For each year between the start and end date of the input feature:
    - The modal (most frequent) class is computed (`10_class`)
    - Its pixel agreement (% of pixels matching that class) is calculated (`11_percent`)
    3. Optionally, the geometry is eroded to avoid edge noise.
    4. Final features are returned as an annual time series of LULC summaries.

    Dataset Information
    -------------------
    +-------------------------+------------------------------------------------------+
    | Field                   | Value                                                |
    +-------------------------+------------------------------------------------------+
    | Dataset                 | MapBiomas Brazil Collection 10                       |
    | Temporal Coverage       | 1985 - 2023                                          |
    | Spatial Resolution      | 30 meters                                            |
    | Projection              | EPSG: 4674 (SIRGAS 2000)                             |
    | Source Imagery          | Landsat (TM, ETM+, OLI)                              |
    | Classification Method   | Random Forest + Temporal Filtering                   |
    +-------------------------+------------------------------------------------------+

    Notes
    -----
    - Official MapBiomas dataset (Earth Engine):
    https://developers.google.com/earth-engine/datasets/catalog/projects_mapbiomas-public_assets_brazil_lulc_collection10_mapbiomas_brazil_collection10_coverage_v2

    - ATBD (Algorithm Theoretical Basis Document) Collection 10:
    https://mapbiomas.org/downloads?cama_set_language=en

    - Only the majority class (`classification_YEAR`) is used here — secondary confidence or transitions are not included.

    """

    def __init__(
        self,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 50_000,
    ) -> None:
        super().__init__()
        self.imageAsset: str = (
            "projects/mapbiomas-public/assets/brazil/lulc/collection10/mapbiomas_brazil_collection10_coverage_v2"
        )
        self.pixelSize: int = 30
        self.startDate = "1985-01-01"
        self.endDate = "2024-01-01"
        self.shortName = "mapbiomasmajclass"
        self.selectedBands = [
            (None, "10_class"),
            (None, "11_percent"),
        ]

        self.classes = {
            1: {"label": "Forest", "color": "#1f8d49"},
            3: {"label": "Forest Formation", "color": "#1f8d49"},
            4: {"label": "Savanna Formation", "color": "#7dc975"},
            5: {"label": "Mangrove", "color": "#04381d"},
            6: {"label": "Floodable Forest", "color": "#007785"},
            9: {"label": "Forest Plantation", "color": "#7a5900"},
            10: {"label": "Herbaceous and Shrubby Vegetation", "color": "#d6bc74"},
            11: {"label": "Wetland", "color": "#519799"},
            12: {"label": "Grassland", "color": "#d6bc74"},
            14: {"label": "Farming", "color": "#ffefc3"},
            15: {"label": "Pasture", "color": "#edde8e"},
            18: {"label": "Agriculture", "color": "#E974ED"},
            19: {"label": "Temporary Crop", "color": "#C27BA0"},
            20: {"label": "Sugar cane", "color": "#db7093"},
            21: {"label": "Mosaic of Uses", "color": "#ffefc3"},
            22: {"label": "Non vegetated area", "color": "#d4271e"},
            23: {"label": "Beach, Dune and Sand Spot", "color": "#ffa07a"},
            24: {"label": "Urban Area", "color": "#d4271e"},
            25: {"label": "Other non Vegetated Areas", "color": "#db4d4f"},
            26: {"label": "Water", "color": "#2532e4"},
            27: {"label": "Not Observed", "color": "#ffffff"},
            29: {"label": "Rocky Outcrop", "color": "#ffaa5f"},
            30: {"label": "Mining", "color": "#9c0027"},
            31: {"label": "Aquaculture", "color": "#091077"},
            32: {"label": "Hypersaline Tidal Flat", "color": "#fc8114"},
            33: {"label": "River, Lake and Ocean", "color": "#2532e4"},
            35: {"label": "Palm Oil", "color": "#9065d0"},
            36: {"label": "Perennial Crop", "color": "#d082de"},
            39: {"label": "Soybean", "color": "#f5b3c8"},
            40: {"label": "Rice", "color": "#c71585"},
            41: {"label": "Other Temporary Crops", "color": "#f54ca9"},
            46: {"label": "Coffee", "color": "#d68fe2"},
            47: {"label": "Citrus", "color": "#9932cc"},
            48: {"label": "Other Perennial Crops", "color": "#e6ccff"},
            49: {"label": "Wooded Sandbank Vegetation", "color": "#02d659"},
            50: {"label": "Herbaceous Sandbank Vegetation", "color": "#ad5100"},
            62: {"label": "Cotton (beta)", "color": "#ff69b4"},
            75: {"label": "Photovoltaic Power Plant (beta)", "color": "#c12100"},
        }

        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode
        self.toDownloadSelectors = ["10_class", "11_percent"]

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float | None = None,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        mb_image = ee.Image(self.imageAsset)
        mb_image = ee_map_valid_pixels(mb_image, ee_geometry, self.pixelSize)

        ee_start = ee.Feature(ee_feature).get("s")
        ee_end = ee.Feature(ee_feature).get("e")
        start_year = ee.Date(ee_start).get("year")
        end_year = ee.Date(ee_end).get("year")
        indexnum = ee.Feature(ee_feature).get("0")

        years = ee.List.sequence(start_year, end_year)

        def _feat_for_year(year: ee.Number) -> ee.Feature:
            year_num = ee.Number(year).toInt()
            year_str = year_num.format()
            band_in = ee.String("classification_").cat(year_str)
            img = mb_image.select([band_in], [year_str])

            mode_dict = img.reduceRegion(
                reducer=ee.Reducer.mode(),
                geometry=ee_geometry,
                scale=self.pixelSize,
                maxPixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
                bestEffort=True,
            )
            clazz = ee.Number(mode_dict.get(year_str)).round()

            percent = (
                img.eq(clazz)
                .reduceRegion(
                    reducer=ee.Reducer.mean(),
                    geometry=ee_geometry,
                    scale=self.pixelSize,
                    maxPixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
                    bestEffort=True,
                )
                .get(year_str)
            )

            timestamp = ee.String(year_str).cat("-01-01")

            stats = ee.Feature(
                None,
                {
                    "00_indexnum": indexnum,
                    "01_timestamp": timestamp,
                    "10_class": clazz,
                    "11_percent": percent,
                },
            )

            stats = stats.set("99_validPixelsCount", mb_image.get("ZZ_USER_VALID_PIXELS"))

            return stats

        features = years.map(_feat_for_year)
        return ee.FeatureCollection(features)

    def __str__(self) -> str:
        return self.shortName

    def __repr__(self) -> str:
        return self.shortName

Modis8Days

Bases: OpticalSatellite

Satellite abstraction for MODIS Terra and Aqua (8-day composites).

MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard NASA's Terra and Aqua satellites, providing global coverage for land, ocean, and atmospheric monitoring at frequent intervals.

Parameters:

Name Type Description Default
bands list of str

List of bands to select. Defaults to ['red', 'nir'].

None
indices list of str

List of spectral indices to compute from selected bands.

None
use_cloud_mask bool

Whether to apply a cloud mask based on the QA 'State' band (bits 0-1). If True, only pixels with cloud state == 0 (clear) are retained.

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

2
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

0.5
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

190_000
Cloud Masking

Cloudy pixels are masked using bits 0-1 of the 'State' QA band, which encode cloud state: - 00: clear - 01: cloudy - 10: mixed - 11: not set

The masking keeps only pixels with value 00 (clear) if use_cloud_mask=True.

Satellite Information

+----------------------------+------------------------+ | Field | Value | +----------------------------+------------------------+ | Name | MODIS (8-day) | | Platforms | Terra, Aqua | | Temporal Resolution | 8 days | | Pixel Size | 250 meters | | Coverage | Global | +----------------------------+------------------------+

Collection Dates

+--------+------------+------------+ | Source | Start Date | End Date | +--------+------------+------------+ | Terra | 2000-02-18 | present | | Aqua | 2002-07-04 | present | +--------+------------+------------+

Band Information

+-----------+----------------+----------------+------------------------+ | Band Name | Original Band | Resolution | Spectral Wavelength | +-----------+----------------+----------------+------------------------+ | red | sur_refl_b01 | 250 meters | 620-670 nm | | nir | sur_refl_b02 | 250 meters | 841-876 nm | +-----------+----------------+----------------+------------------------+

Notes

Cloud Mask Reference (QA 'State' band documentation): https://lpdaac.usgs.gov/documents/925/MOD09_User_Guide_V61.pdf

MODIS Collections on Google Earth Engine: - Terra (MOD09Q1): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MOD09Q1 - Aqua (MYD09Q1): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD09Q1

Source code in agrigee_lite/sat/modis.py
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
class Modis8Days(OpticalSatellite):
    """
    Satellite abstraction for MODIS Terra and Aqua (8-day composites).

    MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard NASA's Terra and Aqua satellites,
    providing global coverage for land, ocean, and atmospheric monitoring at frequent intervals.

    Parameters
    ----------
    bands : list of str, optional
        List of bands to select. Defaults to ['red', 'nir'].
    indices : list of str, optional
        List of spectral indices to compute from selected bands.
    use_cloud_mask : bool, default=True
        Whether to apply a cloud mask based on the QA 'State' band (bits 0-1).
        If True, only pixels with cloud state == 0 (clear) are retained.
    min_valid_pixel_count : int, default=2
        Minimum number of valid (non-cloud) pixels required to retain an image.
    border_pixels_to_erode : float, default=0.5
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=190_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Cloud Masking
    -------------
    Cloudy pixels are masked using bits 0-1 of the 'State' QA band, which encode cloud state:
        - 00: clear
        - 01: cloudy
        - 10: mixed
        - 11: not set

    The masking keeps only pixels with value 00 (clear) if `use_cloud_mask=True`.

    Satellite Information
    ---------------------
    +----------------------------+------------------------+
    | Field                      | Value                  |
    +----------------------------+------------------------+
    | Name                       | MODIS (8-day)          |
    | Platforms                  | Terra, Aqua            |
    | Temporal Resolution        | 8 days                 |
    | Pixel Size                 | 250 meters             |
    | Coverage                   | Global                 |
    +----------------------------+------------------------+

    Collection Dates
    ----------------
    +--------+------------+------------+
    | Source | Start Date | End Date  |
    +--------+------------+------------+
    | Terra  | 2000-02-18 | present   |
    | Aqua   | 2002-07-04 | present   |
    +--------+------------+------------+

    Band Information
    ----------------
    +-----------+----------------+----------------+------------------------+
    | Band Name | Original Band  | Resolution     | Spectral Wavelength    |
    +-----------+----------------+----------------+------------------------+
    | red       | sur_refl_b01   | 250 meters     | 620-670 nm             |
    | nir       | sur_refl_b02   | 250 meters     | 841-876 nm             |
    +-----------+----------------+----------------+------------------------+

    Notes
    -----
    Cloud Mask Reference (QA 'State' band documentation):
        https://lpdaac.usgs.gov/documents/925/MOD09_User_Guide_V61.pdf

    MODIS Collections on Google Earth Engine:
        - Terra (MOD09Q1): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MOD09Q1
        - Aqua  (MYD09Q1): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD09Q1
    """

    def __init__(
        self,
        bands: list[str] | None = None,
        indices: list[str] | None = None,
        use_cloud_mask: bool = True,
        min_valid_pixel_count: int = 2,
        border_pixels_to_erode: float = 0.5,
        min_area_to_keep_border: int = 190_000,
    ) -> None:
        bands = sorted({"red", "nir"}) if bands is None else sorted(bands)

        indices = [] if indices is None else sorted(indices)

        super().__init__()

        self.shortName = "modis8days"
        self.pixelSize = 250
        self.startDate = "2000-02-18"
        self.endDate = "2050-01-01"

        self._terra = "MODIS/061/MOD09Q1"
        self._aqua = "MODIS/061/MYD09Q1"

        self.availableBands = {
            "red": "sur_refl_b01",
            "nir": "sur_refl_b02",
        }

        self.selectedBands: list[tuple[str, str]] = [(band, f"{(n + 10):02}_{band}") for n, band in enumerate(bands)]

        self.selectedIndices: list[str] = [
            (self.availableIndices[indice_name], indice_name, f"{(n + 40):02}_{indice_name}")
            for n, indice_name in enumerate(indices)
        ]

        self.useCloudMask = use_cloud_mask
        self.minValidPixelCount = min_valid_pixel_count
        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode

        self.toDownloadSelectors = [numeral_band_name for _, numeral_band_name in self.selectedBands] + [
            numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices
        ]

    @staticmethod
    def _mask_modis8days_clouds(img: ee.Image) -> ee.Image:
        """Mask cloudy pixels based on bits 0-1 of 'State' QA band."""
        qa = img.select("State")
        cloud_state = qa.bitwiseAnd(3)  # 3 == 0b11
        return img.updateMask(cloud_state.eq(0))

    def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        ee_filter = ee.Filter.And(
            ee.Filter.bounds(ee_geometry),
            ee.Filter.date(ee_feature.get("s"), ee_feature.get("e")),
        )

        def _base(path: str) -> ee.ImageCollection:
            collection = ee.ImageCollection(path).filter(ee_filter)
            if self.useCloudMask:
                collection = collection.map(self._mask_modis8days_clouds)

            return collection.select(
                list(self.availableBands.values()),
                list(self.availableBands.keys()),
            )

        terra = _base(self._terra)
        aqua = _base(self._aqua)

        modis_imgc = terra.merge(aqua)

        modis_imgc = modis_imgc.map(
            lambda img: ee.Image(img).addBands(ee.Image(img).add(100).divide(16_100), overwrite=True)
        )

        if self.selectedIndices:
            modis_imgc = modis_imgc.map(
                partial(ee_add_indexes_to_image, indexes=[expression for (expression, _, _) in self.selectedIndices])
            )

        modis_imgc = modis_imgc.select(
            [natural_band_name for natural_band_name, _ in self.selectedBands]
            + [indice_name for _, indice_name, _ in self.selectedIndices],
            [numeral_band_name for _, numeral_band_name in self.selectedBands]
            + [numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices],
        )

        modis_imgc = ee_filter_img_collection_invalid_pixels(
            modis_imgc, ee_geometry, self.pixelSize, self.minValidPixelCount
        )

        return ee.ImageCollection(modis_imgc)

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        geom = ee_feature.geometry()
        geom = ee_safe_remove_borders(geom, self.pixelSize // 2, 190_000)
        ee_feature = ee_feature.setGeometry(geom)

        modis = self.imageCollection(ee_feature)

        feats = modis.map(
            partial(
                ee_map_bands_and_doy,
                ee_feature=ee_feature,
                pixel_size=self.pixelSize,
                subsampling_max_pixels=ee_get_number_of_pixels(geom, subsampling_max_pixels, self.pixelSize),
                reducer=ee_get_reducers(reducers),
            )
        )
        return feats

_mask_modis8days_clouds(img) staticmethod

Mask cloudy pixels based on bits 0-1 of 'State' QA band.

Source code in agrigee_lite/sat/modis.py
337
338
339
340
341
342
@staticmethod
def _mask_modis8days_clouds(img: ee.Image) -> ee.Image:
    """Mask cloudy pixels based on bits 0-1 of 'State' QA band."""
    qa = img.select("State")
    cloud_state = qa.bitwiseAnd(3)  # 3 == 0b11
    return img.updateMask(cloud_state.eq(0))

ModisDaily

Bases: OpticalSatellite

⚠️⚠️⚠️ Note: Despite this cloud mask, daily MODIS imagery tends to have a high presence of residual clouds. It is recommended to use Modis8Days for cleaner data. ⚠️⚠️⚠️

Satellite abstraction for MODIS Terra and Aqua (Daily composites).

MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard NASA's Terra and Aqua satellites, offering daily global coverage for environmental and land surface monitoring.

Parameters:

Name Type Description Default
bands set of str

Set of bands to select. Defaults to ['red', 'nir'].

None
indices set of str

List of spectral indices to compute from selected bands.

None
use_cloud_mask bool

Whether to apply cloud masking using bit 10 of the 'state_1km' QA band. When set to False, no cloud filtering is applied (results may be ULTRA NOISY).

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

2
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

0.5
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

190_000
Cloud Masking

Cloudy pixels are masked using bit 10 of the 'state_1km' QA band: - 0: clear - 1: cloudy

Only pixels with bit 10 equal to 0 (clear) are retained.

Satellite Information

+----------------------------+------------------------+ | Field | Value | +----------------------------+------------------------+ | Name | MODIS (Daily) | | Platforms | Terra, Aqua | | Temporal Resolution | 1 day | | Pixel Size | 250 meters | | Coverage | Global | +----------------------------+------------------------+

Collection Dates

+--------+------------+------------+ | Source | Start Date | End Date | +--------+------------+------------+ | Terra | 2000-02-24 | present | | Aqua | 2002-07-04 | present | +--------+------------+------------+

Band Information

+-----------+----------------+----------------+------------------------+ | Band Name | Original Band | Resolution | Spectral Wavelength | +-----------+----------------+----------------+------------------------+ | red | sur_refl_b01 | 250 meters | 620-670 nm | | nir | sur_refl_b02 | 250 meters | 841-876 nm | +-----------+----------------+----------------+------------------------+

Notes

Cloud Mask Reference (QA 'state_1km' band documentation): https://lpdaac.usgs.gov/documents/925/MOD09_User_Guide_V61.pdf

MODIS Collections on Google Earth Engine: - Terra (MOD09GQ - reflectance): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MOD09GQ - Terra (MOD09GA - QA band): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MOD09GA - Aqua (MYD09GQ - reflectance): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD09GQ - Aqua (MYD09GA - QA band): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD09GA

Source code in agrigee_lite/sat/modis.py
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
class ModisDaily(OpticalSatellite):
    """
    ⚠️⚠️⚠️  Note: Despite this cloud mask, daily MODIS imagery tends to have **a high presence of residual clouds**. It is recommended to use Modis8Days for cleaner data. ⚠️⚠️⚠️

    Satellite abstraction for MODIS Terra and Aqua (Daily composites).

    MODIS (Moderate Resolution Imaging Spectroradiometer) is a key instrument aboard NASA's Terra and Aqua satellites,
    offering daily global coverage for environmental and land surface monitoring.

    Parameters
    ----------
    bands : set of str, optional
        Set of bands to select. Defaults to ['red', 'nir'].
    indices : set of str, optional
        List of spectral indices to compute from selected bands.
    use_cloud_mask : bool, default=True
        Whether to apply cloud masking using bit 10 of the 'state_1km' QA band.
        When set to False, no cloud filtering is applied (results may be ULTRA NOISY).
    min_valid_pixel_count : int, default=2
        Minimum number of valid (non-cloud) pixels required to retain an image.
    border_pixels_to_erode : float, default=0.5
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=190_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Cloud Masking
    -------------
    Cloudy pixels are masked using bit 10 of the 'state_1km' QA band:
        - 0: clear
        - 1: cloudy

    Only pixels with bit 10 equal to 0 (clear) are retained.

    Satellite Information
    ---------------------
    +----------------------------+------------------------+
    | Field                      | Value                  |
    +----------------------------+------------------------+
    | Name                       | MODIS (Daily)          |
    | Platforms                  | Terra, Aqua            |
    | Temporal Resolution        | 1 day                  |
    | Pixel Size                 | 250 meters             |
    | Coverage                   | Global                 |
    +----------------------------+------------------------+

    Collection Dates
    ----------------
    +--------+------------+------------+
    | Source | Start Date | End Date  |
    +--------+------------+------------+
    | Terra  | 2000-02-24 | present   |
    | Aqua   | 2002-07-04 | present   |
    +--------+------------+------------+

    Band Information
    ----------------
    +-----------+----------------+----------------+------------------------+
    | Band Name | Original Band  | Resolution     | Spectral Wavelength    |
    +-----------+----------------+----------------+------------------------+
    | red       | sur_refl_b01   | 250 meters     | 620-670 nm             |
    | nir       | sur_refl_b02   | 250 meters     | 841-876 nm             |
    +-----------+----------------+----------------+------------------------+

    Notes
    -----
    Cloud Mask Reference (QA 'state_1km' band documentation):
        https://lpdaac.usgs.gov/documents/925/MOD09_User_Guide_V61.pdf

    MODIS Collections on Google Earth Engine:
        - Terra (MOD09GQ - reflectance): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MOD09GQ
        - Terra (MOD09GA - QA band):     https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MOD09GA
        - Aqua  (MYD09GQ - reflectance): https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD09GQ
        - Aqua  (MYD09GA - QA band):     https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD09GA
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        use_cloud_mask: bool = True,
        min_valid_pixel_count: int = 2,
        border_pixels_to_erode: float = 0.5,
        min_area_to_keep_border: int = 190_000,
    ) -> None:
        bands = sorted({"red", "nir"}) if bands is None else sorted(bands)

        indices = [] if indices is None else sorted(indices)

        super().__init__()

        self.shortName = "modis"
        self.pixelSize = 250
        self.startDate = "2000-02-24"
        self.endDate = "2050-01-01"

        self._terra_vis = "MODIS/061/MOD09GQ"
        self._terra_qa = "MODIS/061/MOD09GA"
        self._aqua_vis = "MODIS/061/MYD09GQ"
        self._aqua_qa = "MODIS/061/MYD09GA"

        self.availableBands = {
            "red": "sur_refl_b01",
            "nir": "sur_refl_b02",
        }

        self.selectedBands: list[tuple[str, str]] = [(band, f"{(n + 10):02}_{band}") for n, band in enumerate(bands)]

        self.selectedIndices: list[str] = [
            (self.availableIndices[indice_name], indice_name, f"{(n + 40):02}_{indice_name}")
            for n, indice_name in enumerate(indices)
        ]

        self.useCloudMask = use_cloud_mask
        self.minValidPixelCount = min_valid_pixel_count
        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode

        self.toDownloadSelectors = [numeral_band_name for _, numeral_band_name in self.selectedBands] + [
            numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices
        ]

    @staticmethod
    def _mask_modis_clouds(img: ee.Image) -> ee.Image:
        """Bit-test bit 10 of *state_1km* (value 0 = clear)."""
        qa = img.select("state_1km")
        bit_mask = 1 << 10
        return img.updateMask(qa.bitwiseAnd(bit_mask).eq(0))

    def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
        """
        Build the merged, cloud-masked Terra + Aqua collection *exactly*
        like the stand-alone helper did.
        """
        ee_geometry = ee_feature.geometry()
        ee_filter = ee.Filter.And(
            ee.Filter.bounds(ee_geometry),
            ee.Filter.date(ee_feature.get("s"), ee_feature.get("e")),
        )

        def _base(vis: str, qa: str) -> ee.ImageCollection:
            collection = ee.ImageCollection(vis).linkCollection(ee.ImageCollection(qa), ["state_1km"]).filter(ee_filter)
            if self.useCloudMask:
                collection = collection.map(self._mask_modis_clouds)

            return collection.select(
                list(self.availableBands.values()),
                list(self.availableBands.keys()),
            )

        terra = _base(self._terra_vis, self._terra_qa)
        aqua = _base(self._aqua_vis, self._aqua_qa)

        modis_imgc = terra.merge(aqua)

        modis_imgc = modis_imgc.map(
            lambda img: ee.Image(img).addBands(ee.Image(img).add(100).divide(16_100), overwrite=True)
        )

        if self.selectedIndices:
            modis_imgc = modis_imgc.map(
                partial(ee_add_indexes_to_image, indexes=[expression for (expression, _, _) in self.selectedIndices])
            )

        modis_imgc = modis_imgc.select(
            [natural_band_name for natural_band_name, _ in self.selectedBands]
            + [indice_name for _, indice_name, _ in self.selectedIndices],
            [numeral_band_name for _, numeral_band_name in self.selectedBands]
            + [numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices],
        )

        modis_imgc = ee_filter_img_collection_invalid_pixels(
            modis_imgc, ee_geometry, self.pixelSize, self.minValidPixelCount
        )

        return ee.ImageCollection(modis_imgc)

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        modis = self.imageCollection(ee_feature)

        feats = modis.map(
            partial(
                ee_map_bands_and_doy,
                ee_feature=ee_feature,
                pixel_size=self.pixelSize,
                subsampling_max_pixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
                reducer=ee_get_reducers(reducers),
            )
        )
        return feats

_mask_modis_clouds(img) staticmethod

Bit-test bit 10 of state_1km (value 0 = clear).

Source code in agrigee_lite/sat/modis.py
137
138
139
140
141
142
@staticmethod
def _mask_modis_clouds(img: ee.Image) -> ee.Image:
    """Bit-test bit 10 of *state_1km* (value 0 = clear)."""
    qa = img.select("state_1km")
    bit_mask = 1 << 10
    return img.updateMask(qa.bitwiseAnd(bit_mask).eq(0))

imageCollection(ee_feature)

Build the merged, cloud-masked Terra + Aqua collection exactly like the stand-alone helper did.

Source code in agrigee_lite/sat/modis.py
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
    """
    Build the merged, cloud-masked Terra + Aqua collection *exactly*
    like the stand-alone helper did.
    """
    ee_geometry = ee_feature.geometry()
    ee_filter = ee.Filter.And(
        ee.Filter.bounds(ee_geometry),
        ee.Filter.date(ee_feature.get("s"), ee_feature.get("e")),
    )

    def _base(vis: str, qa: str) -> ee.ImageCollection:
        collection = ee.ImageCollection(vis).linkCollection(ee.ImageCollection(qa), ["state_1km"]).filter(ee_filter)
        if self.useCloudMask:
            collection = collection.map(self._mask_modis_clouds)

        return collection.select(
            list(self.availableBands.values()),
            list(self.availableBands.keys()),
        )

    terra = _base(self._terra_vis, self._terra_qa)
    aqua = _base(self._aqua_vis, self._aqua_qa)

    modis_imgc = terra.merge(aqua)

    modis_imgc = modis_imgc.map(
        lambda img: ee.Image(img).addBands(ee.Image(img).add(100).divide(16_100), overwrite=True)
    )

    if self.selectedIndices:
        modis_imgc = modis_imgc.map(
            partial(ee_add_indexes_to_image, indexes=[expression for (expression, _, _) in self.selectedIndices])
        )

    modis_imgc = modis_imgc.select(
        [natural_band_name for natural_band_name, _ in self.selectedBands]
        + [indice_name for _, indice_name, _ in self.selectedIndices],
        [numeral_band_name for _, numeral_band_name in self.selectedBands]
        + [numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices],
    )

    modis_imgc = ee_filter_img_collection_invalid_pixels(
        modis_imgc, ee_geometry, self.pixelSize, self.minValidPixelCount
    )

    return ee.ImageCollection(modis_imgc)

PALSAR2ScanSAR

Bases: RadarSatellite

Satellite abstraction for ALOS PALSAR-2 ScanSAR (Level 2.2).

PALSAR-2 is an L-band Synthetic Aperture Radar (SAR) sensor onboard the ALOS-2 satellite, operated by JAXA. This class provides preprocessing and abstraction for the Level 2.2 ScanSAR data product with 25-meter resolution. Optionally applies the MSK quality mask.

Parameters:

Name Type Description Default
bands set of str

Set of bands to select. Defaults to ['hh', 'hv'].

None
indices set of str

Radar indices to compute (e.g., polarization ratios). Defaults to [].

None
use_quality_mask bool

Whether to apply the MSK bitmask quality filter. If False, all pixels are retained, including those marked as low-quality or invalid.

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

20
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

35_000
Quality Masking

When use_quality_mask=True, the MSK band is used to filter out invalid pixels. The first 3 bits of the MSK band indicate data quality: - 1 → Valid - 5 → Invalid Only pixels with value 1 are retained.

Satellite Information

+----------------------------+-------------------------------+ | Field | Value | +----------------------------+-------------------------------+ | Name | ALOS PALSAR-2 ScanSAR | | Sensor | PALSAR-2 (L-band SAR) | | Platform | ALOS-2 | | Revisit Time | ~14 days | | Pixel Size | ~25 meters | | Coverage | Japan + selected global areas | +----------------------------+-------------------------------+

Collection Dates

+----------------+-------------+------------+ | Collection | Start Date | End Date | +----------------+-------------+------------+ | Level 2.2 | 2014-08-04 | present | +----------------+-------------+------------+

Band Information

+-----------+---------+------------+-------------------------------------------+ | Band Name | Type | Resolution | Description | +-----------+---------+------------+-------------------------------------------+ | hh | L-band | ~25 m | Horizontal transmit and receive | | hv | L-band | ~25 m | Horizontal transmit, vertical receive | | msk | Bitmask | ~25 m | MSK quality band (used only if enabled) | +-----------+---------+------------+-------------------------------------------+

Notes
  • Earth Engine Dataset: https://developers.google.com/earth-engine/datasets/catalog/JAXA_ALOS_PALSAR-2_Level2_2_ScanSAR

  • MSK Quality Mask Details (bit pattern): https://www.eorc.jaxa.jp/ALOS/en/palsar_fnf/data/Format_PALSAR-2.html

Source code in agrigee_lite/sat/palsar.py
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
class PALSAR2ScanSAR(RadarSatellite):
    """
    Satellite abstraction for ALOS PALSAR-2 ScanSAR (Level 2.2).

    PALSAR-2 is an L-band Synthetic Aperture Radar (SAR) sensor onboard the ALOS-2 satellite,
    operated by JAXA. This class provides preprocessing and abstraction for the Level 2.2
    ScanSAR data product with 25-meter resolution. Optionally applies the MSK quality mask.

    Parameters
    ----------
    bands : set of str, optional
        Set of bands to select. Defaults to ['hh', 'hv'].
    indices : set of str, optional
        Radar indices to compute (e.g., polarization ratios). Defaults to [].
    use_quality_mask : bool, default=True
        Whether to apply the MSK bitmask quality filter. If False, all pixels are retained,
        including those marked as low-quality or invalid.
    min_valid_pixel_count : int, default=20
        Minimum number of valid (non-cloud) pixels required to retain an image.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=35_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Quality Masking
    ---------------
    When `use_quality_mask=True`, the `MSK` band is used to filter out invalid pixels.
    The first 3 bits of the `MSK` band indicate data quality:
        - 1 → Valid
        - 5 → Invalid
    Only pixels with value 1 are retained.

    Satellite Information
    ---------------------
    +----------------------------+-------------------------------+
    | Field                      | Value                         |
    +----------------------------+-------------------------------+
    | Name                       | ALOS PALSAR-2 ScanSAR         |
    | Sensor                     | PALSAR-2 (L-band SAR)         |
    | Platform                   | ALOS-2                        |
    | Revisit Time               | ~14 days                      |
    | Pixel Size                 | ~25 meters                    |
    | Coverage                   | Japan + selected global areas |
    +----------------------------+-------------------------------+

    Collection Dates
    ----------------
    +----------------+-------------+------------+
    | Collection     | Start Date  | End Date   |
    +----------------+-------------+------------+
    | Level 2.2      | 2014-08-04  | present    |
    +----------------+-------------+------------+

    Band Information
    ----------------
    +-----------+---------+------------+-------------------------------------------+
    | Band Name | Type    | Resolution | Description                               |
    +-----------+---------+------------+-------------------------------------------+
    | hh        | L-band  | ~25 m      | Horizontal transmit and receive           |
    | hv        | L-band  | ~25 m      | Horizontal transmit, vertical receive     |
    | msk       | Bitmask | ~25 m      | MSK quality band (used only if enabled)   |
    +-----------+---------+------------+-------------------------------------------+

    Notes
    -----
    - Earth Engine Dataset:
        https://developers.google.com/earth-engine/datasets/catalog/JAXA_ALOS_PALSAR-2_Level2_2_ScanSAR

    - MSK Quality Mask Details (bit pattern):
        https://www.eorc.jaxa.jp/ALOS/en/palsar_fnf/data/Format_PALSAR-2.html
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        use_quality_mask: bool = True,
        min_valid_pixel_count: int = 20,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 35000,
    ):
        bands = sorted({"hh", "hv"}) if bands is None else sorted(bands)

        indices = [] if indices is None else sorted(indices)

        super().__init__()

        self.imageCollectionName: str = "JAXA/ALOS/PALSAR-2/Level2_2/ScanSAR"
        self.pixelSize: int = 25
        self.startDate: str = "2014-08-04"
        self.endDate: str = "2050-01-01"
        self.shortName: str = "palsar2"

        self.availableBands: dict[str, str] = {"hh": "HH", "hv": "HV"}

        self.selectedBands: list[tuple[str, str]] = [(band, f"{(n + 10):02}_{band}") for n, band in enumerate(bands)]

        self.selectedIndices: list[str] = [
            (self.availableIndices[indice_name], indice_name, f"{(n + 40):02}_{indice_name}")
            for n, indice_name in enumerate(indices)
        ]

        self.use_quality_mask = use_quality_mask
        self.minValidPixelCount = min_valid_pixel_count
        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode

        self.toDownloadSelectors = [numeral_band_name for _, numeral_band_name in self.selectedBands] + [
            numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices
        ]

    @staticmethod
    def _mask_quality(img: ee.Image) -> ee.Image:
        """
        Apply MSK quality mask to exclude invalid data.

        MSK bits 0-2 indicate data quality:
            1 = valid data
            5 = invalid

        Parameters
        ----------
        img : ee.Image

        Returns
        -------
        ee.Image
        """
        mask = img.select("MSK")
        quality = mask.bitwiseAnd(0b111)
        valid = quality.eq(1)
        return img.updateMask(valid)

    def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
        ee_geometry = ee_feature.geometry()
        ee_start = ee_feature.get("s")
        ee_end = ee_feature.get("e")

        ee_filter = ee.Filter.And(ee.Filter.bounds(ee_geometry), ee.Filter.date(ee_start, ee_end))

        palsar_img = ee.ImageCollection(self.imageCollectionName).filter(ee_filter)

        if self.use_quality_mask:
            palsar_img = palsar_img.map(self._mask_quality)

        palsar_img = palsar_img.select(
            [self.availableBands[b] for b, _ in self.selectedBands], [b for b, _ in self.selectedBands]
        )

        palsar_img = palsar_img.map(
            lambda img: ee.Image(img).addBands(ee.Image(img).pow(2).log10().multiply(10).subtract(83), overwrite=True)
        )

        if self.selectedIndices:
            palsar_img = palsar_img.map(
                partial(ee_add_indexes_to_image, indexes=[expression for (expression, _, _) in self.selectedIndices])
            )

        palsar_img = palsar_img.select(
            [natural_band_name for natural_band_name, _ in self.selectedBands]
            + [indice_name for _, indice_name, _ in self.selectedIndices],
            [numeral_band_name for _, numeral_band_name in self.selectedBands]
            + [numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices],
        )

        palsar_img = ee_filter_img_collection_invalid_pixels(palsar_img, ee_geometry, self.pixelSize, 20)

        return palsar_img

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        palsar_img = self.imageCollection(ee_feature)

        features = palsar_img.map(
            partial(
                ee_map_bands_and_doy,
                ee_feature=ee_feature,
                pixel_size=self.pixelSize,
                subsampling_max_pixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
                reducer=ee_get_reducers(reducers),
            )
        )

        return features

_mask_quality(img) staticmethod

Apply MSK quality mask to exclude invalid data.

MSK bits 0-2 indicate data quality: 1 = valid data 5 = invalid

Parameters:

Name Type Description Default
img Image
required

Returns:

Type Description
Image
Source code in agrigee_lite/sat/palsar.py
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
@staticmethod
def _mask_quality(img: ee.Image) -> ee.Image:
    """
    Apply MSK quality mask to exclude invalid data.

    MSK bits 0-2 indicate data quality:
        1 = valid data
        5 = invalid

    Parameters
    ----------
    img : ee.Image

    Returns
    -------
    ee.Image
    """
    mask = img.select("MSK")
    quality = mask.bitwiseAnd(0b111)
    valid = quality.eq(1)
    return img.updateMask(valid)

SatelliteEmbedding

Bases: DataSourceSatellite

Satellite abstraction for the Google Satellite Embedding collection.

This collection contains annual, manually curated embeddings derived from multi-sensor satellite data.

IMPORTANT: It always returns the center point value as the median (in order to maintain z-sphere normalization) and the standard deviation of the geometry without the borders.

Parameters:

Name Type Description Default
bands list of str

List of bands to select. Defaults to all 64 embeddings.

None
min_valid_pixel_count int

Minimum number of valid (non-eroded) pixels required to retain an image.

1
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

35_000
Satellite Information

+-----------------------------+-----------------------+ | Field | Value | +-----------------------------+-----------------------+ | Name | Satellite Embedding | | Embedding Dimensions | 64 (A0 to A63) | | Pixel Size | ~10 meters | | Temporal Resolution | Annual | | Coverage | Global | +-----------------------------+-----------------------+

Collection Dates

+------------+------------+ | Start Date | End Date | +------------+------------+ | 2017-01-01 | 2024-01-02 | +------------+------------+

Notes

Satellite Embedding V1: - Dataset: https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_SATELLITE_EMBEDDING_V1_ANNUAL?hl=pt-br#bands

Source code in agrigee_lite/sat/embeddings.py
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
class SatelliteEmbedding(DataSourceSatellite):
    """
    Satellite abstraction for the Google Satellite Embedding collection.

    This collection contains annual, manually curated embeddings derived from multi-sensor satellite data.

    IMPORTANT: It always returns the center point value as the median (in order to maintain z-sphere normalization) and the standard deviation of the geometry without the borders.

    Parameters
    ----------
    bands : list of str, optional
        List of bands to select. Defaults to all 64 embeddings.
    min_valid_pixel_count : int, default=1
        Minimum number of valid (non-eroded) pixels required to retain an image.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=35_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Satellite Information
    ---------------------
    +-----------------------------+-----------------------+
    | Field                       | Value                 |
    +-----------------------------+-----------------------+
    | Name                        | Satellite Embedding   |
    | Embedding Dimensions        | 64 (A0 to A63)        |
    | Pixel Size                  | ~10 meters            |
    | Temporal Resolution         | Annual                |
    | Coverage                    | Global                |
    +-----------------------------+-----------------------+

    Collection Dates
    ----------------
    +------------+------------+
    | Start Date | End Date   |
    +------------+------------+
    | 2017-01-01 | 2024-01-02 |
    +------------+------------+

    Notes
    ----------------
    Satellite Embedding V1:
        - Dataset: https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_SATELLITE_EMBEDDING_V1_ANNUAL?hl=pt-br#bands
    """

    def __init__(
        self,
        bands: list[str] | None = None,
        min_valid_pixel_count: int = 1,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 35000,
    ):
        super().__init__()

        if bands is None:
            bands = [f"A{i:02}" for i in range(64)]

        self.imageCollectionName: str = "GOOGLE/SATELLITE_EMBEDDING/V1/ANNUAL"
        self.pixelSize: int = 10
        self.startDate: str = "2017-01-01"
        self.endDate: str = "2024-01-01"
        self.shortName: str = "satembed"

        self.availableBands: dict[str, str] = {b: b for b in bands}
        self.selectedBands: list[tuple[str, str]] = [(band, f"{(n + 10):02}_{band}") for n, band in enumerate(bands)]

        self.minValidPixelCount = min_valid_pixel_count
        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode

    def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
        ee_geometry = ee_feature.geometry()
        ee_start = ee_feature.get("s")
        ee_end = ee_feature.get("e")

        ee_filter = ee.Filter.And(
            ee.Filter.bounds(ee_geometry),
            ee.Filter.date(ee_start, ee_end),
        )

        imgcol = (
            ee.ImageCollection(self.imageCollectionName)
            .filter(ee_filter)
            .select(
                list(self.availableBands.values()),
                list(self.availableBands.keys()),
            )
        )

        imgcol = imgcol.select(
            [natural for natural, _ in self.selectedBands],
            [renamed for _, renamed in self.selectedBands],
        )

        imgcol = ee_filter_img_collection_invalid_pixels(imgcol, ee_geometry, self.pixelSize, self.minValidPixelCount)

        return imgcol

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        imgcol = self.imageCollection(ee_feature)

        def compute_stats(
            ee_img: ee.Image, ee_feature: ee.Feature, pixel_size: int, subsampling_max_pixels: ee.Number
        ) -> ee.Feature:
            ee_img = ee.Image(ee_img)

            median = ee_img.reduceRegion(
                reducer=ee.Reducer.first(),
                geometry=ee_feature.geometry().centroid(0.001),
                scale=pixel_size,
                maxPixels=subsampling_max_pixels,
                bestEffort=True,
            )

            stddev = ee_img.reduceRegion(
                reducer=ee.Reducer.stdDev(),
                geometry=ee_geometry,
                scale=pixel_size,
                maxPixels=subsampling_max_pixels,
                bestEffort=True,
            )

            stddev = stddev.rename(stddev.keys(), stddev.keys().map(lambda k: ee.String(k).cat("_stdDev")), True)
            median = median.rename(median.keys(), median.keys().map(lambda k: ee.String(k).cat("_median")), True)

            props = ee.Dictionary(median).combine(stddev)

            props = props.set("00_indexnum", ee_feature.get("0"))
            props = props.set("01_timestamp", ee.Date(ee_img.date()).format("YYYY-MM-dd"))
            props = props.set("99_validPixelsCount", ee_img.get("ZZ_USER_VALID_PIXELS"))

            return ee.Feature(None, props)

        features = imgcol.map(
            partial(
                compute_stats,
                ee_feature=ee_feature,
                pixel_size=self.pixelSize,
                subsampling_max_pixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
            )
        )

        return ee.FeatureCollection(features)

Sentinel1GRD

Bases: RadarSatellite

⚠️⚠️⚠️ Sentinel-1 Availability Warning

Due to the failure of the Sentinel-1B satellite in December 2021, the constellation has been operating solely with Sentinel-1A. This has led to reduced data availability in many regions — particularly in the Southern Hemisphere — with revisit times increasing from ~6 days to ~12 days or more. Some areas may experience significant temporal gaps, especially after early 2022. ⚠️⚠️⚠️

Satellite abstraction for Sentinel-1 Ground Range Detected (GRD) product.

Sentinel-1 is a constellation of two polar-orbiting satellites (Sentinel-1A and 1B) operated by ESA, equipped with C-band Synthetic Aperture Radar (SAR). It provides all-weather, day-and-night imaging of Earth's surface.

This class wraps the Sentinel-1 GRD product and allows users to select polarizations, filter by orbit pass, and apply edge masks to remove low-backscatter areas (e.g., layover).

Parameters:

Name Type Description Default
bands set of str

Set of polarizations to select. Defaults to {'vv', 'vh'}.

None
indices set of str

Set of radar indices (e.g. ratios). Defaults to [].

None
ascending bool

If True, selects ASCENDING orbit passes. If False, selects DESCENDING.

True
use_edge_mask bool

Whether to apply an edge mask to remove extreme low-backscatter areas (commonly occurring near the edges of acquisitions or in layover/shadow zones). Default is True.

True
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

20
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

35_000
Edge Masking

Sentinel-1 radar images often contain low-backscatter areas near image borders or over layover zones. This class applies a threshold-based edge mask (< -30 dB) to reduce artifacts.

Satellite Information

+-------------------------------+-------------------------------+ | Field | Value | +-------------------------------+-------------------------------+ | Name | Sentinel-1 | | Agency | ESA (Copernicus) | | Instrument | C-band Synthetic Aperture Radar (SAR) | | Revisit Time (full mission) | ~6 days (1A + 1B constellation)| | Revisit Time (post-2021) | ~12 days (only 1A active) | | Orbit Type | Sun-synchronous (polar) | | Pixel Size | ~10 meters | | Coverage | Global | +-------------------------------+-------------------------------+

Collection Dates

+------------------+-------------+-----------+ | Product | Start Date | End Date | +------------------+-------------+-----------+ | GRD | 2014-10-03 | present | +------------------+-------------+-----------+

Band Information

+------------+-----------+-------------+------------------------------+ | Band Name | Frequency | Resolution | Description | +------------+-----------+-------------+------------------------------+ | VV | 5.405 GHz | ~10 meters | Vertical transmit/receive | | VH | 5.405 GHz | ~10 meters | Vertical transmit, horizontal receive | +------------+-----------+-------------+------------------------------+

Notes
  • Official GRD collection (Earth Engine): https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD

  • Sentinel-1 User Guide: https://sentinels.copernicus.eu/web/sentinel/user-guides/sentinel-1-sar

  • Orbit direction filter: https://developers.google.com/earth-engine/sentinel1#orbit-direction

Source code in agrigee_lite/sat/sentinel1.py
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
class Sentinel1GRD(RadarSatellite):
    """
    ⚠️⚠️⚠️ Sentinel-1 Availability Warning
    ---------------------------------
    Due to the failure of the Sentinel-1B satellite in December 2021, the constellation has been operating solely
    with Sentinel-1A. This has led to reduced data availability in many regions — particularly in the Southern
    Hemisphere — with revisit times increasing from ~6 days to ~12 days or more. Some areas may experience
    significant temporal gaps, especially after early 2022. ⚠️⚠️⚠️

    Satellite abstraction for Sentinel-1 Ground Range Detected (GRD) product.

    Sentinel-1 is a constellation of two polar-orbiting satellites (Sentinel-1A and 1B)
    operated by ESA, equipped with C-band Synthetic Aperture Radar (SAR). It provides
    all-weather, day-and-night imaging of Earth's surface.

    This class wraps the Sentinel-1 GRD product and allows users to select polarizations,
    filter by orbit pass, and apply edge masks to remove low-backscatter areas (e.g., layover).

    Parameters
    ----------
    bands : set of str, optional
        Set of polarizations to select. Defaults to {'vv', 'vh'}.
    indices : set of str, optional
        Set of radar indices (e.g. ratios). Defaults to [].
    ascending : bool, default=True
        If True, selects ASCENDING orbit passes. If False, selects DESCENDING.
    use_edge_mask : bool, optional
        Whether to apply an edge mask to remove extreme low-backscatter areas
        (commonly occurring near the edges of acquisitions or in layover/shadow zones).
        Default is True.
    min_valid_pixel_count : int, default=20
        Minimum number of valid (non-cloud) pixels required to retain an image.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=35_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Edge Masking
    ------------
    Sentinel-1 radar images often contain low-backscatter areas near image borders or over layover zones.
    This class applies a threshold-based edge mask (`< -30 dB`) to reduce artifacts.

    Satellite Information
    ---------------------
    +-------------------------------+-------------------------------+
    | Field                         | Value                         |
    +-------------------------------+-------------------------------+
    | Name                          | Sentinel-1                    |
    | Agency                        | ESA (Copernicus)              |
    | Instrument                    | C-band Synthetic Aperture Radar (SAR) |
    | Revisit Time (full mission)   | ~6 days (1A + 1B constellation)|
    | Revisit Time (post-2021)      | ~12 days (only 1A active)     |
    | Orbit Type                    | Sun-synchronous (polar)       |
    | Pixel Size                    | ~10 meters                    |
    | Coverage                      | Global                        |
    +-------------------------------+-------------------------------+

    Collection Dates
    ----------------
    +------------------+-------------+-----------+
    | Product          | Start Date  | End Date  |
    +------------------+-------------+-----------+
    | GRD              | 2014-10-03  | present   |
    +------------------+-------------+-----------+

    Band Information
    ----------------
    +------------+-----------+-------------+------------------------------+
    | Band Name  | Frequency | Resolution  | Description                  |
    +------------+-----------+-------------+------------------------------+
    | VV         | 5.405 GHz | ~10 meters  | Vertical transmit/receive    |
    | VH         | 5.405 GHz | ~10 meters  | Vertical transmit, horizontal receive |
    +------------+-----------+-------------+------------------------------+

    Notes
    -----
    - Official GRD collection (Earth Engine):
      https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD

    - Sentinel-1 User Guide:
      https://sentinels.copernicus.eu/web/sentinel/user-guides/sentinel-1-sar

    - Orbit direction filter:
      https://developers.google.com/earth-engine/sentinel1#orbit-direction
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        ascending: bool = True,
        use_edge_mask: bool = True,
        min_valid_pixel_count: int = 20,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 35000,
    ):
        bands = sorted({"vv", "vh"}) if bands is None else sorted(bands)

        indices = [] if indices is None else sorted(indices)

        super().__init__()

        self.ascending: bool = ascending
        self.use_edge_mask: bool = use_edge_mask
        self.minValidPixelCount = min_valid_pixel_count
        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode
        self.imageCollectionName: str = "COPERNICUS/S1_GRD"
        self.pixelSize: int = 10

        # full mission start (S-1A launch)
        self.startDate: str = "2014-10-03"
        self.endDate: str = "2050-01-01"
        self.shortName: str = "s1a" if ascending else "s1d"

        # original → product band
        self.availableBands: dict[str, str] = {"vv": "VV", "vh": "VH"}

        self.selectedBands: list[tuple[str, str]] = [(band, f"{(n + 10):02}_{band}") for n, band in enumerate(bands)]

        self.selectedIndices: list[str] = [
            (self.availableIndices[indice_name], indice_name, f"{(n + 40):02}_{indice_name}")
            for n, indice_name in enumerate(indices)
        ]

        self.toDownloadSelectors = [numeral_band_name for _, numeral_band_name in self.selectedBands] + [
            numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices
        ]

    @staticmethod
    def _mask_edge(img: ee.Image) -> ee.Image:
        """
        Remove extreme low-backscatter areas (edges / layover)

        Parameters
        ----------
        img : ee.Image
            Unfiltered Sentinel-1 image

        Returns
        -------
        ee.Image
            Filtered Sentinel-1 image
        """

        edge = img.lt(-30.0)
        valid = img.mask().And(edge.Not())
        return img.updateMask(valid)

    def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
        ee_geometry = ee_feature.geometry()
        ee_start = ee_feature.get("s")
        ee_end = ee_feature.get("e")

        ee_filter = ee.Filter.And(ee.Filter.bounds(ee_geometry), ee.Filter.date(ee_start, ee_end))

        polarization_filter = ee.Filter.And(*[
            ee.Filter.listContains("transmitterReceiverPolarisation", self.availableBands[b])
            for b, _ in self.selectedBands
        ])

        orbit_filter = ee.Filter.eq("orbitProperties_pass", "ASCENDING" if self.ascending else "DESCENDING")

        s1_img = (
            ee.ImageCollection(self.imageCollectionName)
            .filter(ee_filter)
            .filter(polarization_filter)
            .filter(orbit_filter)
        )

        if self.use_edge_mask:
            s1_img = s1_img.map(self._mask_edge)

        s1_img = s1_img.select(list(self.availableBands.values()), list(self.availableBands.keys()))

        if self.selectedIndices:
            s1_img = s1_img.map(
                partial(ee_add_indexes_to_image, indexes=[expression for (expression, _, _) in self.selectedIndices])
            )

        s1_img = s1_img.select(
            [natural_band_name for natural_band_name, _ in self.selectedBands]
            + [indice_name for _, indice_name, _ in self.selectedIndices],
            [numeral_band_name for _, numeral_band_name in self.selectedBands]
            + [numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices],
        )

        s1_img = ee_filter_img_collection_invalid_pixels(s1_img, ee_geometry, self.pixelSize, self.minValidPixelCount)

        return ee.ImageCollection(s1_img)

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        s1_img = self.imageCollection(ee_feature)

        features = s1_img.map(
            partial(
                ee_map_bands_and_doy,
                ee_feature=ee_feature,
                pixel_size=self.pixelSize,
                subsampling_max_pixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
                reducer=ee_get_reducers(reducers),
            )
        )

        return features

_mask_edge(img) staticmethod

Remove extreme low-backscatter areas (edges / layover)

Parameters:

Name Type Description Default
img Image

Unfiltered Sentinel-1 image

required

Returns:

Type Description
Image

Filtered Sentinel-1 image

Source code in agrigee_lite/sat/sentinel1.py
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
@staticmethod
def _mask_edge(img: ee.Image) -> ee.Image:
    """
    Remove extreme low-backscatter areas (edges / layover)

    Parameters
    ----------
    img : ee.Image
        Unfiltered Sentinel-1 image

    Returns
    -------
    ee.Image
        Filtered Sentinel-1 image
    """

    edge = img.lt(-30.0)
    valid = img.mask().And(edge.Not())
    return img.updateMask(valid)

Sentinel2

Bases: OpticalSatellite

Satellite abstraction for Sentinel-2 (HARMONIZED collections).

Sentinel-2 is a constellation of twin Earth observation satellites, operated by ESA, designed for land monitoring, vegetation, soil, water cover, and coastal areas.

Parameters:

Name Type Description Default
bands list of str

List of bands to select. Defaults to all 10 bands most used for vegetation and soil analysis.

None
indices list of str

List of spectral indices to compute from the selected bands.

None
use_sr bool

If True, uses surface reflectance (BOA, 'S2_SR_HARMONIZED'). If False, uses top-of-atmosphere reflectance ('S2_HARMONIZED').

True
cloud_probability_threshold float

Minimum threshold to consider a pixel as cloud-free.

0.7
min_valid_pixel_count int

Minimum number of valid (non-cloud) pixels required to retain an image.

20
border_pixels_to_erode float

Number of pixels to erode from the geometry border.

1
min_area_to_keep_border int

Minimum area (in m²) required to retain geometry after border erosion.

35_000
Satellite Information

+------------------------------------+------------------------+ | Field | Value | +------------------------------------+------------------------+ | Name | Sentinel-2 | | Revisit Time | 5 days | | Revisit Time (cloud-free estimate) | ~7 days | | Pixel Size | 10 meters | | Coverage | Global | +------------------------------------+------------------------+

Collection Dates

+----------------------------+------------+------------+ | Collection Type | Start Date | End Date | +----------------------------+------------+------------+ | TOA (Top of Atmosphere) | 2016-01-01 | present | | SR (Surface Reflectance) | 2019-01-01 | present | +----------------------------+------------+------------+

Band Information

+-----------+---------------+--------------+------------------------+ | Band Name | Original Band | Resolution | Spectral Wavelength | +-----------+---------------+--------------+------------------------+ | blue | B2 | 10 m | 492 nm | | green | B3 | 10 m | 559 nm | | red | B4 | 10 m | 665 nm | | re1 | B5 | 20 m | 704 nm | | re2 | B6 | 20 m | 739 nm | | re3 | B7 | 20 m | 780 nm | | nir | B8 | 10 m | 833 nm | | re4 | B8A | 20 m | 864 nm | | swir1 | B11 | 20 m | 1610 nm | | swir2 | B12 | 20 m | 2186 nm | +-----------+---------------+--------------+------------------------+

Notes

Cloud Masking: This class uses the Cloud Score Plus dataset to estimate cloud probability: https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_CLOUD_SCORE_PLUS_V1_S2_HARMONIZED

Sentinel-2 Collections: - TOA: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_HARMONIZED - SR: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED

Source code in agrigee_lite/sat/sentinel2.py
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
class Sentinel2(OpticalSatellite):
    """
    Satellite abstraction for Sentinel-2 (HARMONIZED collections).

    Sentinel-2 is a constellation of twin Earth observation satellites,
    operated by ESA, designed for land monitoring, vegetation, soil, water cover, and coastal areas.

    Parameters
    ----------
    bands : list of str, optional
        List of bands to select. Defaults to all 10 bands most used for vegetation and soil analysis.
    indices : list of str, optional
        List of spectral indices to compute from the selected bands.
    use_sr : bool, default=True
        If True, uses surface reflectance (BOA, 'S2_SR_HARMONIZED').
        If False, uses top-of-atmosphere reflectance ('S2_HARMONIZED').
    cloud_probability_threshold : float, default=0.7
        Minimum threshold to consider a pixel as cloud-free.
    min_valid_pixel_count : int, default=20
        Minimum number of valid (non-cloud) pixels required to retain an image.
    border_pixels_to_erode : float, default=1
        Number of pixels to erode from the geometry border.
    min_area_to_keep_border : int, default=35_000
        Minimum area (in m²) required to retain geometry after border erosion.

    Satellite Information
    ---------------------
    +------------------------------------+------------------------+
    | Field                              | Value                  |
    +------------------------------------+------------------------+
    | Name                               | Sentinel-2             |
    | Revisit Time                       | 5 days                 |
    | Revisit Time (cloud-free estimate) | ~7 days                |
    | Pixel Size                         | 10 meters              |
    | Coverage                           | Global                 |
    +------------------------------------+------------------------+

    Collection Dates
    ----------------
    +----------------------------+------------+------------+
    | Collection Type            | Start Date | End Date  |
    +----------------------------+------------+------------+
    | TOA (Top of Atmosphere)    | 2016-01-01 | present   |
    | SR (Surface Reflectance)   | 2019-01-01 | present   |
    +----------------------------+------------+------------+

    Band Information
    ----------------
    +-----------+---------------+--------------+------------------------+
    | Band Name | Original Band | Resolution   | Spectral Wavelength    |
    +-----------+---------------+--------------+------------------------+
    | blue      | B2            | 10 m         | 492 nm                 |
    | green     | B3            | 10 m         | 559 nm                 |
    | red       | B4            | 10 m         | 665 nm                 |
    | re1       | B5            | 20 m         | 704 nm                 |
    | re2       | B6            | 20 m         | 739 nm                 |
    | re3       | B7            | 20 m         | 780 nm                 |
    | nir       | B8            | 10 m         | 833 nm                 |
    | re4       | B8A           | 20 m         | 864 nm                 |
    | swir1     | B11           | 20 m         | 1610 nm                |
    | swir2     | B12           | 20 m         | 2186 nm                |
    +-----------+---------------+--------------+------------------------+

    Notes
    ----------------
    Cloud Masking:
        This class uses the **Cloud Score Plus** dataset to estimate cloud probability:
        https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_CLOUD_SCORE_PLUS_V1_S2_HARMONIZED

    Sentinel-2 Collections:
        - TOA: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_HARMONIZED
        - SR:  https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED
    """

    def __init__(
        self,
        bands: set[str] | None = None,
        indices: set[str] | None = None,
        use_sr: bool = True,
        cloud_probability_threshold: float = 0.7,
        min_valid_pixel_count: int = 20,
        border_pixels_to_erode: float = 1,
        min_area_to_keep_border: int = 35000,
    ):
        bands = (
            sorted({"blue", "green", "red", "re1", "re2", "re3", "nir", "re4", "swir1", "swir2"})
            if bands is None
            else sorted(bands)
        )

        indices = [] if indices is None else sorted(indices)

        super().__init__()
        self.useSr = use_sr
        self.imageCollectionName = "COPERNICUS/S2_SR_HARMONIZED" if use_sr else "COPERNICUS/S2_HARMONIZED"
        self.pixelSize: int = 10

        self.startDate: str = "2019-01-01" if use_sr else "2016-01-01"
        self.endDate: str = "2050-01-01"
        self.shortName: str = "s2sr" if use_sr else "s2"

        self.availableBands: dict[str, str] = {
            "blue": "B2",
            "green": "B3",
            "red": "B4",
            "re1": "B5",
            "re2": "B6",
            "re3": "B7",
            "nir": "B8",
            "re4": "B8A",
            "swir1": "B11",
            "swir2": "B12",
        }

        self.selectedBands: list[tuple[str, str]] = [(band, f"{(n + 10):02}_{band}") for n, band in enumerate(bands)]

        self.selectedIndices: list[str] = [
            (self.availableIndices[indice_name], indice_name, f"{(n + 40):02}_{indice_name}")
            for n, indice_name in enumerate(indices)
        ]

        self.cloudProbabilityThreshold = cloud_probability_threshold
        self.minValidPixelCount = min_valid_pixel_count
        self.minAreaToKeepBorder = min_area_to_keep_border
        self.borderPixelsToErode = border_pixels_to_erode

        self.toDownloadSelectors = [numeral_band_name for _, numeral_band_name in self.selectedBands] + [
            numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices
        ]

    def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
        ee_geometry = ee_feature.geometry()

        ee_start_date = ee_feature.get("s")
        ee_end_date = ee_feature.get("e")

        ee_filter = ee.Filter.And(ee.Filter.bounds(ee_geometry), ee.Filter.date(ee_start_date, ee_end_date))

        s2_img = (
            ee.ImageCollection(self.imageCollectionName)
            .filter(ee_filter)
            .select(
                list(self.availableBands.values()),
                list(self.availableBands.keys()),
            )
        )

        s2_img = s2_img.map(lambda img: ee.Image(img).addBands(ee.Image(img).divide(10000), overwrite=True))

        if self.selectedIndices:
            s2_img = s2_img.map(
                partial(ee_add_indexes_to_image, indexes=[expression for (expression, _, _) in self.selectedIndices])
            )

        s2_img = s2_img.select(
            [natural_band_name for natural_band_name, _ in self.selectedBands]
            + [indice_name for _, indice_name, _ in self.selectedIndices],
            [numeral_band_name for _, numeral_band_name in self.selectedBands]
            + [numeral_indice_name for _, _, numeral_indice_name in self.selectedIndices],
        )

        s2_cloud_mask = (
            ee.ImageCollection("GOOGLE/CLOUD_SCORE_PLUS/V1/S2_HARMONIZED")
            .filter(ee_filter)
            .select(["cs_cdf"], ["cloud"])
        )

        s2_img = s2_img.combine(s2_cloud_mask)

        s2_img = s2_img.map(lambda img: ee_cloud_probability_mask(img, self.cloudProbabilityThreshold, True))
        s2_img = ee_filter_img_collection_invalid_pixels(s2_img, ee_geometry, self.pixelSize, self.minValidPixelCount)

        return ee.ImageCollection(s2_img)

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()

        if self.borderPixelsToErode != 0:
            ee_geometry = ee_safe_remove_borders(
                ee_geometry, round(self.borderPixelsToErode * self.pixelSize), self.minAreaToKeepBorder
            )
            ee_feature = ee_feature.setGeometry(ee_geometry)

        s2_img = self.imageCollection(ee_feature)

        features = s2_img.map(
            partial(
                ee_map_bands_and_doy,
                ee_feature=ee_feature,
                pixel_size=self.pixelSize,
                subsampling_max_pixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
                reducer=ee_get_reducers(reducers),
            )
        )

        return features

TwoSatelliteFusion

Bases: OpticalSatellite

A satellite fusion class that combines data from exactly two optical satellites for synchronized analysis.

This class enables the fusion of data from two different optical satellites by finding common observation dates and merging their image collections. It ensures temporal alignment between the two satellite datasets, making it possible to perform comparative analysis or create composite datasets from dual satellite sources.

The class is specifically designed for two-satellite fusion and automatically handles: - Temporal intersection calculation between the two satellite date ranges - Spatial resolution alignment using the finest available resolution - Band renaming with prefixes to distinguish between the two satellite sources - Image collection synchronization based on common observation dates - Unified processing pipeline for both satellite datasets

Parameters:

Name Type Description Default
satellite_a OpticalSatellite

The first optical satellite configuration object.

required
satellite_b OpticalSatellite

The second optical satellite configuration object.

required

Attributes:

Name Type Description
sat_a OpticalSatellite

Reference to the first satellite object.

sat_b OpticalSatellite

Reference to the second satellite object.

startDate str

The latest start date between both satellites (ISO format).

endDate str

The earliest end date between both satellites (ISO format).

pixelSize float

The finest spatial resolution between both satellites.

shortName str

Combined short name identifier for the fused satellite configuration.

toDownloadSelectors list[str]

Combined selectors from both satellites with distinguishing prefixes.

Examples:

>>> from agrigee_lite.sat.landsat import Landsat8
>>> from agrigee_lite.sat.sentinel import Sentinel2
>>>
>>> l8 = Landsat8()
>>> s2 = Sentinel2()
>>> fusion = TwoSatelliteFusion(l8, s2)
>>>
>>> # The fused satellite will only cover the temporal overlap
>>> print(fusion.startDate)  # Latest of the two start dates
>>> print(fusion.endDate)    # Earliest of the two end dates
Source code in agrigee_lite/sat/unified_satellite.py
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
class TwoSatelliteFusion(OpticalSatellite):
    """
    A satellite fusion class that combines data from exactly two optical satellites for synchronized analysis.

    This class enables the fusion of data from two different optical satellites by finding
    common observation dates and merging their image collections. It ensures temporal alignment
    between the two satellite datasets, making it possible to perform comparative analysis or
    create composite datasets from dual satellite sources.

    The class is specifically designed for two-satellite fusion and automatically handles:
    - Temporal intersection calculation between the two satellite date ranges
    - Spatial resolution alignment using the finest available resolution
    - Band renaming with prefixes to distinguish between the two satellite sources
    - Image collection synchronization based on common observation dates
    - Unified processing pipeline for both satellite datasets

    Parameters
    ----------
    satellite_a : OpticalSatellite
        The first optical satellite configuration object.
    satellite_b : OpticalSatellite
        The second optical satellite configuration object.

    Attributes
    ----------
    sat_a : OpticalSatellite
        Reference to the first satellite object.
    sat_b : OpticalSatellite
        Reference to the second satellite object.
    startDate : str
        The latest start date between both satellites (ISO format).
    endDate : str
        The earliest end date between both satellites (ISO format).
    pixelSize : float
        The finest spatial resolution between both satellites.
    shortName : str
        Combined short name identifier for the fused satellite configuration.
    toDownloadSelectors : list[str]
        Combined selectors from both satellites with distinguishing prefixes.

    Examples
    --------
    >>> from agrigee_lite.sat.landsat import Landsat8
    >>> from agrigee_lite.sat.sentinel import Sentinel2
    >>>
    >>> l8 = Landsat8()
    >>> s2 = Sentinel2()
    >>> fusion = TwoSatelliteFusion(l8, s2)
    >>>
    >>> # The fused satellite will only cover the temporal overlap
    >>> print(fusion.startDate)  # Latest of the two start dates
    >>> print(fusion.endDate)    # Earliest of the two end dates
    """

    def __init__(self, satellite_a: OpticalSatellite, satellite_b: OpticalSatellite):
        super().__init__()
        self.sat_a = satellite_a
        self.sat_b = satellite_b

        # Get the intersection between start date and end date of both satellites
        self.startDate = max(
            datetime.fromisoformat(satellite_a.startDate), datetime.fromisoformat(satellite_b.startDate)
        ).isoformat()
        self.endDate = min(
            datetime.fromisoformat(satellite_a.endDate), datetime.fromisoformat(satellite_b.endDate)
        ).isoformat()

        self.pixelSize = min(satellite_a.pixelSize, satellite_b.pixelSize)
        self.shortName = f"fusion_{satellite_a.shortName}_{satellite_b.shortName}"
        self.toDownloadSelectors = [
            f"8{selector}{satellite_a.shortName}" for selector in satellite_a.toDownloadSelectors
        ] + [f"7{selector}{satellite_b.shortName}" for selector in satellite_b.toDownloadSelectors]

    def imageCollection(self, ee_feature: ee.Feature) -> ee.ImageCollection:
        sat_a = self.sat_a.imageCollection(ee_feature)
        sat_b = self.sat_b.imageCollection(ee_feature)

        sat_a_dates = extract_dates(sat_a)
        sat_b_dates = extract_dates(sat_b)

        common_dates = intersect_lists(sat_a_dates, sat_b_dates)

        sat_a_filtered = filter_by_common_dates(sat_a, common_dates)
        sat_b_filtered = filter_by_common_dates(sat_b, common_dates)

        sat_a_filtered = rename_bands(sat_a_filtered, "8", self.sat_a.shortName)
        sat_b_filtered = rename_bands(sat_b_filtered, "7", self.sat_b.shortName)

        merged = sat_a_filtered.linkCollection(
            sat_b_filtered, matchPropertyName="ZZ_USER_TIME_DUMMY", linkedBands=sat_b_filtered.first().bandNames()
        )

        return merged

    def compute(
        self,
        ee_feature: ee.Feature,
        subsampling_max_pixels: float,
        reducers: set[str] | None = None,
    ) -> ee.FeatureCollection:
        ee_geometry = ee_feature.geometry()
        ee_geometry = ee_safe_remove_borders(ee_geometry, self.pixelSize, 35000)
        ee_feature = ee_feature.setGeometry(ee_geometry)

        s2_img = self.imageCollection(ee_feature)

        features = s2_img.map(
            partial(
                ee_map_bands_and_doy,
                ee_feature=ee_feature,
                pixel_size=self.pixelSize,
                subsampling_max_pixels=ee_get_number_of_pixels(ee_geometry, subsampling_max_pixels, self.pixelSize),
                reducer=ee_get_reducers(reducers),
            )
        )

        return features

    def log_dict(self) -> dict:
        d = {}
        d["sat_a"] = self.sat_a.log_dict()
        d["sat_b"] = self.sat_b.log_dict()
        return d

multiple_sits(gdf, satellite, band_or_indice_to_plot, reducer='median', ax=None, color='blue', alpha=0.5)

Visualize satellite time series for multiple geometries with normalized temporal alignment.

Creates overlaid line plots for multiple geometries, with time series normalized to year fractions to enable comparison across different years. Each geometry's time series is plotted as a semi-transparent line, making it easy to identify patterns and outliers across the dataset.

Parameters:

Name Type Description Default
gdf GeoDataFrame

GeoDataFrame containing multiple geometries and their temporal information. Must have the required date columns for satellite time series processing.

required
satellite AbstractSatellite

Satellite configuration object.

required
band_or_indice_to_plot str

Name of the band or vegetation index to visualize.

required
reducer str

Temporal reducer to apply (e.g., "median", "mean"), by default "median".

'median'
ax Axes or None

Matplotlib axes object for plotting. If None, creates a new plot, by default None.

None
color str

Color for the plot lines, by default "blue".

'blue'
alpha float

Transparency level for individual lines (0.0 to 1.0), by default 0.5. Lower values help visualize overlapping time series.

0.5

Returns:

Type Description
None

The function creates a plot but doesn't return any value.

Notes

This function normalizes timestamps to year fractions, where each time series starts from 0.0, making it possible to overlay multiple years of data for pattern analysis. The original timestamps are converted using the year_fraction function and then normalized to start from zero.

Source code in agrigee_lite/vis/sits.py
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
def visualize_multiple_sits(
    gdf: gpd.GeoDataFrame,
    satellite: AbstractSatellite,
    band_or_indice_to_plot: str,
    reducer: str = "median",
    ax: Any = None,
    color: str = "blue",
    alpha: float = 0.5,
) -> None:
    """
    Visualize satellite time series for multiple geometries with normalized temporal alignment.

    Creates overlaid line plots for multiple geometries, with time series normalized
    to year fractions to enable comparison across different years. Each geometry's
    time series is plotted as a semi-transparent line, making it easy to identify
    patterns and outliers across the dataset.

    Parameters
    ----------
    gdf : gpd.GeoDataFrame
        GeoDataFrame containing multiple geometries and their temporal information.
        Must have the required date columns for satellite time series processing.
    satellite : AbstractSatellite
        Satellite configuration object.
    band_or_indice_to_plot : str
        Name of the band or vegetation index to visualize.
    reducer : str, optional
        Temporal reducer to apply (e.g., "median", "mean"), by default "median".
    ax : matplotlib.axes.Axes or None, optional
        Matplotlib axes object for plotting. If None, creates a new plot, by default None.
    color : str, optional
        Color for the plot lines, by default "blue".
    alpha : float, optional
        Transparency level for individual lines (0.0 to 1.0), by default 0.5.
        Lower values help visualize overlapping time series.

    Returns
    -------
    None
        The function creates a plot but doesn't return any value.

    Notes
    -----
    This function normalizes timestamps to year fractions, where each time series
    starts from 0.0, making it possible to overlay multiple years of data for
    pattern analysis. The original timestamps are converted using the `year_fraction`
    function and then normalized to start from zero.
    """
    import matplotlib.pyplot as plt

    long_sits = download_multiple_sits(gdf, satellite, reducers=[reducer])

    if len(long_sits) == 0:
        return None

    for indexnumm in long_sits.indexnum.unique():
        indexnumm_df = long_sits[long_sits.indexnum == indexnumm].reset_index(drop=True).copy()
        indexnumm_df["timestamp"] = indexnumm_df.timestamp.apply(year_fraction)
        indexnumm_df["timestamp"] = indexnumm_df["timestamp"] - indexnumm_df["timestamp"].min().round()

        y = indexnumm_df[band_or_indice_to_plot].values

        if ax is None:
            plt.plot(
                indexnumm_df.timestamp,
                y,
                color=color,
                alpha=alpha,
            )
        else:
            ax.plot(indexnumm_df.timestamp, y, color=color, alpha=alpha, label=satellite.shortName)