3 Data Sources
The MST integrates species distribution data from 8 source datasets plus 1 derived merged dataset, all stored in a DuckDB database at 0.05° resolution (~4 km cells). Each dataset provides spatial predictions of species occurrence, ranging from continuous suitability models (AquaMaps) to habitats associated with a core area, critical habitat or distinct population segment (NMFS, FWS, SWOT+DPS) to those based on simple range maps (BirdLife, IUCN, FWS).
3.1 Dataset Overview
Table 3.1 summarizes all 9 datasets in the MST database.
Is Mask = Yes can also serve as spatial masks during model merging.
| Dataset Key | Display Name | Cell Value Encoding | Is Mask | Sort |
|---|---|---|---|---|
am_0.05 |
AquaMaps SDM | Continuous suitability 0–100% | No | 1 |
ca_nmfs |
NMFS Core Area | Core: 100% | Yes | 2 |
ch_nmfs |
NMFS Critical Habitat | EN: 100%, TN: 50% | Yes | 3 |
ch_fws |
FWS Critical Habitat | EN: 100%, TN: 50% | Yes | 4 |
rng_fws |
FWS Range | EN: 100%, TN: 50%, LC: 1% | Yes | 5 |
bl |
BirdLife Range | CR: 50%, EN: 25%, VU: 5%, NT: 2%, LC: 1% | Yes | 6 |
rng_iucn |
IUCN Range | CR: 50%, EN: 25%, VU: 5%, NT: 2%, LC: 1% | Yes | 7 |
rng_turtle_swot_dps |
SWOT+DPS Turtle Range | EN: 100%, TN: 50% | Yes | 8 |
ms_merge |
Merged Model | MAX across source datasets | No | 0 |
The sort order determines priority during model merging: datasets with continuous predictions (AquaMaps) are considered first, followed by progressively coarser range maps. The merged model (ms_merge) takes the MAX value across all available source datasets for each species in each cell.
3.2 Source Datasets
3.2.1 AquaMaps SDM (am_0.05)
AquaMaps provides standardized species distribution models (SDMs) for over 17,000 marine species based on environmental envelope models (Kaschner et al. 2019). Each species model defines habitat suitability as a function of depth, temperature, salinity, primary productivity, ice concentration, and distance from land.
- Species count: ~17,550 models
- Native resolution: 0.5° (c-squares), downscaled to 0.05° using bilinear interpolation
- Cell values: continuous suitability from 0–100%
- Coverage: global ocean
AquaMaps serves as the foundational dataset for most species, providing the highest spatial resolution and broadest taxonomic coverage.
3.2.2 NMFS Core Area (ca_nmfs)
Core areas delineated by the National Marine Fisheries Service [NMFS; National Marine Fisheries Service (2019)] identify regions of concentrated use for species under NMFS jurisdiction.
- Species count: limited to species with designated core areas
- Cell values: 100% within core area boundaries
- Role: contributes to merged model as a mask dataset
3.2.3 NMFS Critical Habitat (ch_nmfs)
Critical habitat designated under the Endangered Species Act (ESA) by NMFS for marine species (National Marine Fisheries Service 2025).
- Species count: 34 species
- Cell values: Endangered species = 100%, Threatened species = 50%
- Role: dual function — contributes cell values AND forms part of the spatial mask for model merging
3.2.4 FWS Critical Habitat (ch_fws)
Critical habitat designated under the ESA by the U.S. Fish and Wildlife Service [FWS; U.S. Fish and Wildlife Service (2025b)] for marine and coastal species.
- Species count: 29 species
- Cell values: Endangered species = 100%, Threatened species = 50%
- Role: dual function — contributes cell values AND forms part of the spatial mask
3.2.5 FWS Range (rng_fws)
Current range maps maintained by FWS for ESA-listed marine and coastal species (U.S. Fish and Wildlife Service 2025a).
- Species count: 106 species
- Cell values: Endangered = 100%, Threatened = 50%, Least Concern = 1%
- Role: mask dataset, providing spatial extent for species not covered by AquaMaps
3.2.6 BirdLife Range (bl)
Expert-reviewed range maps from BirdLife International’s Birds of the World [BOTW; BirdLife International (2024)] dataset, representing the most authoritative global seabird distribution data.
- Species count: 573 seabird species
- Cell values: scaled by IUCN Red List category — CR: 50%, EN: 25%, VU: 5%, NT: 2%, LC: 1%
- Role: mask dataset, providing spatial constraints for seabird species
- Note: BirdLife range maps are expert-delineated polygons; cell values reflect conservation status rather than habitat suitability
3.2.7 IUCN Range (rng_iucn)
Range maps from the IUCN Red List of Threatened Species spatial data (IUCN 2025), covering a broad array of marine taxa.
- Cell values: scaled by IUCN Red List category — CR: 50%, EN: 25%, VU: 5%, NT: 2%, LC: 1%
- Role: mask dataset, providing spatial extent for species with IUCN range data
3.2.8 SWOT+DPS Turtle Range (rng_turtle_swot_dps)
Sea turtle distributions from the original AquaMaps and IUCN range maps were overly broad, extending well beyond where species actually occur. Furthermore, several sea turtle species have differential ESA protection across their range according to NOAA Fisheries Distinct Population Segments (DPS).
This dataset replaces the IUCN range map as the global spatial mask for all 6 sea turtle species. It combines two data sources:
- SWOT Global Distributions from the State of the World’s Sea Turtles (Wallace et al. 2023), which define more realistic species range polygons.
- NMFS Distinct Population Segments (DPS) from the NOAA Fisheries ESA Species Range Geodatabase (NOAA Fisheries 2025), which identify sub-populations listed as Endangered under the ESA.
For species with DPS data (Loggerhead CC, Green CM, Olive Ridley LO): Endangered DPS areas are coded at 100%, and the remainder of the SWOT range is coded as Threatened at 50%. For purely Endangered species (Leatherback DC, Hawksbill EI, Kemp’s Ridley LK): the entire SWOT range is coded at 100%.
- Species count: 6 sea turtle species
- Cell values: Endangered = 100%, Threatened = 50%
- Role: mask dataset — replaces
rng_iucnas the global mask for turtles, producing more restrictive distributions dominated by ESA risk values See workflow: ingest_turtles-swot-dps.
3.3 Merged Model (ms_merge)
The merged model is the derived output of the model merging pipeline (see Chapter 6). For each of the 9,819 valid species, cell values are computed as:
- MAX across all source dataset values for that species in each cell
- Masked to the spatial extent defined by IUCN, NMFS CH, and FWS CH ranges (when available)
- Floored at minimum values for MMPA-protected species (20%) and MBTA-protected species (10%)
3.4 Standard Grid
All datasets are aligned to a standard 0.05° × 0.05° latitude-longitude grid covering US waters within BOEM Program Areas. At mid-latitudes, each cell represents approximately 4 × 4 km (16 km²). The grid is stored as a reference raster with unique cell_id values, enabling efficient joins between spatial data and tabular attributes in DuckDB.
3.5 Primary Productivity
Primary productivity is specified in the explicit mandate for BOEM’s management, per the Outer Continental Shelf Lands Act (OCSLA), Section 18(a)(2) of the OCSLA Amendments of 1978 specifying 8 factors the USDOI must consider in the timing and location of OCS oil and gas activities, including “the relative environmental sensitivity and marine productivity of different areas of the OCS.”
We use satellite-derived net primary productivity (NPP) from the Vertically Generalized Production Model [VGPM; Behrenfeld and Falkowski (1997)] product from Oregon State’s Ocean Productivity Lab, based on VIIRS satellite data for 2014–2023. Values are converted to metric tons C km-2 yr-1 and averaged across the time period.
