erDiagram %% https://mermaid.js.org/syntax/entityRelationshipDiagram.html %% tables sdm_datasets { str ds_key PK "dataset key" str name str link str citation str response_type } sdm_models { str ds_key PK "dataset key" int mdl_id PK "model ID" str description "description of model" str sp_key FK "species key" str population "subpopulation" str time_period "start/duration (ISO 8601)" str var "variable, e.g.: cv, n, se" } sdm_species { str ds_key PK "dataset key" str sp_key PK "species key" int aphia_id FK "singular taxa ID" arr aphia_ids FK "array of all taxa IDs" str sp_scientific str sp_common str description } sdm_geometries { str ds_key PK "dataset key" int geom_id PK "geometry ID" geo geom "geometry"} sdm_values { str ds_key PK "dataset key" int mdl_id PK "model ID" int geom_id PK "geometry ID" dbl val "value"} taxa { int aphia_id PK "unique ID at MarineSpecies.org" str scientific_name str common_name str rank str iucn_cat "IUCN Red List extinction risk category" } %% relationships sdm_datasets ||--|{ sdm_models : ds_key sdm_datasets ||--|{ sdm_species : ds_key sdm_datasets ||--|{ sdm_geometries : ds_key sdm_models ||--|{ sdm_species : "ds_key,sp_key" sdm_geometries ||--|{ sdm_values : "ds_key,geom_id" sdm_models ||--|{ sdm_values : "ds_key,mdl_id" taxa ||--|{ sdm_species : aphia_id taxa }o--|{ sdm_species : aphia_ids
Create SDM Tables
storing species distribution model outputs in a database for analysis
1 Species Distribution Models
The distribution of models varies geographically by extent, spatial analytical units, taxa, time period and output variable. To store the original data, we ingest data into the following database public
schema:
Specific columns:
- sdm_datasets:
name_short
short name in form of {source_broad} {regions} {taxa_groups} {response_type}name_long
original nameresponse_type
one of: occurrence, range, suitability, probability or densitytaxa_groups
one or more of: fish, invertebrates, marine mammals, cetaceans, sea turtles, seabirds, etc.
- sdm_models`
time_period
time interval in format of {start}/{duration} per ISO 8601, e.g. “2019-02/P1M” is a duration of one month starting February, 2019. See https://en.wikipedia.org/wiki/ISO_8601#Time_intervals.
1.1 Data Dictionary
Data types: - str
string: variable-length character string, up to 1 GB; - ser
serial: 1 to 2,147,483,647; - bser
bigserial: 1 to 9,223,372,036,854,775,807; - int
integer: -2,147,483,648 to 2,147,483,647; - dbl
double precision: 15 decimal digits of precision;
2 Database Table and Column Naming Conventions
- Table names are plural and use all lower case.
- Unique identifiers are suffixed with:
*_id
for unique integer keys;*_key
for unique string keys;*_seq
for auto-incrementing sequence integer keys.
- Column names are singular and use snake_case.
- Foreign keys are named with the singular form of the table they reference, followed by _id.
- Primary keys are named id.