Build CMIP6 experiment output file index — init_cmip6

init_cmip6_index() will search the CMIP6 model output file using esgf_query() , return a data.table::data.table() containing the actual NetCDF file url to download, and store it into user data directory for future use.

init_cmip6_index(
  activity = "ScenarioMIP",
  variable = c("tas", "tasmax", "tasmin", "hurs", "hursmax", "hursmin", "pr", "rsds",
    "rlds", "psl", "sfcWind", "clt"),
  frequency = "day",
  experiment = c("ssp126", "ssp245", "ssp370", "ssp585"),
  source = c("AWI-CM-1-1-MR", "BCC-CSM2-MR", "CESM2", "CESM2-WACCM", "EC-Earth3",
    "EC-Earth3-Veg", "GFDL-ESM4", "INM-CM4-8", "INM-CM5-0", "MPI-ESM1-2-HR",
    "MRI-ESM2-0"),
  variant = "r1i1p1f1",
  replica = FALSE,
  latest = TRUE,
  resolution = c("100 km", "50 km"),
  limit = 10000L,
  data_node = NULL,
  years = NULL,
  save = FALSE
)

Arguments

activity

A character vector indicating activity identifiers. Default: "ScenarioMIP". Possible values:

"AerChemMIP": Aerosols and Chemistry Model Intercomparison Project,
"C4MIP": Coupled Climate Carbon Cycle Model Intercomparison Project,
"CDRMIP": Carbon Dioxide Removal Model Intercomparison Project,
"CFMIP": Cloud Feedback Model Intercomparison Project,
"CMIP": CMIP DECK: 1pctCO2, abrupt4xCO2, amip, esm-piControl, esm-historical, historical, and piControl experiments,
"CORDEX": Coordinated Regional Climate Downscaling Experiment,
"DAMIP": Detection and Attribution Model Intercomparison Project,
"DCPP": Decadal Climate Prediction Project,
"DynVarMIP": Dynamics and Variability Model Intercomparison Project,
"FAFMIP": Flux-Anomaly-Forced Model Intercomparison Project,
"GMMIP": Global Monsoons Model Intercomparison Project,
"GeoMIP": Geoengineering Model Intercomparison Project,
"HighResMIP": High-Resolution Model Intercomparison Project,
"ISMIP6": Ice Sheet Model Intercomparison Project for CMIP6,
"LS3MIP": Land Surface, Snow and Soil Moisture,
"LUMIP": Land-Use Model Intercomparison Project,
"OMIP": Ocean Model Intercomparison Project,
"PAMIP": Polar Amplification Model Intercomparison Project,
"PMIP": Palaeoclimate Modelling Intercomparison Project,
"RFMIP": Radiative Forcing Model Intercomparison Project,
"SIMIP": Sea Ice Model Intercomparison Project,
"ScenarioMIP": Scenario Model Intercomparison Project,
"VIACSAB": Vulnerability, Impacts, Adaptation and Climate Services Advisory Board,
"VolMIP": Volcanic Forcings Model Intercomparison Project

variable

A character vector indicating variable identifiers. The 12 most related variables for EPW are set as defaults. If NULL, all possible variables are returned. Default: c("tas", "tasmax", "tasmin", "hurs", "hursmax", "hursmin", "psl", "rss", "rls", "sfcWind", "pr", "clt"), where:

tas: Near-surface (usually, 2 meter) air temperature, units: K.
tasmax: Maximum near-surface (usually, 2 meter) air temperature, units: K.
tasmin: Minimum near-surface (usually, 2 meter) air temperature, units: K.
hurs: Near-surface relative humidity, units: %.
hursmax: Maximum near-surface relative humidity, units: %.
hursmin: Minimum near-surface relative humidity, units: %.
psl: Sea level pressure, units: Pa.
rsds: Surface downwelling shortwave radiation, units: W m-2.
rlds: Surface downwelling longwave radiation, units: W m-2.
sfcWind: Near-surface (usually, 10 meters) wind speed, units: m s-1.
pr: Precipitation, units: kg m-2 s-1.
clt: Total cloud area fraction for the whole atmospheric column, as seen from the surface or the top of the atmosphere. Units: %.

frequency

A character vector of sampling frequency. If NULL, all possible frequencies are returned. Default: "day". Possible values:

"1hr": sampled hourly,
"1hrCM": monthly-mean diurnal cycle resolving each day into 1-hour means,
"1hrPt": sampled hourly, at specified time point within an hour,
"3hr": sampled every 3 hours,
"3hrPt": sampled 3 hourly, at specified time point within the time period,
"6hr": sampled every 6 hours,
"6hrPt": sampled 6 hourly, at specified time point within the time period,
"day": daily mean samples,
"dec": decadal mean samples,
"fx": fixed (time invariant) field,
"mon": monthly mean samples,
"monC": monthly climatology computed from monthly mean samples,
"monPt": sampled monthly, at specified time point within the time period,
"subhrPt": sampled sub-hourly, at specified time point within an hour,
"yr": annual mean samples,
"yrPt": sampled yearly, at specified time point within the time period

experiment

A character vector indicating root experiment identifiers. The Tier-1 experiment of activity ScenarioMIP are set as defaults. If NULL, all possible experiment are returned. Default: c("ssp126", "ssp245", "ssp370", "ssp585").

source

A character vector indicating model identifiers. Defaults are set to 11 sources which give outputs of all 4 experiment of activity ScenarioMIP with daily frequency, i.e. "AWI-CM-1-1-MR", "BCC-CSM2-MR", "CESM2", "CESM2-WACCM", "EC-Earth3", "EC-Earth3-Veg", "GFDL-ESM4", "INM-CM4-8", "INM-CM5-0", "MPI-ESM1-2-HR" and "MRI-ESM2-0". If NULL, all possible sources are returned.

variant

A character vector indicating label constructed from 4 indices stored as global attributes in format r<k>i<l>p<m>f<n> described below. Default: "r1i1p1f1". If NULL, all possible variants are returned.

r: realization_index (<k>) = realization number (integer >0)
i: initialization_index (<l>) = index for variant of initialization method (integer >0)
p: physics_index (<m>) = index for model physics variant (integer >0)
f: forcing_index (<n>) = index for variant of forcing (integer >0)

replica

Whether the record is the "master" copy, or a replica. Use FALSE to return only originals and TRUE to return only replicas. Use NULL to return both the master and the replicas. Default: FALSE.

latest

Whether the record is the latest available version, or a previous version. Use TRUE to return only the latest version of all records and FALSE to return previous versions. Default: FALSE.

resolution

A character vector indicating approximate horizontal resolution. Default: c("50 km", "100 km"). If NULL, all possible resolutions are returned.

limit

An integer indicating the maximum of matched records to return. Should be <= 10,000. Default: 10000.

data_node

A character vector indicating data nodes to be queried. Default to NULL, which means all possible data nodes.

years

An integer vector indicating the target years to be include in the data file. All other years will be excluded. If NULL, no subsetting on years will be performed. Default: NULL.

save

If TRUE, the results will be saved into user data directory. Default: FALSE.

Value

A data.table::data.table with 22 columns:

No.	Column	Type	Description
1	`file_id`	Character	Model output file universal identifier
2	`dataset_id`	Character	Dataset universal identifier
3	`mip_era`	Character	Activity's associated CMIP cycle. Will always be `"CMIP6"`
4	`activity_drs`	Character	Activity DRS (Data Reference Syntax)
5	`institution_id`	Character	Institution identifier
6	`source_id`	Character	Model identifier
7	`experiment_id`	Character	Root experiment identifier
8	`member_id`	Character	A compound construction from `sub_experiment_id` and `variant_label`
9	`table_id`	Character	Table identifier
10	`frequency`	Character	Sampling frequency
11	`grid_label`	Character	Grid identifier
12	`version`	Character	Approximate date of model output file
13	`nominal_resolution`	Character	Approximate horizontal resolution
14	`variable_id`	Character	Variable identifier
15	`variable_long_name`	Character	Variable long name
16	`variable_units`	Character	Units of variable
17	`datetime_start`	POSIXct	Start date and time of simulation
18	`datetime_end`	POSIXct	End date and time of simulation
19	`file_size`	Character	Model output file size in Bytes
20	`data_node`	Character	Data node to download the model output file
21	`dataset_pid`	Character	A unique string that helps identify the dataset
22	`tracking_id`	Character	A unique string that helps identify the output file

Details

For details on where the file index is stored, see rappdirs::user_data_dir().

Note

Argument limit will only apply to Dataset query. init_cmip6_index() will try to get all model output files which match the dataset id.

Examples

if (FALSE) {
init_cmip6_index()
}