Extract data — extract_data • epwshiftr

extract_data() takes an epw_cmip6_coord object generated using match_coord() and extracts CMIP6 data using the coordinates and years of interest specified.

extract_data(
  coord,
  years = NULL,
  unit = FALSE,
  out_dir = NULL,
  by = NULL,
  keep = is.null(out_dir),
  compress = 100
)

Arguments

coord

An epw_cmip6_coord object created using match_coord()

years

An integer vector indicating the target years to be included in the data file. All other years will be excluded. If NULL, no subsetting on years will be performed. Default: NULL.

unit

If TRUE, units will be added to values using units::set_units().

out_dir

The directory to save extracted data using fst::write_fst(). If NULL, all data will be kept in memory by default. Default: NULL.

by

A character vector of variable names used to split data during extraction. Should be a subset of:

"experiment": root experiment identifiers
"source": model identifiers
"variable": variable identifiers
"activity": activity identifiers
"frequency": sampling frequency
"variant": variant label
"resolution": approximate horizontal resolution

If NULL and out_dir is given, file name data.fst will be used. Default: NULL.

keep

Whether keep extracted data in memory. Default: TRUE if out_dir is NULL, and FALSE otherwise.

compress

A single integer in the range 0 to 100, indicating the amount of compression to use. Lower values mean larger file sizes. Default: 100.

Value

An epw_cmip6_data object, which is basically a list of 3 elements:

epw: An eplusr::Epw object whose longitude and latitude are used to extract CMIP6 data. It is the same object as created in match_coord()
meta: A list containing basic metadata of input EPW, including city, state_province, country, latitude and longitude.

data: An empty data.table::data.table() if keep is FALSE or a data.table::data.table() of 14 columns if keep is TRUE:

No.	Column	Type	Description
1	`activity_drs`	Character	Activity DRS (Data Reference Syntax)
2	`institution_id`	Character	Institution identifier
3	`source_id`	Character	Model identifier
4	`experiment_id`	Character	Root experiment identifier
5	`member_id`	Character	A compound construction from `sub_experiment_id` and `variant_label`
6	`table_id`	Character	Table identifier
7	`lon`	Double	Longitude of extracted location
8	`lat`	Double	Latitude of extracted location
9	`dist`	Double	The spherical distance in km between EPW location and grid coordinates
10	`datetime`	POSIXct	Datetime for the predicted value
11	`variable`	Character	Variable identifier
12	`description`	Character	Variable long name
13	`units`	Character	Units of variable
14	`value`	Double	The actual predicted value

Details

extract_data() supports common calendars, including 365_day and 360_day, thanks to the PCICt package.

extract_data() uses future.apply underneath. You can use your preferable future backend to speed up data extraction in parallel. By default, extract_data() uses future::sequential backend, which runs things in sequential.

Examples

if (FALSE) {
coord <- match_coord("path_to_an_EPW")
extract_data(coord, years = 2030:2060)
}