fre.pp.split_netcdf_script module

fre.pp.split_netcdf_script.fre_outfile_name(infile, varname)

Builds split var filenames the way that fre expects them (and in a way that should work for any .nc file) infile: string name of a file with a . somwehere in the filename varname: string to add to the infile This is expected to work with files formed the following way:

Fre Input format: date.component(.tileX).nc Fre Output format: date.component.var(.tileX).nc

but it should also work on any file filename.nc

fre.pp.split_netcdf_script.parse_yaml_for_varlist(yamlfile, yamlcomp, hist_source='none')

Given a yaml config file, parses the structure looking for the list of variables to postprocess (https://github.com/NOAA-GFDL/fre-workflows/issues/51) and returns “all” if no such list is found yamlfile: .yml file used for fre pp configuration yamlcomp: string, one of the components in the yamlfile hist_source: string, optional, allows you to check that the hist_source

is under the specified component

fre.pp.split_netcdf_script.set_coord_encoding(dset, vcoords)

Gets the encoding settings needed for xarray to write out the coordinates as expected we need the list of all vars (varnames) because that’s how you get coords for the metadata vars (i.e. nv or bnds for time_bnds) dset: xarray dataset object varname: name (string) of data variable we intend to write to file varnames: list of all variables (string) in the dataset; needed to get

names of all coordinate variables since coordinate status is defined only in relation with a variable

Note: this code removes _FillValue from coordinates. CF-compliant files do not have _FillValue on coordinates, and xarray does not have a good way to get _FillValue from coordinates. Letting xarray set _FillValue for coordinates when coordinates have a _FillValue gets you wrong metadata, and bad metadata is worse than no metadata. Dropping the attribute if it’s present seems to be the lesser of two evils.

fre.pp.split_netcdf_script.set_var_encoding(dset, varnames)

Gets the encoding settings needed for xarray to write out the variables as expected mostly addressed to time_bnds, because xarray can drop the units attribute:

dset: xarray dataset object varnames: list of variables (strings) that will be written to file

fre.pp.split_netcdf_script.split_file_xarray(infile, outfiledir, var_list='all', verbose=False)

Given a netcdf infile containing one or more data variables, writes out a separate file for each data variable in the file, including the variable name in the filename. if var_list if specified, only the vars in var_list are written to file; if no vars in the file match the vars in var_list, no files are written. infile: input netcdf file outfiledir: writeable directory to which to write netcdf files var_list: python list of string variable names or a string “all”

fre.pp.split_netcdf_script.split_netcdf(inputDir, outputDir, component, history_source, use_subdirs, yamlfile)

Given a directory of netcdf files, splits those netcdf files into separate files for each data variable and copies the data variable files of interest to the output directory Intended to work with data structured for fre-workflows and fre-workflows

file naming conventions Sample infile name convention: “19790101.atmos_tracer.tile6.nc”

inputDir - directory containg netcdf files outputDir - directory to which to write netcdf files component - the ‘component’ element we are currently working with in the yaml history_source - a history_file under a ‘source’ under the ‘component’ that

we are working with. Is used to identify the files in inputDir.

use_subdirs - whether to recursively search through inputDir under the subdirectories.

used when regridding.

yamlfile - a .yml config file for fre postprocessing