Preprocessing Functions

The preprocessing module is used help correct dive drift and offsets. The offset is calculated using a rolling time window, similar to what is explained here.

There are two methods for the main function, correct_depth_offset():

  • max: zeros the local maxium and uses the difference as the offset for the rest
  • mean: uses the time window and a maximum depth to look for the average offset within the window
divebomb.preprocessing.calculate_window_mean(window, surface_threshold, df)
Parameters:
  • window – an int to determine the size for a rolling median
  • surface_threshold – the maximum depth that will be considered for the offset
  • df – Pandas Dataframe of the dive data
Returns:

An average offset in meters using the defined window

divebomb.preprocessing.correct_depth_offset(data, window=3600, columns={'depth': 'depth', 'time': 'time'}, aux_file='corrected_depth_auxillary_data.nc', method='max', surface_threshold=4)
Parameters:
  • data – The dataset consisting of a time and a depth column
  • window – time window (in seconds) to use in the calculation
  • aux_file – A netCDF file to write all of the calculated offsets and window size
  • columns – column renaming dictionary if needed
  • method – either ‘max’ or ‘mean’ declaring the calculation method, default is max
  • surface_threshold – maximum values (in meters) to use when using the mean the calculate
Returns:

A DataFrame with a corrected depth

divebomb.preprocessing.zlib_encoding(ds)

This is a helper function for xarray to compress all variables going to netCDF

Parameters:ds – an xarray Dataset
Returns:A dictionary indicating zlib compression for all variables