easyclimate.filter.emd

easyclimate.filter.emd#

Empirical Mode Decomposition and Ensemble Empirical Mode Decomposition (EEMD)

Functions#

filter_emd(input_data, time_step[, time_array, ...])

Empirical Mode Decomposition

filter_eemd(input_data, time_step[, time_array, ...])

Ensemble Empirical Mode Decomposition (EEMD)

Module Contents#

easyclimate.filter.emd.filter_emd(input_data: xarray.DataArray, time_step: Literal['ns', 'us', 'ms', 's', 'm', 'h', 'D', 'W', 'M', 'Y'], time_array=None, time_dim: str = 'time', spline_kind: Literal['cubic', 'akima', 'pchip', 'cubic_hermite', 'slinear', 'quadratic', 'linear'] = 'cubic', nbsym: int = 2, max_iteration: int = 1000, energy_ratio_thr: float = 0.2, std_thr: float = 0.2, svar_thr: float = 0.001, total_power_thr: float = 0.005, range_thr: float = 0.001, extrema_detection: Literal['simple', 'parabol'] = 'simple', max_imf: int = -1, dtype=np.float64)#

Empirical Mode Decomposition

Method of decomposing signal into Intrinsic Mode Functions (IMFs) based on algorithm presented in Huang et al (1998).

Algorithm was validated with Rilling et al (2003). Matlab’s version from 3.2007.

Threshold which control the goodness of the decomposition:

  • std_thr: Test for the proto-IMF how variance changes between siftings.

  • svar_thr: Test for the proto-IMF how energy changes between siftings.

  • total_power_thr: Test for the whole decomp how much of energy is solved.

  • range_thr: Test for the whole decomp whether the difference is tiny.

Parameters#

input_data: xarray.DataArray

Input signal data to be decomposed.

time_step: str

Time step unit for datetime conversion (e.g., ‘s’, ‘ms’, ‘us’, ‘ns’).

time_array: array-like, optional

Custom time array for the input signal. If None, uses input_data’s time dimension.

time_dim: str, default: time.

The time coordinate dimension name.

spline_kind: Literal["cubic", "akima", "pchip", "cubic_hermite", "slinear", "quadratic", "linear"], default: “cubic”

Type of spline used for envelope interpolation.

nbsym: int, default: 2

Number of points to add at signal boundaries for mirroring.

max_iteration: int, default: 1000

Maximum number of iterations per single sifting in EMD.

energy_ratio_thr: float, default: 0.2

Threshold value on energy ratio per IMF check.

std_thr: float, default: 0.2

Threshold value on standard deviation per IMF check.

svar_thr: float, default: 0.001

Threshold value on scaled variance per IMF check.

total_power_thr: float, default: 0.005

Threshold value on total power per EMD decomposition.

range_thr: float, default: 0.001

Threshold for amplitude range (after scaling) per EMD decomposition.

extrema_detection: Literal["simple", "parabol"], default: “simple”

Method used to finding extrema.

max_imf: int, default: -1

IMF number to which decomposition should be performed. Negative value means all.

dtype: numpy.dtype, default: np.float64

Data type used for calculations.

Reference#

easyclimate.filter.emd.filter_eemd(input_data: xarray.DataArray, time_step, time_array=None, time_dim: str = 'time', noise_seed: None | int = None, trials: int = 100, noise_width: float = 0.05, parallel: bool = False, processes: None | int = None, separate_trends: bool = False, spline_kind: Literal['cubic', 'akima', 'pchip', 'cubic_hermite', 'slinear', 'quadratic', 'linear'] = 'cubic', nbsym: int = 2, max_iteration: int = 1000, energy_ratio_thr: float = 0.2, std_thr: float = 0.2, svar_thr: float = 0.001, total_power_thr: float = 0.005, range_thr: float = 0.001, extrema_detection: Literal['simple', 'parabol'] = 'parabol', dtype=np.float64)#

Ensemble Empirical Mode Decomposition (EEMD)

Ensemble empirical mode decomposition (EEMD) is noise-assisted technique (Wu & Huang, 2009), which is meant to be more robust than simple Empirical Mode Decomposition (EMD). The robustness is checked by performing many decompositions on signals slightly perturbed from their initial position. In the grand average over all IMF results the noise will cancel each other out and the result is pure decomposition.

Parameters#

input_data: xarray.DataArray

Input signal data to be decomposed.

time_step: str

Time step unit for datetime conversion (e.g., ‘s’, ‘ms’, ‘us’, ‘ns’).

time_array: array-like, optional

Custom time array for the input signal. If None, uses input_data’s time dimension.

time_dim: str, default: “time”

The time coordinate dimension name.

noise_seed: int or None, default: None

Set seed for noise generation.

Warning

Given the nature of EEMD, each time you decompose a signal you will obtain a different set of components. That’s the expected consequence of adding noise which is going to be random. To make the decomposition reproducible, one needs to set a seed for the random number generator used in EEMD.

trials: int, default: 100

Number of trials or EMD performance with added noise.

noise_width: float, default: 0.05

Standard deviation of Gaussian noise (\(\hat\sigma\)). It’s relative to absolute amplitude of the signal, i.e. \(\hat\sigma = \sigma\cdot|\max(S)-\min(S)|\), where \(\sigma\) is noise_width.

parallel: bool, default: False

Flag whether to use multiprocessing in EEMD execution. Since each EMD(s+noise) is independent this should improve execution speed considerably. Note that it’s disabled by default because it’s the most common problem when EEMD takes too long time to finish. If you set the flag to True, make also sure to set processes to some reasonable value.

processes: int or None, default: None

Number of processes harness when executing in parallel mode. The value should be between 1 and max that depends on your hardware. If None, uses all available cores.

separate_trends: bool, default: False

Flag whether to isolate trends from each EMD decomposition into a separate component. If True, the resulting EEMD will contain ensemble only from IMFs and the mean residue will be stacked as the last element.

spline_kind: Literal["cubic", "akima", "pchip", "cubic_hermite", "slinear", "quadratic", "linear"], default: “cubic”

Type of spline used for envelope interpolation.

nbsym: int, default: 2

Number of points to add at signal boundaries for mirroring.

max_iteration: int, default: 1000

Maximum number of iterations per single sifting in EMD.

energy_ratio_thr: float, default: 0.2

Threshold value on energy ratio per IMF check.

std_thr: float, default: 0.2

Threshold value on standard deviation per IMF check.

svar_thr: float, default: 0.001

Threshold value on scaled variance per IMF check.

total_power_thr: float, default: 0.005

Threshold value on total power per EMD decomposition.

range_thr: float, default: 0.001

Threshold for amplitude range (after scaling) per EMD decomposition.

extrema_detection: Literal["simple", "parabol"], default: “parabol”

Method used to finding extrema.

dtype: numpy.dtype, default: np.float64

Data type used for calculations.

Returns#

xarray.Dataset

Dataset containing the input data and the decomposed eIMFs (Ensemble IMFs).

Reference#

Wu, Z., & Huang, N. E. (2009). Ensemble empirical mode decomposition: a noise-assisted data analysis method. Advances in Adaptive Data Analysis, 01(01), 1-41. https://doi.org/10.1142/S1793536909000047