.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_gallery/plot_basic_statistical_analysis.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_gallery_plot_basic_statistical_analysis.py: Basic Statistical Analysis =================================== Before proceeding with all the steps, first import some necessary libraries and packages .. GENERATED FROM PYTHON SOURCE LINES 8-12 .. code-block:: Python import easyclimate as ecl import matplotlib.pyplot as plt import cartopy.crs as ccrs .. GENERATED FROM PYTHON SOURCE LINES 13-25 Obtain the sea ice concentration (SIC) data from the Barents-Kara Seas (30°−90°E, 65°−85°N). .. seealso:: Luo, B., Luo, D., Ge, Y. et al. Origins of Barents-Kara sea-ice interannual variability modulated by the Atlantic pathway of El Niño–Southern Oscillation. Nat Commun 14, 585 (2023). https://doi.org/10.1038/s41467-023-36136-5 .. tip:: You can download following datasets here: - :download:`Download mini_HadISST_ice.nc ` - :download:`Download mini_HadISST_sst.nc ` .. GENERATED FROM PYTHON SOURCE LINES 25-28 .. code-block:: Python sic_data_Barents_Sea = ecl.open_tutorial_dataset("mini_HadISST_ice").sic sic_data_Barents_Sea .. raw:: html
<xarray.DataArray 'sic' (time: 508, lat: 20, lon: 60)> Size: 2MB
    [609600 values with dtype=float32]
    Coordinates:
      * time     (time) datetime64[ns] 4kB 1981-01-31 1981-02-28 ... 2023-04-30
      * lat      (lat) float32 80B 65.5 66.5 67.5 68.5 69.5 ... 81.5 82.5 83.5 84.5
      * lon      (lon) float32 240B 30.5 31.5 32.5 33.5 34.5 ... 86.5 87.5 88.5 89.5
    Attributes:
        standard_name:  sea_ice_area_fraction
        long_name:      Monthly 1 degree resolution sea ice concentration
        units:          1
        cell_methods:   time: lat: lon: median


.. GENERATED FROM PYTHON SOURCE LINES 29-33 And tropical SST dataset. .. seealso:: Rayner, N. A.; Parker, D. E.; Horton, E. B.; Folland, C. K.; Alexander, L. V.; Rowell, D. P.; Kent, E. C.; Kaplan, A. (2003) Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century J. Geophys. Res.Vol. 108, No. D14, 4407 https://doi.org/10.1029/2002JD002670 (pdf ~9Mb) .. GENERATED FROM PYTHON SOURCE LINES 33-36 .. code-block:: Python sst_data = ecl.open_tutorial_dataset("mini_HadISST_sst").sst sst_data .. raw:: html
<xarray.DataArray 'sst' (time: 504, lat: 30, lon: 360)> Size: 22MB
    [5443200 values with dtype=float32]
    Coordinates:
      * time     (time) datetime64[ns] 4kB 1981-01-16T12:00:00 ... 2022-12-16T12:...
      * lat      (lat) float32 120B -14.5 -13.5 -12.5 -11.5 ... 11.5 12.5 13.5 14.5
      * lon      (lon) float32 1kB -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5
    Attributes:
        standard_name:  sea_surface_temperature
        long_name:      sst
        units:          C
        cell_methods:   time: lat: lon: mean


.. GENERATED FROM PYTHON SOURCE LINES 37-40 Mean States for Special Month ------------------------------------ :py:func:`easyclimate.get_specific_months_data ` allows us to easily obtain data on the SIC for December alone. .. GENERATED FROM PYTHON SOURCE LINES 40-43 .. code-block:: Python sic_data_Barents_Sea_12 = ecl.get_specific_months_data(sic_data_Barents_Sea, 12) sic_data_Barents_Sea_12 .. raw:: html
<xarray.DataArray 'sic' (time: 42, lat: 20, lon: 60)> Size: 202kB
    [50400 values with dtype=float32]
    Coordinates:
      * time     (time) datetime64[ns] 336B 1981-12-31 1982-12-31 ... 2022-12-31
      * lat      (lat) float32 80B 65.5 66.5 67.5 68.5 69.5 ... 81.5 82.5 83.5 84.5
      * lon      (lon) float32 240B 30.5 31.5 32.5 33.5 34.5 ... 86.5 87.5 88.5 89.5
    Attributes:
        standard_name:  sea_ice_area_fraction
        long_name:      Monthly 1 degree resolution sea ice concentration
        units:          1
        cell_methods:   time: lat: lon: median


.. GENERATED FROM PYTHON SOURCE LINES 44-45 Now we try to draw the mean states of the SIC in the Barents-Kara for the December. .. GENERATED FROM PYTHON SOURCE LINES 46-67 .. code-block:: Python draw_sic_mean_state = sic_data_Barents_Sea_12.mean(dim="time") fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) draw_sic_mean_state.plot.contourf( ax=ax, # projection on data transform=ccrs.PlateCarree(), # Colorbar is placed at the bottom cbar_kwargs={"location": "right"}, cmap="Blues", levels=21, ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_001.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 68-75 Linear Trend ------------------------------------ As we all know, the area of Arctic sea ice has been decreasing more and more in recent years due to the impact of global warming. We can obtain the change of SIC by solving the linear trend of SIC data from 1981-2022. :py:func:`easyclimate.calc_linregress_spatial ` can provide the calculation results of solving the linear trend for each grid point. .. GENERATED FROM PYTHON SOURCE LINES 75-81 .. code-block:: Python sic_data_Barents_Sea_12_linear_trend = ecl.calc_linregress_spatial( sic_data_Barents_Sea_12, dim="time" ).compute() sic_data_Barents_Sea_12_linear_trend .. raw:: html
<xarray.Dataset> Size: 58kB
    Dimensions:           (lat: 20, lon: 60)
    Coordinates:
      * lat               (lat) float32 80B 65.5 66.5 67.5 68.5 ... 82.5 83.5 84.5
      * lon               (lon) float32 240B 30.5 31.5 32.5 33.5 ... 87.5 88.5 89.5
    Data variables:
        slope             (lat, lon) float64 10kB nan nan ... 4.943e-05 0.0002091
        intercept         (lat, lon) float64 10kB nan nan nan ... 0.9792 0.9733
        rvalue            (lat, lon) float64 10kB nan nan nan ... 0.04545 0.1818
        pvalue            (lat, lon) float64 10kB nan nan nan ... 0.775 0.2493
        stderr            (lat, lon) float64 10kB nan nan ... 0.0001718 0.0001788
        intercept_stderr  (lat, lon) float64 10kB nan nan nan ... 0.004091 0.004259


.. GENERATED FROM PYTHON SOURCE LINES 82-83 The `slope` is our desired linear trend, let's try to plot the linear trend of each grid point. .. GENERATED FROM PYTHON SOURCE LINES 83-102 .. code-block:: Python draw_sic_slope = sic_data_Barents_Sea_12_linear_trend.slope fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) draw_sic_slope.plot.contourf( ax=ax, transform=ccrs.PlateCarree(), cbar_kwargs={"location": "right"}, cmap="RdBu_r", levels=21, ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_002.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 103-105 The `pvalue` is the corresponding p-value, and we can determine a significance level (e.g., significance level is set to 0.05) in order to plot the region of significance. .. GENERATED FROM PYTHON SOURCE LINES 105-120 .. code-block:: Python draw_sic_pvalue = sic_data_Barents_Sea_12_linear_trend.pvalue fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) ecl.plot.draw_significant_area_contourf( draw_sic_pvalue, ax=ax, thresh=0.05, transform=ccrs.PlateCarree() ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_003.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 121-123 Further, we can superimpose the linear trend and the region of significance to study the linear trend of the region of significance (since the linear trend of the region of non-significance is often spurious). .. GENERATED FROM PYTHON SOURCE LINES 123-146 .. code-block:: Python fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) # SIC slope draw_sic_slope.plot.contourf( ax=ax, transform=ccrs.PlateCarree(), cbar_kwargs={"location": "right"}, cmap="RdBu_r", levels=21, ) # SIC 95% significant level ecl.plot.draw_significant_area_contourf( draw_sic_pvalue, ax=ax, thresh=0.05, transform=ccrs.PlateCarree() ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_004.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 147-164 Regression ------------------------------------ Regression analysis is a statistical technique used to investigate the connection between a dependent variable and one or more independent variables. It is frequently employed in climatology to analyze trends and patterns in climatic data, identify correlations between different climatic parameters, and create models that can predict future changes. By identifying patterns and connections in massive datasets, regression analysis offers several benefits for weather research. For instance, regression analysis can be used to pinpoint the elements that affect global temperatures, such as solar radiation, atmospheric greenhouse gases, and volcanic eruptions. Climate scientists can create models that can accurately predict future changes by including these variables in a regression model. Moreover, regression analysis can assist climate experts in spotting natural fluctuations in climate data, like El Niño events, and in determining how human activities like deforestation and fossil fuel combustion affect the environment. Regression analysis can also evaluate the effectiveness of various mitigation tactics, such as carbon pricing policies or renewable energy initiatives. Overall, regression analysis is a potent tool for analyzing complex climate data and producing reliable projections of upcoming alterations. .. seealso:: - Regression Analysis: Definition, Types, Usage & Advantages. Website: https://www.questionpro.com/blog/regression-analysis/ - The Advantages of Regression Analysis & Forecasting. Website: https://smallbusiness.chron.com/advantages-regression-analysis-forecasting-61800.html In this subsection we try to regress the Niño 3.4 index on the Barents-Kara December SIC data. Before performing the regression analysis, we can see that the longitude range of the SST data is **-180°~180°**, try to convert the longitude range to **0°~360°** using :py:func:`easyclimate.utility.transfer_xarray_lon_from180TO360 `. .. GENERATED FROM PYTHON SOURCE LINES 164-167 .. code-block:: Python sst_data_0_360 = ecl.utility.transfer_xarray_lon_from180TO360(sst_data) sst_data_0_360 .. raw:: html
<xarray.DataArray 'sst' (time: 504, lat: 30, lon: 360)> Size: 22MB
    [5443200 values with dtype=float32]
    Coordinates:
      * time     (time) datetime64[ns] 4kB 1981-01-16T12:00:00 ... 2022-12-16T12:...
      * lat      (lat) float32 120B -14.5 -13.5 -12.5 -11.5 ... 11.5 12.5 13.5 14.5
      * lon      (lon) float32 1kB 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5
    Attributes:
        standard_name:  sea_surface_temperature
        long_name:      sst
        units:          C
        cell_methods:   time: lat: lon: mean


.. GENERATED FROM PYTHON SOURCE LINES 168-174 Further, :py:func:`easyclimate.remove_seasonal_cycle_mean ` is used to remove the climate state of each month in order to obtain the individual month anomalies. The figure below illustrates the November SST anomaly in the tropical equatorial Pacific during the 1982-83 super El Niño. .. seealso:: Philander, S. Meteorology: Anomalous El Niño of 1982–83. Nature 305, 16 (1983). https://doi.org/10.1038/305016a0 .. GENERATED FROM PYTHON SOURCE LINES 174-191 .. code-block:: Python sst_data_anormaly = ecl.remove_seasonal_cycle_mean(sst_data_0_360) fig, ax = plt.subplots( figsize=(10, 4), subplot_kw={"projection": ccrs.PlateCarree(central_longitude=180)} ) sst_data_anormaly.sel(lon=slice(120, 290)).isel(time=22).plot.contourf( ax=ax, transform=ccrs.PlateCarree(), cbar_kwargs={"location": "bottom", "pad": 0.1}, cmap="RdBu_r", levels=21, ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_005.png :alt: time = 1982-11-16, month = 11 :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_005.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 192-197 The Niño3.4 index is commonly used as an indicator for detecting ENSO, and `easyclimate` provides :py:func:`easyclimate.field.air_sea_interaction.calc_index_nino34 ` to calculate the index using SST original dataset. .. seealso:: Anthony G. Bamston, Muthuvel Chelliah & Stanley B. Goldenberg (1997) Documentation of a highly ENSO‐related sst region in the equatorial pacific: Research note, Atmosphere-Ocean, 35:3, 367-383, DOI: https://doi.org/10.1080/07055900.1997.9649597 .. GENERATED FROM PYTHON SOURCE LINES 197-203 .. code-block:: Python nino34_monthly_index = ecl.field.air_sea_interaction.calc_index_nino34(sst_data_0_360) nino34_monthly_index.plot( figsize=(8, 3), ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_006.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_006.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none /home/runner/work/easyclimate/easyclimate/src/easyclimate/core/utility.py:836: UserWarning: It seems that the input data longitude range is not from -180° to 180°. Please carefully check your data. warnings.warn( [] .. GENERATED FROM PYTHON SOURCE LINES 204-205 :py:func:`easyclimate.calc_yearly_climatological_mean ` is then used to solve for the annual average of the monthly index data .. GENERATED FROM PYTHON SOURCE LINES 205-209 .. code-block:: Python nino34_12_index = ecl.get_specific_months_data(nino34_monthly_index, 12) nino34_dec_yearly_index = ecl.calc_yearly_climatological_mean(nino34_12_index) nino34_dec_yearly_index .. raw:: html
<xarray.DataArray 'Nino34_index' (time: 42)> Size: 168B
    array([-0.0535326 ,  2.2554278 , -0.86200035, -1.0462521 , -0.55916315,
            1.08696   ,  0.9741691 , -1.8894404 , -0.06711823,  0.2840484 ,
            1.4100466 ,  0.06721933,  0.16351025,  1.0622212 , -0.6885618 ,
           -0.31378114,  2.3521311 , -1.3440579 , -1.4383495 , -0.71757525,
           -0.21213251,  1.1417981 ,  0.34162298,  0.59405655, -0.5856687 ,
            0.7327775 , -1.5969441 , -0.6306157 ,  1.3947657 , -1.5393288 ,
           -0.895552  , -0.05969566, -0.22111896,  0.6862107 ,  2.4407508 ,
           -0.424509  , -0.74715966,  0.8113012 ,  0.60371643, -1.0101154 ,
           -0.8180708 ,         nan], dtype=float32)
    Coordinates:
      * time     (time) datetime64[ns] 336B 1981-01-01 1982-01-01 ... 2022-01-01
    Attributes:
        standard_name:  sea_surface_temperature
        long_name:      sst
        units:          C
        cell_methods:   time: lat: lon: mean


.. GENERATED FROM PYTHON SOURCE LINES 210-212 Unlike solving for linear trend without passing in `x`, regression analysis must use the parameter `x` to pass in the object to be regressed. Care must be taken to ensure that the `time` dimensions are identical. .. GENERATED FROM PYTHON SOURCE LINES 212-218 .. code-block:: Python sic_reg_nino34 = ecl.calc_linregress_spatial( sic_data_Barents_Sea_12, x=nino34_dec_yearly_index.data ) sic_reg_nino34 = sic_reg_nino34.compute() sic_reg_nino34 .. rst-class:: sphx-glr-script-out .. code-block:: none /home/runner/work/easyclimate/easyclimate/src/easyclimate/core/stat.py:105: UserWarning: Assuming that the coordinate value of 'time' in `data_input` and `x` is same. Ignoring. warnings.warn( .. raw:: html
<xarray.Dataset> Size: 58kB
    Dimensions:           (lat: 20, lon: 60)
    Coordinates:
      * lat               (lat) float32 80B 65.5 66.5 67.5 68.5 ... 82.5 83.5 84.5
      * lon               (lon) float32 240B 30.5 31.5 32.5 33.5 ... 87.5 88.5 89.5
    Data variables:
        slope             (lat, lon) float64 10kB nan nan nan nan ... nan nan nan
        intercept         (lat, lon) float64 10kB nan nan nan nan ... nan nan nan
        rvalue            (lat, lon) float64 10kB nan nan nan nan ... nan nan nan
        pvalue            (lat, lon) float64 10kB nan nan nan nan ... nan nan nan
        stderr            (lat, lon) float64 10kB nan nan nan nan ... nan nan nan
        intercept_stderr  (lat, lon) float64 10kB nan nan nan nan ... nan nan nan


.. GENERATED FROM PYTHON SOURCE LINES 219-220 Here is an attempt to plot the results of the regression analysis. .. GENERATED FROM PYTHON SOURCE LINES 220-244 .. code-block:: Python draw_sic_slope = sic_reg_nino34.slope draw_sic_pvalue = sic_reg_nino34.pvalue fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) draw_sic_slope.plot.contourf( ax=ax, transform=ccrs.PlateCarree(), cbar_kwargs={"location": "right"}, cmap="RdBu_r", levels=21, ) ecl.plot.draw_significant_area_contourf( draw_sic_pvalue, ax=ax, thresh=0.05, transform=ccrs.PlateCarree() ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_007.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_007.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 245-250 Detrend ------------------------------------ Sea ice area shows an approximately linear trend of decreasing due to global warming. We remove the linear trend from SIC in order to study the variability of SIC itself. In addition, here we explore the differences between the trend followed by regional averaging and regional averaging followed by detrending approaches. .. GENERATED FROM PYTHON SOURCE LINES 250-258 .. code-block:: Python sic_data_Barents_Sea_12_spatial_mean = sic_data_Barents_Sea_12.mean(dim=("lat", "lon")) sic_data_Barents_Sea_12_spatial_detrendmean = ecl.calc_detrend_spatial( sic_data_Barents_Sea_12, time_dim="time" ).mean(dim=("lat", "lon")) sic_data_Barents_Sea_12_time_detrendmean = ecl.calc_detrend_spatial( sic_data_Barents_Sea_12_spatial_mean, time_dim="time" ) .. GENERATED FROM PYTHON SOURCE LINES 259-260 The results show that there is no significant difference between these two detrending methods to study the variability of SIC in the Barents-Kara Seas. .. GENERATED FROM PYTHON SOURCE LINES 260-277 .. code-block:: Python fig, ax = plt.subplots(2, 1, sharex=True) sic_data_Barents_Sea_12_spatial_mean.plot(ax=ax[0]) ax[0].set_xlabel("") ax[0].set_title("Original") sic_data_Barents_Sea_12_spatial_detrendmean.plot( ax=ax[1], label="detrend -> spatial mean" ) sic_data_Barents_Sea_12_time_detrendmean.plot( ax=ax[1], ls="--", label="spatial mean -> detrend" ) ax[1].set_xlabel("") ax[1].set_title("Detrend") ax[1].legend() .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_008.png :alt: Original, Detrend :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_008.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 278-298 Weighted Spatial Data ------------------------------------ When calculating regional averages in high-latitude areas, considering weights is necessary because regions at different latitudes cover unequal surface areas on the Earth. Since the Earth is approximately a spheroid, areas closer to the poles have a different distribution of surface area on the spherical surface. One common way to incorporate weights is by using the cosine of latitude, i.e., multiplying by :math:`\cos (\varphi)`, where :math:`\varphi` represents the latitude of a location. This is because areas at higher latitudes, close to the poles, have higher latitudes and smaller cosine values, allowing for a smaller weight to be applied to these regions when calculating averages. In summary, considering weights is done to more accurately account for the distribution of surface area on the Earth, ensuring that contributions from different regions are weighted according to their actual surface area when calculating averages or other regional statistical measures. :py:func:`easyclimate.utility.get_weighted_spatial_data ` can help us create an :py:class:`xarray.core.weighted.DataArrayWeighted ` object. This object will automatically consider and calculate weights in subsequent area operations, thereby achieving the operation of the weighted spatial average. .. GENERATED FROM PYTHON SOURCE LINES 298-306 .. code-block:: Python sic_data_Barents_Sea_12_detrend = ecl.calc_detrend_spatial( sic_data_Barents_Sea_12, time_dim="time" ) grid_detrend_data_weighted_obj = ecl.utility.get_weighted_spatial_data( sic_data_Barents_Sea_12_detrend, lat_dim="lat", lon_dim="lon" ) print(type(grid_detrend_data_weighted_obj)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 307-308 Solve for regional averaging for `grid_detrend_data_weighted_obj` objects (the role of weights is considered at this point) .. GENERATED FROM PYTHON SOURCE LINES 308-313 .. code-block:: Python sic_data_Barents_Sea_12_spatial_detrend_weightedmean = ( grid_detrend_data_weighted_obj.mean(dim=("lat", "lon")) ) sic_data_Barents_Sea_12_spatial_detrend_weightedmean .. raw:: html
<xarray.DataArray 'sic' (time: 42)> Size: 168B
    array([ 0.00733505, -0.03790337,  0.00323731, -0.07254851, -0.01491249,
            0.00765501, -0.00645751,  0.04360954, -0.0093088 , -0.00049735,
           -0.02658244, -0.00516641, -0.01209194, -0.0102047 ,  0.02112157,
           -0.01286527,  0.07362372,  0.10449733,  0.00810461,  0.00628584,
           -0.0582574 ,  0.08276358,  0.08993486,  0.02257021, -0.04379203,
           -0.02342463, -0.04546626, -0.01492424,  0.03098435,  0.08953879,
           -0.04640904, -0.12334438,  0.03096083,  0.08812726, -0.04354496,
           -0.1620483 , -0.02880639, -0.05406104,  0.10325759, -0.0727714 ,
            0.11438236, -0.00260423], dtype=float32)
    Coordinates:
      * time     (time) datetime64[ns] 336B 1981-12-31 1982-12-31 ... 2022-12-31
    Attributes:
        standard_name:  sea_ice_area_fraction
        long_name:      Monthly 1 degree resolution sea ice concentration
        units:          1
        cell_methods:   time: lat: lon: median


.. GENERATED FROM PYTHON SOURCE LINES 314-315 We can find some differences between the data considering latitude weights and those not considering latitude weights. .. GENERATED FROM PYTHON SOURCE LINES 315-331 .. code-block:: Python fig, ax = plt.subplots(2, 1, sharex=True) sic_data_Barents_Sea_12_spatial_mean.plot(ax=ax[0]) ax[0].set_xlabel("") ax[0].set_title("Original") sic_data_Barents_Sea_12_spatial_detrendmean.plot(ax=ax[1], label="Regular mean") sic_data_Barents_Sea_12_spatial_detrend_weightedmean.plot( ax=ax[1], ls="--", label="Weighted mean" ) ax[1].set_xlabel("") ax[1].set_title("Detrend") ax[1].legend() fig.show() .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_009.png :alt: Original, Detrend :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_009.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 332-355 Skewness ------------------------------------ Skewness is a measure of the asymmetry of a probability distribution. It quantifies the extent to which a distribution deviates from being symmetric around its mean. A distribution with a skewness value of zero is considered symmetric, meaning that it has equal probabilities of occurring above and below its mean. However, when the skewness value is non-zero, the distribution becomes asymmetric, indicating that there is a higher likelihood of occurrence on one side of the mean than the other. Skewness can be positive or negative, depending on whether the distribution is skewed towards larger or smaller values. In climate analysis, skewness can arise due to various factors such as changes in atmospheric circulation patterns, uneven temperature or precipitation distributions, or differences in measurement instruments. .. seealso:: - Distributions of Daily Meteorological Variables: Background. Website: https://psl.noaa.gov/data/atmoswrit/distributions/background/index.html - Bakouch HS, Cadena M, Chesneau C. A new class of skew distributions with climate data analysis. J Appl Stat. 2020 Jul 13;48(16):3002-3024. doi: https://doi.org/10.1080/02664763.2020.1791804. PMID: 35707257; PMCID: PMC9042114. The skewness is calculated using :py:func:`easyclimate.calc_skewness_spatial `. The result of the calculation contains the skewness and p-value. .. GENERATED FROM PYTHON SOURCE LINES 355-364 .. code-block:: Python sic_data_Barents_Sea_12_detrend = ecl.calc_detrend_spatial( sic_data_Barents_Sea_12, time_dim="time" ) sic_data_Barents_Sea_12_skew = ecl.calc_skewness_spatial( sic_data_Barents_Sea_12_detrend, dim="time" ) sic_data_Barents_Sea_12_skew .. raw:: html
<xarray.Dataset> Size: 10kB
    Dimensions:   (lat: 20, lon: 60)
    Coordinates:
      * lat       (lat) float32 80B 65.5 66.5 67.5 68.5 69.5 ... 81.5 82.5 83.5 84.5
      * lon       (lon) float32 240B 30.5 31.5 32.5 33.5 ... 86.5 87.5 88.5 89.5
    Data variables:
        skewness  (lat, lon) float32 5kB nan nan nan nan ... 0.06856 0.1416 0.07797
        pvalue    (lat, lon) float32 5kB nan nan nan nan ... 0.1773 0.01788 0.4496


.. GENERATED FROM PYTHON SOURCE LINES 366-392 .. code-block:: Python fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) # SIC slope sic_data_Barents_Sea_12_skew.skewness.plot.contourf( ax=ax, transform=ccrs.PlateCarree(), cbar_kwargs={"location": "right"}, cmap="RdBu_r", levels=21, ) # SIC 95% significant level ecl.plot.draw_significant_area_contourf( sic_data_Barents_Sea_12_skew.pvalue, ax=ax, thresh=0.05, transform=ccrs.PlateCarree(), ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_010.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_010.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 393-408 Kurtosis ------------------------------------ Kurtosis is a measure of the "tailedness" of a probability distribution. It describes how heavy or light the tails of a distribution are relative to a standard normal distribution. A distribution with high kurtosis has heavier tails and a greater propensity for extreme events, whereas a distribution with low kurtosis has lighter tails and fewer extreme events. Kurtosis is particularly useful in climate analysis because it can reveal information about the frequency and intensity of extreme weather events such as hurricanes, droughts, or heatwaves. .. seealso:: - Distributions of Daily Meteorological Variables: Background. Website: https://psl.noaa.gov/data/atmoswrit/distributions/background/index.html - Bakouch HS, Cadena M, Chesneau C. A new class of skew distributions with climate data analysis. J Appl Stat. 2020 Jul 13;48(16):3002-3024. doi: https://doi.org/10.1080/02664763.2020.1791804. PMID: 35707257; PMCID: PMC9042114. The skewness is calculated using :py:func:`easyclimate.calc_kurtosis_spatial `. .. GENERATED FROM PYTHON SOURCE LINES 408-414 .. code-block:: Python sic_data_Barents_Sea_12_kurt = ecl.calc_kurtosis_spatial( sic_data_Barents_Sea_12_detrend, dim="time" ) sic_data_Barents_Sea_12_kurt .. raw:: html
<xarray.DataArray 'sic' (lat: 20, lon: 60)> Size: 5kB
    array([[       nan,        nan,        nan, ...,        nan,        nan,
                   nan],
           [       nan,        nan,        nan, ...,        nan,        nan,
                   nan],
           [       nan,        nan,        nan, ...,        nan,        nan,
                   nan],
           ...,
           [16.684706 , 18.235487 , 15.48351  , ...,  2.6224325,  2.4516895,
             3.073731 ],
           [11.324911 , 11.565623 , 10.90156  , ...,  2.3805072,  2.4513185,
             2.1890404],
           [ 2.8223338,  3.5768723,  3.2213914, ...,  2.0441246,  1.835163 ,
             2.2185273]], shape=(20, 60), dtype=float32)
    Coordinates:
      * lat      (lat) float32 80B 65.5 66.5 67.5 68.5 69.5 ... 81.5 82.5 83.5 84.5
      * lon      (lon) float32 240B 30.5 31.5 32.5 33.5 34.5 ... 86.5 87.5 88.5 89.5
    Attributes:
        standard_name:  sea_ice_area_fraction
        long_name:      Monthly 1 degree resolution sea ice concentration
        units:          1
        cell_methods:   time: lat: lon: median


.. GENERATED FROM PYTHON SOURCE LINES 415-416 Consider plotting kurtosis below .. GENERATED FROM PYTHON SOURCE LINES 416-434 .. code-block:: Python fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) # SIC slope sic_data_Barents_Sea_12_kurt.plot.contourf( ax=ax, transform=ccrs.PlateCarree(), cbar_kwargs={"location": "right"}, cmap="RdBu_r", levels=21, ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_011.png :alt: plot basic statistical analysis :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_011.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 435-457 Composite Analysis ------------------------------------ In the process of climate analysis, **composite analysis** is a statistical and integrative method used to study the spatial and temporal distribution of specific climate events or phenomena. The primary purpose of this analysis method is to identify common features and patterns among multiple events or time periods by combining their information. Specifically, the steps of composite analysis typically include the following aspects: 1. **Event Selection**: Firstly, a set of events related to the research objective is chosen. These events could be specific climate phenomena such as heavy rainfall, drought, or temperature anomalies. 2. **Data Collection**: Collect meteorological data related to the selected events, including observational data, model outputs, or remote sensing data. 3. **Event Alignment**: Time-align the chosen events to ensure that they are within the same temporal framework for analysis. 4. **Data Combination**: Combine data values at corresponding time points into a composite dataset. Averages or weighted averages are often used to reduce the impact of random noise. The advantages of this method include: - **Highlighting Common Features**: By combining data from multiple events or time periods, composite analysis can highlight common features, aiding in the identification of general patterns in climate events. - **Noise Reduction**: By averaging data, composite analysis helps to reduce the impact of random noise, resulting in more stable and reliable analysis outcomes. - **Spatial Consistency**: Through spatial averaging, this method helps reveal the consistent spatial distribution of climate events, providing a more comprehensive understanding. - **Facilitating Comparisons**: Composite analysis makes it convenient to compare different events or time periods as it integrates them into a unified framework. Here we try to extract the El Niño and La Niña events using the standard deviation of the Niño 3.4 index as a threshold. .. GENERATED FROM PYTHON SOURCE LINES 457-460 .. code-block:: Python nino34_dec_yearly_index_std = nino34_dec_yearly_index.std(dim="time").data nino34_dec_yearly_index_std .. rst-class:: sphx-glr-script-out .. code-block:: none array(1.0746759, dtype=float32) .. GENERATED FROM PYTHON SOURCE LINES 461-463 :py:func:`easyclimate.get_year_exceed_index_upper_bound ` is able to obtain the years that exceed the upper bound of the Niño 3.4 exponential threshold, and similarly :py:func:`easyclimate.get_year_exceed_index_lower_bound ` can obtain the years that exceed the lower bound of the exponential threshold. .. GENERATED FROM PYTHON SOURCE LINES 463-472 .. code-block:: Python elnino_year = ecl.get_year_exceed_index_upper_bound( nino34_dec_yearly_index, thresh=nino34_dec_yearly_index_std ) lanina_year = ecl.get_year_exceed_index_lower_bound( nino34_dec_yearly_index, thresh=-nino34_dec_yearly_index_std ) print("El niño years: ", elnino_year) print("La niña years: ", lanina_year) .. rst-class:: sphx-glr-script-out .. code-block:: none El niño years: [1982 1986 1991 1997 2002 2009 2015] La niña years: [1988 1998 1999 2007 2010] .. GENERATED FROM PYTHON SOURCE LINES 473-475 Further we use :py:func:`easyclimate.get_specific_years_data ` to extract data for the El Niño years within `sic_data_Barents_Sea_12_detrend`. The results show that six temporal levels were extracted. .. GENERATED FROM PYTHON SOURCE LINES 475-486 .. code-block:: Python sic_data_Barents_Sea_12_detrend_elnino = ecl.get_specific_years_data( sic_data_Barents_Sea_12_detrend, elnino_year ) sic_data_Barents_Sea_12_detrend_elnino.name = "El niño years" sic_data_Barents_Sea_12_detrend_lanina = ecl.get_specific_years_data( sic_data_Barents_Sea_12_detrend, lanina_year ) sic_data_Barents_Sea_12_detrend_lanina.name = "La niña years" sic_data_Barents_Sea_12_detrend_elnino .. raw:: html
<xarray.DataArray 'El niño years' (time: 7, lat: 20, lon: 60)> Size: 34kB
    array([[[        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            [        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            [        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            ...,
            [ 0.04322249,  0.01938468,  0.04936236, ..., -0.00439435,
             -0.00613469, -0.00160259],
            [-0.01840574, -0.01017874,  0.00120759, ..., -0.00662297,
             -0.00390106, -0.011338  ],
            [-0.00223053, -0.01166272, -0.0034039 , ..., -0.01167816,
              0.00072545, -0.0035426 ]],

           [[        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            [        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            [        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
    ...
            [ 0.04887038,  0.04507154,  0.04859674, ...,  0.00190991,
              0.00191987,  0.00413281],
            [ 0.03411204,  0.03380483,  0.03195328, ...,  0.00738251,
              0.00765401,  0.00765753],
            [-0.00430715, -0.00485522,  0.00482559, ...,  0.01009589,
              0.00939119,  0.01081306]],

           [[        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            [        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            [        nan,         nan,         nan, ...,         nan,
                     nan,         nan],
            ...,
            [-0.04431903, -0.0514425 , -0.04157346, ...,  0.00886643,
              0.0092653 ,  0.0109629 ],
            [-0.01755071, -0.00753218, -0.01010329, ..., -0.0039497 ,
             -0.00422269, -0.00478798],
            [-0.01699084, -0.01778686, -0.02779007, ..., -0.01173216,
             -0.01090533, -0.0104413 ]]], shape=(7, 20, 60), dtype=float32)
    Coordinates:
      * time     (time) datetime64[ns] 56B 1982-12-31 1986-12-31 ... 2015-12-31
      * lat      (lat) float32 80B 65.5 66.5 67.5 68.5 69.5 ... 81.5 82.5 83.5 84.5
      * lon      (lon) float32 240B 30.5 31.5 32.5 33.5 34.5 ... 86.5 87.5 88.5 89.5
    Attributes:
        standard_name:  sea_ice_area_fraction
        long_name:      Monthly 1 degree resolution sea ice concentration
        units:          1
        cell_methods:   time: lat: lon: median


.. GENERATED FROM PYTHON SOURCE LINES 487-488 Similarly, 5 temporal levels were extracted for the La Niña years. .. GENERATED FROM PYTHON SOURCE LINES 488-490 .. code-block:: Python sic_data_Barents_Sea_12_detrend_lanina .. raw:: html
<xarray.DataArray 'La niña years' (time: 5, lat: 20, lon: 60)> Size: 24kB
    array([[[           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            [           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            [           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            ...,
            [ 7.0033133e-02,  5.2870691e-02,  5.9192181e-02, ...,
              2.2562087e-02,  2.1210730e-02,  1.5227437e-02],
            [ 5.9931576e-02,  5.8484256e-02,  4.9151063e-02, ...,
              2.2044897e-02,  2.4222255e-02,  2.6216567e-02],
            [ 5.5085778e-02,  5.5405617e-02,  5.3980470e-02, ...,
              2.6493847e-02,  2.0428896e-02,  2.5203109e-02]],

           [[           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            [           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            [           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
    ...
              2.9244423e-03,  2.8046966e-03,  5.1894188e-03],
            [ 2.4666309e-02,  2.4250507e-02,  2.2638798e-02, ...,
              1.7826498e-02,  1.8279552e-02,  1.8472672e-02],
            [-1.3412595e-02, -3.8779974e-03, -4.3025017e-03, ...,
              2.0705163e-02,  1.9490004e-02,  2.1231115e-02]],

           [[           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            [           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            [           nan,            nan,            nan, ...,
                        nan,            nan,            nan],
            ...,
            [ 6.8338752e-02,  6.5652549e-02,  6.8568349e-02, ...,
              1.4026761e-03,  1.4774799e-03,  3.6044717e-03],
            [ 4.3834925e-02,  4.3582022e-02,  4.1610479e-02, ...,
              7.1604848e-03,  7.3412657e-03,  7.2499514e-03],
            [ 3.5245597e-02,  3.4656227e-02,  3.4389675e-02, ...,
              9.7911954e-03,  9.3417764e-03,  1.0603964e-02]]],
          shape=(5, 20, 60), dtype=float32)
    Coordinates:
      * time     (time) datetime64[ns] 40B 1988-12-31 1998-12-31 ... 2010-12-31
      * lat      (lat) float32 80B 65.5 66.5 67.5 68.5 69.5 ... 81.5 82.5 83.5 84.5
      * lon      (lon) float32 240B 30.5 31.5 32.5 33.5 34.5 ... 86.5 87.5 88.5 89.5
    Attributes:
        standard_name:  sea_ice_area_fraction
        long_name:      Monthly 1 degree resolution sea ice concentration
        units:          1
        cell_methods:   time: lat: lon: median


.. GENERATED FROM PYTHON SOURCE LINES 491-492 We now plot the distribution of SIC during El Niño and La Niña years. .. GENERATED FROM PYTHON SOURCE LINES 492-525 .. code-block:: Python fig, ax = plt.subplots( 1, 2, subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) }, figsize=(10, 4), ) for axi in ax.flat: axi.gridlines( draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--" ) axi.coastlines(edgecolor="black", linewidths=0.5) sic_data_Barents_Sea_12_detrend_elnino.mean(dim="time").plot.contourf( ax=ax[0], transform=ccrs.PlateCarree(), cbar_kwargs={"location": "bottom"}, cmap="RdBu_r", levels=21, ) ax[0].set_title("El niño years") sic_data_Barents_Sea_12_detrend_lanina.mean(dim="time").plot.contourf( ax=ax[1], transform=ccrs.PlateCarree(), cbar_kwargs={"location": "bottom"}, cmap="RdBu_r", levels=21, ) ax[1].set_title("La niña years") .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_012.png :alt: El niño years, La niña years :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_012.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Text(0.5, 1.0, 'La niña years') .. GENERATED FROM PYTHON SOURCE LINES 526-529 So is there a significant difference in the distribution of SIC between these two events? :py:func:`easyclimate.calc_ttestSpatialPattern_spatial ` provides a two-sample t-test operation to investigate whether there is a significant difference between the means of the two samples. .. GENERATED FROM PYTHON SOURCE LINES 529-536 .. code-block:: Python sig_diff = ecl.calc_ttestSpatialPattern_spatial( sic_data_Barents_Sea_12_detrend_elnino, sic_data_Barents_Sea_12_detrend_lanina, dim="time", ) sig_diff .. raw:: html
<xarray.Dataset> Size: 20kB
    Dimensions:    (lat: 20, lon: 60)
    Coordinates:
      * lat        (lat) float32 80B 65.5 66.5 67.5 68.5 ... 81.5 82.5 83.5 84.5
      * lon        (lon) float32 240B 30.5 31.5 32.5 33.5 ... 86.5 87.5 88.5 89.5
    Data variables:
        statistic  (lat, lon) float64 10kB nan nan nan nan ... -4.9 -2.239 -2.086
        pvalue     (lat, lon) float64 10kB nan nan nan ... 0.0006233 0.04909 0.06352


.. GENERATED FROM PYTHON SOURCE LINES 537-538 We can find that there is little difference in the effect on SIC under different ENSO events. .. GENERATED FROM PYTHON SOURCE LINES 538-564 .. code-block:: Python fig, ax = plt.subplots( subplot_kw={ "projection": ccrs.Orthographic(central_longitude=70, central_latitude=70) } ) ax.gridlines(draw_labels=["bottom", "left"], color="grey", alpha=0.5, linestyle="--") ax.coastlines(edgecolor="black", linewidths=0.5) # SIC slope diff = sic_data_Barents_Sea_12_detrend_lanina.mean( dim="time" ) - sic_data_Barents_Sea_12_detrend_elnino.mean(dim="time") diff.plot.contourf( ax=ax, transform=ccrs.PlateCarree(), cbar_kwargs={"location": "right"}, cmap="RdBu_r", levels=21, ) ax.set_title("La niña minus El niño", loc="left") ecl.plot.draw_significant_area_contourf( sig_diff.pvalue, ax=ax, thresh=0.1, transform=ccrs.PlateCarree() ) .. image-sg:: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_013.png :alt: La niña minus El niño :srcset: /auto_gallery/images/sphx_glr_plot_basic_statistical_analysis_013.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 17.827 seconds) .. _sphx_glr_download_auto_gallery_plot_basic_statistical_analysis.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_basic_statistical_analysis.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_basic_statistical_analysis.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_basic_statistical_analysis.zip `