Alternative Approach to Evaluating Interpolation Methods of Small and Imbalanced Data Sets

The research concerns an alternative approach to the evaluation of interpolation methods for mapping small and imbalanced data sets. A basic statistical analysis of the standard cross-validation procedure is not always conclusive. In the case of the investigated data set (which is inconsistent with normal distribution), three interpolation methods have been selected as the most reliable (according to standard cross-validation). However, maps resulting from the aforementioned methods clearly diff er from each other. This is the reason why a comprehensive statistical analysis of the studied data is a necessity. We propose an alternative approach that evaluates a broadened scope of parameters describing the data distribution. The general idea of the methodology is to compare not only the standard deviation of the estimator but also three additional parameters to make the fi nal assessment much more accurate. The analysis has been carried out with the use of Golden Software Surfer. It provides a wide range of interpolation methods and numerous adjustable parameters.


Introduction
The research concerns the evaluation of the infl uence of the Stanisław Siedlecki Polish Polar Station (Hornsund, Svalbard -Fig. 1) on the environment with the use of magnetic methods.
The study area is located in the southern part of Spitsbergen and covers approximately 17.5 ha (500 × 350 meters).There is mostly tundra vegetation growing on initial lithosols and frost-deformed regosols [6].
It is essential to emphasize that this area is placed within a territory of South Spitsbergen National Park and has been identifi ed as an Important Bird Area [1].Hence, legal regulations have strongly limited the number of samples, so only 73 topsoil specimens were collected from the vicinity of the station (Fig. 2).
Consequently, a grave problem arose; i.e., how to visualize, interpolate, and fi nally interpret such a small data set?In this paper, we propose a methodology for coping with the evaluation of interpolation methods for mapping small and imbalanced data sets.

Magnetic Susceptibility, Descriptive Statistics, and Chi-Squared Test
For each sample, magnetic susceptibility was measured using Multi-Function Kappabridge (AGICO, Czech Republic).The susceptibility values were normalized by a mass unit (kg), and the specifi c magnetic susceptibility () was calculated.This parameter corresponds to the magnetic mineral content that is usually accompanied by heavy metals and polycyclic aromatic hydrocarbons [8], [13].Thus, the area where magnetic susceptibility is enhanced can be considered as having been infl uenced by the activity of the Polish Polar Station (PPS).
After our measurements, a basic statistical analysis was carried out.Figure 3 presents a decile plot and histogram.Deciles are values that divide the sorted data into ten equal parts; e.g., the 3 rd decile is the value below which there are 30% of the values [12].Consequently, the 5 th decile is equal to the median value.
Magnetic susceptibility values vary in wide extremes; i.e., from 7.98•10 −8 m 3 •kg −1 to 138.86•10 −8 m 3 •kg −1 .The mean is 19.43•10 −8 m 3 •kg −1 (dashed line in Figure 4a), and the median is 13.20•10 −8 m 3 •kg −1 .The mean value is higher than the 75 th percentile, which means that more than 75% of the samples feature magnetic susceptibility lower than the mean.The high level of discrepancy between the mean and median values indicates the strong asymmetry of the data distribution.This fact is also proven by the high positive value of skewness (i.e., equalling 4.24).Moreover, the standard deviation is very high -18.68•10−8 m 3 •kg −1 (over 96% of the mean).The data distribution is leptokurtic -kurtosis equals 23.35 [7].Subsequently, we carried out the chi-squared test with a signifi cance level of 5%.It was proven that investigated data is not consistent with normal distribution [15].
This kind of data requires exceptional caution during both the interpolation and interpretation processes.Furthermore, the literature does not recommend one particular interpolation method in such cases in order to create a reliable map of the studied variable.Hence, we have decided to examine all of the interpolation methods available in Golden Software Surfer.

Interpolation Simulations Using Surfer
Golden Software Surfer provides a wide range of interpolation methods with numerous adjustable parameters [3]: 1. Inverse Distance to a Power is a weighted average interpolator.Weights of the data are determined in accordance with the inverse of the distance from a grid node.Moreover, a weighting power can be adjusted.This parameter controls how quickly weighting factors drop with distance from a grid node.
If the weighting power is high, the infl uence of the points far from the grid node is insignifi cant.Thus, this method can be either an exact or smoothing interpolator.In our case, power values of 1, 2, 3, and 4 were used.2. Modifi ed Shepard's Method is similar to the previous one, and it uses the same type of interpolator.Nonetheless, it eliminates (or at least reduces) the bull's-eye eff ect by using the local least-squares method.3. Kriging is a geostatistical gridding method that is most-frequently recommended by the literature [3,4,10,14].This method is based on the theory of regionalized variables.Variograms, anisotropy, and other options help to fi nd trends in the analyzed data.4. Minimum Curvature is one of the smoothing interpolators.It helps to generate a smooth surface that honors the data as closely as possible.5.The Natural Neighbor method is based on Delaunay triangulation, where weights are proportional to the areas of particular Voronoi cells.6. Nearest Neighbor is a simple interpolation method assigning the value of the nearest point to each grid node.This method is useful in cases when the data is already evenly spaced.7. Polynomial Regression defi nes large-scale trends and patt erns in the analyzed data using various types of polynomials.8. Radial Basis Function is a set of exact interpolation methods.The basis kernel functions are analogous to the variograms in Kriging.Additionally, a smoothing factor can be introduced in order to produce a smoother surface.9.The Triangulation with Linear Interpolation method uses optimal Delaunay triangulation.In result, no triangle edges are intersected by another triangle.Each triangle determines a plane over the grid nodes lying within the triangle, with the tilt and elevation of the triangle defi ned by the three original data points.10.Local Polynomial assigns grid node values with the use of a weighted least-squares fi t within an adjustable search ellipse.Local polynomials can be of degrees 1, 2, or 3. 11.The Moving Average interpolation method is based on averaging the data within a user-defi ned search ellipse for each grid node [3].
Altogether, a series of 83 simulations was carried out using aforementioned 11 interpolation methods (Tab.1).Each simulation had diff erent values of parameters characteristic for the particular method.

Preliminary Interpolation Assessment -Standard Cross-Validation
In order to be sure that the presentation of the data in the form of a map is reliable, an evaluation of the quality of the interpolation was needed.For this purpose, standard cross-validation is routinely used.During this procedure, Surfer removes one point from the data set; then, using the remaining data and a particular algorithm, it interpolates a new value at this point -it is called the Estimated value (E i ).
The diff erence between the Estimated value and the measured one (M i ) at the particular point is called the Residual (R i ) (Equation ( 1)) [3,12].
Afterwards, the program repeats the process for the rest of the points from the data set.After calculating the Residual for each point, various types of statistical analysis could be carried out and used as a quantitative objective proxy for the interpolation method quality [2].The literature [3] suggests focusing on a comparison between the standard deviation and standard error of the residuals during the selection of the best method.
The standard error is defi ned by the following equation: where: S R -standard deviation of the residuals, n -sample size.
Being aware of the proportionality between the standard error (D R ) and standard deviation (S R ); we analyzed only the latt er parameter.
The results of the cross-validation procedure for the studied data set are shown in Table 2.The last column in Table 2 contains R E -calculated as the relative diff erence between the analyzed parameter value for a particular interpolation method and the parameter value for the best one.The conducted analysis has revealed that Minimum Curvature is the best interpolation method (having the smallest value of standard deviation).The second is Polynomial Regression; the third -Local Polynomial; and the fourth -Radial Basis Function.
Despite the fact that there is only a 9.6% diff erence between the fi rst four methods (in compliance with the cross-validation algorithm), their graphical representations are signifi cantly disparate (Fig. 4).What is more, these kinds of discrepancies could be observed in almost half of the utilized methods (Figs. 5, 6, Tabs.3-5).It is important to emphasize that each interpolation method has slightly different characteristics.For instance, Moving Average is most-reliable in cases with a large volume of data; Local Polynomial is most-applicable to data sets that are locally smooth; and Polynomial Regression shows only the underlying large-scale trends and patt erns [9].Hence, some of the methods are not suitable for interpolation of the studied data.However, we have considered all of the available methods, because the proposed methodology should be as universal as possible and applicable to each data set.
The other important issue is the use of the Kriging method.It is often recommended by the literature [2,3,14] as one of the most-fl exible and most-useful method.Nevertheless, it only achieved eighth place in the case of the investigated data set (Tab. 2).The reason is that Kriging is most-applicable to data that is consistent with normal distribution.Otherwise, there is no clear suggestion on which interpolation method is most-reliable.
After critical visual assessment by the researchers, it becomes quite obvious that only Minimum Curvature is reliable among the best three methods (Tab. 1, Fig. 4a-c).However, such an evaluation would not be clear if the other methods were taken into consideration.For instance, the choice between Minimum Curvature and Radial Basis Function seems to be impossible without any statistical analysis (Fig. 4a, d).
Furthermore, the conducted simulations have proven that neither the standard cross-validation procedure nor the subjective assessment of the researcher may be reliable in assessing the quality of interpolation, especially in the case of data that is not consistent with normal distribution.
The literature recommends the Univariate and Willmott statistical methods in such situations [16].They are partially based on a linear regression analysis and absolute diff erence measures.However, Willmott [16] emphasizes that the above-mentioned parameters can be misleading.The data set analyzed in this paper contains a number of outliers (values even several times higher than the mean; cf.Fig. 3); consequently, it is inconsistent with normal distribution.The aforementioned facts increase the risk of an unreliable evaluation of interpolation quality; this is the reason why we propose an alternative approach that evaluates a broadened scope of parameters describing the data distribution.

Alternative Approach -Error Index
The general idea of the proposed methodology is to compare not only the standard deviation of the estimator (Step 1) but also three additional parameters in order to make a fi nal assessment much more accurate.Besides the standard deviation of the Residuals, our analysis shall consider: -Mean of absolute relative residual (Step 2) -Mean residual (Step 3) -Diff erence between standard deviations of the measured and estimated values (Step 4) where: R i -residual of the particular data point, E i -estimated value of the particular data point, M i -measured value of the particular data point, n -number of points, S M -standard deviation of the measured values, S E -standard deviation of the estimated values.
It is important to emphasize that our goal is not to assess each parameter separately but to draw one fi nal conclusion from all of them, making the assessment more accurate and thorough.
After calculating all of the parameters, we have normalized them from 1 to 10, where 1 is the best mark and 10 is the worst.The sum of these normalized values for each method gives a new parameter, which we call the Error Index.Therefore, the best interpolation method is the one with the lowest value of Error Index.Admitt edly, the conclusion is the same as that according to standard cross-validation, but it will not always be like this (especially in the cases of small and imbalanced data sets).
The conducted research has proven that, in the cases of small and imbalanced data sets, selection of the best interpolation method could be problematic.
Firstly, the researcher should be highly aware of the basics of each method.Each algorithm is constructed for specifi c sets of data; for instance, Moving Average is the most-applicable to large and very large data sets (more than 1000 data points); Nearest Neighbor is the most-useful when it comes to regularly (or almost regularly) spaced data points; methods based on polynomials are suitable for defi ning local/global trends and patt erns in data; Kriging is based on the theory of regionalized variables, which works best with data consistent with normal distribution, etc. [2,3,9,14].
Secondly, there are some methods that result in relatively similar maps (e.g., Minimum Curvature and Radial Basis Function; Fig. 4a, c) and, simultaneously, have considerably diff erent statistical results.A basic statistical analysis of the standard cross-validation procedure used during the evaluation of the gridding method quality is not always conclusive, especially in the cases of small and imbalanced data sets.
Finally, as shown in the conducted analysis, nearly every single method takes a diff erent place depending on which statistical parameter has been taken into consideration in a particular step.Hence, it is vital to remember that the statistics are not related to the physical basis of the investigated phenomena; therefore, the researcher's reliable assessment of the fi nal map is crucial to the proper interpretation of the studied variable.

Conclusions
-The fi nal results have revealed that Minimum Curvature is the best interpolation method for the investigated data set.-Small and imbalanced data sets require careful consideration during interpolation method selection, because some algorithms might lead to unrealistic distortions and, consequently, a misinterpretation of the studied data.-Both the researcher's knowledge about the physical basis of the studied parameter and a critical assessment of the fi nal map are essential to the reliable interpretation of the investigated variable.-Standard cross-validation analysis is not always conclusive.The use of Error Index allows us to evaluate a broadened scope of parameters describing the data distribution, which makes the fi nal assessment more accurate and thorough.-The proposed approach has been applied to only one data set.Hence, in order to assess its functionality, a further analysis of diff erent data sets is strongly recommended.

Table 1 .
Number of simulations conducted using particular interpolation method

Table 2 .
Results of cross-validation procedure for studied data

Table 3 .
Results of second parameter (Eq. 3) for studied data

Table 4 .
Results of third parameter (Eq.4) for studied data

Table 5 .
Results of fourth parameter (Eq.5) for studied data

Table 6 .
Final results of all normalized parameters and Error Index