Predicting continuous form of soil-water characteristics curve from limited particle size distribution data

Detailed information derived from a soil moisture characteristics curve (SMC) helps in water flow and solute transport management. Hence, prediction of the SMC from soil particle size distribution (PSD), which is easy to measure, would be convenient. In this study, we combine an integrated robust PSD-based model and a Van Genuchten SMC model to predict a continuous form of SMC using sand, silt and clay percentages for 50 soils selected from the UNSODA database. We compare the performance of the proposed approach with some previous prediction models. The results indicated that the SMC can be predicted and modelled properly by using sand, silt, clay and bulk density data. The model’s bias was attributed to the high fine particle and organic carbon (OC) content. We concluded that independence of the proposed method from the database and any empirical coefficients make predictions more reliable and applicable for large-scale water and solute transport management.


INTRODUCTION
The soil's unsaturated zone forms a pivotal part of the hydrological cycle as it connects surface water to groundwater through the porous medium of soil.Therefore, a comprehensive evaluation of unsaturated soil proves useful for studying water flow and solute transport (Harter and Hopmans 2004).One of the most important challenges in soil physics is to deal with the estimation of the hydraulic conductivity curve (HCC) and soil moisture characteristic curve (SMC) (Futter et al., 2007;Balland et al., 2008).The SMC, which indicates the functional relationship between soil water content and matric potential, is used to model the solute transport and water flow in the vadose zone (Hunt et al., 2013).However, due to temporal and spatial variability, a direct measurement of the hydraulic properties is labour-intensive, costly and inaccurate (Schaap and Leij 1998;Christiaens and Feyen 2001;Islam et al., 2006;Abbasi et al., 2011).Therefore, considerable efforts have been made to estimate the SMC indirectly (Antinoro et al., 2014).
Easily available soil properties have been used extensively as a basis for alternative methods to estimate the HCC and SMC.In recent years, researchers have paid considerable attention to predicting SMC in terms of pore size distribution (PoSD) using basic soil physical properties (Nimmo et al., 2007;Mohammadi and Meskini-Vishkaee, 2013).These approaches, which are dubbed transfer functions, can be classified into three groups: • Statistical techniques (pedo-transfer functions) or neural network models determine the correlation of basic soil properties (for instance sand, silt and clay percentages and organic matter content) to SMC points or parameters (Dashtaki et al., 2010;Vereecken et al., 2010;Abbasi et al., 2011).Available and reliable soil databases provide a variety of inputs for statistical models and, therefore, these models have been widely used (Hwang and Choi, 2006).For instance, ROSETTA software uses neural network analysis to estimate soil hydraulic parameters with hierarchical pedo-transfer functions.However, some researchers have shown that ROSETTA software did not estimate the Van Genuchten model (VG model) parameters properly (Yang and You, 2013).• Physico-empirical models express the relation of particle size distribution (PSD) with PoSD.Arya and Paris (1981) made the first attempt to develop a physico-empirical model, which connects the soil moisture content and void volume.They estimated the pore diameter from the particle size (AP model).

429
Therefore, the objectives of this study were (i) to adjust the MV-VG model for the prediction of SMC using only sand, silt and clay percentage; (ii) to compare the performance of the proposed approach with the results from the MV-VG model using the UNSODA (Unsaturated Soil Database) database and (iii) to evaluate the performance of the adjusted MV-VG model with the ROSETTA software prediction results.

Scaling approach
Empirical parameters of the soil water characteristic curve and database-dependent models are error sources of models which describe soil hydraulic functions.Elimination of such systematic error using scaling approaches greatly improves the SMC accuracy.Meskini-Vishkaee et al. (2014) proposed an SMC scaled model based on the VG model assuming that the residual water content equals zero (θ r = 0).
where θ (L 3 •L -3 ) is the soil moisture content, θ s (L 3 •L -3 ) is the saturated soil moisture content, h (L) is the matric suction, m and α are fitting coefficients and the parameter n* is the scaled pore size distribution index.
where n is a fitting coefficient and λ is defined as: The parameter ξ max (-) equals 1.41432 and ξ (-) is a coefficient depending on the arrangement state of soil particles and is defined as: where e(-) is the void ratio given by: where ρ b (M•L -3 ) and ρ s (M•L -3 ) are bulk and particle densities, respectively.

Developed soil water characteristic curve
Mohammadi and Vanclooster (2011) presented a conceptual robust model (MV model) to predict the soil matric suction, h i , from the particle size assuming the pore space geometry: where h i (L) is the matric suction of the i th fraction size, r i (L) is the radius of the i th fraction size.Simplified assumptions of the MV model, which ignore the considerable effects of clay surface forces, lead to under-predictions in a dry range of the SMC, despite the fact that the MV model predicts the water characteristic curve accurately because of independence of SMC to the database and no empirical parameters.Following the Arya and Paris model (AP model), the mathematical equation between the moisture content (θ i ) and h i is defined as: where w i (-) is the particle mass fraction of the i th fraction.Eq. ( 1) is the scaled form of the Van Genucthen SMC model when θ r = 0.
Combining Eqs 6 to 8 gives: Eq. 9 is fitted to the PSD data to estimate the VG model parameters (m, n and α), which should be used as the input parameters in Eq. 1 as an SMC predictor model.Since Eq. 9 includes 3 variable fitting parameters it should be used to fit the full range of PSD data containing at least 4 measured points.For limited-availability data points, Eq. 9 can be represented with the assumption of m = 1-1/n.
Eq. 10 can be used to fit PSD data including the sand, silt and clay percentages only.In summary, fitting of Eq. 10 or Eq. 9 allows the estimation of SMC parameters (n, m, α).Considering that ρ b is known and the scaling factor and subsequently n* can be calculated, the continuous form of SMC is predicted using Eq. 1.

MATERIALS AND METHODS
Fifty soil samples from the UNSODA database (Nemes et al., 2001) having PSD data with at least 4 fractions were selected to estimate SMC.The selected codes are presented in Table 1.The UNSODA database contains unsaturated hydraulic characteristics of 790 soil samples from all over the world, and especially Europe and America.They are used to develop estimations of water flow and solute transport management.
Equation 9 was used to fit the full range of PSD data with at least 4 measured points (Method 1: full PSD method) and Eq. 10 was used to fit the PSD data by assuming that only the sand, silt and clay percentages are known (Method 2: limited PSD method).To evaluate the unknown coefficients of Eq. 9 and Eq. 10, the trust region algorithm of Matlab8.3 software (Matlab 8.3, The Mathworks Inc., Natick, MA, USA) was used.
The parameters e, ξ and λ were easily calculated using available bulk and particle densities.In most UNSODA soil samples, θ s data are available.For those samples with no θ s data we used the suggestion of Chan and Govindaraju (2004), who assumed saturation moisture content to be equal to the corresponding moisture content of the lowest matric potential.
The ROSETTA software is also used to estimate the SMC parameters of the VG model using the SSCBD model option (sand, silt and clay percentages and bulk density are model predictors).

Statistical analysis
To calculate the accuracy of each prediction, the root mean square error (RMSE) between the measured and predicted moisture content was computed: where N is the number of measured moisture contents, θ i(p) and θ i(m) are predicted and measured moisture content in the i th matric suction, respectively.The coefficient of determination (R 2 ) is also presented to evaluate the correlation between the measured and predicted moisture content.Relative improvement (RI) was calculated to compare the prediction methods (McBratney, 2002): where RMSE f and RMSE s are RMSE of Method 1 (as the reference model) and Method 2 or ROSETTA (as the comparative approaches), respectively.A positive value of dimensionless RI indicates that the accuracy of the predicted moisture contents improves by using Method 2 or the ROSETTA approach.
To compare the measured and predicted moisture content for the dataset containing 50 soils, UNSODA codes were evaluated using the mean absolute error (MAE) and mean bias error (MBE) defined as: and where M i , P i are the measured and predicted values of moisture content, respectively, and N is also the number of measured and predicted points.MAE is a statistical criterion to show the average of error magnitude and MBE is used to show the average bias of each method.A positive MBE value indicates over-prediction.Methods 1 and 2 are based on the MV model.This model assumes that all soil particles are spherical and that soil structure can only influence soil bulk density.The effects of soil organic matter content, particle surface energy, lens and film water volume are not supported by this model (Mohammadi and Vanclooster, 2011;Mohammadi and Meskini-Vishkaee, 2013).Therefore, the under-prediction by Method 1 and Method 2 can be partially attributed to the assumptions of the MV model.For all sample soils represented in Fig. 1, Method 1 and Method 2 provide consistent predictions, especially for the wet ranges of SMC.

RESULTS AND DISCUSSION
The results of fitting Eq. 9 (Method 1: full PSD method) and Eq. 10 (Method 2: limited PSD method) are presented in Table 2.
Table 2 shows that the average value of θ s is 0.445 for all selected soils and varies from 0.324 for sandy loam soils to 0.557 for silty clay loam soils.For Methods 1 and 2, the average values of n* were 1.374 and 1.229, respectively.Regarding Eqs 3 to 5, the λ value is computed using bulk and particle densities; the average λ values of Method 1 and Method 2 are the same (0.756).For Method 1 and Method 2, the geometric average values of α are 0.0101 and 0.0129, respectively.The prediction results of Method 1, Method 2 and the ROSETTA software are   The average RMSEs of Method 1, Method 2 and the ROSETTA software are 0.048 (varying from 0.023 for silty clay soils to 0.080 for sandy soils), 0.034 (varying from 0.009 for sandy soils to 0.064 for loam soils) and 0.069 (varying from 0.027 for sandy loam soils to 0.139 for silty clay loam soils), respectively.In terms of RMSE, Method 1 and Method 2 predicted consistently better than ROSETTA software.The RMSEs derived from Method 1 and Method 2 are smaller than the 0.060 and 0.2071 obtained from the scaling approach by Meskini et al., (2014) and Mohammadi and Vanclooster (2011 Comparison of the performances of Method 1 and Method 2 with the performance of ROSETTA software reveals that ROSETTA software is not capable of predicting SMC accurately in fine textured soils because of the fine particles.The average value of R 2 for all selected UNSODA soil textures is 0.958, 0.975 and 0.910 for Method 1, Method 2 and the ROSETTA software, respectively (Table 3).In terms of R 2 values, Method 1 and Method 2 performed consistently better than the ROSETTA software.The small difference between the R 2 values of Method 1 and Method 2 is not statistically significant.

Overall comparison of developed model
Comparison of Method 1 and Method 2 according to the average RI value (0.126) indicated that, in general, the accuracy of the MV-VG model does not increase with an increased number of measured points of the PSD curve.However, the average RI value per texture class varies from -0.425 for clay soils to 0.637 for loamy sand soils.The RI values in comparison of Method 1 and Method 2 for clay, loam, silty clay loam, and silty clay textured soils were slightly negative, revealing that a low number of model inputs reduce the SMC accuracy in fine to moderate textured soils (Table 3).The average RI value for comparison of ROSETTA and Method 2 is 0.342 (varying from -0.230 for loam soils to 0.788 for silty loam soils), which indicated that, although ROSETTA and the limited PSD method require the same input data, Method 2 (limited PSD method) predicts SMC more accurately.
In many pedological studies, sand, silt and clay percentages are measured routinely and this information is usually available in most soil survey reports.Method 2 can be used to predict SMC easily.Moreover, Method 2 does not need any empirical coefficient or database-dependent parameter.This advantage allows for prediction of the soil hydraulic characteristics regardless of spatio-temporal variations; thus SMC can be estimated for large-scale studies.
Table 4 represents statistical criteria to compare the measured vs. predicted moisture content using mean absolute error, mean bias error and R 2 of linear regression.In terms of MAE, Method 2 predicts SMC more accurately.As can be seen in Table 3, that is also evident from comparing average values of RMSE and R 2 obtained for Method 2 (0.034, 0.975) and Method 1 (0.048, 0.958) and ROSETTA software (0.069, 0.910).The negative MBE shows that Method 2 over-predicts SMC while Method 1 and ROSETTA under-predict.
Comparison of measured and predicted soil moisture content of the full dataset for Method 1, Method 2 and the ROSETTA software is respectively shown in Fig. 2 (a-c).In general, the 1:1 line shows the equal measured and Optimal value 0 0 1 a R 2 is the determination coefficient of linear regression in Fig. 2 (a-c)

CONCLUSIONS
In this study, we adopted the MV-VG model for the prediction of SMC using only sand, silt and clay percentages, and we also evaluated the performance of this approach with the experimental data and results of ROSETTA software.Results showed that the continuous form of SMC can be predicted accurately assuming that sand, silt and clay percentages are the only known properties of the soil.Full PSD data are not usually available while sand, silt and clay percentages are measured conventionally in all soil analyses.In general, we summarized the advantages of the proposed method for proper SMC prediction: (i) This method does not depend on a database or any empirical parameter.(ii) The proposed approach predicts continuous forms of SMC for all tested soils.(iii) In comparison with the well-known ROSETTA software, this method is capable of predicting SMC more accurately, especially in a dry range of SMC.Since sand, silt and clay percentages are readily available, soil properties and their spatial-temporal variability are approximately constant.The proposed method can be used as an alternative for predicting SMC in large-scale studies.

Figure 1 (
Figure 1 (a-h) Examples of measured and predicted water retention curves for Method 1 (Eq.9), Method 2 (Eq.10) and ROSETTA software for: (a) clay soil (b) loam soil (c) sandy soil (d) sandy clay loam soil (e) sandy loam soil (f) sitly loam (g) silty clay soil (h) silty clay loam soil ) (MV model), respectively.The RMSE results of the ROSETTA software in the current study (0.069) are approximately the same as for ROSETTA software by Meskini et al. (2014) (0.0745).

TABle 3 Comparison of average of RMSE, R 2 and RI values of Method 1 (Eq. 9), Method 2 (Eq. 10) and ROSETTA software in predicting the SMC. The standard deviations are presented in parentheses.
July 2018 Published under a Creative Commons Attribution Licence 433 summarized in Table 3 by comparing the statistical criteria, including RMSE, R 2 and RI.

TABle 4 Statistical comparison of measured vs. predicted moisture content
July 2018 Published under a Creative Commons Attribution Licence434predicted moisture content to reveal the bias of measured vs. predicted moisture content of the dataset.Linear regression (dashed red line) is considered to evaluate the best fitting line through predicted and measured moisture content.The slope values of linear regression between measured and predicted moisture content were 0.94 (Method 1), 0.93 (Method 2) and 0.77 (ROSETTA software).The R 2 of the linear regressions were 0.917, 0.950 and 0.827 for Method 1, Method 2 and the ROSETTA software, respectively.According to the slope values, Fig.2 (a-c) and MBE criteria, Method 2 slightly over-predicts moisture contents, while the ROSETTA software and Method 1 under-predict.Comparison of the proposed methods revealed that Method 1 and Method 2 generally predict SMC more accurately than ROSETTA software, according to statistical criteria, including RMSE, RI, MBE and R 2 .