Environmental Determinants of Surface Water Quality Based on Environmetric Methods

A multivariate statistical technique, exploratory factor analysis-FA, has been used to assess the natural and anthropogenic impacts on surface water quality in two river basins in Western part of Turkey (Büyük Menderes and Küçük Menderes River Basins). The method attempted to explain the correlations between the observations in terms of the underlying factors, which were not directly observable and to reduce a great number of the water quality variables to a smaller number of attributes, grouped in common factors. Furthermore, by using the Confirmatory Factor Analysis-CFA method, the reliability of separated factors and the dimensionality have been determined. Then the path diagram was designed to investigate the structural model. Results revealed that ionic composition and water oxygenation of waters were factors controlling overall water quality in the region. Since components of first factors were observed in higher levels in Büyük Menderes Basin, it can be concluded that this region was affected from anthropogenic activities severely, compared to the Kücük Menderes Basin. Overall results revealed that FA confirmed by CFA can be used to identify probable pollution sources of surface waters.


Introduction
The assessment of the river water quality is usually based on the comparison of analytically determined monitoring values of particular physicochemical parameters with the allowable threshold values defined in national or international legislation packages. A much more reliable approach for classification, modeling and interpretation of data obtained from monitoring studies of surface water appears to be chemometrics/ environmetrics using intelligent data analysis and data mining. Only multivariate statistical methods can describe the complex relationships in an ecosystem [1].
These techniques including but not limited to factor, cluster analysis (CA) discriminant analysis etc. have been applied for the interpretation of large data matrices and reliable characterization and evaluation of surface water quality, to enhance understanding of the spatio-temporal variation resulting from by natural and anthropogenic processes related to seasonal changes. They have been recognized as powerful tool in identification of physical, chemical and biological characteristics that affects water systems for efficient management and effective solution to pollution problems [2].
Environmental data, water quality data, are also characterized by high variability. Much information is lost by using only univariate graphical or statistical methods for data evaluation and interpretation. Chemometric/ environmetric methods, in particular methods of multivariate data analysis, help to extract the latent information in such data [3].
In the study water quality data obtained from two river Basins (Büyük Menderes & Küçük Menderes Basins) in Turkey was evaluated using factor analysis. The Confirmatory Factor Analysis-CFA method, was also used to determine the reliability of separated factors and the dimensionality Then the path diagram was designed to investigate the structural model.

Study Area
Büyük Menderes Basin: The Büyük Menderes River Basin is located in Western Anatolia with a total land area of about 25000 km 2 which is approximately 3 % of the total surface area of Turkey (Fig. 1). The total length of the main river on the catchment basin is 584 km. The population living in the Basin is approximately 2.4 million. The land use in the basin is dominated by agricultural use (40%), forest (45%), followed by pasture and meadow (10%) surface waters (%1), urban area (1%) etc. The main sources of pollution in the basin are caused by domestic and agricultural activities. There is also relatively limited pollution due to industrial sources. The water is mainly used by agricultural domestic and industrial supply purposes [4,5]. In the study quality samples obtained from 8 monitoring stations on monthly basis along one year was used to determine environmental factors explaining variation in water quality in the region.
Küçük Menderes Basin: The Küçük Menderes River Basin is located in western Turkey. The catchment area is 3502 km 2 with 129 km river length (Fig. 2). The Küçük Menderes River Basin is a very productive and land uses are as follow: agricultural area (52%), olive trees (12%), orchard (4%) and arable land (2%) etc. Furthermore industrial sites concentrated in the west. The Küçük Menderes River and its tributaries constitute the only surface water system in the study area, with an annual average discharge rate of 9.5m 3 /s. The basin has hot and dry summers with mild and rainy winters. The mean annual precipitation calculated for the study area is 640 mm. Surface water quality class is not proper for many purposes in the region and main reason is uncontrolled industrial and agricultural discharges [7,8].
In the study data obtained on monthly basis along a year from 9 stations was subjected to evaluation to fingerprint water quality in the region.

Study Method
In the study sampling and laboratory analysis were carried out according to the standard procedures described in APHA [9]. Analyzed water quality variables were electrical conductivity-EC, total dissolved solids-TDS, chloride-Cl -, nitrate nitrogen-NO 3 -N, dissolved oxygen DO, biochemical oxygen demand-BOD, sulphate-SO 4 2-, sodium-Na + , calcium-Ca 2+ and Magnesium-Mg 2+ . Statistical analysis methods (FA, confirmatory FA) were performed using IBM SPSS 24 and Lisrel (software for structural equation model) software.
Factor analysis is a statistical method used to find a small set of unobserved variables (also called latent variables, or factors) which can account for the covariance among a larger set of observed variables (also called manifest variables). A factor is an unobservable variable that is assumed to influence observed variables [11].
Confirmatory factor analysis CFA is theory-or hypothesis driven. With CFA it is possible to place substantively meaningful constraints on the factor model. Researchers can specify the number of factors or set the effect of one latent variable on observed variables to particular values. CFA allows researchers to test hypotheses about a particular factor structure (e.g., factor loading between the first factor and first observed variable is zero). It is common to display confirmatory factor models as path diagrams in which squares represent observed variables and circles represent the latent variables. Single-headed arrows are used to imply a direction of assumed causal influence, and double-headed arrows represent covariance between two latent variables. Latent variables "cause" the observed variables, as shown by the single-headed arrows pointing away from the circles and towards the manifest variables [11].

Results
Descriptive statistics of data set is presented in Table 1 and Table 2.The correlation matrix of variables was generated and factors extracted by the Centroid method, rotated by Varimax rotation. Total variance, factor loadings and cumulative variance are given in Table 3 and 4. The factor analysis generated two significant factors which explained 67% of the variance in data sets for the Büyük Menderes and 60% for the Küçük Menderes Basin. Factor components at two study area were quite similar.
Total dissolved solids (TDS) is measure of all constituents dissolved in water. The principal constituents are usually calcium, magnesium, sodium, and potassium cations and carbonate, hydrogencarbonate, chloride, sulfate, and nitrate anions. TDS in water supplies originate from natural sources, sewage, urban and agricultural run-off, and industrial wastewater. Salts used for road de-icing can also contribute to the TDS loading of water supplies [12,13].
BOD and NO 3 -N levels Kjeldahl -N are indicators of organic pollution and the concentrations of both variables were low in both cases. Therefore this factor could represent water oxygenation rather than pollution (with presence of BOD).
Based on these statements it can be concluded that two factors representing two different processes were: F1: ionic composition F2: water oxygenation Confirmatory factor models as path diagrams have been displayed in Fig.3 and 4.
Since these variables levels were comparatively higher in Büyük Menderes Basin, it can be concluded that this region was affected from anthropogenic activities more, compared to the Küçük Menderes Basin.

Conclusions
This study aimed to extract hidden factors explaining the structure of the database, and to quantify the influence of possible sources on the water parameters of the two selected rivers. Data set comprised 10 variables EC, TDS, Cl -, NO 3 -N, DO, BOD, SO 4 2-, Na, Ca 2+ and Mg 2+ . Factor analysis results revealed that two factors representing two different processes were ionic composition, and water oxygenation of waters. Path diagrams confirmed these results. The multivariate statistical techniques, namely, factor analysis and confirmatory factor analysis are important analytical techniques for the processing of water quality parameters and power full tools for classification as well as identification of pollution sources.