Universal Journal of Agricultural Research Vol. 11(5), pp. 836 - 848
DOI: 10.13189/ujar.2023.110509
Reprint (PDF) (1406Kb)


Rice Yield Modeling Using Machine Learning Algorithms Based on Environmental and Agronomic Data of Pampanga River Basin, Philippines


Maria Cristina V. David *
Department of Civil Engineering, Pampanga State Agricultural University, Philippines

ABSTRACT

This study investigated the environmental and agronomic factors that influence rice crop yields in Pampanga River basin in Central Luzon Philippines. Specifically, this study examined the influence of fifteen (15) environmental and agronomic factors, to enumerate some: rainfall, temperature, wind speed, humidity, solar radiation, El Niño–Southern Oscillation (ENSO) classification, storms, tropical cyclones, rice crop yield (RCY), type of irrigation or water source; the data collected and investigated covered the years 2009 to 2018. Correlation and principal component analysis (PCA) were used to identify trends and patterns to identify significant environmental and agronomic variables that possibly affects rice crop yields. RCY was predicted using machine learning algorithms that include Linear Regression (LR), Artificial Neural Network (ANN) and Random Forest (RF); these models were evaluated using MSE (Mean Squared Error), MAE (Mean Absolute Error), and R2 (Coefficient of Determination) metrics. Findings indicate that Correlation and PCA analysis identified average monthly rainfall (Rain-Ave) (0.387), total monthly rainfall (Rain-Total) (0.376), pressure at station (P-station) (0.388), and wind speed (Wind) (0.351) as among the environmental and agronomic factors that relatively have high loadings in the PCA analysis. Further, findings on the machine learning (ML) algorithms showed that RF model consistently performs across all evaluation methods, with lower MAE, MSE, and RMSE scores and higher R2 scores compared to the other models; the LR model also showed reasonably good performance in MAE and RMSE metrics, although it lagged behind the RF model; surprisingly the ANN model performed poorly despite it being the best performer during the model training phase. This study noted that overfitting is a major concern, and recommended that if a study has a limited data, other ML algorithms should be used instead of ANN (extensive data is required) as it could lead to overfitting, unsatisfactory results, poor generalization and performance on unseen data among other issues.

KEYWORDS
Crop Modeling, Environmental and Agronomic Data, Machine Learning Algorithms, Philippines

Cite This Paper in IEEE or APA Citation Styles
(a). IEEE Format:
[1] Maria Cristina V. David , "Rice Yield Modeling Using Machine Learning Algorithms Based on Environmental and Agronomic Data of Pampanga River Basin, Philippines," Universal Journal of Agricultural Research, Vol. 11, No. 5, pp. 836 - 848, 2023. DOI: 10.13189/ujar.2023.110509.

(b). APA Format:
Maria Cristina V. David (2023). Rice Yield Modeling Using Machine Learning Algorithms Based on Environmental and Agronomic Data of Pampanga River Basin, Philippines. Universal Journal of Agricultural Research, 11(5), 836 - 848. DOI: 10.13189/ujar.2023.110509.