科研进展

科研进展
位置: 首页 > 科研进展 > 正文

Improving potato leaf chlorophyll content prediction using a machine learning model with a hybrid dataset

时间:2025-05-26   点击数:

【论文题目】Improving potato leaf chlorophyll content prediction using a machine learning model with a hybrid dataset(基于混合数据集的机器学习模型提高马铃薯叶片叶绿素含量预测精度)

【作者】Haibo Yang(杨海波), Yuncai Hu(胡云才), Hang Yin(尹航), Qingyu Jin(金庆宇), Fei Li(李斐), Kang Yu(于康)

【摘要】

Combining proximal remote sensing and machine learning (ML) has become a common approach to monitoring leaf chlorophyll content (LCC) for crop stress, productivity assessment, and nutrient management. However, the robustness of ML models is constrained by the limited numbers of in-situ training samples due to time-consuming and labour-intensive workflow in sample analysis. To cope with the issue of limited in-situ samples in monitoring potato LCC, this study used hybrid datasets that integrated limited in-situ measured samples and different-size PROSAIL model simulated samples to calibrate the ML models. Subsequently, the calibrated ML models were evaluated using independently field-measured data. During LCC sampling, canopy reflectance data (400–950 nm) were collected using a passive bi-directional spectrometer and an unmanned aerial vehicle carrying a hyperspectral sensor. Five types of ML models, including the partial least squares regression (PLSR), Gaussian process regression (GPR), random forest (RF), gradient boosting machines (GBM), and blending, were trained for LCC prediction. The scalability of the best ML models was evaluated using hyperspectral data extracted from unmanned aerial vehicle images. The results indicated that the ML models trained using the hybrid dataset outperformed those trained using the single limited in-situ measured dataset or the single PROSAIL simulated dataset when predicting the LCC of different potato cultivars. Nevertheless, when the number of measured in-situ samples was limited, the size of the simulated samples in the hybrid dataset influenced the prediction accuracy and robustness of the ML model. The RF model had the strongest generalization regardless of the handheld passive spectrometer data (R² = 0.67, RPD = 1.55 and RMSE = 0.08 g m−2) and the aerial vehicle image data (R² = 0.88, RPD = 1.97 and RMSE = 0.06 g m−2). Our results imply the potential of integrating limited in-situ samples with simulated data to achieve accurate and robust estimations for potato LCC. This study offers a key solution for crop chlorophyll monitoring in scenarios with restricted data availability.

【关键词】 machine learning; remote monitoring; potatoes; chlorophyll content; generalization ability



内蒙古农业大学资源与环境学院 版权所有 

地址:内蒙古呼和浩特市赛罕区鄂尔多斯东街29号内蒙古农业大学资源与环境学院

邮编:010011