Spatial Data Makes AI Crop-Yield Predictions Better
This article is part of our exclusive IEEE Journal Watch series in partnership with IEEE Xplore.
Researchers from Zhejiang University and risk-management company Tongdun Technology, both based in Hangzhou, China, have improved crop-yield predictions using deep-learning techniques. It's a promising method that can account for the way crop yield is affected by the location of farmland, and can help produce more accurate predictions for farmers and policymakers.
Predicting crop yield is an important part of agriculture that has historically consisted of tracking factors like weather and soil conditions. Making accurate predictions gives farmers an edge when making financial decisions for their businesses and helps governments avoid catastrophes like famine. Climate change and increasing food production have made accurate predictions more important than ever as there's less room for error. Climate change is increasing the risk of low crop yields in multiple regions, which could cause a global crisis.
Many of the variables used to predict crop yield-like the climate, soil quality, and crop-management methods-are still the same, but modeling techniques have become more sophisticated in recent years. Deep-learning techniques not only can calculate how variables like precipitation and temperature affect crop yield, but also how they affect each other. The benefits of increased rain, for example, can be canceled out by extremely hot temperatures. The way variables interact can lead to different results than looking at each variable independently.
In their study, the researchers used a recurrent neural network, which is a deep-learning tool that tracks the relationships of different variables through time, to help capture complex temporal dependencies" affecting crop yield. Variables relating to crop yield that are affected by time include temperature, sunlight, and precipitation, said Chao Wu, a researcher at Zhejiang University and one of the paper's authors. Wu said these factors change over time, interact with each other in complex ways, and their impact on crop yield is usually cumulative."
This tool is also able to infer the effect of variables that are difficult to quantify, such as steady improvements in breeding and agricultural cultivation techniques, Wu said. As a result, their model benefited from capturing larger trends that stretched beyond a single year.
The researchers also wanted to incorporate spatial information, like information about the proximity between two regions of farmland to help determine whether their crop yields are likely to be similar. To do so, they combined their recurrent neural network with a graph neural network representing geographic distance to determine how predictions for particular locations would be affected by the area around them. In other words, the researchers could include information about adjacent regions for each area of farmland, and help the model learn from relationships across time and space.
The researchers tested their new method on U.S soybean yield data published by the National Agricultural Statistics Service. They input climate data including precipitation, sunlight, and vapor pressure; soil data like electrical conductivity, acidity, and soil composition; and management data like the percentage of fields planted. The model was trained on soybean yield data between 1980 and 2013, and tested using data from 2015 to 2017. Compared with existing models, the proposed method performed significantly better than models trained using non-deep-learning methods, and better than other deep-learning models that did not take spatial relationships into account.
In their future work, the researchers want to make the training data more dynamic and add security features to the model-training process. Currently, the model is trained on data that has been aggregated, which doesn't allow the possibility of keeping proprietary data private. This could be a problem if data like crop yields and farm-management practices is seen by competitors and used to gain an unfair advantage in the marketplace, Wu said. Agricultural data like farm location and crop yields could also make farmers vulnerable as targets of scams and theft. The possibility of data disclosure could also deter participation, decreasing the amount of data available to train on and negatively affecting the accuracy of trained models.
Researchers hope to use a federated learning approach to train future crop-yield models, which would allow the training to update a global model while keeping different sources of data isolated from one another.
The researchers presented their findings at the 26th International Conference on Computer Supported Cooperative Work in Design, held from 24 to 26 May in Rio de Janeiro.