At the beginning of November, Me and my colleague Honza attended the Space Application Hackathon where our team managed to win the earth observation category. Specifically, we worked out two use cases – classification of agricultural land use and crop yield prediction. Here is our write up from the event.
We had wanted to test our technical skills in a hackathon for some time and as we both are interested in space technologies this one had seemed like an obvious choice. The hackathon was organized by ESA Business Incubation Center as a part of Czech space year.
The event was hosted by IBM and took place in their headquarters in Prague. We arrived there on Friday just in time for the welcome talk and introduction to the challenges that were to be solved during the hackathon. The challenges included satellite navigation, earth observation, general space technology and blockchain in space, each offering unique and interesting problems to tackle. After some initial hesitation, me and Honza decided to go for the earth observation challenge, even though, working with satellite imagery was an uncharted territory for both of us.
Now it was time to build the teams. We decided to team up with Tomáš Kapler who offered to take care of the presentation and business side of things while Honza and I work on the engineering itself.
In our project, we focused on use of satellite imagery in agriculture. More specifically, we worked out two use cases – classification of agricultural land use and crop yield prediction.
The goal in the first use case was to develop a model that can assign the correct crops to each piece of farm land in a satellite image. We based the model on an assumptions that different types of crops have different spectral characteristics in near-infrared and visible spectrums. The classification is pixel-wise as we are not looking for any particular shapes in the images. We used the RGB channels and normalized difference vegetation index (NDVI) as features of each pixel. The NDVI is defined as:
NDVI = (NIR - R) / (NIR + R)
The R and NIR are spectral reflectances in the red and near-infrared spectral bands. The NDVI value correlates with the amount of living vegetation in the area.
We chose Gradient Boosting Classifier as our model and trained it by combining geolocated land use data with Landsat satellite imagery. The land use data came from American midwest and contained types of crops cultivated in each part of the area of interest. We transformed this data by assigning the correct crop type from the land use data to each pixel in our training images.
In the image below, the model performance can be seen. In this case we tried to correctly classify fields of soybeans and wheat in North Dakota.
In our second use case, we built a crop yield prediction model. The idea was to use NDVI data from early spring and to estimate seasonal crop yield per hectare of land. To achieve this, we took historical region-level crop yields in the Czech Republic from 2000 to 2012 and NDVI Landsat images from March and April of each year.
We added elevation as another feature and preprocessed the data using PCA (principal components analysis) to improve the performance. The model itself was a regressor using non-linear SVM (support vector machine).
While Honza and I were hacking on the models, Tomáš developed business proposals on how such technology can be used to overall optimization of food production.
We finished work on Saturday afternoon just in time for the final presentations in front of a panel of experts from IBM, ESA and other partners of the hackathon. After all was said and done, the panel announced our team as the winner of the earth observation challenge!
Honza H. (left) and Honza C. (right) after the result announcement
The hackathon was definitely a cool experience where we learned a lot about current state of the space industry while simultaneously being a great opportunity to practice our data science skills.