Detecting Solar Panels From Satellite Imagery

Contributors to this project include Viggy Kumaresan, Azucena Morales, Yifei Wang, and Sicong Zhao.

Why Bother With Solar Panel Detection?

Solar power currently accounts for 1% of the world’s electricity generation. In fact, estimates of solar energy production predict a potential 65-fold growth by 2050, eventually making solar power one of the largest sources of energy across the globe [1]. Solar photovoltaic, or solar PV, power installed on top of rooftops is estimated to make up 30% of this energy generation. In recent years, solar PV power has already begun playing an increasingly large role in US electricity generation. From 2008 through 2017, there was a 39-fold growth in annual solar generation representing an increase of 75,123 GWh [2].

As consumer solar PV adoption increases, so does interest in solar consumption habits. Policymakers, for example, are reliant on accurate measures of the saturation of solar PV to inform structuring of tax incentives programs [3, 4]. Solar manufacturers also rely on detailed solar PV market insights to acquire new customers and customize sales strategies [5, 6].

Traditional data sources such as consumer surveys and market research, are costly and time-consuming to collect, and can often only give a partial or biased view of the market. Satellite imagery, on the other hand, gives us overhead views of households all over the country and does not rely on self-reported data. The use of aerial imagery to for solar PV detection may result in more consistent and cost-effective assessment of solar adoption. Research exploring the use of machine learning for satellite image classification have shown high rates of accuracy of detection with a low rate of incorrect classification [7, 8]. Specifically, random forest classification and CNN approaches have been shown to accurately measure size, shape and capacity of solar PV from satellite imagery [9, 10, 11].


Images Containing Solar PV
Solar PV instances varied in size and color, but shared similar shapes.
Images Not Containing Solar PV
Surrounding landscape was similar across both images with and without solar PV.

Data Sources

The primary dataset contains 1,500 satellite images in TIFF format, with each image consisting of 101 pixels x 101 pixels and three color channels (Red, Green, Blue). In total, each image is represented as 101 x 101 x 3 array, which is a total of 30,603 numerical entries with values ranging from 0 and 255.

Each image in the dataset has been labeled (my humans) as one of two classes, either containing a solar PV, or not containing a solar PV. There are 555 images that contains a solar PV and 995 images without a solar PV, which means that the classes are not entirely balanced. An additional 558 labeled images are housed on Kaggle.com and have an unknown number each image class. These images were used as a test set during the model selection process.


Methods

Additional Image Features
Example effect of relative luminance and gradient channels. Additional features hopefully help delineate between images with and without solar PV by making the differences between classes more salient.

Data Preprocessing

During the first step of preprocessing, features were added based on on properties we know to be true of solar panels, they absorb light and they are angular in shape. To captures these properties, gradient and luminance features were added to each image. [16]

During the second step of preprocessing, random modifications were applied to images including horizontal flips, image shears, and zooms. By increasing the diversity of images in the training set we can ideally to reduce overfit and ultimately increase out-of-sample accuracy. [13, 14, 15]

Convolutional Neural Networks (CNN)

CNNs are a type of neural network for processing data that has a grid-like topology. CNNs are widely known in computer vision for being successful for tasks such as classifying images, clustering images, and object recognition [17]. At they’re core, CNNs use the convolution operation, instead of matrix multiplication in at least one of their layers. Similar to other neural networks, they are conformed by a sequence of layers. The layers of CNN have neurons arranged in three dimensions: width, height and depth. There are many different types of CNN architectures, but in essence, CNNs handle pixels in context to their surroundings making them an ideal choice for image recognition. [18, 19]

Results

Compared to a variety of modeling approaches including logistic regression, k-nearest neighbors (KNN), gaussian naive bayes, random forest, a CNN approached proved to be the strongest with a validation accuracy of 0.957. Moreover, the addition of a gradient feature increased the accuracy of many models techniques while luminance feature did not.

Model Metric Validation Set Performance
Chance AUC 0.500
KNN AUC 0.639
Gaussian Naive Bayes AUC 0.643
Logistic Regression AUC 0.785
Random Forest AUC 0.824
CNN Validation Accuracy 0.957

Final Thoughts

A CNN approach produced the highest accuracy of detection. Instead of handling each pixel as features individually, CNN essentially looks at pixels in context. This type of approach is ideal when considering image recognition analysis because pixels only have meaning in relation to surrounding pixels. CNN or similar neural network approaches appear to be a feasible and strong choice for similar image detection analyses. That said, by adding features such as image gradients, performance can be improved even when using simple models such as logistic regression.

Next Steps

Future research should also examine the use of supervised learning techniques to not only identify the existence or absence of solar PV, but also to estimate the energy consumption associated with each installation. This could be achieved by integrating other disparate data sources, such as geographic data and energy usage patterns. Ultimately, increasingly accurate information around both consumer solar adoption and solar energy consumption would aid policymakers in the structuring of new green policies and incentive programs.

Want to see more? Read the Full Report Here

References

[1] Energy Transition Outlook 2018. DNV GL.

[2] "Renewables on the Rise 2018 - Environment America.”

[3] Matasci, S. "Solar tax credit". EnergySage. (6 Jan. 2019)

[4] Solangi, K. H., et al. "A review on global solar energy policy." Renewable and sustainable energy reviews 15.4 (2011): 2149-2163.

[5] Janes, M., 2014. "Predictive models help determine which consumers buy solar equipment and why". Phys.org. (6 Jan. 2019)

[6] Rai, Varun, D. Cale Reeves, and Robert Margolis. "Overcoming barriers and uncertainties in the adoption of residential solar PV." Renewable Energy 89 (2016): 498-505.

[7] Mnih, Volodymyr, and Geoffrey E. Hinton. "Learning to detect roads in high-resolution aerial images." European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2010.

[8] Kubat, Miroslav, Robert C. Holte, and Stan Matwin. "Machine learning for the detection of oil spills in satellite radar images." Machine learning 30.2-3 (1998): 195-215.

[9] Malof, Jordan M., et al. "Automatic detection of solar photovoltaic arrays in high resolution aerial imagery." Applied energy 183 (2016): 229-240.

[10] Malof, Jordan M., Leslie M. Collins, and Kyle Bradbury. "A deep convolutional neural network, with pre-training, for solar photovoltaic array detection in aerial imagery." 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, 2017.

[11] Golovko, Vladimir, et al. "Convolutional neural network based solar photovoltaic panel detection in satellite photos." 2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). Vol. 1. IEEE, 2017.

[13] Chollet, F. "Building powerful image classification models using very little data”. The Keras Blog. (June 5, 2016).

[14] Venkatesh, T. “Simple Image Classification using Convolutional Neural Network — Deep Learning in python”

[15] Lewinson, E. “Mario vs. Wario: Image Classification in Python.” (24 Jul 2018).

[16] Stokes, M., Anderson, M., Chandrasekar, S., & Motta, R. “A Standard Default Color Space for the Internet.” (1996, November 5).

[17] Karpathy, A. “CS231n Convolutional Neural Networks for Visual Recognition.” (2018).

[18] Goodfellow, Ian, Bengio Yoshua, and Courville, Aaron, 2016. “CS231n Convolutional Neural Networks for Visual Recognition.” (2018).

[19] Skymind “A Beginner's Guide to Convolutional Neural Networks (CNNs).”

[20] Prabhu, R. "Understanding of Convolutional Neural Network (CNN)." (3 Mar. 2018).