Machine Learning in GIS
Machine Learning in GIS
Machine Learning in GIS
WHAT IS GIS?...............................................................................................................................................3
USES OF MACHINE LEARNING IN GIS:..........................................................................................................3
MACHINE LEARNING APPROACHES IN GIS:.................................................................................................4
WHAT HAS CHANGED IN THE RECENT YEARS?............................................................................................6
SOME APPLICATIONS OF MACHINE LEARNING IN GEOSPATIAL TECHNOLOGIES:.......................................7
CONCLUSION:..............................................................................................................................................7
REFERENCES:...............................................................................................................................................7
Figure 1Support Vector Machines................................................................................................................3
Figure 2 K-means clustering........................................................................................................................4
Figure 3 Semi-variogram built using Empirical Bayesian Networks.............................................................5
WHAT IS GIS?
Geographic Information System (GIS) is a framework that is specifically used to collect, classify
as well as analyze data depending upon the geographical space time. The main commonality
among all the data points that are retrieved is that they ultimately refer to some physical
location on the earthly dimensions. With the advent of location services such as GPS, GIS has
become an integral part of almost every service that we render. By using GIS, organizations
around the world are now becoming masters in strategizing their operations as per the
demographics that they serve making the most out of the resources at hand. GIS relates
completely unrelated data to the geographical topography of earth, ultimately benefitting
almost every sector from agriculture to development of space technology.
Support Vector Machines:- These models are binary in nature and predict whether a
given set of input values belongs to either of the vectors given. Support vector machines
are mainly used for classification or regression but also for probabilistic functions in
conjunction with methods such as Platt scaling. It is most commonly used in the field of
satellite imagery classification. For example when we get the satellite imagery of a
particular location or demography, it is very much difficult to distinguish between
features just depending upon the shape and no elevation details. Both trees as well as
grasslands would appear to be green in nature and concrete structures such as buildings
or roads would appear as same. This is where support vector machines come into play.
Support vector machines are supervised machine learning approaches that are used
majorly for classification.
They basically mean creating a boundary on the hyperplane so as to divide data points
efficiently and in a manner, such that the distance between the hyperplane and the
nearest data point is maximized. Since they are unsupervised machine learning models,
we just need to train them with unlabeled data sets consisting of earthly features such
as trees, grasslands, concrete buildings, etc., and these models would eventually better
themselves with time in classifying between structures & bring about a revolution in
satellite imagery reading techniques.
K means clustering:- K means clustering refers to one of the unsupervised machine
learning algorithms which takes an indigenous amount of unlabeled data as training
data sets and then classify them into clusters without the need of providing any
expected outcomes. K refers to the total number of target clusters that are required for
final segmentation and based upon this, the algorithm would divide the data sets into
different clusters with squared distances of each cluster. While dealing with satellite
imagery, k means clustering would cluster objects or features depending upon the
similarities between them. These similarities may refer to the featured characteristics of
the topography, number of similar structures such as concrete buildings and be further
enhanced to cluster locations based upon other factors such as number of schools in a
particular area or areas with high traffic or areas with high crime rate etc.
Empirical Bayesian Networks:- Bayesian networks are models that provide a gist of an
event occurring and its potential outcome. They are acyclic in nature and represent a set
of random variables using a graph. They are most commonly used to identify the most
probable cause for the likelihood of an event to occur. Bayesian networks in
amalgamation with Kriging Interpolation, not only provide details about specific
activities or natural phenomena at a particular geo-location, but also make predictions
by building a mathematical function along with a feature called semi-variogram. These
Bayesian networks have proved to be really useful in correctly predicting various natural
phenomena such as rainfall in a particular place, heat waves around a particular
demography, etc., with existing training data sets to determine how they would vary in
the nearby feature and then cross validating the models.
Figure 3 Semi-variogram built using Empirical Bayesian Networks
Data Collection Abilities:- The world around us is becoming more and more data driven
day by day. There are now innumerable means to collect data and with the
advancement of technology and availability of sensory devices, drones and satellite
imagery, it has become much easier to collect intrinsic data even from geospatial
perspective.
Computing Abilities:- With the advancement of technology, our computing abilities
have grown multifold with the ability to process enormous amounts of data in a single
go. Moreover, the costs of computing have also gone down considerably with the
floating point operations per second increasing by an order of magnitude of over 7.5
every year.
Algorithmic Improvements:-With the incorporation of artificial intelligence and machine
learning, our learning algorithms have improved considerably over the last few decades
and by feeding over enormous amounts of training datasets, both labeled as well as
unlabeled, the algorithmic processes are becoming much faster, reliable, accurate as
well as self-sufficient.
CONCLUSION:
By using Geographical Information Systems in amalgamation with machine learning algorithms,
organizations around the world are now becoming masters in strategizing their operations as
per the demographics that they serve making the most out of the resources at hand. GIS relates
completely unrelated data to the geographical topography of earth, ultimately benefitting
almost every sector from agriculture to development of space technology.
REFERENCES:
Allen, C., Tsou, M.H., Aslam, A., Nagel, A. and Gawron, J.M., 2016. Applying GIS and machine learning
methods to Twitter data for multiscale surveillance of influenza. PloS one, 11(7), p.e0157734.
Cetin, M., 2015. Using GIS analysis to assess urban green space in terms of accessibility: case study in
Kutahya. International Journal of Sustainable Development & World Ecology, 22(5), pp.420-424.
Chen, W., Peng, J., Hong, H., Shahabi, H., Pradhan, B., Liu, J., Zhu, A.X., Pei, X. and Duan, Z., 2018.
Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County,
Jiangxi Province, China. Science of the total environment, 626, pp.1121-1135.
Garosi, Y., Sheklabadi, M., Conoscenti, C., Pourghasemi, H.R. and Van Oost, K., 2019. Assessing the
performance of GIS-based machine learning models with different accuracy measures for determining
susceptibility to gully erosion. Science of the Total Environment, 664, pp.1117-1132.
Huang, X. and Jensen, J.R., 1997. A machine-learning approach to automated knowledge-base building
for remote sensing image analysis with GIS data. Photogrammetric engineering and remote
sensing, 63(10), pp.1185-1193.
Kobler, A. and Adamic, M., 2000. Identifying brown bear habitat by a combined GIS and machine learning
method. Ecological Modelling, 135(2-3), pp.291-300.
Mojaddadi, H., Pradhan, B., Nampak, H., Ahmad, N. and Ghazali, A.H.B., 2017. Ensemble machine-
learning-based geospatial approach for flood risk assessment using multi-sensor remote-sensing data
and GIS. Geomatics, Natural Hazards and Risk, 8(2), pp.1080-1102.
Mollalo, A., Vahedi, B., Bhattarai, S., Hopkins, L.C., Banik, S. and Vahedi, B., 2020. Predicting the
hotspots of age-adjusted mortality rates of lower respiratory infection across the continental United
States: Integration of GIS, spatial statistics and machine learning algorithms. International journal of
medical informatics, 142, p.104248.
Rabby, Y.W., Hossain, M.B. and Abedin, J., 2020. Landslide susceptibility mapping in three Upazilas of
Rangamati hill district Bangladesh: application and comparison of GIS-based machine learning
methods. Geocarto International, pp.1-27.
Razavi-Termeh, S.V., Sadeghi-Niaraki, A. and Choi, S.M., 2021. Spatial modeling of asthma-prone areas
using remote sensing and ensemble machine learning algorithms. Remote Sensing, 13(16), p.3222.