Big Data and Geospatial Analysis
Perhaps one of the mostly hotly debated topics in recent years has been the question of "GIS and Big Data". Much of the discussion has been about the data: huge volumes of 2D and 3D spatial data and spatio-temporal data are now being collected and stored; so how they can be accessed? and how can we map and interpret massive datasets in an effective manner? Less attention has been paid to questions regarding the analysis of Big Data, although this has risen up the agenda in recent times. Examples include the use of density analysis to represent map request events, with Esri demonstrating that (given sufficient resources) they can process and analyze large numbers of data point events using kernel density techniques within a very short timeframe (under a minute); data filtering (to extract subsets of data that are of particular interest); and data mining (broader than simple filtering). For real-time data, sequential analysis has also been successfully applied; in this case the data are received as a stream and are used to build up a dynamic map or to cumulatively generate statistical values that may be mapped and/or used to trigger events or alarms. To this extent the analysis is similar to that conducted on smaller datasets, but with data and processing architectures that are specifically designed to cope with the data volumes involved and with a focus on data exploration as a key mechanism for discovery. Miller and Goodchild (2014) have argued that considerable care is required when working with Big Data significant issues arise from each of the "four Vs of Big Data": the sheer Volume of data; the Velocity of data arrival; the Variety of forms of data and their origins; and the Veracity of such data. As such, geospatial research has had to adapt to harness new forms of data to validly represent real-world phenomena.
READ FULL TEXT