Space-Time Cluster Analysis
Data has both a spatial and a temporal context: everything happens someplace and occurs at some point in time. Several tools, including Hot Spot Analysis, Cluster and Outlier Analysis, and Grouping Analysis, allow you to usefully exploit those aspects of your data. When you consider both the spatial and the temporal context of your data, you can answer questions like the following:
- Where are the space-time crime hot spots? If you are a crime analyst, you might use the results from space-time Hot Spot Analysis to make sure that your police resources are allocated as effectively as possible. You want those resources to be in the right places at the right times.
- Where are the spending anomalies? In an effort to identify fraud, you might use Cluster and Outlier Analysis to scrutinize spending behaviors looking for outliers in space and time. A sudden change in spending patterns or frequency could suggest suspicious activity.
- What are the characteristics of bacteria outbreaks? Suppose you are studying salmonella samples taken from dairy farms in your state. To characterize individual outbreaks, you can run Grouping Analysis on your sample data, constraining group membership in both space and time. Samples close in time and space are most likely to be associated with the same outbreak.
Several tools in the Spatial Statistics toolbox work by assessing each feature within the context of their neighboring features. When neighbor relationships are defined in terms of both space and time, traditional spatial analyses become space-time analyses. To define neighbor relationships using both spatial and temporal parameters, use the Generate_Spatial_Weights_Matrix tool and select the SPACE_TIME_WINDOW option for the Conceptualization of Spatial Relationships parameter. Then specify both a Threshold Distance and a time interval (Date/Time Interval Type and Date/Time Interval Value). If, for example, you provide a distance of 1 kilometer and a time interval of 7 days, features found within 1 kilometer that also have a date/time stamp within 7 days of each other will be analyzed together. Similarly, proximal features within 1 kilometer of each other that do not fall within the 7-day time interval of each other will not be considered neighboring features.
Beyond Time Snapshots
One common approach to understanding spatial and temporal trends in your data is to break it up into a series of time snapshots. You might, for example, create separate datasets for week one, week two, week three, week four, and week five. You could then analyze each week separately and present the results of your analysis as either a series of maps or as an animation. While this an effective way to show trends, how you decide to break up the data is somewhat arbitrary. If you are analyzing your data week to week, for example, how do you decide where the break falls? Should you break the data between Sunday and Monday? Perhaps Monday through Thursday, and then again Friday through Sunday? And is there something special about analyzing the data in week-long intervals? Might not daily analysis or monthly analysis be more effective? The implications might be important if the division (dividing Sunday events from Monday events, for example) separates features that really should be related. In the example below, 6 features fall within a 1 km and 7-day space-time window of the feature labeled Jan 31; only one feature will be included as a neighbor, however, if the data is analyzed using monthly snapshots.
When you define feature relationships using the SPACE_TIME_WINDOW, you are not creating snapshots of the data. Instead, all the data is used in the analysis. Features that are near each other in space and time will be analyzed together, because all feature relationships are assessed relative to the location and time stamp of the target feature; in the example above (A.), a 1 km, 7-day space-time window finds six neighbors for the feature labeled Jan 31.
Suppose you were analyzing wildfires in a region. If you were to run the Hot Spot Analysis tool using the default FIXED_DISTANCE_BAND conceptualization to define feature relationships, the result would be a map showing you locations of statistically significant wildfire hot spots and cold spots. If you then ran the analysis again, but this time defined spatial relationships in terms of a SPACE-TIME WINDOW, you may find that some of the hot spot areas are seasonal. Understanding this temporal characteristic of wildfires can have important implications for how you allocate fire resources.
Visualizing Space-Time Results
Heat maps typically show high-intensity areas (hot spots) in red and low-intensity areas (cold spots) in blue. In the graphic below, for example, the red areas are places getting the largest number of 911 emergency calls. The blue areas are locations getting relatively few calls. How might you add information about the temporal dimension of 911 call frequencies to the map below? How might you effectively map things like individual outbreaks, a series of crime sprees, reverberations in the adoption of a new technology, or the seasonal oscillations of storm patterns?
Representing three-dimensional data (x and y location, plus time) is difficult to do with a two-dimensional map. Notice that in the example below, you can't discern that there are two distinct hot spots (near each other in space, but separated by time), until the data is viewed in three dimensions. By extruding the features based on a time field, it becomes clearer which features are related and which are separated by time.
There are at least two ways to visualize the output from space-time analyses. Three-dimensional visualization is effective with a smaller study area when you have a limited number of features; this approach allows you to present space-time relationships in a single map. Another powerful method for portraying space-time processes is through animation. The examples below focus specifically on visualization of space-time clusters.
To animate your space-time clusters, enable time on your result features, open the Time Slider from the Tools Toolbar, and click Play . Set a time window that will allow you to see enough of your data at one time in a single step. If you are new to creating animations, follow the links below.
Another powerful way to visualize the results of a space-time cluster analysis is to use 3D visualization. With this method, time becomes the third dimension, with point features extruded to reflect temporal progression. In the 3D graphic above, for example, the oldest events are nearest to the ground, and the more recent events hover at higher elevations (appearing closer to the viewer).
To create a 3D representation of your data like the one above, you'll need to use ArcGlobe (included with the standard installation of ArcGIS for Desktop).
First, run your space-time cluster analysis in ArcGlobe, then create a new field in the output feature class to reflect the height of each feature. For this example, the heights will be based on the number of days that have passed since the first event in the dataset occurred. To calculate the time lapse, you will use a VB script and the date function called DateDiff, as shown below.
If you have trouble adding a new field to the output feature class because of a lock, save your ArcGlobe document and reopen it, or export the output feature class to a new dataset, add it to your map document, and symbolize it to match the output feature class.
Next, sort your features by date so that you can identify the earliest date. You will use this to calculate the new time lapse field values. Right-click the new field you just created and choose Field Calculator. From the field calculator, click the Date type functions and select DateDiff from the right-hand side of the calculator, as illustrated below. Type DateDiff ( "d", "3/1/2011", [DateField] ), replacing the date string with the earliest date in your feature class and specifying your new field name for the [DateField] parameter ("d" indicates that the difference interval should be in days).
The example above uses VB to compute date/time fields. The equivalent Python statement would be:
(datetime.datetime.strptime(!Date_Con!, "%m/%d/%Y ").date() - datetime.date(2011, 3, 11)).days
The next step is to change the ArcGlobe display properties so that the features in your dataset will appear elevated. To do this, right-click the output feature class and choose Properties. On the properties dialog box, click the Elevation tab. In the Elevation from features section, choose Use constant value or expression, then click the Calculator button and specify the new field you created with the DateDiff function. ArcGlobe will now elevate your features based on the time lapse field. If you find that your features are not showing enough elevation, you may want to try multiplying the time lapse field by a constant. In the Use constant value or expression property of the Elevation tab, this would look something like this: [TimeLapse] *100, as illustrated below.
You can then use the ArcGlobe navigation tool to tilt and view the cluster results from various angles and viewpoints. The resultant map might look something like this: