Generate Spatial Weights Matrix (Spatial Statistics)
Summary
Constructs a spatial weights matrix (SWM) file to represent the spatial relationships among features in a dataset.
Illustration
Usage
-
Output from this tool is a spatial weights matrix file (SWM). Tools, such as Hot_Spot_Analysis, that require you to specify a Conceptualization of Spatial Relationships will accept a spatial weights matrix file; select GET_SPATIAL_WEIGHTS_FROM_FILE for the Conceptualization of Spatial Relationships parameter, and for the Weights Matrix File parameter, specify the full path to the spatial weights file you create using this tool.
This tool also reports characteristics of the resultant spatial weights matrix file: number of features, connectivity, minimum, maximum and average number of neighbors. This summary is accessible from the Results window and may be viewed by right-clicking on the Messages entry in the Results window and selecting View. Using this summary, ensure that all features have at least 1 neighbor. In general, especially with large datasets, a minimum of 8 neighbors and a low value for feature connectivity is desirable.
For space/time analyses, select SPACE_TIME_WINDOW for the Conceptualization of Spatial Relationships parameter. You define space by specifying a Threshold Distance value; you define time by specifying a Date/Time Field and both a Date/Time Type (such as HOURS or DAYS) and a Date/Time Interval Value. The Date/Time Interval Value is an Integer. For example, if you enter 1000 feet, select HOURS, and provide a Date/Time Interval Value of 3, features within 1,000 feet and occuring within 3 hours of each other would be considered neighbors.
The spatial weights matrix file (*.swm) was designed to allow you to generate, store, reuse, and share your conceptualization of the relationships among a set of features. To improve performance the file is created in a binary file format. Feature relationships are stored as a sparse matrix, so only nonzero relationships are written to the SWM file. In general, tools will perform well even when the SWM file contains more than 15 million nonzero relationships. If a memory error is encountered when using the SWM file, however, you should revisit how you are defining your feature relationships. As a rule of thumb, you should aim for a spatial weights matrix where every feature has at least 1 neighbor, most have about 8 neighbors, and no feature has more than about 1,000 neighbors.
When the Input Feature Class is not projected (that is, when coordinates are given in degrees, minutes, and seconds) or when the output coordinate system is set to a Geographic Coordinate System, distances are computed using chordal measurements. Chordal distance measurements are used because they can be computed quickly and provide very good estimates of true geodesic distances, at least for points within about thirty degrees of each other. Chordal distances are based on a sphere rather than the true oblate ellipsoid shape of the earth. Given any two points on the earth's surface, the chordal distance between them is the length of a line, passing through the three dimensional earth, to connect those two points. Chordal distances are reported in meters.
Caution:Be sure to project your data if your study area extends beyond 30 degrees. Chordal distances are not a good estimate of geodesic distances beyond 30 degrees.
When chordal distances are used in the analysis, the Threshold Distance parameter, if specified, should be given in meters.
Prior to ArcGIS 10.2.1, you would see a warning message if the parameters and environment settings you selected would result in calculations being performed using Geographic Coordinates (degrees, minutes, seconds). This warning advised you to project your data into a Projected Coordinate System so that distance calculations would be accurate. Beginning at 10.2.1, however, this tool calculates chordal distances whenever Geographic Coordinate System calculations are required.
Caution:Because of this change, there is a small chance that you will need to modify models that incorporate this tool if your models were created prior to ArcGIS 10.2.1 and if your models include hard-coded Geographic Coordinate System parameter values. If, for example, a distance parameter is set to something like 0.0025 degrees, you will need to convert that fixed value from degrees to meters and resave your model.
-
For line and polygon features, feature centroids are used in distance computations. For multipoints, polylines, or polygons with multiple parts, the centroid is computed using the weighted mean center of all feature parts. The weighting for point features is 1, for line features is length, and for polygon features is area.
-
The Unique ID field is linked to feature relationships derived from running this tool. Consequently, the Unique ID values must be unique for every feature and typically should be in a permanent field that remains with the feature class. If you don't have a Unique ID field, you can create one by adding a new integer field (Add Field) to your feature class table and calculating the field values to be equal to the FID or OBJECTID field (Calculate Field). Because the FID/OID field values may change when you copy or edit a feature class, you cannot use these fields directly for the Unique ID parameter.
-
The Number of Neighbors parameter may override the Threshold Distance parameter for Inverse or Fixed Distance Conceptualizations of Spatial Relationships. If you specify a threshold distance of 10 miles and 3 for the number of neighbors, all features will receive a minimum of 3 neighbors even if the threshold has to be increased to find them. The threshold distance is only increased in those cases where the minimum number of neighbors is not met.
The CONVERT_TABLE option for the Conceptualization of Spatial Relationships parameter may be used to convert an ASCII spatial weights matrix file to a SWM formatted spatial weights matrix file. First, you will need to put your ASCII weights into a formatted table (using Excel, for example).
Caution:If your table includes weights for self-potential, they will be omitted from the SWM output file, and the default self-potential value will be used in analyses. The default self-potential value for the Hot_Spot_Analysis tool is one, but this value can be overwritten by specifying a Self-Potential Field value; for all other tools, the default self-potential value is zero.
For polygon features, you will almost always want to choose ROW for the Row Standardization parameter. Row Standardization mitigates bias when the number of neighbors each feature has is a function of the aggregation scheme or sampling process, rather than reflecting the actual spatial distribution of the variable you are analyzing.
-
The Modeling Spatial Relationships help topic provides additional information about this tool's parameters.
The tools that can use a spatial weights matrix file project feature geometry to the output coordinate system prior to analysis and all mathematical computations are based on the output coordinate system. Consequently, if the output coordinate system setting does not match the input feature class spatial reference, either make sure, for all analyses using the spatial weights matrix file, that the output coordinate system matches the settings used when the spatial weights matrix file was created, or project the input feature class so that it does match the spatial reference associated with the spatial weights matrix file.
When using shapefiles, keep in mind that they cannot store null values. Tools or other procedures that create shapefiles from nonshapefile inputs may store or interpret null values as zero. In some cases, nulls are stored as very large negative values in shapefiles. This can lead to unexpected results. See Geoprocessing considerations for shapefile output for more information.
Syntax
Parameter | Explanation | Data Type |
Input_Feature_Class |
The feature class for which spatial relationships of features will be assessed. | Feature Class |
Unique_ID_Field |
An integer field containing a different value for every feature in the Input Feature Class. | Field |
Output_Spatial_Weights_Matrix_File |
The full path for the spatial weights matrix file (SWM) you want to create. | File |
Conceptualization_of_Spatial_Relationships |
Specifies how spatial relationships among features are conceptualized.
Note: Polygon Contiguity methods are only available with an ArcGIS for Desktop Advanced license. | String |
Distance_Method (Optional) |
Specifies how distances are calculated from each feature to neighboring features.
| String |
Exponent (Optional) |
Parameter for inverse distance calculation. Typical values are 1 or 2. | Double |
Threshold_Distance (Optional) |
Specifies a cutoff distance for Inverse Distance and Fixed Distance conceptualizations of spatial relationships. Enter this value using the units specified in the environment output coordinate system. Defines the size of the Space window for the Space Time Window conceptualization of spatial relationships. A value of zero indicates that no threshold distance is applied. When this parameter is left blank, a default threshold value is computed based on output feature class extent and the number of features. | Double |
Number_of_Neighbors (Optional) |
An integer reflecting either the minimum or the exact number of neighbors. For K Nearest Neighbors, each feature will have exactly this specified number of neighbors. For Inverse Distance or Fixed Distance each feature will have at least this many neighbors (the threshold distance will be temporarily extended to ensure this many neighbors, if necessary). When there are island polygons and one of the Contiguity Conceptualizations of Spatial Relationships is selected, then this specified number of nearest polygons will be associated with those island polygons. | Long |
Row_Standardization (Optional) |
Row standardization is recommended whenever feature distribution is potentially biased due to sampling design or to an imposed aggregation scheme.
| Boolean |
Input_Table (Optional) |
A table containing numeric weights relating every feature to every other feature in the input feature class. Required fields are the Input Feature Class Unique ID field, NID (neighbor ID), and WEIGHT. | Table |
Date_Time_Field (Optional) |
A date field with a timestamp for each feature. | Field |
Date_Time_Interval_Type (Optional) |
The units to use for measuring time.
| String |
Date_Time_Interval_Value (Optional) |
An Integer reflecting the number of time units comprising the time window. For example, if you select HOURS for the Date/Time Interval Type and 3 for the Date/Time Interval Value, the time window would be 3 hours; features within the specified space window and within the specified time window would be neighbors. | Long |
Code Sample
The following Python window script demonstrates how to use the GenerateSpatialWeightsMatrix tool.
import arcpy
arcpy.env.workspace = "C:/data"
arcpy.GenerateSpatialWeightsMatrix_stats("911Count.shp", "MYID","euclidean6Neighs.swm","K_NEAREST_NEIGHBORS","#", "#", "#", 6,"NO_STANDARDIZATION")
The following stand-alone Python script demonstrates how to use the GenerateSpatialWeightsMatrix tool.
# Analyze the spatial distribution of 911 calls in a metropolitan area
# using the Hot-Spot Analysis Tool (Local Gi*)
# Import system modules
import arcpy
# Set geoprocessor object property to overwrite existing output, by default
arcpy.gp.overwriteOutput = True
# Local variables...
workspace = "C:/Data"
try:
# Set the current workspace (to avoid having to specify the full path to the feature classes each time)
arcpy.env.workspace = workspace
# Copy the input feature class and integrate the points to snap
# together at 500 feet
# Process: Copy Features and Integrate
cf = arcpy.CopyFeatures_management("911Calls.shp", "911Copied.shp",
"#", 0, 0, 0)
integrate = arcpy.Integrate_management("911Copied.shp #", "500 Feet")
# Use Collect Events to count the number of calls at each location
# Process: Collect Events
ce = arcpy.CollectEvents_stats("911Copied.shp", "911Count.shp", "Count", "#")
# Add a unique ID field to the count feature class
# Process: Add Field and Calculate Field
af = arcpy.AddField_management("911Count.shp", "MyID", "LONG", "#", "#", "#", "#",
"NON_NULLABLE", "NON_REQUIRED", "#",
"911Count.shp")
cf = arcpy.CalculateField_management("911Count.shp", "MyID", "[FID]", "VB")
# Create Spatial Weights Matrix for Calculations
# Process: Generate Spatial Weights Matrix...
swm = arcpy.GenerateSpatialWeightsMatrix_stats("911Count.shp", "MYID",
"euclidean6Neighs.swm",
"K_NEAREST_NEIGHBORS",
"#", "#", "#", 6,
"NO_STANDARDIZATION")
# Hot Spot Analysis of 911 Calls
# Process: Hot Spot Analysis (Getis-Ord Gi*)
hs = arcpy.HotSpots_stats("911Count.shp", "ICOUNT", "911HotSpots.shp",
"GET_SPATIAL_WEIGHTS_FROM_FILE",
"EUCLIDEAN_DISTANCE", "NONE",
"#", "#", "euclidean6Neighs.swm")
except:
# If an error occurred when running the tool, print out the error message.
print arcpy.GetMessages()
Environments
- Output Coordinate System
Feature geometry is projected to the Output Coordinate System prior to analysis, so values entered for the Threshold Distance parameter should match those specified in the Output Coordinate System. All mathematical computations are based on the spatial reference of the Output Coordinate System. When the Output Coordinate System is based on degrees, minutes, and seconds, geodesic distances are estimated using chordal distances in meters.