Grouping Analysis (Spatial Statistics)

License Level:BasicStandardAdvanced

Summary

Groups features based on feature attributes and optional spatial/temporal constraints.

Learn more about how Grouping Analysis works

Illustration

Group Analysis diagram

Usage

Syntax

GroupingAnalysis_stats (Input_Features, Unique_ID_Field, Output_Feature_Class, Number_of_Groups, Analysis_Fields, Spatial_Constraints, {Distance_Method}, {Number_of_Neighbors}, {Weights_Matrix_File}, {Initialization_Method}, {Initialization_Field}, {Output_Report_File}, {Evaluate_Optimal_Number_of_Groups})
ParameterExplanationData Type
Input_Features

The feature class or feature layer you want to create groups for.

Feature Layer
Unique_ID_Field

An integer field containing a different value for every feature in the Input Features dataset.

Field
Output_Feature_Class

The new output feature class created containing all features, the analysis fields specified, and a field indicating which group each feature belongs to.

Feature Class
Number_of_Groups

The number of groups to create. The Output Report parameter will be disabled for more than 15 groups.

Long
Analysis_Fields
[analysis_field,...]

A list of fields you want to use to distinguish one group from another. The Output Report parameter will be disabled for more than 15 fields.

Field
Spatial_Constraints

Specifies if and how spatial relationships among features should constrain the groups created.

  • CONTIGUITY_EDGES_ONLYGroups contain contiguous polygon features. Only polygons that share an edge can be part of the same group.
  • CONTIGUITY_EDGES_CORNERSGroups contain contiguous polygon features. Only polygons that share an edge or a vertex can be part of the same group.
  • DELAUNAY_TRIANGULATIONFeatures in the same group will have at least one natural neighbor in common with another feature in the group. Natural neighbor relationships are based on Delaunay Triangulation. Conceptually, Delaunay Triangulation creates a nonoverlapping mesh of triangles from feature centroids. Each feature is a triangle node and nodes that share edges are considered neighbors.
  • K_NEAREST_NEIGHBORSFeatures in the same group will be near each other; each feature will be a neighbor of at least one other feature in the group. Neighbor relationships are based on the nearest K features where you specify an Integer value, K, for the Number of Neighbors parameter.
  • GET_SPATIAL_WEIGHTS_FROM_FILESpatial, and optionally temporal, relationships are defined by a spatial weights file (.swm). Create the spatial weights matrix file using the Generate Spatial Weights Matrix tool.
  • NO_SPATIAL_CONSTRAINTFeatures will be grouped using data space proximity only. Features do not have to be near each other in space or time to be part of the same group.
String
Distance_Method
(Optional)

Specifies how distances are calculated from each feature to neighboring features.

  • EUCLIDEANThe straight-line distance between two points (as the crow flies)
  • MANHATTANThe distance between two points measured along axes at right angles (city block); calculated by summing the (absolute) difference between the x- and y-coordinates
String
Number_of_Neighbors
(Optional)

This parameter is enabled whenever the Spatial Constraints parameter is K_NEAREST_NEIGHBORS or one of the CONTIGUITY methods. The default number of neighbors is 8. For K_NEAREST_NEIGHBORS, this integer value reflects the exact number of nearest neighbor candidates to consider when building groups. A feature will not be included in a group unless one of the other features in that group is a K nearest neighbor. For the CONTIGUITY methods, this value reflects the exact number of neighbor candidates to consider for island polygons only. Since island polygons have no contiguous neighbors, they will be assigned neighbors that are not contiguous but are close by.

Long
Weights_Matrix_File
(Optional)

The path to a file containing spatial weights that define spatial relationships among features.

File
Initialization_Method
(Optional)

Specifies how initial seeds are obtained when the Spatial Constraint parameter selected is NO_SPATIAL_CONSTRAINT. Seeds are used to grow groups. If you indicate you want 3 groups, for example, the analysis will begin with three seeds.

  • FIND_SEED_LOCATIONSSeed features will be selected to optimize performance.
  • GET_SEEDS_FROM_FIELDNonzero entries in the Initialization Field will be used as starting points to grow groups.
  • USE_RANDOM_SEEDSInitial seed features will be selected randomly.
String
Initialization_Field
(Optional)

The numeric field identifying seed features. Features with a value of 1 for this field will be used to grow groups.

Field
Output_Report_File
(Optional)

The full path for the .pdf report file to be created summarizing group characteristics. This report provides a number of graphs to help you compare the characteristics of each group. Creating the report file can add substantial processing time.

File
Evaluate_Optimal_Number_of_Groups
(Optional)
  • EVALUATEGroupings from 2 to 15 will be evaluated.
  • DO_NOT_EVALUATENo evaluation of the number of groups will be performed. This is the default.
Boolean

Code Sample

GroupingAnalysis example 1 (Python window)

The following Python window script demonstrates how to use the GroupingAnalysis tool.

import arcpy
import arcpy.stats as SS
arcpy.env.workspace = r"C:\GA"
SS.GroupingAnalysis("Dist_Vandalism.shp", "TARGET_FID", "outGSF.shp", "4",
                    "Join_Count;TOTPOP_CY;VACANT_CY;UNEMP_CY",
                    "NO_SPATIAL_CONSRAINT", "EUCLIDEAN", "", "", "FIND_SEED_LOCATIONS", "",
                    "outGSF.pdf", "DO_NOT_EVALUATE")
GroupingAnalysis example 2 (stand-alone Python script)

The following stand-alone Python script demonstrates how to use the GroupingAnalysis tool.

# Grouping Analysis of Vandalism data in a metropolitan area
# using the Grouping Analysis Tool

# Import system modules
import arcpy, os
import arcpy.stats as SS

# Set geoprocessor object property to overwrite existing output, by default
arcpy.gp.overwriteOutput = True

try:
    # Set the current workspace (to avoid having to specify the full path to
    # the feature classes each time)
    arcpy.env.workspace = r"C:\GA"

    # Join the 911 Call Point feature class to the Block Group Polygon feature class
    # Process: Spatial Join
    fieldMappings = arcpy.FieldMappings()
    fieldMappings.addTable("ReportingDistricts.shp")
    fieldMappings.addTable("Vandalism2006.shp")

    sj = arcpy.SpatialJoin_analysis("ReportingDistricts.shp", "Vandalism2006.shp", "Dist_Vand.shp",
                               "JOIN_ONE_TO_ONE",
                               "KEEP_ALL",
                               fieldMappings,
                               "COMPLETELY_CONTAINS", "", "")
    
    # Use Grouping Anlysis tool to create groups based on different variables or analysis fields
    # Process: Group Similar Features  
    ga = SS.GroupingAnalysis("Dist_Vand.shp", "TARGET_FID", "outGSF.shp", "4",
                                       "Join_Count;TOTPOP_CY;VACANT_CY;UNEMP_CY",
                                       "NO_SPATIAL_CONSRAINT", "EUCLIDEAN", "", "", "FIND_SEED_LOCATIONS", "",
                                       "outGSF.pdf", "DO_NOT_EVALUATE")
    
    # Use Summary Statistic tool to get the Mean of variables used to group
    # Process: Summary Statistics
    SumStat = arcpy.Statistics_analysis("outGSF.shp", "outSS", "Join_Count MEAN; \
                               VACANT_CY MEAN;TOTPOP_CY MEAN;UNEMP_CY MEAN", 
                                       "GSF_GROUP")

except:
    # If an error occurred when running the tool, print out the error message.
    print arcpy.GetMessages()

Environments

Output Coordinate System

Feature geometry is projected to the Output Coordinate System prior to analysis, so values entered for the Distance Band/Threshold Distance parameter should match those specified in the Output Coordinate System. All mathematical computations are based on the Output Coordinate System spatial reference.

Related Topics

Licensing Information

ArcGIS for Desktop Basic: Yes
ArcGIS for Desktop Standard: Yes
ArcGIS for Desktop Advanced: Yes
4/18/2013