Grouping Analysis (Spatial Statistics)

License Level:BasicStandardAdvanced


Groups features based on feature attributes and optional spatial/temporal constraints.

Learn more about how Grouping Analysis works


Group Analysis diagram



GroupingAnalysis_stats (Input_Features, Unique_ID_Field, Output_Feature_Class, Number_of_Groups, Analysis_Fields, Spatial_Constraints, {Distance_Method}, {Number_of_Neighbors}, {Weights_Matrix_File}, {Initialization_Method}, {Initialization_Field}, {Output_Report_File}, {Evaluate_Optimal_Number_of_Groups})
ParameterExplanationData Type

The feature class or feature layer you want to create groups for.

Feature Layer

An integer field containing a different value for every feature in the Input Features dataset.


The new output feature class created containing all features, the analysis fields specified, and a field indicating which group each feature belongs to.

Feature Class

The number of groups to create. The Output Report parameter will be disabled for more than 15 groups.


A list of fields you want to use to distinguish one group from another. The Output Report parameter will be disabled for more than 15 fields.


Specifies if and how spatial relationships among features should constrain the groups created.

  • CONTIGUITY_EDGES_ONLYGroups contain contiguous polygon features. Only polygons that share an edge can be part of the same group.
  • CONTIGUITY_EDGES_CORNERSGroups contain contiguous polygon features. Only polygons that share an edge or a vertex can be part of the same group.
  • DELAUNAY_TRIANGULATIONFeatures in the same group will have at least one natural neighbor in common with another feature in the group. Natural neighbor relationships are based on Delaunay Triangulation. Conceptually, Delaunay Triangulation creates a nonoverlapping mesh of triangles from feature centroids. Each feature is a triangle node and nodes that share edges are considered neighbors.
  • K_NEAREST_NEIGHBORSFeatures in the same group will be near each other; each feature will be a neighbor of at least one other feature in the group. Neighbor relationships are based on the nearest K features where you specify an Integer value, K, for the Number of Neighbors parameter.
  • GET_SPATIAL_WEIGHTS_FROM_FILESpatial, and optionally temporal, relationships are defined by a spatial weights file (.swm). Create the spatial weights matrix file using the Generate Spatial Weights Matrix tool.
  • NO_SPATIAL_CONSTRAINTFeatures will be grouped using data space proximity only. Features do not have to be near each other in space or time to be part of the same group.

Specifies how distances are calculated from each feature to neighboring features.

  • EUCLIDEANThe straight-line distance between two points (as the crow flies)
  • MANHATTANThe distance between two points measured along axes at right angles (city block); calculated by summing the (absolute) difference between the x- and y-coordinates

This parameter is enabled whenever the Spatial Constraints parameter is K_NEAREST_NEIGHBORS or one of the CONTIGUITY methods. The default number of neighbors is 8. For K_NEAREST_NEIGHBORS, this integer value reflects the exact number of nearest neighbor candidates to consider when building groups. A feature will not be included in a group unless one of the other features in that group is a K nearest neighbor. For the CONTIGUITY methods, this value reflects the exact number of neighbor candidates to consider for island polygons only. Since island polygons have no contiguous neighbors, they will be assigned neighbors that are not contiguous but are close by.


The path to a file containing spatial weights that define spatial relationships among features.


Specifies how initial seeds are obtained when the Spatial Constraint parameter selected is NO_SPATIAL_CONSTRAINT. Seeds are used to grow groups. If you indicate you want 3 groups, for example, the analysis will begin with three seeds.

  • FIND_SEED_LOCATIONSSeed features will be selected to optimize performance.
  • GET_SEEDS_FROM_FIELDNonzero entries in the Initialization Field will be used as starting points to grow groups.
  • USE_RANDOM_SEEDSInitial seed features will be selected randomly.

The numeric field identifying seed features. Features with a value of 1 for this field will be used to grow groups.


The full path for the .pdf report file to be created summarizing group characteristics. This report provides a number of graphs to help you compare the characteristics of each group. Creating the report file can add substantial processing time.

  • EVALUATEGroupings from 2 to 15 will be evaluated.
  • DO_NOT_EVALUATENo evaluation of the number of groups will be performed. This is the default.

Code Sample

GroupingAnalysis example 1 (Python window)

The following Python window script demonstrates how to use the GroupingAnalysis tool.

import arcpy
import arcpy.stats as SS
arcpy.env.workspace = r"C:\GA"
SS.GroupingAnalysis("Dist_Vandalism.shp", "TARGET_FID", "outGSF.shp", "4",
                    "NO_SPATIAL_CONSRAINT", "EUCLIDEAN", "", "", "FIND_SEED_LOCATIONS", "",
                    "outGSF.pdf", "DO_NOT_EVALUATE")
GroupingAnalysis example 2 (stand-alone Python script)

The following stand-alone Python script demonstrates how to use the GroupingAnalysis tool.

# Grouping Analysis of Vandalism data in a metropolitan area
# using the Grouping Analysis Tool

# Import system modules
import arcpy, os
import arcpy.stats as SS

# Set geoprocessor object property to overwrite existing output, by default = True

    # Set the current workspace (to avoid having to specify the full path to
    # the feature classes each time)
    arcpy.env.workspace = r"C:\GA"

    # Join the 911 Call Point feature class to the Block Group Polygon feature class
    # Process: Spatial Join
    fieldMappings = arcpy.FieldMappings()

    sj = arcpy.SpatialJoin_analysis("ReportingDistricts.shp", "Vandalism2006.shp", "Dist_Vand.shp",
                               "COMPLETELY_CONTAINS", "", "")
    # Use Grouping Anlysis tool to create groups based on different variables or analysis fields
    # Process: Group Similar Features  
    ga = SS.GroupingAnalysis("Dist_Vand.shp", "TARGET_FID", "outGSF.shp", "4",
                                       "NO_SPATIAL_CONSRAINT", "EUCLIDEAN", "", "", "FIND_SEED_LOCATIONS", "",
                                       "outGSF.pdf", "DO_NOT_EVALUATE")
    # Use Summary Statistic tool to get the Mean of variables used to group
    # Process: Summary Statistics
    SumStat = arcpy.Statistics_analysis("outGSF.shp", "outSS", "Join_Count MEAN; \
                               VACANT_CY MEAN;TOTPOP_CY MEAN;UNEMP_CY MEAN", 

    # If an error occurred when running the tool, print out the error message.
    print arcpy.GetMessages()


Output Coordinate System

Feature geometry is projected to the Output Coordinate System prior to analysis. All mathematical computations are based on the Output Coordinate System spatial reference. When the Output Coordinate System is based on degrees, minutes, and seconds, geodesic distances are estimated using chordal distances.

Related Topics

Licensing Information

ArcGIS for Desktop Basic: Yes
ArcGIS for Desktop Standard: Yes
ArcGIS for Desktop Advanced: Yes