Find Identical (Data Management)

License Level:BasicStandardAdvanced

Summary

Reports any records in a feature class or table that have identical values in a list of fields, and generates a table listing these identical records. If the field Shape is selected, feature geometries are compared.

The Delete Identical tool can be used to find and delete identical records.

Illustration

Find Identical illustration
In this example, points with the OBJECTIDs of 1,2, 3, 8, 9, and 10 are spatially coincident (blue highlight). The output table identifies those spatially coincident points that share the same CATEGORY.

Usage

Syntax

FindIdentical_management (in_dataset, out_dataset, fields, {xy_tolerance}, {z_tolerance}, {output_record_option})
ParameterExplanationData Type
in_dataset

The table or feature class for which identical records will be found.

Table View
out_dataset

The output table reporting identical records. The FEAT_SEQ field in the output table will have the same value for identical records.

Table
fields
[fields,...]

The field or fields whose values will be compared to find identical records.

Field
xy_tolerance
(Optional)

The xy tolerance that will be applied to each vertex when evaluating if there is an identical vertex in another feature. This parameter is enabled only when Shape is selected as one of the fields.

Linear unit
z_tolerance
(Optional)

The Z tolerance that will be applied to each vertex when evaluating if there is an identical vertex in another feature. This parameter is enabled only when Shape is selected as one of the fields.

Double
output_record_option
(Optional)

Choose if you want only duplicated records in the output table.

  • ALLAll input records will have corresponding records in the output table. This is the default.
  • ONLY_DUPLICATESOnly duplicate records will have corresponding records in the output table. The output will be empty if no duplicate is found.
Boolean

Code Sample

FindIdentical example 1 (Python window)

The following Python window script demonstrates how to use the FindIdentical function in immediate mode.

import arcpy

# Find identical records based on a text field and a numeric field.
arcpy.FindIdentical_management("C:/data/fireincidents.shp", "C:/output/duplicate_incidents.dbf", ["ZONE", "INTENSITY"])
FindIdentical example 2 (stand-alone script)

The following stand-alone script demonstrates how to use the FindIdentical tool to identify duplicate records of a table or feature class.

# Name: FindIdentical_Example2.py
# Description: Finds duplicate features in a dataset based on location (Shape field) and fire intensity

import arcpy
from arcpy import env

env.overwriteOutput = True

# Set workspace environment
env.workspace = "C:/data/findidentical.gdb"

try:
    # Set input feature class
    in_dataset = "fireincidents"
    
    # Set the fields upon which the matches are found
    fields = ["Shape", "INTENSITY"]
    
    # Set xy tolerance
    xy_tol = ".02 Meters"
    
    out_table = "duplicate_incidents"

    # Execute Find Identical 
    arcpy.FindIdentical_management(in_dataset, out_table, fields, xy_tol)
    print arcpy.GetMessages()

except arcpy.ExecuteError:
    print arcpy.GetMessages(2)
    
except Exception as ex:
    print ex.args[0]
FindIdentical example 3: Output only duplicate records (stand-alone script)

Demonstrates the use of the optional parameter Output only duplicated records. If checked on tool dialog box, or if set, the value of ONLY_DUPLICATES, then all unique records are removed. keeping only the duplicates from the output/

# Name: FindIdentical_Example3.py
# Description: Demonstrates the use of the optional parameter Output only duplicated records.

import arcpy
from arcpy import env

env.overwriteOutput = True

# Set workspace environment
env.workspace = "C:/data/redlands.gdb"

try:
    
    in_data = "crime"
    out_data = "crime_dups"
    
    # Note that XY Tolerance and Z Tolerance parameters are not used
    # In that case, any optional parameter after them must assign
    # the value with the name of that parameter    
    arcpy.FindIdentical_management(in_data, out_data, ["Shape"], output_record_option="ONLY_DUPLICATES")
    
    print arcpy.GetMessages()
    
except Exception as ex:
    print arcpy.GetMessages(2)
    print ex.args[0]
FindIdentical example 4: Group identical records by FEAT_SEQ value

Reads the output of FindIdentical tool and groups identical records by FEAT_SEQ value.

import arcpy

from itertools import groupby
from operator import itemgetter

# Set workspace environment
arcpy.env.workspace = r"C:\data\redlands.gdb"

# Run Find Identical on feature geometry only.
result = arcpy.FindIdentical_management("parcels", "parcels_dups", ["Shape"])
    
# List of all output records as IN_FID and FEAT_SEQ pair - a list of lists
out_records = []   
for row in arcpy.SearchCursor(result.getOutput(0), fields="IN_FID; FEAT_SEQ"):
    out_records.append([row.IN_FID, row.FEAT_SEQ])

# Sort the output records by FEAT_SEQ values
# Example of out_records = [[3, 1], [5, 3], [1, 1], [4, 3], [2, 2]]
out_records.sort(key = itemgetter(1))
    
# records after sorted by FEAT_SEQ: [[3, 1], [1, 1], [2, 2], [5, 3], [4, 3]]
# records with same FEAT_SEQ value will be in the same group (i.e., identical)
identicals_iter = groupby(out_records, itemgetter(1))
    
# now, make a list of identical groups - each group in a list.
# example identical groups: [[3, 1], [2], [5, 4]]
# i.e., IN_FID 3, 1 are identical, and 5, 4 are identical.
identical_groups = [[item[0] for item in data] for (key, data) in identicals_iter]

print identical_groups

Environments

Related Topics

Licensing Information

ArcGIS for Desktop Basic: No
ArcGIS for Desktop Standard: No
ArcGIS for Desktop Advanced: Yes
5/7/2015