Find Identical (Data Management)
Summary
Reports any records in a feature class or table that have identical values in a list of fields, and generates a table listing these identical records. If the field Shape is selected, feature geometries are compared.
The Delete Identical tool can be used to find and delete identical records.
Illustration
Usage
-
Records are identical if values in the selected input fields are the same for those records. The values from multiple fields in the input dataset can be compared. If more than one field is specified, records are matched by the values in the first field, then by the values of the second field, and so on.
-
With feature class or feature layer input, select the field Shape in the Field(s) parameter to compare feature geometries to find identical features by location. The XY Tolerance and Z Tolerance parameters are only valid when Shape is selected as one of the input fields.
If the Shape field is selected and the input features have M or Z values enabled, then the M or Z values are also used to determine identical features.
Check Output only duplicated records parameter if you want only the duplicated records in the output table. The output will have the same number of records as the input dataset If this parameter is unchecked (the default).
-
The output table will contain two fields: IN_FID and FEAT_SEQ.
- The IN_FID field can be used to join the records of the output table back to the input dataset.
- Identical records have the same FEAT_SEQ value while nonidentical records will have sequential value. FEAT_SEQ values have no relationship to IDs of input records.
Syntax
Parameter | Explanation | Data Type |
in_dataset |
The table or feature class for which identical records will be found. | Table View |
out_dataset |
The output table reporting identical records. The FEAT_SEQ field in the output table will have the same value for identical records. | Table |
fields [fields,...] | The field or fields whose values will be compared to find identical records. | Field |
xy_tolerance (Optional) |
The xy tolerance that will be applied to each vertex when evaluating if there is an identical vertex in another feature. This parameter is enabled only when Shape is selected as one of the fields. | Linear unit |
z_tolerance (Optional) |
The Z tolerance that will be applied to each vertex when evaluating if there is an identical vertex in another feature. This parameter is enabled only when Shape is selected as one of the fields. | Double |
output_record_option (Optional) |
Choose if you want only duplicated records in the output table.
| Boolean |
Code Sample
The following Python window script demonstrates how to use the FindIdentical function in immediate mode.
import arcpy
# Find identical records based on a text field and a numeric field.
arcpy.FindIdentical_management("C:/data/fireincidents.shp", "C:/output/duplicate_incidents.dbf", ["ZONE", "INTENSITY"])
The following stand-alone script demonstrates how to use the FindIdentical tool to identify duplicate records of a table or feature class.
# Name: FindIdentical_Example2.py
# Description: Finds duplicate features in a dataset based on location (Shape field) and fire intensity
import arcpy
from arcpy import env
env.overwriteOutput = True
# Set workspace environment
env.workspace = "C:/data/findidentical.gdb"
try:
# Set input feature class
in_dataset = "fireincidents"
# Set the fields upon which the matches are found
fields = ["Shape", "INTENSITY"]
# Set xy tolerance
xy_tol = ".02 Meters"
out_table = "duplicate_incidents"
# Execute Find Identical
arcpy.FindIdentical_management(in_dataset, out_table, fields, xy_tol)
print arcpy.GetMessages()
except arcpy.ExecuteError:
print arcpy.GetMessages(2)
except Exception as ex:
print ex.args[0]
Demonstrates the use of the optional parameter Output only duplicated records. If checked on tool dialog box, or if set, the value of ONLY_DUPLICATES, then all unique records are removed. keeping only the duplicates from the output/
# Name: FindIdentical_Example3.py
# Description: Demonstrates the use of the optional parameter Output only duplicated records.
import arcpy
from arcpy import env
env.overwriteOutput = True
# Set workspace environment
env.workspace = "C:/data/redlands.gdb"
try:
in_data = "crime"
out_data = "crime_dups"
# Note that XY Tolerance and Z Tolerance parameters are not used
# In that case, any optional parameter after them must assign
# the value with the name of that parameter
arcpy.FindIdentical_management(in_data, out_data, ["Shape"], output_record_option="ONLY_DUPLICATES")
print arcpy.GetMessages()
except Exception as ex:
print arcpy.GetMessages(2)
print ex.args[0]
Reads the output of FindIdentical tool and groups identical records by FEAT_SEQ value.
import arcpy
from itertools import groupby
from operator import itemgetter
# Set workspace environment
arcpy.env.workspace = r"C:\data\redlands.gdb"
# Run Find Identical on feature geometry only.
result = arcpy.FindIdentical_management("parcels", "parcels_dups", ["Shape"])
# List of all output records as IN_FID and FEAT_SEQ pair - a list of lists
out_records = []
for row in arcpy.SearchCursor(result.getOutput(0), fields="IN_FID; FEAT_SEQ"):
out_records.append([row.IN_FID, row.FEAT_SEQ])
# Sort the output records by FEAT_SEQ values
# Example of out_records = [[3, 1], [5, 3], [1, 1], [4, 3], [2, 2]]
out_records.sort(key = itemgetter(1))
# records after sorted by FEAT_SEQ: [[3, 1], [1, 1], [2, 2], [5, 3], [4, 3]]
# records with same FEAT_SEQ value will be in the same group (i.e., identical)
identicals_iter = groupby(out_records, itemgetter(1))
# now, make a list of identical groups - each group in a list.
# example identical groups: [[3, 1], [2], [5, 4]]
# i.e., IN_FID 3, 1 are identical, and 5, 4 are identical.
identical_groups = [[item[0] for item in data] for (key, data) in identicals_iter]
print identical_groups