Working with shapefiles


Summary
This topic discusses differences in behavior between geodatabases and shapefiles that developers working with shapefiles should be aware of. This is not intended to be a comprehensive list, but rather addresses the problems most commonly encountered by developers that are used to working with geodatabases.


Dataset and workspace limitations

When working with the Geodatabase application programming interface (API), shapefiles can often be used like a feature class. For example, to open a shapefile, developers can use the IFeatureWorkspace.OpenFeatureClass method and (with a couple of exceptions) the IFeatureClass interface can be used to work with a shapefile. Since shapefiles use DBase (DBF) files to store their attributes, DBF files can be opened and used like tables (that is, using the ITable interface).
Beyond feature classes and tables, however, most dataset types are not supported by shapefile workspaces. For example, neither relationship classes or feature datasets can be created in shapefile workspaces, and by extension, neither can datasets that require feature datasets (such as topologies and geometric networks). A notable exception to this rule are network datasets.
Domains and rules are also not supported with shapefile workspaces. As such, the IValidation interface is not supported by shapefiles, so any validation performed by client code must be manual.
The following include some limitations developers should be aware of:
  • Alias and model names are not supported (this includes datasets as well as fields).
  • Length and area fields are not maintained for shapefiles.
  • Annotation feature classes and dimension feature class are not supported.
  • QueryDefs are not supported.
  • Representation classes are not supported.
  • Dataset and workspace behavior cannot be customized using class extensions or workspace extensions.
  • Functionality related to the DistributedGDB library (for example, Extensible Markup Language [XML] import and export) is not supported.

ObjectIDs vs. feature IDs

Shapefiles always contain a feature ID (FID) field that can generally be used like an ObjectID. For example, methods such as IFeatureClass.GetFeature can be used with shapefiles by using FIDs as parameters, and selection sets created from shapefiles use FIDs in lieu of ObjectIDs. Like ObjectIDs, FIDs cannot be edited, and the value of IField.Type for FID fields returns a value of esriFieldTypeOID.
The major difference between ObjectIDs and FIDs is that whereas ObjectIDs are permanent identifiers of a feature or row in a geodatabase, FIDs simply represent the current (zero-based) position of a feature or a row in a shapefile. This means that a feature can have different FIDs from one session to the next, even if it has not been modified in any way.
Consider a shapefile with an integer field and three point features as shown in the following table:
FID
ID
0
0
1
1
2
2
If an edit session is started and the feature with an FID of 1 is deleted, after edits are saved, the shapefile's attribute table will be changed as shown in the following table:
FID
ID
0
0
1
2
When a workflow requires that features have a static identifier, use an integer field other than the FID field.

M-awareness and z-awareness

Geodatabase feature classes have the option of using geometries with m-values, geometries with z-values, geometries with m- and z-values, or geometries with neither m- nor z-values. Shapefiles do not have the option of having only z-values (for a shapefile to have z-values, it must also have m-values).

Field and index restrictions

Shapefiles are restricted in the types of fields that can be used relative to a geodatabase. For example, the following field types are not supported:
  • Globally unique identifier (GUID)
  • GlobalID
  • Binary large object (BLOB)
  • Raster
Additionally, field names are restricted to 10 characters, and as previously mentioned, alias and model names are not supported.

DateTime values

DateTime field values in shapefiles are not true DateTime values; rather, they are simply Date values. If a DateTime value is stored in a shapefile DateTime field, the date portion of the value will be maintained correctly but the time value will not. To model DateTime values in shapefiles, create one or more additional fields to store the time portion of the value. For example, one field could be used to store the number of seconds since midnight or three fields could be used to store hours, minutes, and seconds separately.

Null values

Null values are not supported by shapefiles. One approach to working around this is to represent nulls using values that would not typically occur in the data. For example, in a shapefile containing cities, a value of –9999 could be used to represent a null (unknown) population.

Attribute index names

When an attribute index is created for a shapefile, the name of the index reflects the value assigned to the IIndexEdit.Name property at creation time as long as the shapefile remains open. Once the shapefile is out of memory and reopened, the index name no longer reflects the value of the Name property; instead, it has the same name as the field it was created on.






To use the code in this topic, reference the following assemblies in your Visual Studio project. In the code files, you will need using (C#) or Imports (VB .NET) directives for the corresponding namespaces (given in parenthesis below if different from the assembly name):
Development licensing Deployment licensing
ArcGIS for Desktop Basic ArcGIS for Desktop Basic
ArcGIS for Desktop Standard ArcGIS for Desktop Standard
ArcGIS for Desktop Advanced ArcGIS for Desktop Advanced
Engine Developer Kit Engine