Feature class basics

Feature classes are homogeneous collections of common features, each having the same spatial representation, such as points, lines, or polygons, and a common set of attribute columns, for example, a line feature class for representing road centerlines. The four most commonly used feature classes are points, lines, polygons, and annotation (the geodatabase name for map text).

In the illustration below, these are used to represent four datasets for the same area: (1) manhole cover locations as points, (2) sewer lines, (3) parcel polygons, and (4) street name annotation.

The four most commonly used feature classes in the geodatabase

In this diagram, you might also have noted the potential requirement to model some advanced feature properties. For example, the sewer lines and manhole locations make up a storm sewer network, a system with which you can model runoff and flows. Also, note how adjacent parcels share common boundaries. Most parcel users want to maintain the integrity of shared feature boundaries in their datasets using a topology.

As mentioned earlier, users often need to model such spatial relationships and behaviors in their geographic datasets. In these cases, you can extend these basic feature classes by adding a number of advanced geodatabase elements, such as topologies, network datasets, terrains, and address locators.

You can learn more about adding such advanced behaviors to your geodatabases in Extending feature classes.

Types of feature classes

Vector features (geographic objects with vector geometry) are versatile and frequently used geographic data types, well suited for representing features with discrete boundaries, such as streets, states, and parcels. A feature is an object that stores its geographic representation, which is typically a point, line, or polygon, as one of its properties (or fields) in the row. In ArcGIS, feature classes are homogeneous collections of features with a common spatial representation and set of attributes stored in a database table, for example, a line feature class for representing road centerlines.

NoteNote:

When creating a feature class, you'll be asked to set the type of features to define the type of feature class (point, line, polygon, and so forth).

Generally, feature classes are thematic collections of points, lines, or polygons, but there are seven feature class types. The first three are supported in databases and geodatabases. The last four are only supported in geodatabases.

Feature geometry and feature coordinates

Feature classes contain both the geometric shape of each feature as well as descriptive attributes. Each feature geometry is primarily defined by its feature type (point, line, or polygon). But additional geometric properties can also be defined. For example, features can be single part or multipart, have 3D vertices, have linear measures (called m-values), and contain parametrically defined curves. This section provides a short overview of these capabilities.

Single-part and multipart lines and polygons

Line and polygon feature classes can be composed of single parts or multiple parts. For example, a state can contain multiple parts (Hawaii's islands) but is considered to be a single state feature.

Single-part and multipart lines and polygons

Vertices, segments, elevation, and measurements

Feature geometry is primarily composed of coordinate vertices. Segments in lines and polygon features span vertices. Segments can be straight edges or parametrically defined curves. Vertices in features can also include z-values to represent elevation measures and m-values to represent measurements along line features.

Feature geometry defined both by coordinates and segments

Segment types in line and polygon features

Lines and polygons are defined by two key elements: an ordered list of vertices that define the shape of the line or polygon and the types of line segments used between each pair of vertices. Each line and polygon can be thought of as an ordered set of vertices that can be connected to form the geometric shape. Another way to express each line and polygon is as an ordered series of connected segments where each segment has a type: straight line, circular arc, elliptical arc, or Bézier curve.

Parcel feature boundaries with a mix of straight line and curved segments

The default segment type is a straight line between two vertices. However, when you need to define curves or parametric shapes, you have three additional segment types that can be defined: circular arcs, elliptical arcs, and Bézier curves. These shapes are often used for representing built environments such as parcel boundaries and roadways.

Vertical measurements using z-values

Feature coordinates can include x,y and x,y,z vertices. Z-values are most commonly used to represent elevations, but they can represent other measurements such as annual rainfall or air quality.

Elevation measures (x,y,z)

Features can have x,y coordinates and, optionally, added z-elevation values.

Linear measurements using m-values

Linear feature vertices can also include m-values. Some GIS applications employ a linear measurement system used to interpolate distances along linear features, such as roads, streams, and pipelines. You can assign an m-value to each vertex in a feature. A commonly used example is a highway milepost measurement system used by departments of transportation for recording pavement conditions, speed limits, accident locations, and other incidents along highways. Two commonly used units of measure are milepost distance from a set location, such as a county line, and distance from a reference marker.

Coordinate systems for linear referencing include m's—(x,y,m) or (x,y,z,m)

Vertices for measurements can be either x,y,m or x,y,z,m.

Support for these data types is often referred to as linear referencing. The process of geolocating events that occur along these measurement systems is referred to as dynamic segmentation.

Measured coordinates form the building blocks for these systems. In the linear referencing implementation in ArcGIS, the term route refers to any linear feature, such as a city street, highway, river, or pipe, that has a unique identifier and a common measurement system along each linear feature. A collection of routes with a common measurement system can be built on a line feature class as follows:

Line feature class holding routes with measured coordinates and a route identifier for each feature

See An overview of linear referencing for more information.

Feature tolerances

Locational accuracy and support for a high-precision data management framework are critical in GIS data management. A key requirement is the ability to store coordinate information with enough precision. The precision of a coordinate describes the number of digits that is used to record the location. This defines the resolution at which spatial data is collected and managed.

Since geodatabases and databases can record high-precision coordinates, users can build datasets with high accuracy levels and with greater resolution as data capture tools and sensors improve over time (data entry from survey and civil engineering, cadastral and COGO data capture, increased imagery resolution, lidar, building plans from CAD, and so on).

ArcGIS records coordinates using integer numbers and can handle locations with very high precision. In various ArcGIS operations, feature coordinates are processed and managed using some key geometric properties. These properties are defined during the creation of each feature class or feature dataset.

The following geometric properties help to define coordinate resolution and processing tolerances used in various spatial processing and geometric operations:

X,y resolution

The x,y resolution of a feature class or a feature dataset is the numeric precision used to store the x,y coordinate values. Precision is important for accurate feature representation, analysis, and mapping.

The x,y resolution defines the number of decimal places or significant digits used to store feature coordinates (in both x and y). You can think of the resolution as defining a very fine grid mesh onto which all coordinates are snapped. Coordinate values are actually stored and operated on as integers in ArcGIS. Therefore, sometimes this grid mesh is referred to as an integer grid or coordinate grid.

The resolution defines the distance between the mesh in a coordinate grid onto which all coordinates fit. The x,y resolution is expressed in the units of the data (based on its coordinate system), such as in state plane feet, UTM meters, or Albers meters.

The default x,y resolution for feature classes is 0.0001 meters or its equivalent in the units of the dataset's coordinate system. For example, if a feature class is stored in state plane feet, the default precision will be 0.0003281 feet (0.003937 inches). If coordinates are in latitude-longitude, the default x,y resolution is 0.000000001 degrees.

The graphic below provides a conceptual view of a coordinate grid onto which all coordinate values snap to the grid mesh. The grid covers the extent of each dataset. The fineness of this mesh (the distance between the lines in the grid) is defined by the x,y resolution, which is very small.

X,y resolution grid mesh

If necessary, you can override the default x,y resolution value and set another for each feature class or feature dataset. Setting a smaller x,y resolution value can potentially increase data storage and processing time of datasets compared with those using larger values for x,y resolution.

X,y tolerance

When you create a feature class, you are asked to set the x,y tolerance. The x,y tolerance is used to set the minimum distance between coordinates in clustering operations, such as topology validation, buffer generation, and polygon overlay, as well as in some editing operations.

Feature processing operations are influenced by the x,y tolerance, which determines the minimum distance separating all feature coordinates (nodes and vertices) during those operations. By definition, it also defines the distance a coordinate can move in x or y (or both) during clustering operations.

The x,y tolerance is an extremely small distance (the default is 0.001 meters in on-the-ground units). It is used to resolve inexact intersection locations of coordinates during clustering operations. When processing feature classes using geometry operations, coordinates whose x distance and y distance are within the x,y tolerance of each other are considered to be coincident (in other words, share the same x,y location). Thus, the clustered coordinates are moved to a common location.

X,y tolerance used to match coordinates that are coincident (within the tolerance of each other)

Typically, the less accurate coordinate is moved to the location of the more accurate coordinate, or a new location is computed as a weighted average distance between the coordinates in the cluster. In these cases, the weighted average distance is based on the accuracy ranks of the clustered coordinates.

For more information about how accuracy ranks are set for each feature class, see Topology in ArcGIS.

The clustering process works by moving across the map and identifying clusters of coordinates that fall within the x,y tolerance of one another. ArcGIS uses this algorithm to discover, clean up, and manage shared geometry between features. This means that coordinates are deemed to be coincident (and are snapped to the same shared coordinate location). This is fundamental to many GIS operations and concepts. For example, see An overview of topology in ArcGIS.

The maximum distance a coordinate could move to its new location during such operations is the square root of 2 times the x,y tolerance. The clustering algorithm is iterative, so it is possible in some cases for coordinate locations to shift more than this distance.

The default x,y tolerance is set to 0.001 meters or its equivalent in the units of the dataset's real-world coordinate system (in other words, 0.001 meters on the ground). For example, if your coordinate system is recorded in state plane feet, the default x,y tolerance is 0.003281 feet (0.03937 inches).

X,y tolerance = 10 times the resolution

The default value for the x,y tolerance is 10 times the default x,y resolution, and this is recommended for most cases. You have the option to set a larger tolerance value for data that has less coordinate accuracy or a smaller value for a dataset with extremely high accuracy.

It is important to note that the x,y tolerance is not intended to be used to generalize geometry shapes. Instead, it's intended to integrate line work and boundaries during topological operations. That means integrating coordinates that fall within very small distances of one another. Because coordinates can move in both x and y by as much as the x,y tolerance, many potential problems can be resolved by processing datasets with commands that use the x,y tolerance. These include handling of extremely small overshoots or undershoots, automatic sliver removal of duplicate segments, and coordinate thinning along boundary lines.

Here are some useful tips:

  • Generally, you can use an x,y tolerance that is 10 times x,y resolution and expect good results.
  • To keep coordinate movement small, keep the x,y tolerance small. However, an x,y tolerance that is too small (such as 3 times the x,y resolution or less) may not properly integrate the line work of coincident boundaries and coordinates.
  • Conversely, if your x,y tolerance is too large, feature coordinates may collapse on one another. This can compromise the accuracy of feature boundary representations.
  • Your x,y tolerance should never approach your data capture resolution. For example, at a map scale of 1:12,000, 1 inch equals 1,000 feet, and 1/50 of an inch equals 20 feet. You'll want to keep the coordinate movement using the x,y tolerance well under these numbers. Remember, the default x,y tolerance in this case would be 0.0003281 feet, which is a very reasonable default value for x,y tolerance; in fact, it is best to use the default x,y tolerance values in all but extreme cases.
  • In topologies, you can set the coordinate rank of each feature class. You'll want to set the coordinate rank of your most accurate features (for instance, surveyed features) to 1 and less accurate features to 2, 3, and so on, in descending levels of accuracy. This will cause other feature coordinates with a higher accuracy rank number (and therefore, a lower coordinate accuracy) to be adjusted to the more accurate features with a lower rank number.

Feature class storage

Each feature class is managed in a single table. A shape column in each row is used to hold the geometry or shape of each feature.

Feature classes stored as tables with a shape column

In the feature class table, the following are true:

If you create a line feature class in a geodatabase, an additional field is added to the feature class automatically to record the length of the line. If you create a polygon feature class, two additional fields are added automatically to record the length (perimeter) and area of each polygon feature. The units of measure for these values depends on the spatial reference defined for the feature class. The names of these fields vary depending on the database and spatial type you use. These are required fields and cannot be modified.

Extending feature classes

Each feature class is a collection of geographic features with the same geometry type (point, line, or polygon), the same attributes, and the same spatial reference. Feature classes stored in geodatabases can be extended as needed to achieve a number of objectives. Here are some of the ways that you can extend feature classes using the geodatabase and why.

Working with feature classes in the geodatabase

Use

If you need to

Feature dataset

Hold a collection of spatially related feature classes, or build topologies, networks, cadastral datasets, and terrains.

Subtypes

Manage a set of feature subclasses in a single feature class. This is often used on feature class tables to manage different behaviors on subsets of the same feature type.

Attribute domains

Specify a list of valid values or a range of valid values for attribute columns. Use domains to help ensure the integrity of attribute values. Domains are often used to enforce data classifications (such as road class, zoning codes, and land-use classifications).

Relationship classes

Build relationships between feature classes and other tables using a common key. For example, find the related rows in a second table based on rows selected in the feature class.

Topology

Model how features share geometry. For example, adjacent counties share a common boundary. Also, county polygons nest within and completely cover states.

Network dataset

Model transportation connectivity and flow. You must have the ArcGIS Network Analyst extension to ArcGIS for Desktop installed.

Geometric networks

Model utility networks and tracing.

Terrain dataset

Model triangulated irregular networks (TINs) and manage large lidar and sonar point collections. You must have the ArcGIS 3D Analyst extension to ArcGIS for Desktop installed.

Address locator

Geocode addresses.

Parcel fabric

Integrate and maintain survey information for subdivisions and parcel plans as part of a continuous parcel fabric data model in the geodatabase. Also, make incremental accuracy improvements of the parcel fabric as new subdivision plans and parcel descriptions are entered.

Linear referencing

Locate events along linear features with measurements.

Cartographic representations

Manage multiple cartographic representations and advanced cartographic drawing rules.

Versioning

Manage a number of key GIS workflows for data management; for example, support long update transactions, historical archives, and multiuser editing. It requires the use of ArcSDE geodatabases.

Working with feature classes in the geodatabase
7/30/2013