Feature class basics
Feature classes are homogeneous collections of common features, each having the same spatial representation, such as points, lines, or polygons, and a common set of attribute columns, for example, a line feature class for representing road centerlines. The four most commonly used feature classes are points, lines, polygons, and annotation (the geodatabase name for map text).
In the illustration below, these are used to represent four datasets for the same area: (1) manhole cover locations as points, (2) sewer lines, (3) parcel polygons, and (4) street name annotation.
In this diagram, you might also have noted the potential requirement to model some advanced feature properties. For example, the sewer lines and manhole locations make up a storm sewer network, a system with which you can model runoff and flows. Also, note how adjacent parcels share common boundaries. Most parcel users want to maintain the integrity of shared feature boundaries in their datasets using a topology.
As mentioned earlier, users often need to model such spatial relationships and behaviors in their geographic datasets. In these cases, you can extend these basic feature classes by adding a number of advanced geodatabase elements, such as topologies, network datasets, terrains, and address locators.
You can learn more about adding such advanced behaviors to your geodatabases in Extending feature classes.
Types of feature classes
Vector features (geographic objects with vector geometry) are versatile and frequently used geographic data types, well suited for representing features with discrete boundaries, such as streets, states, and parcels. A feature is an object that stores its geographic representation, which is typically a point, line, or polygon, as one of its properties (or fields) in the row. In ArcGIS, feature classes are homogeneous collections of features with a common spatial representation and set of attributes stored in a database table, for example, a line feature class for representing road centerlines.
When creating a feature class, you'll be asked to set the type of features to define the type of feature class (point, line, polygon, and so forth).
Generally, feature classes are thematic collections of points, lines, or polygons, but there are seven feature class types. The first three are supported in databases and geodatabases. The last four are only supported in geodatabases.
- Points: Features that are too small to represent as lines or polygons as well as point locations (such as GPS observations).
- Lines:Represent the shape and location of geographic objects, such as street centerlines and streams, too narrow to depict as areas. Lines are also used to represent features that have length but no area, such as contour lines and boundaries.
- Polygons: A set of many-sided area features that represents the shape and location of homogeneous feature types such as states, counties, parcels, soil types, and land-use zones.
- Annotation: Map text including properties for how the text is rendered. For example, in addition to the text string of each annotation, other properties are included such as the shape points for placing the text, its font and point size, and other display properties. Annotation can also be feature linked and can contain subclasses.
- Dimensions: A special kind of annotation that shows specific lengths or distances, for example, to indicate the length of a side of a building or land parcel boundary or the distance between two features. Dimensions are heavily used in design, engineering, and facilities applications for GIS.
- Multipoints: Features that are composed of more than one point. Multipoints are often used to manage arrays of very large point collections, such as lidar point clusters, which can contain literally billions of points. Using a single row for such point geometry is not feasible. Clustering these into multipoint rows enables the geodatabase to handle massive point sets.
- Multipatches: A 3D geometry used to represent the outer surface, or shell, of features that occupy a discrete area or volume in three-dimensional space. Multipatches comprise planar 3D rings and triangles that are used in combination to model a three-dimensional shell. Multipatches can be used to represent anything from simple objects, such as spheres and cubes, to complex objects, such as iso-surfaces and buildings.
Feature geometry and feature coordinates
Feature classes contain both the geometric shape of each feature as well as descriptive attributes. Each feature geometry is primarily defined by its feature type (point, line, or polygon). But additional geometric properties can also be defined. For example, features can be single part or multipart, have 3D vertices, have linear measures (called m-values), and contain parametrically defined curves. This section provides a short overview of these capabilities.
Single-part and multipart lines and polygons
Line and polygon feature classes can be composed of single parts or multiple parts. For example, a state can contain multiple parts (Hawaii's islands) but is considered to be a single state feature.
Vertices, segments, elevation, and measurements
Feature geometry is primarily composed of coordinate vertices. Segments in lines and polygon features span vertices. Segments can be straight edges or parametrically defined curves. Vertices in features can also include z-values to represent elevation measures and m-values to represent measurements along line features.
Segment types in line and polygon features
Lines and polygons are defined by two key elements: an ordered list of vertices that define the shape of the line or polygon and the types of line segments used between each pair of vertices. Each line and polygon can be thought of as an ordered set of vertices that can be connected to form the geometric shape. Another way to express each line and polygon is as an ordered series of connected segments where each segment has a type: straight line, circular arc, elliptical arc, or Bézier curve.
The default segment type is a straight line between two vertices. However, when you need to define curves or parametric shapes, you have three additional segment types that can be defined: circular arcs, elliptical arcs, and Bézier curves. These shapes are often used for representing built environments such as parcel boundaries and roadways.
Vertical measurements using z-values
Feature coordinates can include x,y and x,y,z vertices. Z-values are most commonly used to represent elevations, but they can represent other measurements such as annual rainfall or air quality.
Features can have x,y coordinates and, optionally, added z-elevation values.
Linear measurements using m-values
Linear feature vertices can also include m-values. Some GIS applications employ a linear measurement system used to interpolate distances along linear features, such as roads, streams, and pipelines. You can assign an m-value to each vertex in a feature. A commonly used example is a highway milepost measurement system used by departments of transportation for recording pavement conditions, speed limits, accident locations, and other incidents along highways. Two commonly used units of measure are milepost distance from a set location, such as a county line, and distance from a reference marker.
Vertices for measurements can be either x,y,m or x,y,z,m.
Support for these data types is often referred to as linear referencing. The process of geolocating events that occur along these measurement systems is referred to as dynamic segmentation.
Measured coordinates form the building blocks for these systems. In the linear referencing implementation in ArcGIS, the term route refers to any linear feature, such as a city street, highway, river, or pipe, that has a unique identifier and a common measurement system along each linear feature. A collection of routes with a common measurement system can be built on a line feature class as follows:
See An overview of linear referencing for more information.
Feature tolerances
Locational accuracy and support for a high-precision data management framework are critical in GIS data management. A key requirement is the ability to store coordinate information with enough precision. The precision of a coordinate describes the number of digits that is used to record the location. This defines the resolution at which spatial data is collected and managed.
Since geodatabases and databases can record high-precision coordinates, users can build datasets with high accuracy levels and with greater resolution as data capture tools and sensors improve over time (data entry from survey and civil engineering, cadastral and COGO data capture, increased imagery resolution, lidar, building plans from CAD, and so on).
ArcGIS records coordinates using integer numbers and can handle locations with very high precision. In various ArcGIS operations, feature coordinates are processed and managed using some key geometric properties. These properties are defined during the creation of each feature class or feature dataset.
The following geometric properties help to define coordinate resolution and processing tolerances used in various spatial processing and geometric operations:
- X,y resolution: The precision with which coordinates within a feature class are recorded
- X,y tolerance: A cluster tolerance used to cluster features with coincident geometry; used in topology, feature overlay, and related operations
- Z-tolerance and z-resolution: The tolerance and resolution properties for the vertical coordinate dimension in 3D datasets (for example, an elevation measure)
- M-tolerance and m-resolution: The tolerance and resolution properties for measures along line features used in linear referencing datasets (for example, the distance along a road in meters)
X,y resolution
The x,y resolution of a feature class or a feature dataset is the numeric precision used to store the x,y coordinate values. Precision is important for accurate feature representation, analysis, and mapping.
The x,y resolution defines the number of decimal places or significant digits used to store feature coordinates (in both x and y). You can think of the resolution as defining a very fine grid mesh onto which all coordinates are snapped. Coordinate values are actually stored and operated on as integers in ArcGIS. Therefore, sometimes this grid mesh is referred to as an integer grid or coordinate grid.
The resolution defines the distance between the mesh in a coordinate grid onto which all coordinates fit. The x,y resolution is expressed in the units of the data (based on its coordinate system), such as in state plane feet, UTM meters, or Albers meters.
The default x,y resolution for feature classes is 0.0001 meters or its equivalent in the units of the dataset's coordinate system. For example, if a feature class is stored in state plane feet, the default precision will be 0.0003281 feet (0.003937 inches). If coordinates are in latitude-longitude, the default x,y resolution is 0.000000001 degrees.
The graphic below provides a conceptual view of a coordinate grid onto which all coordinate values snap to the grid mesh. The grid covers the extent of each dataset. The fineness of this mesh (the distance between the lines in the grid) is defined by the x,y resolution, which is very small.
If necessary, you can override the default x,y resolution value and set another for each feature class or feature dataset. Setting a smaller x,y resolution value can potentially increase data storage and processing time of datasets compared with those using larger values for x,y resolution.
X,y tolerance
When you create a feature class, you are asked to set the x,y tolerance. The x,y tolerance is used to set the minimum distance between coordinates in clustering operations, such as topology validation, buffer generation, and polygon overlay, as well as in some editing operations.
Feature processing operations are influenced by the x,y tolerance, which determines the minimum distance separating all feature coordinates (nodes and vertices) during those operations. By definition, it also defines the distance a coordinate can move in x or y (or both) during clustering operations.
The x,y tolerance is an extremely small distance (the default is 0.001 meters in on-the-ground units). It is used to resolve inexact intersection locations of coordinates during clustering operations. When processing feature classes using geometry operations, coordinates whose x distance and y distance are within the x,y tolerance of each other are considered to be coincident (in other words, share the same x,y location). Thus, the clustered coordinates are moved to a common location.
Typically, the less accurate coordinate is moved to the location of the more accurate coordinate, or a new location is computed as a weighted average distance between the coordinates in the cluster. In these cases, the weighted average distance is based on the accuracy ranks of the clustered coordinates.
For more information about how accuracy ranks are set for each feature class, see Topology in ArcGIS.
The clustering process works by moving across the map and identifying clusters of coordinates that fall within the x,y tolerance of one another. ArcGIS uses this algorithm to discover, clean up, and manage shared geometry between features. This means that coordinates are deemed to be coincident (and are snapped to the same shared coordinate location). This is fundamental to many GIS operations and concepts. For example, see An overview of topology in ArcGIS.
The maximum distance a coordinate could move to its new location during such operations is the square root of 2 times the x,y tolerance. The clustering algorithm is iterative, so it is possible in some cases for coordinate locations to shift more than this distance.
The default x,y tolerance is set to 0.001 meters or its equivalent in the units of the dataset's real-world coordinate system (in other words, 0.001 meters on the ground). For example, if your coordinate system is recorded in state plane feet, the default x,y tolerance is 0.003281 feet (0.03937 inches).
The default value for the x,y tolerance is 10 times the default x,y resolution, and this is recommended for most cases. You have the option to set a larger tolerance value for data that has less coordinate accuracy or a smaller value for a dataset with extremely high accuracy.
It is important to note that the x,y tolerance is not intended to be used to generalize geometry shapes. Instead, it's intended to integrate line work and boundaries during topological operations. That means integrating coordinates that fall within very small distances of one another. Because coordinates can move in both x and y by as much as the x,y tolerance, many potential problems can be resolved by processing datasets with commands that use the x,y tolerance. These include handling of extremely small overshoots or undershoots, automatic sliver removal of duplicate segments, and coordinate thinning along boundary lines.
Here are some useful tips:
- Generally, you can use an x,y tolerance that is 10 times x,y resolution and expect good results.
- To keep coordinate movement small, keep the x,y tolerance small. However, an x,y tolerance that is too small (such as 3 times the x,y resolution or less) may not properly integrate the line work of coincident boundaries and coordinates.
- Conversely, if your x,y tolerance is too large, feature coordinates may collapse on one another. This can compromise the accuracy of feature boundary representations.
- Your x,y tolerance should never approach your data capture resolution. For example, at a map scale of 1:12,000, 1 inch equals 1,000 feet, and 1/50 of an inch equals 20 feet. You'll want to keep the coordinate movement using the x,y tolerance well under these numbers. Remember, the default x,y tolerance in this case would be 0.0003281 feet, which is a very reasonable default value for x,y tolerance; in fact, it is best to use the default x,y tolerance values in all but extreme cases.
- In topologies, you can set the coordinate rank of each feature class. You'll want to set the coordinate rank of your most accurate features (for instance, surveyed features) to 1 and less accurate features to 2, 3, and so on, in descending levels of accuracy. This will cause other feature coordinates with a higher accuracy rank number (and therefore, a lower coordinate accuracy) to be adjusted to the more accurate features with a lower rank number.
Feature class storage
Each feature class is managed in a single table. A shape column in each row is used to hold the geometry or shape of each feature.
In the feature class table, the following are true:
- Each feature class is a table.
- Individual features are held as rows.
- Feature attributes are recorded in columns.
- The shape column holds each feature's geometry (point, line, polygon, and so forth).
- The ObjectID column holds the unique identifier for each feature.
If you create a line feature class in a geodatabase, an additional field is added to the feature class automatically to record the length of the line. If you create a polygon feature class, two additional fields are added automatically to record the length (perimeter) and area of each polygon feature. The units of measure for these values depends on the spatial reference defined for the feature class. The names of these fields vary depending on the database and spatial type you use. These are required fields and cannot be modified.
Extending feature classes
Each feature class is a collection of geographic features with the same geometry type (point, line, or polygon), the same attributes, and the same spatial reference. Feature classes stored in geodatabases can be extended as needed to achieve a number of objectives. Here are some of the ways that you can extend feature classes using the geodatabase and why.
Use |
If you need to |
---|---|
Hold a collection of spatially related feature classes, or build topologies, networks, cadastral datasets, and terrains. |
|
Manage a set of feature subclasses in a single feature class. This is often used on feature class tables to manage different behaviors on subsets of the same feature type. |
|
Specify a list of valid values or a range of valid values for attribute columns. Use domains to help ensure the integrity of attribute values. Domains are often used to enforce data classifications (such as road class, zoning codes, and land-use classifications). |
|
Build relationships between feature classes and other tables using a common key. For example, find the related rows in a second table based on rows selected in the feature class. |
|
Model how features share geometry. For example, adjacent counties share a common boundary. Also, county polygons nest within and completely cover states. |
|
Model transportation connectivity and flow. You must have the ArcGIS Network Analyst extension to ArcGIS for Desktop installed. |
|
Model utility networks and tracing. |
|
Model triangulated irregular networks (TINs) and manage large lidar and sonar point collections. You must have the ArcGIS 3D Analyst extension to ArcGIS for Desktop installed. |
|
Geocode addresses. |
|
Integrate and maintain survey information for subdivisions and parcel plans as part of a continuous parcel fabric data model in the geodatabase. Also, make incremental accuracy improvements of the parcel fabric as new subdivision plans and parcel descriptions are entered. |
|
Locate events along linear features with measurements. |
|
Manage multiple cartographic representations and advanced cartographic drawing rules. |
|
Manage a number of key GIS workflows for data management; for example, support long update transactions, historical archives, and multiuser editing. It requires the use of ArcSDE geodatabases. |