Essentials of joining tables
Joining data is typically used to append the fields of one table to those of another through an attribute or field common to both tables. You can choose to define the join based on either attributes or a predefined geodatabase relationship class or by location (also referred to as a spatial join). You will only see join by relationship class listed if you are joining geodatabase data for which a relationship class has already been defined in the geodatabase.
Several tables or layers can be joined to a single table or layer, and relationship class joins can be mixed with attribute joins. When a join table is removed, all data from tables that were joined after it is also removed, but data from previously joined tables remains. Symbology or labeling that is based on an appended column is returned to a default state when the join is removed.
In most cases, appended columns are named <TableName>.<FieldName>. This naming convention helps prevent duplicate field names when the target table and a join table have common field names. If you do not want to see the full field names like that, click the Table window's Table Options button and click Show Field Aliases to toggle this option on or off. When this option is on, a check mark is displayed beside it on the Options menu, and your fields are not prefixed with the table name.
The following is a joined table with the field names prefixed with the table name:
The following is a joined table with only the field alias showing:
Learn more about joining and relating tables
Summarizing your data before joining it
Depending on how your data is organized, you may have to start by summarizing the data in your table before you join it to a layer. When you summarize a table, ArcMap creates a new table containing summary statistics derived from your table. You can create various summary statistics including count, average, sum, minimum, and maximum.
For example, suppose you want to create weather maps by state instead of county, but the weather information you have is organized by county. You could summarize the county data by state—for instance, finding the average rainfall for all counties within a state—then join the newly created output table to a state layer to create a weather map of rainfall by state.
Editing and joining tables
When editing joined data, you cannot edit the joined columns directly. To edit the joined data, you must first add the joined tables or layers to ArcMap. You can then perform edits on this data separately. These changes will be reflected in the joined columns.
Join validation
You can analyze a join before creating it by using the Validate Join button on the Join Data dialog box. Join validation allows you to assess any potential problems that you might encounter when creating a join. Join validation analyzes the two participating datasets to determine if there are any common problems with the data. The following is a list of what is checked in the data:
- Check for field names that start with an invalid character.
- Check for field names that contain an invalid character.
- Check for field names that match reserved words.
- Check for nongeodatabase MS Access tables.
Each of these four problems can cause join fields to display null values in the attribute table or cause selection and record counts to be misleading. Join validation excludes the character symbols the number sign (#), the dollar sign ($), and the hyphen (-) for the invalid characters check when analyzing coverage data and excludes the period (.) when checking for invalid characters for ArcSDE software-connected data. You will still receive a warning if a field name begins with any of these characters.
Join validation checks for the following characters:
Invalid starting characters: `~@#$%^&*()-+=|\\,<>?/{}.!'[]:;_0123456789
Invalid contained characters: `~@#$%^&*()-+=|\\,<>?/{}.!'[]:;
Join validation also informs you about how many records will match if the join is created. You can compute what percentage of records were successfully matched and determine if there might be other errors in the data if the number of matched records is not what was expected. This could occur if using text fields to create a join and an expected matching record has a spelling mistake or an uppercase or lowercase character that causes no match to be found. If join validation counts more matched records than there are records in the source dataset, a warning is displayed that a 1:M or M:M relationship exists between the participating data and you should not use a join to associate these datasets to each other; instead, you should use a relate or relationship class.
Performance tips for joining data
Data from appended fields can be used to symbolize and label features and perform queries and many other operations. Accessing the joined data is slower than accessing data from the target table because of the additional work needed to maintain the join.
The following tips can be used when working with joined data to improve performance:
- You can perform a join with either the Join Data dialog box, accessed by right-clicking a layer in ArcMap, or by using a set of geoprocessing tools. Use the Join geoprocessing tools when working with particularly large datasets to get the best performance. You can also include these tools in geoprocessing models and scripts when you want to automate repetitive or complex steps involving joins. As these tools perform the actual behind-the-scenes join processing slightly differently than the Join Data dialog box, use the tools if you encounter any unexpected issues with the join functionality on that dialog box. The geoprocessing tools are the Spatial Join tool, Add Join tool, and Remove Join tool.
-
Create attribute indexes on the join fields. If your joins involve only shapefiles, dBASE files, coverages, or INFO files, indexing will not improve performance when drawing or working with the Table window. Performance will be improved while editing, however. In all other cases, attribute indexes will improve overall performance.
- When joining data from the same geodatabase, choose the Keep only matching records option. In some cases, this option produces different results but allows the join to be processed by the database. You will find that this is normally faster for operations that require accessing the data in the joined columns (symbolizing, labeling, and so on).
The default Keep all records option always performs processing on the client. Performance is normally good for operations that don't require accessing joined data (such as drawing with default symbolization). An operation may become much slower, though, if accessing joined data is needed.
- Cross-database joins, where the target table and the join table are from different data sources, may have poorer performance. This is especially true in the case where the join table is from a geodatabase or an OLE DB connection. Performance is much better when the join table is from a file-based data source (such as shapefiles, dBASE files, and coverages) and the target table has an ObjectID field (most data sources).
- Joining multiple tables or layers to a single layer can be costly in terms of performance. If all the data is from the same ArcSDE server and you choose Keep only matching records when joining, performance should not be greatly affected.
Reasons joining tables may fail
After performing a join, the values in the fields from the joined table might appear empty or null. Null values can be the result of several factors:
- Values in the specified fields for the join do not match.
Joins are case sensitive, so be aware of this when using string fields to create a join. For example, NEW YORK will not join with New York. To convert string values to the proper case, see the task in Making field calculations.
- The name of the table or feature class, or field names in the table or feature class, include spaces or special characters.
Special characters include hyphens, such as in x-coordinate and y-coordinate; parentheses; brackets; and symbols such as $, %, and #. Essentially, eliminate anything that is not alphanumeric or an underscore, but avoid starting field names with a number or an underscore. Be sure to edit the field names in delimited text files or other tables to remove unsupported characters before trying to use the files in ArcGIS. Geodatabase feature class, table, and field names can have up to 64 characters. (More specifically, you can only enter up to 52 characters for a personal geodatabase feature class name because the system appends characters to total 64.) Shapefiles and .dbf field names can be up to 10 characters long. For INFO tables, use up to 16 letters or numbers. See Adding and deleting fields for more field naming guidelines.
- The field names in the table are Microsoft Access reserved words.
Some examples include date, day, month, table, text, user, when, where, year, and zone. For a list of reserved words, see the Microsoft support article (KB 286335).
- The table is stored in a Microsoft Access database that is not a personal geodatabase.
You should access Microsoft Access tables in ArcGIS through an OLE DB connection rather than attempt to add the database directly to ArcMap. See Working with Microsoft Access files in ArcGIS to learn how to add an OLE DB connection.