The geodatabase compress operation

The geodatabase compress operation removes unnecessary states and rows from the system tables that track versions and versioned edits.

TipTip:

To understand compression, you must first understand how versioning works. If you are unfamiliar with this concept, see A quick tour of versioning.

What is a compress operation?

The compress operation removes the states that are no longer referenced by a version and can move rows in the delta tables to the business table. A compress operation can only be performed by the geodatabase administrator and operates against all states in the geodatabase, regardless of the version owner.

Compress operations are necessary because, as a geodatabase is edited over time, the delta tables increase in size and the number of states increases. The larger the tables and the more states, the more data ArcGIS must process every time you display or query a version. Therefore, the greatest impact on performance is not the number of versions but the amount of change contained in the delta tables for each version. As a result, versions can have different query response times.

To maintain database performance, the geodatabase administrator must periodically run a compress operation to remove unused data.

You can use the ArcGIS for Desktop Compress command or the Compress geoprocessing tool or Python script to compress a geodatabase. See Compressing an enterprise geodatabase for information on compressing a geodatabase from the Catalog tree or Compress for information on the geoprocessing tool or script.

What happens during a compress operation?

The compress operation first scans into memory the instance's state tree configuration. Using this information, compression deletes all states that do not participate within a version's lineage. Deleting a state deletes all the rows from the delta tables that are associated with that state.

The next step the compress operation performs is to collapse any candidate lineage of states into one state. A candidate lineage is a collection of states that can be compressed into one state without affecting the logical representation for any table in a given version.

The final step, when applicable, is to move rows from the delta tables into the base (or business) tables.

For each step of the operation, database transactions are started and stopped for each table being compressed. The transaction verifies each table is consistent during each step of the process.

The compress operation can be stopped while it is executing because the operation is designed to be transactionally consistent. Therefore, if the operation encounters an error, fails, or abruptly stops, the versioned tables being compressed are still logically correct with respect to any version's representation. One reason you might stop the compress operation is if you run it while users are connected to the geodatabase, then discover the compression is consuming a large amount of system resources. In that case, you might want to stop the operation and run it again when fewer or no users are connected.

Fully compressing a geodatabase

In a fully compressed geodatabase, there are no rows in the delta tables and the state tree is trimmed back to zero. Performance improvement is greatest if the geodatabase is fully compressed. To achieve this, do the following:

You can see the results of each compress operation in the COMPRESS_LOG table in the geodatabase (SDE_compress_log in geodatabases in SQL Server and PostgreSQL). You can also check the VERSIONS table (SDE_versions in geodatabases in SQL Server and PostgreSQL) to see if the state ID for the DEFAULT version has returned to zero. If it has and there are no other outstanding versions, full compression has been achieved.

It may not always be possible to reconcile, post, delete versions, and disconnect all users before a compress operation. For instance, if you are tracking history using versions or need to maintain design versions for a project, the historic and design versions hold a state within the state tree; therefore, these states will not be removed during compression of the geodatabase. You can successfully compress without doing all these steps, and you will still see some performance improvements.

Frequency of compress operations

The frequency with which you need to perform a compress operation is based on the amount of editing that takes place in your geodatabase. If you have a high volume of edits, you should probably compress the geodatabase once a day. For average or low edit volumes, you should compress at least once a week.

NoteNote:

It is important not to wait too long between compress operations; the greater the amount of versioned editing activity that takes place, the longer it will take to compress the geodatabase. If you do not compress the geodatabase at least once a week, compression could take several hours to complete when you do finally run it.

After compressing a geodatabase

You should update the statistics on your geodatabase after you have run a compress operation. The geodatabase administrator should update statistics on the system tables, and individual users can update statistics on their datasets that were edited. For information on updating statistics, see Using the Analyze Datasets tool to update statistics on geodatabase system tables or Updating statistics on a geodatabase using Analyze.

7/30/2013