The archive process

Enabling archiving on a versioned dataset creates and populates the archive class with the current data present in the DEFAULT version. The archive class uses the gdb_from_date and the gdb_to_date to maintain the time the change was archived. Enabling archiving on nonversioned data creates the gdb_from_date and gdb_to_date fields directly in the class's base table.

Representing time

It is important to understand how ArcGIS represents time when change is recorded. History can be recorded as either valid time, transaction time, or Coordinated Universal Time (UTC). Valid time is the actual moment at which a change occurred in the real world and is typically recorded by the user who is applying the change. Transaction time is the time an event was recorded in the database. Transaction times are generated automatically by the system. UTC is the primary standard used to regulate clocks and time over the Internet.

For archiving on versioned data ArcGIS uses transaction time, which is based on the current server time, to record changes to the data when changes are saved or posted to the DEFAULT version. Transaction time and the time the event occurred in the real world are rarely the same time. Time will elapse between an event happening in the real world and its being recorded in the database. For example, a parcel is sold on May 14, 2006; however, the change is not recorded to the data until June 5, 2006. The transaction time of June 5, 2006, is recorded in the archive class for this change.

When the edit occurs, ArcGIS will archive the transaction to the archive class. The difference between the time of the real-world event and the transaction time may seem insignificant, but it becomes more apparent when queries are performed against the archived information. Backlogs in editing and updating data are not uncommon in production systems, and they result in the time difference and lag between valid and transaction time.

The difference between valid and transaction time is also an issue in situations where history is recorded in a multiuser environment with many different users or departments editing the database. The sequence in which changes are performed and logged in the database may not be the same order in which those changes occurred in the real world.

Archiving on nonversioned data uses UTC to represent time. Changes to the data are recorded when edits are saved during an edit session.

Enabling archiving on nonversioned data

Upon enabling archiving, gdb_from_date and gdb_to_date attributes are added to the base table. The gdb_from_date attribute for all rows is time-stamped with the date and time of the enable archiving operation. The gdb_to_date attribute for all rows is time-stamped with 12/31/9999. Anytime an attribute has the gdb_to_date 12/31/9999, it is the current representation of the object. When edits are saved, the geodatabase automatically archives the changes as follows:

Enabling archiving on versioned data

Upon enabling archiving, all rows representing the DEFAULT version for the given class are copied to the archive class with the same time stamp. The gdb_from_date attribute for all rows is time-stamped with the date and time of the enable archiving operation. The gdb_to_date attribute for all rows is time-stamped with 12/31/9999. Anytime an attribute has the gdb_to_date 12/31/9999, it is the current representation of the object in the DEFAULT version. When edits are saved or posted to the DEFAULT version, the geodatabase automatically archives the changes to the archive class. This means the following:

Updating the archive table is performed within a single database transaction. If any errors are encountered during the transaction, the entire archive operation is rolled back, and the save or posting operation is therefore not completed. Once the error has been rectified, perform the save or post operation again.

For each archive operation, the DEFAULT historical marker is updated with the value of the archive operation. This ensures that when choosing the DEFAULT historical marker when working with a historical version, the current representation of the archive class is equivalent to the versioned class's representation in the transactional DEFAULT version.

Accessing the archive class can actually consume fewer database resources than working with the equivalent versioned class.

Application developers interested in the event that captures the moment of the archive operation can refer to the OnarchiveUpdated event on the Iversionevents2 interface of the software developer kit.

Queries on historical versions are on the archive class:

Archive table
Archive table

Queries on transactional versions are still on the base and delta tables:

Base, adds, and deletes tables
Base, adds, and deletes tables

Adding a feature

This feature in a cadastral database shows parcel number 116 and its corresponding row. For versioned data this row would appear in the archive class. For nonversioned data this row would be in the base table for parcels. The gdb_from_date shows the time and date of creation, while the gdb_to_date shows 12/31/9999, because the feature has not been modified or deleted since enabling archiving.

Adding a feature

When a feature is inserted (parcel 117), a row is inserted with the gdb_from_date updated with the time stamp of this post operation. The gdb_to_date attribute in the new row shows 12/31/9999 because this feature has yet to be updated or deleted.

Adding a feature 2
NoteNote:

Certain editing operations, such as creating features with the Auto-Complete Polygon tool and validating a geodatabase topology, insert vertices on existing features to maintain coincidence between adjacent features. For example, if you use the Auto-Complete Polygon tool to create a new polygon that adjoins an existing polygon, vertices are added to the existing polygon at the locations where the sketch of the new feature crosses the existing feature.

Updating a feature

When a feature is updated, the gdb_to_date is set with the time stamp of the archive operation, and a row is inserted to show the current representation of the feature. The gdb_from_date in this new row is set with the time of the archive operation, while the gdb_to_date shows 12/31/9999, since it has yet to be modified or deleted.

The following diagram shows two parcels, 116 and 117, with their corresponding gdb_from_date and gdb_to_date attributes prior to performing the update operation.

Updating a feature

If the parcel boundary for parcel 117 is extended, the gdb_to_date is updated with the time stamp of the archive operation, and a new row is created. The gdb_from_date attribute in this new row is set with the time and date of the archive operation.

Updating a feature updates the gdb_to_date

For example, queries which investigate moments prior to the update (7/12/2005 5:34:22 PM) show parcel 117 as it existed prior to the update. Querying moments before 7/9/2005 2:23:43 PM will not show parcel 117 because it had not been created. Any moment queries after the update (7/14/2005 3:45:23 AM) will show parcel 117 in its current representation with the extended boundary.

Learn more about querying the archive class

Deleting a feature

When a feature is deleted, the gdb_to_date is updated with the time stamp of the archive operation. The following diagram shows parcels 116 and 117 with their corresponding gdb_from_date and gdb_to_date attributes.

Deleting a feature

If parcel 117 is now deleted, the gdb_to_date attribute is updated with the time stamp of the archive operation.

Deleting a feature updates the gdb_to_date

Technical note on archiving with versioned data

The following scenario can create a time gap in the archive class:

An editor is directly editing the DEFAULT version and deletes an object in an edit session.

The editor then saves the edits, which updates the gdb_to_date attribute of the archive class with the time stamp of the deletion of that object.

If the same object is updated in a child version and reconciled with the DEFAULT version, there will be a conflict.

If, during the conflict resolution process, the editor chooses to replace the conflict with the updated representation of the row, the row will be restored in the DEFAULT version when the version is posted. The archive operation inserts a new row into the archive class and sets the gdb_from_date attribute to the time stamp and gdb_to_date to 12/31/9999.

Therefore, when the editor looks at the object’s lineage through time, the dates will contain a gap between the gdb_to_date and gdb_from_date when the object did not exist in the DEFAULT version.

Related Topics

3/13/2015