Replicating Workflow Manager repositories
This topic applies to ArcGIS for Desktop Standard and ArcGIS for Desktop Advanced only.
Key concepts
ArcGIS Workflow Manager repository replication allows organizations to maintain multiple Workflow Manager repositories locally in different geographic regions for performance reasons, while the users at each location continue to work on the same set of jobs. ArcGIS Workflow Manager replication is not an extension of geodatabase replication. It is accomplished with Workflow Manager services or configuration files. ArcGIS Workflow Manager replication is a two-way replication.
You will define a collection of Workflow Manager repositories that you want to have identical Workflow Manager contents. This collection of repositories is known as a repository cluster.
One Workflow Manager repository is designated as the parent repository and is tasked with coordinating the synchronization between each repository. After the parent repository is specified, each child repository is added. The child repositories can be designated as connected repositories, where the synchronization happens automatically, or disconnected repositories, where the synchronization occurs by creating files that can be manually imported.
- Repository Cluster—A cluster is a collection of Workflow Manager repositories that participate in repository replication. Each repository is identified as a node. Nodes can be added to the cluster any time. For example, a cluster might be created for the Workflow Manager repositories at Redlands, California; Washington, D.C.; St. Louis, Missouri; and Denver, Colorado. The repositories at these locations must have ArcGIS Workflow Manager installed, and the postinstallation must have been executed to create the necessary tables. For the purpose of replication, the minimum configuration can be imported.
- Parent node—Each cluster must have a parent node. In connected replication, the parent node initiates all replication and synchronization processes. The parent node must be a repository that has the configuration you want to distribute to users in the other locations.
- Child node—A cluster can have more than one child node. The children have identical elements when replicas are created and changes are synchronized.
- Elements—Elements are considered configuration items like job types and step types. However, they are not limited to just configuration items. Your jobs are also elements, and they all get replicated and synchronized.Note:
Workflow Manager replication does not replicate and synchronize data workspaces and spatial notification rules between repositories.
- Last Sync date—This date is a property of each node in the cluster. It is used when you create a replica or synchronize changes. As changes are made to each of your elements, the date and time that it occurred are updated. If there is an element with a newer date and time stamp, the application imports that to the parent and pushes this change to all the children in the cluster.
- Connected or Not—This is the status of the node. A connected node means that it is online, and the replication and synchronization changes relay immediately. If the node is disconnected, specify a file location to export the configuration file and manually import it later using tools in the Workflow Manager geoprocessing toolbox.
Create replica
This is the process of making all child repositories an identical copy of the parent repository. It exports the configuration from the parent repository, deletes existing configuration in child repositories, and imports the parent repository configuration in child repositories. Replica can be created in the ArcGIS Workflow Manager Administrator or by using the ArcGIS Workflow Manager geoprocessing tools.
Synchronize changes
This is the process of synchronizing changes made in parent and child repositories. Changes from child repositories are sent to the parent repository and consolidated changes are sent to all child repositories. Changes can be synchronized in the ArcGIS Workflow Manager Administrator or by using the ArcGIS Workflow Manager geoprocessing tools.
Connected replication
Connected replication is ideal for locations that have network access as it requires ArcGIS Workflow Manager for Server and is accomplished using Workflow Manager Services. When using connected replication, the information is automatically exchanged using the Workflow Manager Services with minimal user interaction. Workflow Manager for Server must be installed at each node, and the parent and child repositories must be published as Workflow Manager Services. There are two ways to set up connected replication:
- ArcGIS Workflow Manager Administrator—Use the Manage Replication dialog box to create replicas and synchronize changes.
- Geoprocessing tools—Use ArcGIS Workflow Manager geoprocessing tools to create replicas and synchronize changes.
Parent and child services must be online for connected replication.
Creating the Workflow Manager repositories
Run the post installation for all repositories participating in your cluster.
- From the Start menu, navigate to the ArcGIS Workflow Manager menu and click Workflow Manager Post Installation.
See the Workflow Manager post installation topic for more details.
- Specify and note the repository name on the last page of the Post Installation utility.
- Repeat the steps for all the repositories participating in the cluster.Note:
The Windows login for the user that creates the initial replica from the parent repository must be added to all child repositories. The user should be granted administrator access or given Manage Replication privileges.
Creating Workflow Manager services
Create Workflow Manager services for all repositories participating in your cluster. The Workflow Manager services publish using ArcGIS Workflow Manager Administrator for each repository in the cluster.
- Create a Workflow Manager service for the parent repository.
- Create a Workflow Manager service for all child repositories participating in your cluster.
Creating and adding nodes to the cluster in administrator
Clusters are created through the administrator on the parent repository. You can use the Manage Replication tools for adding nodes.
- Open the Workflow Manager Administrator and connect to the parent repository.
- Right-click the database connection and click Manage Replication.
The repository name for the parent is already completed on the Manage Replication dialog box.
- By default, the Connected check box is checked.
- Specify the ArcGIS Workflow Manager Server URL you configured in the steps above. For example, http://yourserver:6080/ArcGIS/rest/Services/Parent/WMServer.
The Last Sync column is empty if a replica hasn't been created.
- Click the Add button to add a new child to the cluster.
- Specify the repository name and repeat steps 4 and 5. Note:
Click the Save button any time to store the added information.
- Repeat steps 5 and 6 for all the other nodes.
- After adding all the nodes in the cluster, click Save.Note:
Use only the REST URL for your Workflow Manager services for replication at the current release.
Creating new replicas using administrator
The Create New Replica button sends the contents of the parent repository to all the other nodes in the cluster. This operation deletes the contents of all the child nodes; therefore, run this operation initially when defining the cluster.
- Open the Workflow Manager Administrator and connect to the parent repository.
- Right-click the database connection and click Manage Replication.
- Click Create New Replica.
This process might run for several minutes, depending on the size of the parent repository.
- Click OK when replication completes.
Synchronizing replicas using administrator
The synchronize process compares the differences between all the children in the cluster, imports them to the parent node, and sends the changes to all the other nodes in the cluster.
- Open the Workflow Manager Administrator and connect to the parent repository.
- Right-click the database connection and click Manage Replication.
- Click Synchronize Replicas.
This process might run for several minutes, depending on the number of the changes in the parent and child repositories.
- Click OK when synchronization completes.
Disconnected replication
Disconnected replication is ideal for locations where the network connectivity is a problem or when Workflow Manager for Server is not available at each location. It can be utilized when the parent repository on a server but the child nodes are not connected to a server, or if neither parent nor child repositories are connected to a server. There are two ways to set up disconnected replication:
- ArcGIS Workflow Manager Administrator—Use the Manage Replication dialog box to create a configuration file consisting of all elements and jobs in the parent repository.
- Geoprocessing tools—Use ArcGIS Workflow Manager geoprocessing tools to create configuration files to create replicas and synchronize changes.
Creating the Workflow Manager repositories
Run the post installation for all repositories participating in your cluster.
- From the Start menu, navigate to the ArcGIS Workflow Manager menu and click Workflow Manager Post Installation.
See the Workflow Manager post installation topic for more details.
- Specify and note the Repository Name on the last page of the Post Installation utility.
- Repeat the steps for all repositories participating in the cluster.Note:
The Windows login for the user that creates the initial replica from the parent repository must be added to all child repositories. Grant the user administrator access or manage replication privileges.
Disconnected replication—parent repository connected
Occasionally, in disconnected replication the parent repository is connected to a server but all child nodes do not have access to a server. In this case, the parent repository is published as a service and the configuration from the parent repository is stored as a configuration file. This configuration file is used to create replicas in child nodes and the changes are also synchronized using configuration files. The scenario also uses replication tools in Workflow Manager Administrator and geoprocessing tools.
Creating the Workflow Manager service for the parent
If the parent repository is connected to a server, create the Workflow Manager service for the parent repository.
- Create a Workflow Manager service for the parent repository.
Creating and adding nodes to the cluster in administrator
If the parent repository is connected to a server, the disconnected replication can be partially managed in the administrator. Clusters can be created through the administrator on the parent repository and the Manage Replication tools can be used for adding the nodes.
- Open the Workflow Manager Administrator and connect to the parent repository.
- Right-click the database connection and click Manage Replication.
The repository name for the parent is already completed on the Manage Replication dialog box.
- By default, the Connected check box is checked. Uncheck the box for disconnected replication.
- Specify the ArcGIS Workflow Manager Server URL for the parent repository you published in the steps above. For example, http://yourserver:6080/ArcGIS/rest/Services/Parent/WMServer.
The Last Sync column is empty if a replica hasn't been created.
- By default, the Connected check box is checked. Uncheck the box for disconnected replication.
- Click the Add button to add a new child to the cluster.
- Specify the repository name for the child repository. Note:
Click the Save button any time to store the added information.
- By default, the Connected check box is checked. Uncheck the box for disconnected replication.
- Specify the folder location to store the created parent repository configuration file.
The Last Sync column is empty if a replica hasn't been created.
- Repeat steps 6, 7, and 8 for all the other nodes.
- After adding all the nodes in the cluster, click Save.
Creating new replicas using administrator
In disconnected replication, if the parent repository is connected to a server, the Create New Replica button creates a configuration file with the elements and jobs of the parent repository at the specified folder location. This operation will not delete the contents of all the child nodes, as the process cannot communicate with them. Use this to run initially when defining the cluster.
- Open the Workflow Manager Administrator and connect to the parent repository.
- Right-click the database connection and click Manage Replication.
- Click Create New Replica.
This process might run for several minutes, depending on the size of the parent repository.
- Click OK when replication completes.
Synchronize replica using geoprocessing tools
- Open ArcCatalog or ArcMap and expand the Workflow Manager toolbox.
- Open the Export Job Data tool.
- Specify the folder location for the Folder to export to parameter.
- Specify the folder location of the child repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the child repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Specify the date when replicas were created or changes were last synchronized for the Export Since parameter.
- Click OK on the tool dialog box.
The child repository configuration consisting of changes exports to the specified folder location as a .jxl file. The changes are exported only for the specified time after the export since parameter.
- Open the Import Job Data tool.
- Select the file created in step 7 as the input for the Input JXL/Acknowledgement parameter.
- Check the check box for Merge parameter.
- Specify the folder location of the parent repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the parent repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Click OK on the tool dialog box.
The child repository configuration with changes is imported to the specified parent repository and merged with contents of the parent repository.
- Repeat steps 2 through 13 for all child repositories to send changes to the parent repository.
- Open the Export Job Data tool.
- Specify the folder location for the Folder to export to parameter.
- Specify the folder location of the parent repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the parent repository for the Repository Name parameter.
If no repository name is specified, the current default workflow manager database repository name is used.
- Specify the date when the replicas were created or changes were last synchronized for the Export Since parameter.
- Click OK on the tool dialog box.
The parent repository configuration consisting of changes from the parent repository as well as all child repositories is exported to the specified folder location as a .jxl file. The changes are exported only for the specified time after the export since parameter.
- Open the Import Job Data tool.
- Select the file created in step 20 as the input for the Input JXL/Acknowledgement parameter.
- Check the check box for the Merge parameter.
- Specify the folder location of the child repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the child repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Click OK on the tool dialog box.
The parent repository configuration with all consolidated changes is imported to the specified child repository and merged with contents of the child repository.
- Repeat steps 21 through 26 for all child repositories.
Disconnected replication—all repositories disconnected
In cases where none of the repositories participating in the cluster are connected to a server, all information exchanged is through configuration files. The configuration files are created using the Export Job Data and Import Job Data geoprocessing tools.
Creating new replicas using geoprocessing tools
- Open ArcCatalog or ArcMap and expand the Workflow Manager toolbox.
- Open Export Job Data tool.
- Specify the folder location for the Folder to export to parameter.
- Specify the folder location of the parent repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default workflow manager database is used.
- Specify the repository name of the parent repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Do not specify any value for Export Since parameter.
- Click OK on the tool dialog box.
The parent repository configuration is exported to the specified folder location as a .jxl file.
- Open the Import Job Data tool.
- Select the file created in step 7 as the input for the Input JXL/Acknowledgement parameter.
- Uncheck the check box for the Merge parameter.
The check box must be checked and unchecked again to pass the information to the dialog box.
- Specify the folder location of the child repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the child repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Click OK on the tool dialog box.
The parent repository configuration is imported to the specified child repository, and the entire contents of the child repository are replaced with the contents of parent repository; therefore, the child repository becomes identical to the parent repository.
- Repeat steps 8 through 13 for each child repository.
Synchronize replica using geoprocessing tools
In cases where none of the repositories participating in the cluster are connected to a server, all information exchanged and synchronized is through configuration files. The configuration files are created using the Export Job Data and Import Job Data geoprocessing tools.
- Open ArcCatalog or ArcMap and expand the Workflow Manager toolbox.
- Open the Export Job Data tool.
- Specify the folder location for the Folder to export to parameter.
- Specify the folder location of the child repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the child repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Specify the date when the replicas were created or changes were last synchronized for the Export Since parameter.
- Click OK on the tool dialog box.
The child repository configuration consisting of changes is exported to the specified folder location as a .jxl file. The changes are exported only for the time specified after the export since parameter.
- Open the Import Job Data tool.
- Select the file created in step 7 as the input for the Input JXL/Acknowledgement parameter.
- Check the check box for Merge parameter.
- Specify the folder location of the parent repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the parent repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Click OK on the tool dialog box.
The child repository configuration with changes is imported to the specified parent repository and merged with contents of the parent repository.
- Repeat steps 2 through 13 for all child repositories to send changes to the parent repository.
- Open the Export Job Data tool.
- Specify the folder location for the Folder to export to parameter.
- Specify the folder location of the parent repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the parent repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Specify the date when replicas were created or changes were last synchronized for Export Since parameter.
- Click OK on the tool dialog box.
The parent repository configuration consisting of changes from the parent repository as well as all child repositories is exported to the specified folder location as a .jxl file. The changes are exported only for the time specified after the export since parameter.
- Open the Import Job Data tool.
- Select the file created in step 20 as the input for the Input JXL/Acknowledgement parameter.
- Check the check box for the Merge parameter.
- Specify the folder location of the child repository connection file for the Input Database Path (.jtc) parameter.
If no connection file is specified, the current default Workflow Manager database is used.
- Specify the repository name of the child repository for the Repository Name parameter.
If no repository name is specified, the current default Workflow Manager database repository name is used.
- Click OK on the tool dialog box.
The parent repository configuration with all consolidated changes is imported to the specified child repository and merged with contents of the child repository.
- Repeat steps 21 through 26 for all child repositories.
The export and import of configuration files using geoprocessing tools can be scripted to automate the workflow.
Deleting nodes from a cluster
Nodes added to the cluster can also be deleted using the tools available in the Workflow Manager Administrator. When a node is deleted, a message is sent to the parent and other nodes to ensure that the existing relationship clears from the system tables.
- Open the Workflow Manager Administrator and connect to the parent repository.
- Right-click the database connection and click Manage Replication.
- Click the Delete button.
There are geoprocessing tools available in the Workflow Manager toolbox that creates replicas, synchronize replicas, delete nodes, and export and import data. This gives you the option to run these as scheduled tasks through a Python script. See An overview of the Workflow Manager toolbox for more information.