Allocation of server resources to caching
ArcGIS Server creates cache tiles using a geoprocessing service named CachingTools. This service is configured for you in the System folder when you create the ArcGIS Server site. The number of instances you allow for the CachingTools service determines how much power your machine can dedicate toward caching jobs.
Additionally, you always need to have at least one instance running of the map, globe, or image service that you are caching. Increasing the number of instances of the map, globe, or image service does not affect how fast tiles are created.
In 10.0 and earlier versions, to increase the number of operating system processes working on a caching job, you increased the number of instances of the map or globe service being cached. In 10.1 and later releases, you increase the number of instances of the CachingTools geoprocessing service instead.
Choosing the number of instances to allow for the CachingTools service
At any time, you can use Manager to adjust the maximum number of instances of the CachingTools geoprocessing service that you want to make available for working on caching jobs. The minimum and maximum values apply to each individual GIS server; thus, if your maximum is set to a value of 3 and you have four GIS servers in the cluster running the CachingTools service, you could have up to 12 instances of CachingTools running.
This behavior allows you to add and remove GIS servers from the site to increase or reduce the number of resources dedicated to caching. You can add a GIS server even when the caching job is running and it will be detected and assigned tiles to create.
If you choose to allow too many instances of the CachingTools service, your machine can become overwhelmed and inefficient. If you choose to allow too few instances, your machine may be underutilized. Finding the best number can be a process of trial and error. A good starting point is to allow a maximum of n + 1 instances, where n is the number of CPU cores on a single machine in your cluster. If you're deploying your site on Amazon Web Services, use 2n + 1 where n is the number of virtual cores on a single EC2 instance in your site.
The CachingTools service must run with its execution mode as Asynchronous. This is the default value.
Choosing the number of instances to work on a caching job
Tools such as Manage Map Server Cache Tiles allow you to choose how many instances of CachingTools will work on the job. You can choose to divide the available instances of CachingTools among several running jobs. A job might not utilize its maximum number of instances of CachingTools if those instances are being used by other jobs. If a caching job is using all the CachingTools instances, other requested jobs are queued until the first job finishes.
Scenarios
Suppose you want to create a cache and you have four GIS servers in a site with one cluster. You've configured each server to allow a maximum of five instances of CachingTools. The maximum number of instances you can dedicate toward any caching job is 20.
If you want to run two simultaneous caching jobs on this site and maintain an evenly distributed load, you can dedicate 10 instances toward each job.
Allowing for elasticity
It may be that you have configured your site in a cloud environment that can automatically add GIS servers in response to demand. In this case, you might not want to be limited by a fixed maximum number of instances that can work on the job. In this situation, you can enter a value of -1 to indicate that there is no limit on the number of instances that can work on the job. All the available instances of CachingTools will be used for the job, no matter how many GIS servers are added to your site.
Setting the number of jobs that can run simultaneously
If too many publishers start requesting cache to be built at the same time, the server can get overwhelmed, even if you only choose to dedicate a small number of instances toward each job. The CachingControllers service (in the System folder) determines how many jobs can run at the same time.
The default maximum number of instances for the CachingControllers service is 3, meaning only three caching jobs can run at once. If the server receives a request for a fourth caching job, it will be queued until one of the other jobs has finished. If you want to allow four jobs to run at once, you can set the maximum number of instances of CachingControllers to 4.
Using clusters
Clusters are used in large ArcGIS sites to divide work among subsets of GIS servers. Caching jobs are elastic and spread to all available GIS servers in the cluster on which the CachingTools service is running.
When you configure your site for the first time, there is only one cluster, named default. If you want to constrain your caching jobs to a subset of machines, you should create a new cluster and assign the CachingTools service to run on that cluster. You can then potentially assign your other services to a different cluster so that they are not overrun by processes from the caching job.
You can create a cache for a service that is not running in the same cluster as the CachingTools geoprocessing service. For example, you might have a map service, Spain, that is running on Cluster A and your CachingTools service running on Cluster B. With this configuration, you can still create a cache of Spain.
You should always run the CachingTools service and the CachingControllers service on the same cluster.