Data store scenarios for image services

How you organize and register your data with the server will impact whether or not the data needs to be packaged and copied to the server when publishing.

Before reading this topic, it's important that you understand the fundamentals about how ArcGIS for Server stores and accesses data. Please read:

As you know, raster datasets and mosaic datasets can comprise many different files. This topic will help to guide you in organizing this data so you get the desired results when publishing.

One key thing to understand is if the path to the data is registered in the data store, then ArcGIS for Server assumes that all the data is there. This typically doesn't pose an issue when the registered location is shared with the server, but it can cause an issue when the location is a duplicate. In this case, you want to make sure that you haven't made any changes to one location and not the other. For example, if you build pyramids for a raster dataset on your local location, but don't copy those to the duplicated location on the server, they won't be copied over during the publishing operation because the server assumes that all the data files are duplicated. Things get a little more complicated with mosaic datasets because more files are involved.

One complexity is the location of the source data for your mosaic dataset. For simplicity, let's assume that the source data is a collection of raster datasets—including pyramid, statistics, and metadata files. All of this data can reside in a location where the mosaic dataset has read access.

Next, there is the mosaic dataset. For simplicity, let's assume it is stored in a file geodatabase. If you build overviews, these are stored in a folder next to the geodatabase. This folder has the same name as the geodatabase with a *.Overviews extension. If you've added lidar data, or generated cache for a raster item, there will be another folder stored next to the geodatabase with the same geodatabase name and a *.Cache extension. Both overview and cache storage locations are stored next to the geodatabase by default, but you can choose to store them elsewhere—which just adds to the complexity of your data organization.

When you publish, you want to be sure that the server has access to all of the content managed by the mosaic dataset; so you must set the data store and prepare the mosaic dataset correctly. Each scenario below will expand on this issue specifically when using mosaic datasets since a mosaic dataset references data that can reside anywhere (as long as it can be read).

Scenario 1: All data is on a shared location

This is probably the simplest data organizational structure. In this scenario everything is stored on a location that is shared to you and the server. You need to register this location with the server, and connect to this location in the Catalog window (or ArcCatalog) to share the data from it as an image service. This is also a fast way to publish, since no data is moved.

Scenario 2: All data is duplicated

In this scenario, your data is stored in two locations: one accessed by the server and one you connect to in the Catalog window. This setup is commonly used when the server is in a cloud or on a Linux operating system.

You must ensure that the data is exactly the same. For example, if you modify the mosaic dataset by adding a new image or modifying the footprints, you must ensure that the copy the server accesses is updated. You also must ensure that the paths to the data are modified accordingly. A mosaic dataset contains hard-coded paths to all of its content. Therefore, if the location for your content is D:\MyData and the data on the server is \\Blue\ServerData, you need to make sure the paths in the mosaic dataset are updated on the \\Blue\ServerData location. You can update these paths before or after the mosaic dataset (and associated files) is duplicated on the server location; see Repairing paths in a mosaic dataset.

Before publishing, ensure that you register the local and server locations as duplicate locations, and ensure that the data is duplicated and the paths are correct. Then you can publish the mosaic dataset by pointing to the location on your local machine. The server will know that the location is duplicated and therefore it won't move any data. Like scenario 1, this also makes publishing fast.

Scenario 3: There is no registered data location

Like scenario 1, this also is uncomplicated because you don't have to worry about where any of your data is and if the server can or can't access it or the correct version of it. In this scenario, all the data is packaged and moved to the server when it's published. This works great for small collections of data, but it is not recommended for medium to large collections because of the time it can take to package and move the data. You might choose this option when you don't have any access to the location used by the server, or if you're publishing a small raster dataset. But moving gigabytes or more data this way is not efficient.

Scenario 4: Only the source data is in a registered location

In this scenario, the source data location is not the same location as the mosaic dataset. This source data location can be shared or duplicated.

Example 1: shared source data

The source data location is shared on \\yellow\RasterData. On your local machine you create a mosaic dataset and add the data from \\yellow\RasterData. Then when you publish the mosaic dataset the process will include packaging the mosaic dataset and the associated files (such as the contents of the *.Overviews folder), moving it to the server, and updating the hard-coded file path locations (or ensuring their relative locations remain the same). This could take a long time if there are a lot of overviews.

Example 2: duplicated source data

The source data is duplicated—on the server it is on P:\SourceData\RasterData, and on your local machine it's on D:\RasterData.

In this example, you must ensure that the mosaic dataset is not created within D:\RasterData, because the server assumes that these two locations are duplicated, and it will not check P:\SourcData\RasterData to see if the mosaic dataset is there when publishing.

Create your mosaic dataset in a unique location, such as D:\Collections. Then when you publish the mosaic dataset the process will include packaging the mosaic dataset and the associated files (such as the contents of the *.Overviews folder), moving it to the server, and updating the hard-coded file path locations (or ensuring their relative locations remain the same).

9/1/2015