Overview

Organizations with large collections of imagery have difficulties in managing and making it accessible to users within their organization or to external users and organizations. The traditional method of distributing imagery was to share files physically, but this was inefficient and created problems, especially when working with large datasets. The following topics discuss various methods of serving imagery effectively and provide an overview of the workflows that help in better image management.

Image services

Image services enable web-based access to imagery for both visualization and analysis.ArcGIS for Server provides the ability to serve individual raster datasets as image services. These image services enable a wide range of applications to quickly access imagery, because only the required data is returned to the client application. In addition to returning the original pixel value, the server can also perform on-the-fly processing, by processing the pixels as they are accessed and returning rendered and compressed versions of the imagery. The ArcGIS for Server Image Extension extends these capabilities by using mosaic datasets as a source. Mosaic datasets can reference large collections of imagery, maintain metadata, and define how images should be processed into multiple image products. The image extension enables the serving of mosaic datasets not only as searchable catalogs, but also as dynamic mosaics in which the multiple images are fused together based on predefined or user-defined rules. In this way, fast and simple access is provided to large collections of imagery.

Mosaic datasets

Mosaic datasets are the optimum data model for the efficient management and serving of large collections of imagery. Mosaic datasets catalog the rasters along with their metadata, but also enable the definition of additional metadata for each raster as well as processing to be applied when the data is accessed. The mosaic dataset can then be either directly accessed by ArcGIS applications, such as ArcGIS for Desktop, or served as image services, making them accessible to a wide range of desktop, web, and mobile devices. You can quickly access the imagery as well as associated metadata for use as a background image in applications or for detailed interpretation or analysis.

Mosaic datasets are generally created using tools in ArcGIS for Desktop. A range of tools is provided to create the mosaic dataset, add the rasters, define or refine metadata, add processing, and modify properties. These same tools can also be incorporated into scripts that can be run either in Desktop or on Server in fully automated environments.

When mosaic datasets are accessed directly by ArcGIS for Desktop, the application determines the view extent for the imagery that needs to be accessed, performs the required processing, and displays the image. When connecting through ArcGIS for Server, the client application defines the extent and scale of the imagery required and the server determines the imagery that needs to be accessed, performs the required processing, and returns only the required processed pixels to the application. As a user of these dynamic image services, you can access the rendered imagery, a catalog of the imagery, and metadata and also define additional processes to be applied to the imagery. You can also define additional queries and filters to limit which images are displayed. When there are overlapping images, you can define the mosaic method rules that control the display order of the imagery. In this way, ArcGIS performs dynamic mosaicking by combining different sources into a virtual image and on-the-fly processing by transforming the pixels to required products as they are accessed. According to the requirements (and when allowed by the administrator), you can export this processed imagery or download the original imagery.

This rich functionality enables the full information content in imagery to be accessed for visual interpretation as well as analysis.

As an imagery manager, you may want to provide your users quick access to the imagery, but preferably without creating intermediate products or using lots of different services. Users prefer simple user experiences and applications where a single layer provides access to all the appropriate imagery that they require for a task within an application. A typical user would prefer to zoom directly into their area of interest and immediately see the most appropriate imagery without needing to first define a source or search for appropriate imagery. They may also want to query metadata of the imagery to ensure that it is suitable for some analysis, or query the availability of other imagery, and optionally, change the display of the imagery. They may also want to define processing to be performed to enhance the image or perform some analysis. Such user experiences can be achieved using image services.

Image services are designed to be used directly so that the users do not need to download the imagery to their local machines. There may be valid reasons for a user to download, such as exporting a section of imagery covering a specified extent, taking imagery to the field, or downloading the original pixels for local analysis. This is allowed, but in many cases it is discouraged, as it creates multiple redundant copies of the imagery and makes image management and maintenance more complex.

Caching imagery

Map caching is an alternative method of serving imagery, by which imagery is cached as tiles that are returned to the client applications. Caching provides the most scalable method for serving imagery, because accessing pregenerated cache tiles puts very little load on a server and exploits the inherent caching capabilities of networks and browsers for static content. Caching imagery is most valuable when imagery is only required as a background and no filtering or control of the image order is required. Cached imagery is generally highly compressed and limited to 3-band 8-bit imagery, so it is not suitable for analysis. Caching does require additional storage, but in cases where the original imagery need not be accessed (and so archived away), it can also reduce the storage requirement on the server. The optimum method of creating a cache of imagery is to first create a mosaic dataset. The mosaic dataset can then be served as an image service, converted to a cache, or served as an image service with caching. When serving image services with caching, you can connect and use this default cached view of a service. This puts very little load on the services. But when you need to better interpret, query, or analyze the imagery, you may change to a dynamic image service. For more information, see How applications access and use the image service cache.

With ArcGIS 10.1 for Desktop SP1, desktop caching was added. This enables the creation of a cache from either a raster dataset or a mosaic dataset directly within ArcGIS for Desktop. This imagery cache can be used as a new compressed raster dataset served as a map service from ArcGIS for Server or through ArcGIS Online.

Geoprocessing

The third way of serving imagery is through geoprocessing services. This is generally used when you do not need to visualize the image but want the result of some analysis on the imagery. Preferably, the required analysis is performed on a server close to where the imagery is located, such that the data transfer over the networks is minimized. A typical example would be an automated feature identification or the creation of a viewshed from elevation data. Both these processes potentially require access to a large amount of imagery but return a small feature set. Geoprocessing services have the advantage that large collections of imagery can be analyzed without the user needing access to the source imagery. For such imagery-based geoprocessing, it is advantageous to use mosaic datasets or image services as the source.

Creating mosaic datasets

Mosaic datasets are used as the basis of image management. In principle, the creation of a mosaic dataset is simple.

The general steps as well as details for each of the available functions can be found in the ArcGIS Help. This guide assumes that you have a basic knowledge and experience of creating mosaic datasets from different types of imagery.

For simple collections of imagery, these general steps can be followed to create mosaic datasets (using default values) that can be used directly, served as image services, or used for caching.

Mosaic datasets can be used to manage a wide range of different imagery, from preprocessed orthos to elevation data to nonprocessed satellite imagery. The volume and number of images can quickly become massive. To enable flexibility and scalability in managing such large volumes of imagery, there are many different tools to work with mosaic datasets and many parameters that can be set to optimize them for different types of imagery. The imagery management workflows defined in this guide provide templates of best practices, including scripts and sample data for working with different types of imagery. These workflows assume that larger collections of imagery are to be managed. In such cases, it is often impractical to work with a single mosaic dataset for all imagery, so the workflows follow a pattern of using source, derived, and referenced mosaic datasets. These are described in Imagery Data Management Patterns and Recommendataions. This pattern breaks a potentially complex task into smaller tasks. It makes it easier to manage multiple sources, perform quality assurance of the mosaic datasets, and maintain the services.

This guidebook further expands on these concepts and provides greater detail in the implementation of workflows for different types of imagery. The following diagram provides an overview.

Image management workflow

Source mosaic datasets

For each collection of similar images, a source mosaic dataset is created. A source mosaic dataset could represent all imagery from a specific type of sensor or represent imagery that was acquired as a project covering a known extent or period in time. The number of images in each source mosaic dataset typically ranges from tens to hundreds of thousands of images. All imagery in a source mosaic dataset should be similar in terms of the number of bands, bit depth, and type of metadata. Typically, a single raster type is used to add all rasters to a source mosaic dataset. Imagery in a source mosaic dataset typically also has similar scales (or pixel size), but may be in different projections. A source mosaic dataset represents a single manageable unit typically used for aspects such as checking that metadata is defined correctly, defining specific processes to be applied, or doing quality assurance. Each record in the source mosaic dataset defines a dataset with specific metadata.

Typically, if modifications to the raster item (within the mosaic dataset) such as clipping images to a footprint or applying a stretch or orthorectification are required, they are defined and refined in the source mosaic dataset. The spatial reference of a source mosaic dataset should be the best choice to encompass all imagery. For example, do not use a state plane projection to contain data across an entire country. Instead, use a projection suitable to contain the entire country's data. The spatial reference system should be selected so that all the imagery to be added fits within the extent or horizon of the selected spatial reference system. The number of bands and bit depth of the source mosaic dataset are set to be suitable to contain all the data. For example, a source mosaic dataset with high-resolution satellite imagery, such as GeoEye-1, IKONOS, or QuickBird, would be defined as 4-band, 16-bit (not 1-band 8-bit). In some but not all cases, overviews may be generated for a source mosaic dataset to enable visualization at smaller scales. Source mosaic datasets are generally not made accessible to the end users or served as image services.

Source mosaic datasets do not have to be static; over time additional rasters can be added. In some workflows, source mosaic datasets are created manually, while for others the creation ofsource mosaic datasets may be fully automated.

Derived mosaic datasets

Derived mosaic datasets are created from multiple source mosaic datasets. The derived mosaic dataset typically defines an imagery product that is to be served, regardless of the source, for example, natural color imagery for visual interpretation, multispectral imagery for analysis, or digital surface models that best represent the terrain. There may be multiple sources contributing to a derived mosaic dataset. The spatial reference of the derived mosaic dataset is set to encompass all the imagery, and the number of bands and bit depth is set to be appropriate for the product. The imagery is added to the derived mosaic dataset primarily by using the Table raster type. This enables all (or a queried selection of) records from a source mosaic dataset to be added. In some cases only a subset of all the source mosaic datasets will be added to a derived mosaic dataset. For example, images with too much cloud may be excluded based on metadata provided in the source mosaic dataset. All properties and metadata from the source records are copied to the derived mosaic dataset. Optionally, additional functions can be applied to transform the data. For example, the Extract Bands function may be used to convert imagery from 4-band to 3-band or a stretch applied to convert from 16-bit to 8-bit. Multiple derived mosaic datasets may use the same source mosaic datasets. For example, a derived mosaic dataset for natural color imagery and one for enabling multispectral analysis may use the same source mosaic dataset from a high-resolution satellite.

In some cases, imagery is directly added to a derived mosaic dataset. For example, an image source such as NaturalVue (available on ArcGIS Online as an image service or cached map service providing global 15-meter resolution imagery) may be added to provide a background image for natural color imagery, or an overview image from some other source may be added to provide context at small scales. If no suitable overview exists for the derived mosaic dataset, then overviews may be built. Derived mosaic datasets do not need to be static, and over time, the source mosaic datasets from which they are derived may change or new source mosaic datasets may be added. To update the derived mosaic datasets, two different approaches can be used. Either the Synchronize Mosaic Dataset tool can be used, which checks for changes in all sources and updates any changes (see Synchronizing a mosaic dataset). Alternatively, if the process of creating the derived mosaic dataset is automated, the derived mosaic dataset can be re-created, as the process is generally very fast. Derived mosaic datasets may be directly served, but since serving a mosaic dataset can lock tables, often referenced mosaic datasets are used instead.

Referenced mosaic datasets

Referenced mosaic datasets are typically created from derived mosaic datasets. They typically reference derived mosaic datasets and define parameters that are either defaults or enforce specific rules to be applied when the imagery is accessed. For example, from a derived mosaic representing elevation data, a referenced mosaic dataset may be created to define a hillshade or slope map product. Referenced mosaic datasets are also often created to define different restrictions. For example, downloading may be restricted in one service, but enabled in another that is used for geoprocessing. Referenced mosaic datasets are also used to create subsets. For example, a referenced mosaic dataset may be defined with a limited boundary or query to limit access to a specific area or type of imagery.

Publishing image services

Referenced mosaic datasets are then published as image services, making them accessible to web applications. In addition, server functions can also be associated with each image service to enable client applications to quickly redefine functions to be applied as the imagery is accessed. Optionally, the cache is also created for an image service. Alternatively, the cache is generated from the mosaic dataset and served as part of a map service or through ArcGIS Online.

Overview of workflow steps

These defined workflows are split into sections covering aspects such as preprocessing of imagery followed by creation of the source, derived, and referenced mosaic datasets, then followed by sections on aspects of publishing and optimizing the services.

The defined workflows enable a transactional versus linear workflow concept. Since in most cases the creation of a mosaic dataset involves defining parameters, rather than creating intermediate files, the exact order in which each part of the workflow is performed is often not important. This enables parts of the full workflow to be initially skipped to quickly get a usable service, and to later refine the mosaic dataset in an iterative manner.

ArcGIS is a powerful system for image management, and as a result, there are many functions that can be run and parameters that can be set as part of the workflows to create mosaic datasets and serve them as image services. These documented workflows provide a range of recommendations related to the specific type of imagery as well as information on which functions to run and which parameters to set.

In conjunction with each of the image workflows, there is a section on the Imagery community, called Image Workflow Management, that provides an overview of each workflow and includes links to sample data as well as scripts that form the templates. The scripts make use of both Python and the ModelBuilder environment to create appropriate mosaic datasets. The sample data provided is only a small sample of typical data sources and can be replaced with your own data. The scripts call the appropriate functions with suitable parameters, and set the appropriate mosaic dataset properties. The core of the scripts is a set of scripts referred to as Mosaic Dataset Configuration Script (MDCS). This is written using Python and uses a set of configuration files to define different processes to be run as well as their parameters. Users are encouraged to review the content of the scripts along with the configuration files to better understand the setting of the parameters and make changes wherever necessary. The scripts transform the manual process of running a larger number of individual functions with appropriate parameters into a process of configuring and running one or two scripts. In most cases, the management of large collections of imagery can be achieved by suitably configuring these scripts.

The remaining topics in this section of the guide will provide an overview of each of these steps to cover aspects that are common to multiple workflows.

10/28/2013