Imagery - Source mosaic datasets

Source mosaic datasets are created primarily to simplify the data ingestion process and enable parameters to be refined prior to adding the source mosaic datasets to the derived mosaic datasets. The following are typical steps that are performed:

Create a new mosaic dataset

It is recommended to set the coordinate system for each source mosaic dataset to be the same as the source data, to facilitate quality assurance/quality control testing. Ensure the data type of the derived mosaic dataset is explicitly set to 1 band, and pixel type is 32_BIT_FLOAT. This ensures that all computations are done as floats. If not set and the source is, for example, SRTM, which is 16-bit integers, then computations will be performed as integers, which will create artifacts in products such as hillshade and slope.

Add rasters

The majority of elevation sources can be ingested using the RasterDataset raster type. The exception is when using lidar data. For organizations that have very well-defined metadata standards for their elevation data, it may be advantageous to have a specialized raster type created that ingests the metadata automatically. In most cases, though, metadata is uniform for all the rasters in a collection and the method described below.

The Update Cell Size Ranges flag should be turned on when creating the source mosaic dataset.

It is recommended to not include the Update Overviews flag at this step; see considerations on creating overviews below.

Metadata

Metadata should be populated according to the recommendations listed above at the time each source mosaic dataset is populated with elevation data. Typically all data in collection has the same metadata, so the simplest method is to use the Calculate tool to fill values for all fields.

The process to create mosaic datasets and populate metadata may be automated (see the sample scripts available in ArcGIS Online). Metadata to be repeated in all records of a single collection (one source mosaic dataset) may be stored in an external configuration file. The scripts then populate the metadata values for all records.

In cases where metadata is different for each record, often the best way to ingest this is to create a separate table that contains the associated metadata values, then link the table to the footprint table of the mosaic dataset and copy across the associated values.

Calculate cell sizes

When adding rasters to a source mosaic dataset, the option to update cell size ranges should be on. This ensures that pixel sizes are computed especially if the source data already contains preexisting overviews. Note that when creating derived mosaic datasets, ensure this option is off.

Refine geometry

As described in the common workflow, this step refers to x,y adjustments to improve the horizontal accuracy of other types of image data. With regard to using elevation data within ArcGIS, it is assumed that the horizontal accuracy has been validated by the source organization, so horizontal adjustments should not be required.

For each source mosaic dataset, review any available control data to verify the input data has been properly georeferenced. If any discrepancies are identified, review metadata to ensure the spatial reference system (SRS) and vertical datum are correctly specified for the data.

Footprints and NoData

As discussed in the Preprocessing section, correct definition and handling of NoData values is important for elevation workflows. It is recommended that NoData values be defined for each data source. In this way, by default, the system will utilize data values from lower-priority rasters in cases where NoData exists, but if required, users can also lock to a specific raster and see individual rasters with NoData defined explicitly. The properties of the mosaic dataset (defined in a later chapter) are typically set so that the system will utilize NoData values for a raster and not clip elevation data to footprints.

The correct definition of footprints is still important, as they are used to optimize the search of suitable elevation datasets covering any area and are returned as geometries in metadata queries. With elevation data there are often cases where there are very large extents of NoData pixels, and it is best to have the footprint exclude these areas. This is typical for elevation data along transportation corridors, rivers, or power lines. In most cases it is best to run the Build Footprint tool with the Radiometry option set to refine the footprints from the default envelope. This will change the footprint to better approximate the data extents. Depending on the complexity of the NoData area, the number of vertices for this footprint can be increased beyond the default 20, but typically the number of vertices should be kept lower than 300.

In cases where more accurate footprints exist to define the extent of the data, these can be imported using the Import Footprint or Boundary tool.

There are cases such as some bathymetry and lidar-based projects where the source data contains extraneous data, and it is required to clip away these areas using a footprint. For such data collections, it is recommended to use the footprint to define the required areas and set the mosaic dataset to clip based on footprints. As NoData is being handled, the extents of these footprints need not be exact, and care should be taken to ensure there are not too many vertices (preferably fewer than 300) as this can affect performance.

When computing the footprint boundary, typically the boundary of the service is also updated to be the intersection of the footprints. For source mosaic datasets of elevation data, this can result in a very complex boundary polygon that has little value. In automated workflows, the boundary is generally not updated with the footprint, but instead the boundary is built using the Envelope option.

Refine radiometry

As described in the Common Workflow, for other types of image data, this step generally refers to brightness adjustments to improve color balancing. With regard to elevation data, since the pixel values represent height, this step implies adjusting height values. It is assumed that all data in a single source mosaic dataset will have the same height unit of measure and vertical datum.

Typically, for source mosaic datasets, heights do not need to be changed. It is not totally uncommon for the vertical datum (or units) of elevation to be incorrectly defined, so it is advisable to perform some quality assurance steps to check the height values against some known control. For creating derived mosaic datasets, some adjustment of height may be required to take into consideration different units and datums.

Seamlines

In a source mosaic dataset, all the data sources are from the same collection, and in most cases no seamline blending is required. There can be cases where it is required to define better blending between different datasets within a source mosaic dataset. The workflow for this process is similar to that for a derived mosaic dataset, so it will be skipped in this section.

Mosaic dataset properties

Most properties do not need to be set for the source mosaic datasets, as they are generally used only for quality assurance purposes. The properties defined for derived mosaic datasets can be applied, though, and in automated workflows this is generally done.

Overviews

It is recommended that existing lower-resolution data (for example, contiguous datasets such as GMTED and SRTM) be used to provide low-resolution views for the derived mosaic datasets. Each of these datasets is considered a source and separate source mosaic datasets are created.

Many elevation datasets will also have pyramids generated, and in many cases it is therefore not necessary to generate overviews for a source mosaic dataset. Typically the elevation data for higher-resolution imagery should not be used at very small scales, so excluding overviews has the advantage of automatically turning them off at small scales.

As noted in the standard workflow, there can be issues with pyramids in cases where the original data source has been clipped into edge-joined tiles and the columns/rows are not a power of two, which can result in some gaps at some smaller scales. In such cases, and for some large datasets such as SRTM that are cut into tiles, it is advantageous to generate overviews, even if pyramids exist. Overviews also enable users to use a WHERE clause to display imagery with a specific Dataset ID at all scales.

When creating overviews for elevation data, it is recommended to generate them with a sampling factor of 3 with nearest neighbor sampling so that no pixel shifts occur.

Note that when overviews have been created for a source mosaic dataset, it is necessary to reperform a calculate to set suitable metadata (for example, DatasetID) for the newly created datasets. Not all metadata fields may be appropriate for the overviews.

It is best to determine at which scales the data from a source mosaic dataset is no longer valid and ensure that the MaxPS values for all records are below this value. The following approximations for conversion between scale and pixel size can be used for this:

PixelSize in m = Scale factor * 0.0254/96
PixelSize in dd = Scale factor * 0.0254/96/111111
PixelSize in ft = Scale factor * 0.0254/96/0.3048

10/28/2013