Imagery - Data sources and formats

Each workflow specializes in a type of imagery, such as high-resolution satellite imagery or imagery from frame cameras, or elevation. The Data sources chapter in each workflow will provide information about types of data and how the imagery can be obtained as well as recommendations concerning the format and suitable metadata.

The general recommendation for working with imagery is to leave the imagery in its original form. When imagery is processed in such a way that the pixels are sampled (for example, to change the projection), this generally leads to degradation in quality, possible artifacts, creation of NoData areas, and additional management issues due to additional copies of the imagery. In some cases it is advisable to change the format of the imagery. This does not involve sampling the imagery, but it does result in data duplication. The original data is often then archived. Including lossy or lossless compression in the format conversion is optional. When using imagery that may be used for analysis or high-quality interpretation, it is preferable (and sometimes necessary) to ensure that the pixel values do not change.

Many traditional workstation imagery applications work on the concept of first reading the complete image into memory and then optionally allowing a user to make changes to the image before saving it. This is not a scalable approach. To enable scalability for large numbers of large images, ArcGIS reads on demand only the appropriate imagery from the disk system. The performance of the disk system as well as the format of the imagery can have a significant effect on the performance of the imaging system. The following four factors have the greatest influence on the suitability of a format.

Tiling of the rasters—In many cases rasters are stored as a simple (raw) array of pixels on disk. When the rasters are large and only a small extent of an image needs to be read, the system can skip quickly to the appropriate row in the file, but it will still need to read the complete row due to the way files are broken and read as blocks on disk. On such larger files (more than 3,000 columns) it is advisable to use a tiled image format that breaks the image internally into smaller tiles (typically 256x256 pixels), making it faster to access a group of pixels that might represent a rectangular extent. Most modern file formats are either tiled or include options for tiling. The TIFF format can be tiled or nontiled. By default, ArcGIS writes a tiled TIFF.

Volume of data required to be read—For any group of pixels to be displayed, it must be read from disk. If the data volume to be read is reduced, this can improve performance. Compression of imagery can substantially reduce the size of the data read. Typically, a natural color image can be compressed 5–10 times with negligible difference in image quality. Such compression substantially reduces the volume of data read from disk and can have a positive influence on performance, especially on systems with slower drive systems. Lossless compression algorithms, such as LZW, can be effectively used to compress imagery that contains a large number of NoData values, but generally provide minimal compression of optical imagery. Lossy compression algorithms can substantially reduce data volumes but add unacceptable artifacts to imagery, especially if it is to be used for analysis. Compressing imagery also increases the CPU load for reading the imagery, and some algorithms are much more CPU efficient than others, so some compression formats are better than others.

Amount of processing power required to decompress the image—The compression of imagery requires that the system decompress the imagery before any processing can be applied. Some compression formats, such as JPEG 2000, are CPU intensive to decompress. For workstation applications where the complete image is first read into memory or in applications that just stream pixels (or wavelets) without any processing, such compressions are useful, but they are not recommended for use on servers that perform processing. Wavelet-based compression is also not recommended for streaming to web applications, as web browsers cannot natively read wavelet-compressed images; therefore, they require special plug-ins that reduce usage. If imagery is suitable for lossy compression, then JPEG compression (not JPEG format) is recommended, as this is CPU inexpensive to decompress due to the relative simplicity of the algorithms. The ubiquitous use of JPEG compression for many years has also resulted in the codecs being hardwired into many CPUs. Although JPEG compression does not provide quite as high compression as some of the wavelet compression algorithms, the difference is relatively minor. Typically, an 8-bit natural color image can be compressed about 8 times using JPEG compression, which gives similar quality to a wavelet-compressed image with 11 times compression. In relation to the 8 times compression, the additional 30 percent compression provided by wavelet compression algorithms is not warranted, since the CPU load to read formats such as JPEG 2000 can be 4–10 times greater. For lossless compressed imagery, LZW compression provides good compression with minimal CPU load. Other lossy compression algorithms can provide a bit more compression, but the additional CPU load generally outweighs the small increase in compression. ArcGIS does use a special Limited Error Raster Compression (LERC) that provides high compression of elevation data with control of the precision of the data returned and minimal CPU load. LERC is most valuable when working with elevation data such as lidar. It is currently not available in a file format, but it is used internally and can be used for the efficient transmission of floating point data over networks.

Existence of pyramids—Pyramids are reduced-resolution datasets stored with imagery that are used to read imagery at lower resolutions. Pyramids are recommended especially for larger files. Pyramids can often be compressed even if the base data is not compressed. More details about creating pyramids are defined in the preprocessing section below.

Location and type of metadata—When a file is opened to read the pixels, it is often necessary to also read the metadata of the file to obtain georeferencing information and ensure properties such as the spatial reference have not been changed. Therefore, the way in which the metadata is stored with the file can have an influence on the access speed. Formats such as GeoTIFF store metadata in tags that can be quickly accessed. ArcGIS will usually write metadata to a small .aux.xml file stored next to the file to enable faster access while providing an extensible method for storing additional metadata.

Recommended imagery formats

If imagery is to be converted, follow these recommendations:

For 8-bit or 16-bit, 1-, 3-, or 4-band imagery where lossy compression is not suitable, use TIFF, tiled 128 or 256. If there are large NoData areas in the image, use LZW compression.

For 8-bit, 3-band natural color imagery that is already preprocessed by orthorectification, color balanced, mosaicked, and cut into tiles, this imagery is only used primarily as background imagery and it is generally optimum to convert directly to a map cache or store as TIFF with JPEG YCBCR compression, which is also tiled. Typically, a quality value of around 80 is used, which provides approximately 8 times compression. YCBCR-based JPEG compression internally converts the image to a different spectral domain, improving the compression.

For 16-bit or 32-bit 1-band elevation data, use TIFF, LZW compression, tiled 128 or 256. For 16-bit elevation, be sure that JPEG is not used.

For 8- or 16-bit imagery where lossy compression is suitable, use TIF, JPEG compression. The quality factor should be checked by testing on some sample imagery. In many cases, a quality factor of 90 is suitable. Note that ArcGIS supports a 12-bit version JPEG. Therefore, when compressing 16-bit pan imagery using JPEG, only the first 12 bits of the imagery will be used. Many modern sensors have a sensitivity in the range of 11 to 14 bits, and using 12-bit compression maintains the majority of the image content but excludes the last (often noisy) bits.

For 8-bit or 16-bit 3-band non-natural color imagery (such as false color imagery or scanned maps), when lossy compression is suitable, use TIFF, with JPEG (RGB) compression. In RGB JPEG compression, each band is compressed separately.

For 8-bit or 16-bit 4-band RGBI that is often captured by modern digital sensors, If, as above, the data has been orthorectified and enhanced, then some of the original image information has been lost, potentially limiting its use for some forms of analysis. For such imagery, lossy compression may be suitable, but care should be taken to quantify the effects on any intended future analysis. It is then recommended to convert such imagery into a 3-band RGB and 1-band NIR image and use the above recommendations for compressing each. Splitting into a separate RGB image enables better compression, and most users will likely access the RGB component more than the IR. In ArcGIS one can easily virtually remerge the two files to create a RGBIR image suitable for displaying as false color or computing NDVI. Typically for such imagery the compression quality is set higher to 90 or 95 so that compression does not add significant artifacts to NDVI.

Many sensors include 4-band multispectral imagery and 1-band higher-resolution pan imagery. If maintaining the IR band, it is recommended to not pre-pan sharpen such imagery, because as the pan image changes the multispectral properties of the bands, the pan-sharpening process will significantly increase the file sizes and actually reduce the suitability of the imagery for analysis. Instead, use the capability of ArcGIS to pan-sharpen on the fly, which is performed very fast and ensures that the integrity of the spectral bands is not lost and can be used more accurately for analysis purposes. If it is required to compress the imagery to reduce size, then consider only compressing the Pan band using JPEG compression. The pan band is typically much larger than the multispectral image and is not used for spectral analysis. Limited JPEG compression (for example, Q90) has minimal effect on visual interpretation or computation of tie points or DSM generation.

Note that ArcGIS supports BigTIFF, which enables the size of a TIFF file to be larger than the original limitation of 4 GB. It is relatively rare to need to use BigTIFF for imagery from sensors, especially if pan-sharpening is performed on the fly or if compression is used. BigTIFF is the most useful for the storage of large processed rasters such as elevation.

When using JPEG compression, the recommended quality values to use can range from 80 to 95. It is best to try different factors on sample images and review the differences to determine an optimal value.

Cutting large images into tiles

It is generally not advisable to cut single image datasets into smaller edge-joined images. Edge-joining images can cause artifacts to form at the edges of images. These artifacts can be caused when images are reprojected the resampling required some pixels around the pixel being processed. Pyramids also become more problematic with edge-joined images as gaps at the edges of tiles can occur due to images not having a size that is a power of two. Artifacts can also become apparent if imagery is processed, such as applying convolution filters or creating hillshades on elevation, which also require edge pixels.

Reformatting imagery

If imagery is not in an optimal format and it is possible to reformat the files to an optimal format, then it is generally recommended.

The following formats should generally be reformatted:

.jpg—JPEG files larger than 3,000 columns are slower to read, because JPEG is not tiled; therefore, access to the last pixel of the file requires the complete file to be decompressed. When converting to TIFF with JPEG compression, try to use the same quality factor and type (YCBCR or RGB) as the original data.

.asc—ASCII text files, sometimes used to store elevation, are inherently slow to read as they are unnecessarily large and need to be interpreted.

.dem—Internally, some variations of the format store the numbers as ASCII.

.jp2—It is recommend to test difference of performance after converting a sample file to TIFF. There are many variants of JPEG 2000. Some can be very costly to decompress or to access the pyramids. In some cases, it is advisable to leave the format as JP2, but create an additional set of pyramids (optionally skipping the first pyramid level).

.ecw—This proprietary format has limitations on use in a server environment. The format is often used for preprocessed imagery, so a more optimum format for storage and serving is a map cache. Conversion may result in the file size increasing by about 30-40 percent, but it will be in web optimal format. Since wavelet artifacts are different from JPEG artifacts, the conversion of highly compressed to ECW to JPEG often results in unnecessary additional artifacts. Where possible, it is advantageous to obtain and compress the original imagery if available.

.flt—Sometimes used for elevation data. Conversion is recommended if the number of columns is greater than 2,000.

.tif—If untiled (for example, the raw format from most data providers), it is advisable to convert to tiled TIFF. A typical example would be imagery from most satellite vendors, which generally deliver TIFF files as RAW and untiled to be compatible with legacy systems that may not be able to handle tiled TIFF. Tiling these files will improve performance and put reduced load on disk storage systems.

It is generally not necessary to reformat the following formats:

.nitf—Generally read quickly, and extensive metadata support and complexity make other formats unsuitable. There are some formats of NITF that have forms of JPEG 2000 compression pyramids that are very slow to read, in which case consider creating additional pyramids. In some cases, to improve performance it is necessary to convert to a different compressed NITF format.

.sid (MrSID)—The MG2 and MG3 versions of the format are read relatively quickly and include pyramids. The MG4 format appears to be slower to read and it is advisable to test.

Reformatting imagery is not always necessary and may not be required at the start of a project, when time to load the imagery, additional data storage, or support by legacy systems is a concern. It is often advantageous to leave the imagery in its nonoptimized form initially but make the imagery accessibility faster, then at a later time, optimize the format of the imagery in place. If the imagery is originally TIFF, this can be achieved without change to the mosaic datasets. If originally another format, then is it also possible to convert the format and change the name of the reference in the mosaic dataset so that other properties of the mosaic dataset need not be changed.

The best way to reformat imagery is usually to use the Copy Raster tool in batch (or ModelBuilder or Python). A frequent alternative is to use the Compress Imagery tool that is available on the imagery gallery of the ArcGIS Resources website.

Storage system performance

In an imagery environment with large volumes of imagery, it is not possible to read the complete imagery into memory, so it is necessary for the system to read the imagery as required from the disk system on demand. Therefore, the performance of the disk system is important and becomes more significant when using formats that are not tiled and not compressed, as they put more load on the disk. Naturally, a larger number of users simultaneously accessing the server puts additional load on it. When the imagery is stored on a different location to the server and connected via a network, the network can quickly become a significant bottleneck. Many issues related to poor performance of an image server implementation are related to poor storage systems. The performance of disk subsystems varies considerably, and in most cases the performance is either very good or very bad. Unlike CPU and memory performance, where the difference is generally measured in percentage, the performance of different storage systems can vary often by a factor of 10. Though, the price of the disk system is not a good measure of performance or suitability for image server tasks. The following are some general recommendations:

For smaller dedicated systems—Direct Attached Storage (DAS) systems generally provide high performance at lowest cost, but provide limited scalability. In some cases, it can be advantageous to have a server configured specifically for imagery and have the imagery on a dedicated DAS. To scale by a factor of two and/or have 100 percent redundancy, duplicate the server and mirror the DAS. This method has limited scalability.

For larger systems—Using a NAS (or a SAN with a NAS head) is generally recommended. This enables simpler scaling in the number of servers. Often it is advisable to have a separate NAS dedicated to imagery. This enables the NAS to be connected to the servers using a dedicated switch and NICs or using a dedicated fiber channel or similar, and also removes potential contention with other traffic on the network.

In a cloud infrastructure, NAS solutions are often not available since they are inherently not truly elastic, in which case it is often recommended to use a combination of DAS (ephemeral) storage for imagery that is most likely to be accessed and a SAN (for example, EBS) type storage for the remainder of the data.

Most storage solutions include RAID options to provide redundancy in case of drive failure, as well as improved performance, and are typically configured as RAID 3 or 5.

10/28/2013