How Dendrogram works
A dendrogram is a diagram that shows the attribute distances between each pair of sequentially merged classes. To avoid crossing lines, the diagram is graphically arranged so that members of each pair of classes to be merged are neighbors in the diagram.
The Dendrogram tool uses a hierarchical clustering algorithm. The program first computes the distances between each pair of classes in the input signature file. Then it iteratively merges the closest pair of classes and successively merges the next closest pair of classes and the succeeding closest until the classes are all merged. After each merging, the distances between all pairs of classes are updated. The distances at which the signatures of classes are merged are used to construct a dendrogram.
When the Use variance in distance calculations option is unchecked (NO_STD), the distance dmn between a pair of classes m and n is measured as a distance between their means:
where:
m and n : The IDs of classes
i : A layer number
µ : A mean of class m or n in layer i
When the variance option is checked (STD), the Dendrogram tool measures distances between pairs of classes based on their means and variances using the following formula:
where V is a variance of a class m or n in layer i.
The new statistics (means and variances) describing the merged class are based on the original mean and variance of the samples that comprise the merged class. Therefore, the merged class is produced using the pooled mean and variance. The two signatures that are used to create the merged class are replaced by a single signature of the combined class. The new mean signature is computed based on the locations in the multidimensional attribute space of all member cells of the merged class. The new signature retains the lower number of the two input classes for the merged class ID.
The value levels, or the distances at which each pair of classes is merged, can be interpolated using the scale bars of the dendrogram graph. Due to the limitation of the size of a character (the graphic's coarse resolution), the levels of merging are rounded for display. However, the precise values of the levels of merging are presented as DISTANCE in the table associated with the dendrogram.
Variances, not covariances, are used for distance computation after a pair of classes is merged. The algorithm used by Dendrogram does not use Mahalanobis distance to determine the distance between classes. Therefore, the distances between classes and the merged classes may not match the results from those grid tools that are based on Mahalanobis distance, such as Edit Signatures, Maximum Likelihood Classification, and Class Probability.
The dendrogram can be used to reduce statistical missclassification in your analysis by providing the information necessary to combine or separate data classes. If the classes in your analysis are statistically too close (that is, it may be difficult to differentiate the two classes based on their statistics), misclassifications can result. In this case, consider merging the classes. There are no definitive rules when classes should and shouldn't be merged. When should you merge classes? This depends on the heterogeneity of your study area and data, the number of classes you are trying to classify the data into, and your goals. For instance, if your study area is very heterogeneous, you have the potential for many distinct, disparate classes, thus merging classes may not be necessary. In another possible situation, your data may be more homogeneous, and you might be attempting to classify the data into too many classes. In the second situation the classes may be statistically too close, therefore, merging some of the classes may be appropriate.
If your analysis does not require detailed classes, you may wish to merge the classes to more general categories to lessen the chance for misclassifications. The dendrogram identifies which classes are statistically closest, but it is up to you, using your knowledge of the area and your goals, to determine when it is appropriate to merge classes.
For example, it might be appropriate to merge two classes if you have specified one class as general wetlands and a second class as bogs. However, the statistics determined from the training samples are very similar between the two classes; therefore, these two classes will be close in the resulting dendrogram. If you are only interested in identifying wet areas, you may wish to merge bogs into the general wetlands class.
Not only does the dendrogram identify which classes might be merged, it can also identify when it might be beneficial to add classes. If a class is statistically far from another class you may want to add classes to further refine the classification. For example, you may have specified one class as crops and a second class as grass. On the resulting dendrogram, these two classes may be far apart. However, let's assume you have a high-resolution multiband raster. If you are analyzing the agricultural output for the area, the higher-resolution data may allow you to refine the crops and grass classes into specific crop types.
Example
In the following example, classes 3 and 5 are the nearest neighbors in attribute space; therefore, they are merged at level 3.443. This value indicates the relative degree of similarity, which can also be viewed as the distance in multidimensional space. The two classes are merged and treated as a single class. The statistics for the merged class and the distances from the merged class to the other classes are computed. The next two closest classes are then identified. The two candidates are classes 4 and 6. The distance between them is 3.609, and they are merged. The process iterates. All classes are sequentially merged into bigger classes until all classes are merged into a single class.
- Settings used in the Dendrogram tool dialog box:
Input signature file : isoclust12.gsg
Output dendrogram file : isodendro.txt
Use variance in distance calculations : {default}
Line width of dendrogram : 78
The output dendrogram file would be as follows:
Distances between pairs of combined classes (in the sequence of merging): Remaining Merged Between-Class Class Class Distance ---------------------------------- 3 5 3.442680 4 6 3.608904 7 9 3.899360 2 7 3.795288 3 4 4.883098 2 8 6.073256 1 3 6.257798 1 2 9.350019 ---------------------------------- Dendrogram of /discb/topdir/myspace/isoclust12.gsg C DISTANCE L A S 0 1.0 2.1 3.1 4.1 5.2 6.2 7.2 8.3 9.3 S |-------|-------|-------|-------|-------|-------|-------|-------|------ 5 -------------------------| |----------| 3 -------------------------| | |----------| 6 ---------------------------| | | |--------| |-------------------| 4 ---------------------------| | | | | 1 -----------------------------------------------| | |- 9 -----------------------------| | | | 7 ---------------------------------------------| | | | | 2 ------------------------------| |---------------------| | 8 ---------------------------------------------| |-------|-------|-------|-------|-------|-------|-------|-------|------ 0 1.0 2.1 3.1 4.1 5.2 6.2 7.2 8.3 9.3