Find Similar Locations

Find Similar Locations

The Find Similar Locations task measures the similarity of candidate locations to one or more reference locations.

Based on criteria you specify, Find Similar Locations can answer questions such as the following:

To answer questions such as these, you provide the reference locations (the inputLayer parameter), the candidate locations (the searchLayer parameter), and the fields representing the criteria you want to match. For example, the inputLayer might be a layer containing your top performing stores or the villages hardest hit by the disease. The searchLayer contains your candidate locations to search. This might be all of your stores or all other villages. Finally, you supply a list of fields to use for measuring similarity. The Find Similar Locations task will rank all of the candidate locations by how closely they match your reference locations across all of the fields you have selected.

Request URL

http://<analysis url>/FindSimilarLocations/submitJob

Request parameters

Parameter

Description

inputLayer

(Required)

The inputLayer contains one or more reference locations against which features in the searchLayer will be evaluated for similarity. For example, the inputLayer might contain your top performing stores or the villages hardest hit by a disease.

Syntax: As described in detail in the Feature input topic, this parameter can be one of the following:

  • A URL to a feature service layer with an optional filter to select specific features
  • A URL to a big data catalog service layer with an optional filter to select specific features
  • A feature collection

Examples:

  • {"url": <feature service layer url>, "filter": <where clause>}
  • {"layerDefinition": {}, "featureSet": {}, "filter": <where clause>}

It is not uncommon for inputLayer and searchLayer to be the same feature service. For example, the feature service contains locations of all stores, one of which is your top performing store. If you want to rank the remaining stores from most to least similar to your top performing store, you can provide a filter for both inputLayer and searchLayer. The filter on inputLayer would select the top performing store, while the filter on searchLayer would select all stores except for the top performing store. You can use the optional filter parameter to specify reference locations.

If there is more than one reference location, similarity will be based on averages for the fields you specify in the analysisFields parameter. For example, if there are two reference locations and you are interested in matching population, the task will look for candidate locations in searchLayer with populations that are most like the average population for both reference locations. If the values for the reference locations are 100 and 102, for example, the task will look for candidate locations with populations near 101. Consequently, you will want to use fields for the reference locations fields that have similar values. If, for example, the population values for one reference location is 100 and the other is 100,000, the tool will look for candidate locations with population values near the average of those two values: 50,050. Notice that this averaged value is nothing like the population for either of the reference locations.

searchLayer

(Required)

The layer containing candidate locations that will be evaluated against the reference locations.

Syntax: As described in detail in the Feature input topic, this parameter can be one of the following:

  • A URL to a feature service layer with an optional filter to select specific features
  • A URL to a big data catalog service layer with an optional filter to select specific features
  • A feature collection

Examples:

  • {"url": <feature service layer url>, "filter": <where clause>}
  • {"layerDefinition": {}, "featureSet": {}, "filter": <where clause>}

analysisFields

(Required)

A list of fields whose values are used to determine similarity. They must be numeric fields, and the fields must exist on both the inputLayer and the searchLayer. Depending on the matchMethod selected, the task will find features that are most similar based on values or profiles of the fields.

mostOrLeastSimilar

(Required)

The features you want to be returned. You can search for features that are either most similar or least similar to the inputLayer, or search both the most and least similar.

Values: MostSimilar | LeastSimilar | Both

Example: "mostOrLeastSimilar": "MostSimilar"

matchMethod

(Required)

The method you select determines how matching is determined. The AttributeValues method uses the squared differences of standardized values. This is the default. The AttributeProfiles method uses cosine similarity mathematics to compare the profile of standardized values. Using AttributeProfiles requires the use of at least two analysis fields.

Values: AttributeValues | AttributeProfiles

Example: "matchMethod": "MostSimilar"

numberOfResults

The number of ranked candidate locations output to similarResultLayer. If numberOfResults is not set, the 10 locations will be returned. The maximum number of results is 10000.

Example: "numberOfResults": 15

appendFields

Optionally add fields to your data from your search layer. By default, all fields from the search layer are appended.

Examples:

  • "appendFields": "Id Number"
  • "appendFields": "Id Number, Code"

outputName

(Required)

The task will create a feature service of the results. You define the name of the service.

context

Context contains additional settings that affect task execution. For this task, there are three settings:

  • Extent (extent)—A bounding box that defines the analysis area. Only those features that intersect the bounding box will be analyzed.
  • Processing spatial reference (processSR)—The features will be projected into this coordinate system for analysis.
  • Output spatial reference (outSR)—The features will be projected into this coordinate system after the analysis to be saved. The output spatial reference for the spatiotemporal big data store is always WGS84.
  • Data store (dataStore)— Results will be saved to the specified data store. The default is the spatiotemporal big data store.

Syntax:
{
"extent" : {extent},
"processSR" : {spatial reference},
"outSR" : {spatial reference},
"dataStore":{data store}
}

f

The response format. The default response format is html.

Values: html | json

Response

When you submit a request, the service assigns a unique job ID for the transaction.

Syntax:
{
"jobId": "<unique job identifier>",
"jobStatus": "<job status>"
}

After the initial request is submitted, you can use jobId to periodically check the status of the job and messages as described in Checking job status. Once the job has successfully completed, use the jobId to retrieve the results. To track the status, you can make a request of the following form:

Accessing results

When the status of the job request is esriJobSucceeded, you can access the results of the analysis by making a request of the following form:

http://<analysis url>/FindSimilarLocations/jobs/<jobId>/results/output?token=<your token>&f=json

Parameter

Description

output

Contains features from the inputLayer and the searchLayer. The number of features from the searchLayer is based on the value of the numberOfResults parameter. Fields added to outputName include all the fields from the searchLayer and the following:

  • location_type—A string denoting if the feature was an input reference location or a search location.
  • simrank—The similarity rank. Contains the rank for search locations, where 1 equals the candidate location most similar to the reference locations. Contains zero for reference locations.
  • dissimrank—The dissimilarity rank. Contains the rank for search locations, where -1 equals the candidate location most dissimilar to the reference locations. Contains zero for reference locations.
  • simindex—Quantifies how similar a candidate is compared to the reference locations. If a candidate location matches a reference location exactly, the value is zero. The larger the value, the more dissimilar a candidate is from the reference locations.
  • cosimindex—Quantifies how similar a candidate's profile is compared to the reference location's profile. If the profile of a candidate location matches the profile of a reference location exactly, the value is zero. The larger the value, the more dissimilar a candidate is from the reference locations.
  • labelrank—Used for labelling and rendering the outputs. The higher the values, the more similar; the lower the values, the more dissimilar. Values of 0 represent the reference locations.
Request example:
{"url": 
"http://<analysis url>/FindSimilarLocations/jobs/<jobId>/results/output"}

The result has properties for parameter name, data type, and value. The contents of value depend on the outputName parameter provided in the initial request. The value contains the URL of the feature service layer.

{
"paramName":"output", 
"dataType":"GPRecordSet",
"value":{"url":"<hosted featureservice layer url>"}
}

See Feature output for more information about how the result layer is accessed.

7/5/2017