Memory tuning in PostgreSQL

Settings in postgresql.conf

To facilitate loading large amounts of data into a PostgreSQL database, you may need to increase the shared_buffers and max_locks_per_transaction values in the postgresql.conf file.

shared_buffers

The shared_buffers parameter designates the amount of memory used for shared memory buffers. The PostgreSQL documentation indicates that, for performance reasons, you likely need to use a setting greater than the minimum value of 128 KB or 16 KB times the number set for the max_connections value. It is recommended that shared_buffers be set to use several tens of megabytes for production installations.

When loading large amounts of data, you will most likely need a higher shared_buffers setting than the default value of 32 MB. After altering this parameter in the postgresql.conf file, you must restart the database cluster.

max_locks_per_transaction

The max_locks_per_transaction value indicates the number of database objects that can be locked simultaneously. In most cases, the default value of 64 is sufficient. However, when loading a large number of datasets (for example, several thousand) at once, the number of concurrent object locks for the transaction can exceed 64.

It is not a one-to-one relationship between concurrent locks and the number of datasets; in other words, if you are loading 3,000 datasets, you don't need to increase the max_locks_per_transaction to 3,000. Start by uncommenting the max_locks_per_transaction property and increasing its value to 100 prior to bulk loads.

When you change the max_locks_per_transaction parameter value, you must restart the server.

Increasing the value of either of these parameters could cause the database to request more shared memory than your operating system has available. For information on how you can increase the shared memory setting for your operating system, see "Managing Kernel Resources" in the PostgreSQL documentation.

cursor_tuple_fraction

The cursor_tuple_fraction value is used by the PostgreSQL planner to estimate what fraction of rows returned by a query are needed. By default, cursor_tuple_fraction is set to 0.1 in the postgresql.conf file, which means the first 10 percent of rows (or features) in the result set are returned rapidly, but the remainder take longer to be returned.

If you want to alter what percentage of the result set the PostgreSQL planner displays, you can alter the value for the cursor_tuple_fraction property in the postgresql.conf file.

Improving third-party SQL spatial query performance

When you execute SQL queries outside of ArcGIS that return the ST_Geometry spatial columns from a business table, it could improve query performance if you set a system environment variable, ST_GEOMETRY_OUTPUT_FORMAT, to output to the ST_Geometry type rather than the extended Well-Known Text (WKT) representation.

By default, the ST_GEOMETRY_OUTPUT_FORMAT is set to ST_GEOMETRY, which means a hexabinary representation is returned. This is required to create a usable backup of the geodatabase. You can change this variable to TYPE if you want to improve SQL query performance. If you set this variable because you plan to do a number of spatial SQL queries, be sure to remove it after you have finished your queries, then restart the PostgreSQL database cluster.

The variable should be set on the computer where PostgreSQL is running.

For a Linux OS, set the variable for the shell from which you are issuing SQL queries. For a bash shell, the syntax is as follows:

ST_GEOMETRY_OUTPUT_FORMAT=TYPE

For a csh shell, the syntax is

setenv ST_GEOMETRY_OUTPUT_FORMAT TYPE

On Windows, create a system environment variable in the system properties.

Variable name: ST_GEOMETRY_OUTPUT_FORMAT

Variable value: TYPE

After setting the variable, you must restart the PostgreSQL database cluster.

8/21/2013