XML columns in a geodatabase in DB2
XML is an open standard for defining data elements within documents. To store XML data in an IBM DB2 database, you can use ArcSDE XML or native DB2 (PureXML) XML data types.
You can store user-defined XML documents in either type of XML. The XML_COLUMN_STORAGE DBTUNE parameter controls the type of XML used. By default, this parameter is set to DB_XML. Therefore, by default, the PureXML data type is used.
You must install DB2 Text Search Extender to use the ArcSDE XML data type.
If you have been using Net Search Extender, you must migrate to Text Search Extender. See the DB2 Information Center for details (http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/index.jsp?topic=/com.ibm.db2.luw.admin.ts.doc/doc/c0053115.html).
XML columns in ArcGIS for Desktop
XML columns are not completely supported in the geodatabase. Therefore, the following are true:
- You cannot preview the contents of an XML column in a table in ArcCatalog. The column will be blank or have unintelligible characters in it.
- On the Fields tab of the table Properties dialog box, no data type is listed for an XML column.
- You cannot copy and paste a table containing an XML column from one geodatabase to another.
- If you import or export the table using geoprocessing tools, you have to exclude the XML column.
- You cannot register as versioned a table with an XML column using ArcCatalog.
- If you open a table that contains an XML column in ArcMap, the XML column appears blank.
- To edit a table that contains an XML column in ArcMap, you can use a nonversioned editing session. However, you cannot edit the XML column itself, only the other fields in the table.
XML columns in a DB2 DBMS
If you use the native DB2 PureXML, there are no extra geodatabase system tables to track it; it is tracked in DB2's system tables. For information on DB2's PureXML, consult IBM's DB2 documentation set.
There are two different types of XML columns you can use: ArcSDE XML or DB2 PureXML. Which one you use in a dataset you create in your geodatabase is controlled by the XML_STORAGE parameter in the DBTUNE table.
For ArcSDE XML, there are three ArcSDE geodatabase system tables used to manage XML columns: SDE_XML_COLUMNS, SDE_XML_INDEX_TAGS, and SDE_XML_INDEXES. These tables are owned by the geodatabase administrator user. There are two additional tables for each XML column that are used to store and index XML documents: the SDE_XML_DOC<COLUMN_ID> and SDE_XML_IDX_<COLUMN_ID> tables. These tables are owned by the user who owns the business table containing the XML column.
The following tables are created when using ArcSDE XML. These tables are used to store and index XML documents.
Do not alter any of these tables using SQL.
SDE_XML_COLUMNS
When you add an ArcSDE XML column to a business table, a row is added to the SDE_XML_COLUMNS table. This table occurs once in each ArcSDE database.
Field name |
Field type |
Description |
Null? |
---|---|---|---|
COLUMN_ID |
INTEGER |
The identifier of the XML column and the primary key for the table This value is assigned by ArcSDE at the time the ArcSDE XML column is created. |
NOT NULL |
REGISTRATION_ID |
INTEGER |
The identifier of the business table containing the ArcSDE XML column and foreign key to the TABLE_REGISTRY system table |
NOT NULL |
COLUMN_NAME |
VARCHAR(32) |
The name of the ArcSDE XML column in the business table |
NOT NULL |
INDEX_ID |
INTEGER |
The identifier of the XPath index associated with the ArcSDE XML column (if one exists) and foreign key to the SDE_XML_INDEXES table |
|
MINIMUM_ID |
INTEGER |
The value of the initial number used in the ArcSDE XML column of the business table to identify individual XML documents |
|
CONFIG_KEYWORD |
VARCHAR(32) |
The DBTUNE configuration keyword for which the parameters determine how the XML document, the XML XPath index tables, and the text indexes created on those tables are defined in the database For more information on DBTUNE keywords and their parameters, see What are DBTUNE configuration keywords and parameters? and DBTUNE configuration parameters for DB2. |
|
XFLAGS |
INTEGER |
A value indicating whether the original documents in the XML document table are stored compressed or decompressed By default, documents are compressed; compressed documents provide better performance. |
SDE_XML_INDEXES
This table occurs once in each geodatabase. It contains one row for each ArcSDE XML column that has an XPath index.
Field name |
Field type |
Description |
Null? |
---|---|---|---|
INDEX_ID |
INTEGER |
The identifier of the XPath index and the primary key for the table |
NOT NULL |
INDEX_NAME |
VARCHAR(32) |
The name of the XPath index |
NOT NULL |
OWNER |
VARCHAR(32) |
The database user who owns the ArcSDE XML column |
NOT NULL |
INDEX_TYPE |
INTEGER |
A value indicating the type of XPath index With ArcSDE 9.1 and later releases, the value is 2 for the index type SE_XML_INDEX_DEFINITION and 1 for the index type SE_XML_INDEX_TEMPLATE. |
NOT NULL |
DESCRIPTION |
VARCHAR(64) |
Text identifying the XPath index If an index definition file was used to create the index, the index description could be specified at the top of the file. |
SDE_XML_INDEX_TAGS
An ArcSDE XML column might optionally have an XPath index, which lets you search the content of a specific XML element or attribute in each document. The definition of which elements and attributes are included in or excluded from each XPath index is recorded in this table.
This table occurs once in each geodatabase. It contains one row for each XPath associated with the XPath index of an ArcSDE XML column.
Field name |
Field type |
Description |
Null? |
---|---|---|---|
INDEX_ID |
INTEGER |
The identifier of the XPath index associated with an ArcSDE XML column (if one exists) and foreign key to the SDE_XML_INDEXES table |
NOT NULL |
TAG_ID |
INTEGER |
The identifier of an XPath or tag |
NOT NULL |
TAG_NAME |
VARCHAR(1024) |
An absolute XPath identifying an XML element or attribute that may occur in an XML document For example, /metadata/mdDateSt identifies an XML element and /metadata/dataIdInfo/tpCat/TopicCatCd/@value identifies an XML attribute. These XPaths must not contain asterisks (*) to refer to a group of XML elements or attributes—each element or attribute is matched exactly using the XPaths specified in this table. Elements cannot be named * in a valid XML document. |
NOT NULL |
DATA_TYPE |
INTEGER |
A value indicating whether the XML element or attribute is indexed as a number, a varchar, or text A 1 indicates the content of the tag is indexed as text; a 2 indicates the content of the tag is indexed as a number; a 3 indicates the content of the tag is indexed as a varchar. |
NOT NULL |
TAG_ALIAS |
INTEGER |
A number that may be used to identify an XPath For example, the Z39.50 communication protocol uses numeric codes to refer to content that may be searched. |
|
DESCRIPTION |
VARCHAR(64) |
Text identifying the content that should be contained in the XML element or attribute |
|
IS_EXCLUDED |
INTEGER |
A value indicating whether the XML element is included in or excluded from the XPath index
|
NOT NULL |
SDE_XML_DOC<COLUMN_ID>
The SDE_XML_DOC<COLUMN_ID> table stores the XML document and maintains a full-text index on the content of the XML document. The geodatabase contains one of these tables for each ArcSDE XML column. The number in the table name is the ArcSDE XML column identifier. This table contains one row for each XML document stored in the ArcSDE XML column.
Field name |
Field type |
Description |
Null? |
---|---|---|---|
SDE_XML_ID |
INTEGER |
The identifier for an XML document stored in the ArcSDE XML column and primary key for the table |
NOT NULL |
DOC_PROPERTY |
INTEGER |
A value indicating whether any conflicts were found when adding the content of an XML document to the XPath index 1 = A conflict was found, for example, when an element is supposed to be indexed numerically but the document contains a string in that element instead. NULL value = There were no problems indexing the document. |
|
XML_DOC |
BLOB |
The XML document |
NOT NULL |
XML_DOC_VAL |
BLOB |
The content of the entire XML document with all XML tags and other markup removed A text index is built on this column by default; this index is used to respond to full-text queries. |
SDE_XML_IDX<COLUMN_ID>
The SDE_XML_IDX<COLUMN_ID> table is created for ArcSDE XML columns that have an XPath text index. This table stores the text or number content for each XPath that is indexed.
The ID number in the table name is the internal registration number for the ArcSDE XML column.
Field name |
Field type |
Description |
Null? |
---|---|---|---|
XML_KEY_COLUMN |
INTEGER |
The identifier for the indexed value and primary key for the table |
NOT NULL |
SDE_XML_ID |
INTEGER |
The identifier for the XML document that contains the indexed value |
NOT NULL |
TAG_ID |
INTEGER |
The identifier for the tag associated with the XPath index of the ArcSDE XML column, which identifies where in the document the value is stored |
NOT NULL |
DOUBLE_TAG |
FLOAT |
The indexed value, when the tag is defined as DOUBLE in the XPath index definition |
|
STRING_TAG |
VARCHAR2(256) |
The indexed value, when the tag is defined as VARCHAR in the XPath index definition |
|
TEXT_TAG |
CLOB |
The indexed value, when the tag is defined as STRING in the XPath index definition |
The following is a diagram of a table with an ArcSDE XML column and the system tables used to track it. Dashed lines indicate implicit relationships; a solid line denotes explicitly defined relationships between tables.
By default, when XML documents are stored in an XML column in a geodatabase in DB2, the text indexes associated with the column are not automatically updated to include text from the new documents. Until the text indexes have been updated, any XML documents that have not been indexed cannot be found by a search.
ArcSDE does not provide any utilities that can be used to manually trigger updates to an XML column's text indexes.
XML columns in an XML document
You cannot export a table containing an XML column to an XML workspace document. You can export it to an XML recordset document, but there is nothing within the document to distinguish the column as XML.