XML Tutorial - XML Master Professional Application Developer Edition
Volume 6 : XML Data Transfer / XML Security

Tatsuya Kimura

Scope of Questions Appearing in Section 5

Some type of communications protocol and security technology is required when exchanging XML format files over the Internet and other networks. Section 5 of the exam tests your knowledge related to these communications protocols and security technologies. Specifically, the following six specifications appear on the exam:

SOAP 1.1

SOAP 1.1 is an XML-based standard protocol that defines the message specification when transmitting XML documents via a network. Since this message specification does not depend on a particular programming language or OS, data transfer can be conducted among and between systems that use different languages or operating systems.

WSDL 1.1

WSDL 1.1 is a specification for coding network services-related information (access point and interface specifications, etc.) in XML. Note, however, that WSDL does not define a protocol when sending/receiving messages. SOAP is generally used as the protocol for message transmission.

XML-Signature Syntax and Processing

XML-Signature Syntax and Processing ("XML-Signature," below) is a specification for adding digital signatures to XML documents. When exchanging data for business use via the Internet, digital signatures are necessary to prove that data has not been altered, as well as to prove the identity of the sender. XML-Signature is the mechanism standardized as a specification for attaching digital signatures to XML documents.

XML Encryption Syntax and Processing

XML Encryption Syntax and Processing ("XML Encryption," below) is a specification for encrypting XML documents. This specification defines the encryption mechanism used when exchanging data over the Internet.

Canonical XML Version 1.0

Canonical XML Version 1.0 ("Canonical XML," below) is a specification that provides the means for serializing an XML document in normalized format. Based on this specification, if the results of serializing two XML documents in normalized format are identical, then the original two XML documents are considered to be logically equivalent.

Exclusive XML Canonicalization Version 1.0

Exclusive XML Canonicalization Version 1.0 ("Exclusive XML Canonicalization," below) is a specification that provides a method for normalizing portions of XML documents, as opposed to Canonical XML, which defines the normalized format for entire XML documents. This specification is based on Canonical XML, adding changes related to XML namespaces handling.

Adding Digital Signatures and XML Document Normalization

XML document normalization through Canonical XML or Exclusive XML Canonicalization is indispensable when adding digital signatures to XML documents based on XML-Signature. Let me explain the reasons

XML description format allows for comparative flexibility. For example, if the phone element has no content, you can either write "<phone></phone>" or "<phone/>". Either method is logically equivalent, despite the difference in notation. Accordingly, a comparison of XML documents before and after transmission would show no alteration, despite the coding difference.

When a digital signature has been added, however, a change in coding prior/post transmission would be determined to be an alteration of the data. Accordingly, Canonical XML provides a unified expression for regarding XML documents that are logically equivalent, despite such differences in notation, as identical documents.

Exclusive XML Canonicalization, as mentioned above, deals with portions of an XML document, and is a specification for providing a method to resolve namespace issues and performing normalization. Let's take a look at a specific example. A data transfer using SOAP describes XML data within the SOAP message. At this point, the SOAP message namespace declaration may affect XML data described internally. Let's take, for example, the following XML document prior to transmission:

<ns:Order xmlns:ns="http://example.com/order">
 <ns:product>table</ns:product>
 <ns:quantity>2</ns:quantity>
</ns:Order>

Describing this XML document in a SOAP message results in the following:

<SOAP-ENV:Envelope
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Body>
<ns:Order xmlns:ns="http://example.com/order">
 <ns:product>table</ns:product>
 <ns:quantity>2</ns:quantity>
</ns:Order>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The receiving end of the SOAP message picks up the XML data that has been internally notated; however, the following XML data may be picked up, depending on the process:

<ns:Order xmlns:ns="http://example.com/order"
 xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
 <ns:product>table</ns:product>
 <ns:quantity>2</ns:quantity>
</ns:Order>

As you can see, a namespace declaration xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" has been added to the extracted XML data that wasn't in the original. This is because a namespace declaration in the SOAP message has affected the XML data described internally. This namespace declaration is using "SOAP-ENV" as the namespace prefix; accordingly, if this prefix is not used within the XML data, the presence of a namespace declaration xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" has no effect on the content of the XML document. Therefore, while two XML documents may be logically equivalent, the content is regarded as having been altered when a digital signature is used.

In this case, using Exclusive XML Canonicalization reflects in the effect of the SOAP message namespace declaration, letting you treat logically equivalent XML data as identical XML data.

SOAP Message Security

The XML-Signature and XML Encryption introduced above are not more than rules for digital signatures and encryption. In actual practice, you have to determine specific matters beforehand as to the format of the SOAP message after digital signing/encryption, and/or the verification method for digital signatures, and so on. "Web Services Security v1.0" ("WS-Security," below), drafted by the OASIS standards organization, is what determines these matters. *1

*1 WS-Security is not included within the scope of the XML Master Professional Application Developer exam.

WS-Security is comprised of a total of five specifications, built around "SOAP Message Security 1.0."

SOAP Message Security 1.0 defines the SOAP message format after digital signing/encryption, procedures for adding and verifying digital signatures, procedures for encryption/decryption, etc. This specification requires the implementation of XML-Signature and XML Encryption. In addition, it strongly recommends you implement Exclusive XML Canonicalization for XML document normalization.

In this way, SOAP, XML-Signature, Exclusive XML Canonicalization and other specifications work in tandem to ensure the security of exchanged XML data.

In addition to testing your understanding related to the specifications introduced above, Section 5 also presents questions related to XML and system linking. Here is a rundown of the major points in this area:

  • Cautions/points for consideration when using XML to link applications and databases, or when using XML to link two or more applications
  • Understanding of RFC 3023 (defines XML media types)
  • Understanding of data binding (maps XML data structure and data type to a programming language)

Let's take on a practice question based on what we have covered so far:

Example of Questions Appearing in Section5 - (1)

The following WSDL Definition (use WSDL 1.1) defines the specification for a certain Web service. Select the answer that correctly describes this definition.

[WSDL Definition]

<definitions name="StockQuote"
 targetNamespace="http://example.com/stockquote.wsdl"
 xmlns:tns="http://example.com/stockquote.wsdl"
 xmlns:xsd1="http://example.com/stockquote.xsd"
 xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"
 xmlns:xsd="http://www.w3.org/1999/XMLSchema"
 xmlns="http://schemas.xmlsoap.org/wsdl/">
 
 <types>
  <schema targetNamespace="http://example.com/stockquote.xsd"
   xmlns="http://www.w3.org/1999/XMLSchema">
   <complexType name="inputType">
    <sequence>
     <element name="tickerSymbol" type="string"/>
    </sequence>
   </complexType>
  </schema>
 </types>
 
 <message name="GetLastTradePriceInput">
  <part name="input" type="xsd1:inputType"/>
 </message>
 <message name="GetLastTradePriceOutput">
  <part name="output" type="xsd:long"/>
 </message>
 
 <portType name="StockQuotePortType">
  <operation name="getLastTradePrice">
   <input message="tns:GetLastTradePriceInput"/>
   <output message="tns:GetLastTradePriceOutput"/>
  </operation>
 </portType>
</definitions>

Option



  1. HTTP is a protocol for transmitting a message to the Web service
  2. A SOAP message is the message transmitted to the Web service
  3. The function provided by the Web service is request/response type
  4. The xsd:long type coded for the type attribute of the child part element for the second message element is not defined within the types element; accordingly, this is an error as a WSDL definition description

Answer

C

Commentary

In addition to the elements types, message and portType used in this question WSDL definitions also allow for the use of elements binding and service. Before getting into the commentary for this question, let's take a brief look at each of these elements.

First, the elements types, message and portType are used for defining the functions (operations) provided by Web services. The binding element defines the specific protocol used for message transmission, while the service element is used to indicate the access point. In this way, WSDL definitions allow the programmer to describe the definitions for protocol and access points separately from the definitions for operations. This allows one to use a multiple number of protocols and access points for one operation.

Given the preceding information, it's time to take a look at the practice question. The binding element for defining protocol is not coded under [WSDL Definition] provided in the question; accordingly, answers A and B are incorrect.

In [WSDL Definition] the child element input/output of the portType element indicates that the operation provided is a request/response type *2. Accordingly, answer C is the correct answer.

*2 In addition to this, there is a one-way type operation that can be coded by using the element input only.

The "xsd:long type" in answer D is a data type defined by "XML Schema Part 2: Datatypes." In this case, the type definition via types element is not necessary, and therefore, answer D is incorrect.

Example of Questions Appearing in Section5 - (2)

Select two of the answers below that do not result in a format identical to [XML Document] after normalization, when using Canonical XML to perform XML document normalization for each of the possible answers given.

[XML Document]

<parent>
 <child1>DATA1</child1>
 <child2>DATA2</child2>
</parent>

Option



  1. <parent>
     <child1>DATA1</child1>
     
     <child2>DATA2</child2>
    </parent>
  2. <parent>
     <child1>
     DATA1
     </child1>
     <child2>DATA2</child2>
    </parent>
  3. <parent>
     <child1
      >DATA1</child1>
     <child2
      >DATA2</child2>
    </parent>
  4. <?xml version="1.0"?>
    <parent>
     <child1>DATA1</child1>
     <child2>DATA2</child2>
    </parent>

Answer

A, B

Commentary

As explained above, Canonical XML defines rules when normalizing XML documents. In this practice question, the rules below are followed when performing normalization based on Canonical XML.

  1. Character encoding method is UTF-8
  2. XML declaration and document type declaration are not output
  3. When coding an empty element, use a combination of start and end tags, rather than an empty element tag
  4. Extraneous whitespace inside start and end tags is deleted

Comparing answers A, B and C to the [XML Document] presented in this question shows added whitespace including line returns in differing locations. Of these, answer C has added whitespace in the start tag; this whitespace is an object for normalization based on rule "4." above (in the end, this whitespace is deleted).

Meanwhile, whitespace has been added outside the start and end tags in answers A and B, which are not subject to deletion.

The XML document in answer D shows almost identical coding to [XML Document]. The only point of difference is the addition of the XML declaration "<?xml version="1.0"?>". According to rule "2.", this XML declaration is not included in the post-normalized format.

Given the preceding, the correct answers are A and B.

Below, I will show the results of the XML document in our question after normalization for your reference (answers C and D have the same results as [XML Document]).

[XML Document]

<parent>
 <child1>DATA1</child1>
 <child2>DATA2</child2>
</parent>

Option



  1. <parent>
     <child1>DATA1</child1>
     
     <child2>DATA2</child2>
    </parent>
  2. <parent>
     <child1>
     DATA1
     </child1>
     <child2>DATA2</child2>
    </parent>

XML Tutorial - XML Master Professional Application Developer Edition Indexs

Go To HOME