Using XML and Web Services
Although XML is not a common source for large volumes of data, it is an integral technology standard in the realm of data. This section considers XML from a couple of different perspectives. First, you will work with the Web Service Task to interact with a public web service. Second, you will use the XML Source adapter to extract data from an XML document embedded in a file. In one of the web service examples, you will also use the XML Task to read the XML file.
Configuring the Web Service Task
In very simple terms, a web service is to the web as a function is to a code module. It accepts a message in XML, including arguments, and returns the answer in XML. The value of XML technology is that it enables computer systems that are completely foreign to each other to communicate in a common language. When using web services, this transfer of XML data occurs across the enterprise or across the Internet using the HTTP protocol. Many web services — for example, stock-tickers and movie listings — are freely available for anyone’s use. Some web services, of course, are private or require a fee. Two common and useful applications are to enable orders or other data to be exchanged easily by corporate partners, and to receive information from a service — either one that you pay for or a public service that is exposed free on the Internet. In the following examples, you’ll learn how to use a web service to get the weather forecast of a U.S. zip code by subscribing to a public web service, and how to use the Web Service Task to perform currency conversion. Keep in mind that the web service task depends on the availability of a server. The Web Service Task could return errors if the server is unreachable or if the server is experiencing any internal errors.
Weather by Zip Code Example
This example demonstrates how to use a web service to retrieve data:
- Create a new package and create an HTTP Connection by right-clicking in the Connection Managers pane and choosing New Connection.
- Choose HTTP and click Add to bring up the HTTP Connection Manager
Editor. Type http://www.webservicex.net/WeatherForecast.asmx?wsdl as the Server URL (see Figure 14-17). In this case, you’ll use a publicly available web service, so you won’t have to worry about any credentials or certificates. If you must supply proxy information to browse the web, fill that in on the Proxy tab.
- Before continuing, click the Test Connection button, and then click OK to accept the Connection Manager.
- Add a Web Service Task from the Toolbox to the Control Flow workspace.
- Double-click the Web Service Task to bring up the Web Service Task Editor. Select the General pane. Make sure that the HttpConnection property is set to the HTTP connection you created in step number 2.
- In order for a web service to be accessed by a client, a Web Service Definition Language (WSDL) file must be available that describes how the web service works — that is, the methods available and the parameters that the web service expects. The Web Service Task provides a way to automatically download this file.
- In the WSDLFile property, enter the fully qualified path c:\ProSSIS\Data\weather.wsdl where you want the WSDL file to be created (see below screen shot).
- Set the OverwriteWSDLFile property to True and then click Download
WSDL to create the file. If you are interested in learning more about the
file’s XML structure, you can open it with Internet Explorer.
By downloading the WSDL file, the Web Service Task now knows the web service definition.
- Select the Input pane of the Web Service Task Editor. Then, next to the Service property, open the dropdown list and select the one service provided, called WeatherForecast.
- After selecting the WeatherForecast service, click in the Method property and choose the GetWeatherByZipCode option.
- Web services are not limited to providing just one method. If multiple methods are provided, you’ll see all of them listed. Notice another option called GetWeatherByPlaceName. You would use this if you wanted to enter a city instead of a zip code. Once the GetWeatherByZipCode method is selected, a list of arguments appears. In this case, a ZipCode property is presented. Enter a zip code location of a U.S. city (such as 30303 for Atlanta, or, if you live in the U.S., your own zip code). See the below screen shot.
- Now that everything is set up to invoke the web service, you need to tell the Web Service Task what to do with the result. Switch to the Output property page of the Web Service Task Editor. Choose File Connection in the dropdown of the OutputType property. You can also store the output in a variable to be referenced later in the package.
- In the File property, open the dropdown list and choose <new connection>.
- When you are presented with the File Connection Manager Editor, change the Usage type property to Create file and change the File property to C:\ProSSIS\Data\weatheroutput.xml, as shown in below screen shot.
- Select OK in the File Connection Manager Editor, and OK in the Web Service Task Editor to finish configuring the SSIS package.
Now you’re ready to run the package. After executing it, wait for the Web Service Task to complete successfully. If all went well, use Internet Explorer to open the XML file returned by the web service
(c:\ProSSIS\data\weatheroutput.xml) and view the weather forecast for the zip code. It will look something like this:
<?xml version=”1.0″ encoding=”utf-16″ ?>
<Day>Thursday, September 01, 2011</Day>
<Day>Friday, September 02, 2011</Day>
The Currency Conversion Example
In this second example, you learn how to use a web service to get a value that can be used in the package to perform a calculation. To convert a price list to another currency, you’ll use the value with the Derived Column Transformation:
- Begin by creating a new SSIS package. This example requires three variables. To set them up, ensure that the Control Flow tab is selected. If the Variables window is not visible, right-click in the design area and select Variables. Set up the three variables as shown in the following table. At this time, you don’t need initial values. (You can also use package parameters instead of variables for this example.)
- Add a Connection Manager pointing to the AdventureWorks database.
- Add a second connection. This time, create an HTTP Connection Manager and set the Server URL to http://www.webservicex.net/CurrencyConvertor.asmx?wsdl.
NOTE Note that this web service was valid at the time of publication, but the authors cannot guarantee its future availability.
- Drag a Web Service Task to the design area and double-click the task to open the Web Service Task Editor. Set the HTTPConnection property to the Connection Manager you just created.
- Type in a location to store the WSDLFile, such as c:\ProSSIS\data\CurrencyConversion.wsdl, and then click the Download WSDL button as you did in the last example to download the WSDL file.
- Click Input to see the web service properties. Select CurrencyConvertor as the Service property and ConversionRate as the Method.
- Two parameters will be displayed: FromCurrency and ToCurrency. Set FromCurrency equal to USD, and ToCurrency equal to EUR (see below screen shot).
- Click Output and set the OutputType to Variable.
- The variable name to use is User::XMLAnswer (see below screen shot). Click OK to accept the configuration.
NOTE At this point, you may be interested in viewing the XML that it returned from the web service. You can save the XML in a file instead of a variable. Then, after running the task, examine the file. Alternately, you can set a breakpoint on the task and view the variable at runtime. See Error and Event Handling Topic to learn more about breakpoints and debugging.The value of the XML returned will look something like this:
<?xml version=”1.0″ encoding=”utf-8″>
- Because (for the sake of the example) you just need the number and not the XML, add an XML Task to the designer to evaluate the XML.
- Drag the precedence constraint from the Web Service Task to the XML Task, and then open the XML Task Editor by double-clicking the XML Task.
- Change the OperationType to XPATH. The properties available will change to include those specific for the XPATH operation. Set the properties to match those in the following table:
A discussion about the XPATH query language is beyond the scope of this Tutorial, but this XML is very simple with only a root element that can be accessed by using the slash character (/). Values are returned from the query as a list with a one-character unprintable row delimiter. In this case, only one value is returned, but it still has the row delimiter that you can’t use.
You have a couple of options here. You could save the value to a file, then import using a File Source Component into a SQL Server table, and finally use the Execute SQL Task to assign the value to a variable; but in this example, you will get a chance to use the Script Task to eliminate the extra character:
- Add a Script Task to the design area and drag the precedence constraint from the XML Task to the Script Task.
- Open the Script Task Editor and select the Script pane.
- In order for the Script Task to access the package variables, they must be listed in the ReadOnlyVariables or ReadWriteVariables properties (as appropriate considering whether you will be updating the variable value in the script) in a semicolon-delimited list. Enter User::Answer in the ReadOnlyVariables property and User::ConversionRate in the ReadWriteVariables property (see below screen shot).
- Click Design Script to open the code window. A Microsoft Visual Studio Tools for Applications environment opens. The script will save the value returned from the web service call to a variable. One character will be removed from the end of the value, leaving only the conversion factor. This is then converted to a double and saved in the ConversionRate variable for use in a later step.
- Replace Sub Main with the following code (Currency_Script.txt):Public Sub Main()
Dim strConversion As String
strConversion = Dts.Variables(“User::Answer”).Value.ToString
strConversion = strConversion.Remove(strConversion.Length -1,1)
Dts.TaskResult = Dts.Results.Success
- Close the scripting environment, and then click OK to accept the Script Task configuration.
- Add a Data Flow Task to the design area and connect the Script Task to the Data Flow Task. The Control Flow area should resemble what is shown in below screen shot.
- Move to the Data Flow tab and add a Connection Manager pointing to the AdventureWorks database, if you did not do so when getting started with this example.
- Drag an OLE DB Source Component to the design area.
- Open the OLE DB Source Editor and set the OLE DB Connection Manager property to the AdventureWorks connection. Change the data access mode property to SQL Command. Type the following query (Currency_Select.txt) in the command window:SELECT ProductID, ListPrice
WHERE ListPrice> 0
- Click OK to accept the properties.
- Add a Derived Column Transformation to the design area.
- Drag the Data Flow Path from the OLE DB Source to the Derived Column Component.
- Double-click to open the Derived Column Transformation Editor dialog. Variables, columns, and functions are available for easily building an expression. Add a derived column called EuroListPrice. In the Expression field, type the following (Currency_Expression.txt):ListPrice * @[User::ConversionRate]
- The Data Type should be a decimal with a scale of 2. Click OK to accept the properties (see below screen shot).
- Add a Flat File Destination Component to the Data Flow design area. Drag the Data Flow Path from the Derived Column Component to the Flat File Destination Component.
- Bring up the Flat File Destination Editor and click New to open the Flat File Format dialog.
- Choose Delimited and click OK. The Flat File Connection Manager Editor will open.
- Browse to or type in the path to a file, C:\ProSSIS\data\USD_EUR.txt. Here you can modify the file format and other properties if required (see below screen shot). Check Column names in the first data row.
- Click OK to dismiss the Flat File Connection Manager Editor dialog and return to the Flat File Destination Editor.
- Click Mappings and then click OK.
- Run the package and then open the file that was created by the destination adapter, C:\ProSSIS\data\USD_EUR.txt. You should see a list of products along with the list price and the list price converted to euros.
Many web services are available for you to try. See www.xmethods.net for a list of services, some of which are free. In the next section, you learn how to import an XML file into relational tables.
Working with XML Data as a Source
SQL Server provides many ways to work with XML. The XML Source adapter is yet another jewel in the SSIS treasure chest. It enables you to import an XML file directly into relational tables if that is what you need. In this example, you import an RSS (Really Simple Syndication) file from the web.
To use the XML Source adapter in SSIS, you first connect to an XML file, and then you need to provide the XSD definition of the XML structure, so that SSIS can read the file and correctly interpret the XML elements and attribute structure. Don’t have an XSD? No problem; SSIS can self-generate the XSD from within the XML Source adapter. There is no guarantee that the generated XSD will work with another XML file coming from the same source, which is why it is better to have an XSD definition that is provided by the XML Source that will universally apply to the related files used by SSIS.
- Create a new Integration Services package to get started.
- Add a Data Flow Task to the Control Flow design area and then click the Data Flow tab to view the Data Flow design area.
- Add an XML Source and name it CD Collection.
- Double-click the CD Collection Component to open the XML Source Editor.
- Make sure that the Connections Manager property page is selected on the left.
- Select XML File Location for the data access mode. For the XML location property, type in the following address:
C:\ProSSIS\Data\cd_catalog.xmlIf you click the Browse button, a regular File Open dialog opens. It isn’t obvious at first that you can use a URL instead of a file on disk.The XML file must be defined with an XML Schema Definition (XSD), which describes the elements in the XML file. Some XML files have an inline XSD, which you can determine by opening the file and looking for xsd tags. There are many resources and tutorials available on the web if you want to learn more about XML schemas. If the file you are importing has an in-line schema, make sure that Use inline schema is checked. If an XSD file is available, you can enter the path in the XSD location property (see Figure 14-27). In this case, you will create the XSD file right in the Source adapter.
- Click Generate XSD and put the file in a directory on your machine (such as c:\ProSSIS\data\cd_catalog.xsd). Once the file is generated, you can open it with Internet Explorer to view it if you are interested in learning more.
- Now that the CD Collection Component understands the XML file, click Columns. You will see all the columns defined in the XML file, as below screen shot shows.
Even though the XML document is one file, it represents two tables with a one-to-many relationship. If you browse to C:\ProSSIS\Data\cd_catalog.xml, you’ll see a channel, which describes the source of the information, usually news, and several items, or articles, defined. One note of caution here: if you are importing into tables with primary/foreign key constraints, there is no guarantee that the parent rows will be inserted before the child rows. Be sure to keep that in mind as you design your XML solution.
The properties of the channel and item tags match the columns displayed in the XML Source Editor. At this point, you can choose which fields you are interested in importing and change the output names if required. When you use the XML Source adapter, all the output name selections will be available as downstream paths. When you use an output path from the XML Source adapter, you will be able to choose which output you want to use.
- Create a new Connection Manager pointing to the AdventureWorks database or another test database.
- Add an OLE DB Destination Component to the design area and name it CD Collection _Table.
- Drag the Data Flow Path (blue arrow on the screen) from the XML Source to CD Collection Table.
- Double-click the CD Collection Table icon to bring up the OLE DB Destination Editor. Make sure that the OLE DB Connection Manager property is set to point to your sample database.
- Next to Use Table or View, click New. A window with a table definition will pop up. Click OK to create the table. Click Mappings and then click OK to accept the configuration.
- Add another OLE DB Destination Component and name it Errors. Drag a red Data Flow Path from the XML Source to the Errors Component. Click OK. The Configure Error Output dialog will open.
- In the Truncation property of the Description row, change the value to Redirect row (see below screen shot) and click OK.
- Double-click the OLE DB Destination that you named Errors and make sure it is pointing to the test database. Click New next to “Name of the table or the view,” and then click OK to create an Errors table. Click Mappings to accept the mappings, and then click OK to save the configuration. The Data Flow design area should now resemble below screen shot.
Run the package. If it completed successfully, some of the rows will be added to the NewsItem table. Any row with a description that exceeds 2,000 characters will end up in the Errors table.