Accessing Heterogeneous Data
What’s In This Topic?
- Dealing with Excel and Access data
- Integrating with Oracle
- Working with XML files and web services
- Extracting from flat files
- Integrating with ODBC
In this Topic, you will learn about importing and working with data from heterogeneous, or various non–SQL Server, sources. In today’s enterprise environments, data may exist in many diverse systems, such as Oracle, DB2, Teradata, SQL Azure, SQL Parallel Data Warehouse (PDW), Office documents, XML, or flat files, to name just a few. The data may be generated within the company, or it may be delivered through the Internet from a trading partner. Whether you need to import data from a spreadsheet to initially populate a table in a new database application or pull data from other sources for your data warehouse, accessing heterogeneous data is probably a big part of your job.
You can load data into SQL Server using SSIS through any ODBC-compliant, OLE DB–compliant, or ADO.NET managed source. Many ODBC, OLE DB, and .NET providers are supplied by Microsoft for sources like Excel, Access, DB2, FoxPro, Sybase, Oracle, Teradata, and dBase. Others are available from database vendors. A variety of Data Source Components are found in SSIS. These include Excel, Flat File, XML, ADO.NET (which is used to connect to .NET Sources), OLE DB (which allows connections to many different types of data), and Raw File (a special source used to read data that has been previously exported to a Raw File Destination). If the supplied Data Sources do not meet your needs, you can also create custom Data Sources.
SSIS can consume many of these sources from out-of-the-box features. In addition, Microsoft has also provided a set of free downloads in the SQL Server feature pack for advanced data source extraction. They include a set of source components from Attunity, third-party components that Microsoft has licensed for use with SSIS. The Attunity connectors allow advanced sourcing from Oracle (with bulk load capabilities), Teradata, and ODBC sources.
Below screen shot shows highlights the Source Assistant within the Data Flow Toolbox. It shows the various source options within SSIS. Many of them require the installation of a client tool; the gray information window at the bottom of the figure describes where to find the additional application if required.
- Excel and MS Access (versions 2013 and earlier): Excel is often used as a quick way to store data because spreadsheets are easy to set up and use. Access applications are frequently upsized to SQL Server as the size of the database and number of users increase.
- Oracle: Even companies running their business on Oracle or another of SQL Server’s competitors sometimes make use of SQL Server because of its cost-effective reporting and business intelligence solutions. XML and Web Services: XML and web services (which is XML delivered through HTTP) are standards that enable very diverse systems to share data. The XML Data Source enables you to work with XML as you would with almost any other source of data.
- Flat Files: Beyond just standard delimited files, SSIS can parse flat files of various types and code page encoding, which allows files to be received from and exported to different operating systems and non-Windows-based systems. This reduces the need to convert flat files before or after working with them in SSIS.
- ODBC: Many organizations maintain older systems that use legacy ODBC providers for data access. Because of the complexities and cost of migrating systems to newer versions, ODBC is still a common source.
- Teradata: Teradata is a data warehouse database engine that scales out on multiple nodes. Large organizations that can afford Teradata’s licensing and ongoing support fees often use it for centralized warehouse solutions.
- Other Heterogeneous Sources: The sources listed previously are the most common; however, this only touches on the extent of Data Sources that SSIS can access. The last section of this Topic provides third-party resources for when you are trying to access other sources such as SAP or Sybase.