Excel and Access
Recommended by 0 users
SSIS deals with Excel and Access data in a similar fashion because they use the same underlying provider technology for data access. For Microsoft Office 2003 and earlier, the data storage technology is called the JET Engine, which stands for Join Engine Technology; therefore, when you access these legacy releases of Excel or Access, you will be using the JET OLE DB Provider (32-bit only).
Office 2007 introduced a new engine called ACE that is essentially a newer version of the JET but supports the new file formats of Excel and Access. ACE stands for Access Engine and is used for Office 2007 and later. In addition, with the release of Office 2010, Microsoft provided a 64-bit version of the ACE provider. You will find both the 32-bit and 64-bit drivers under the name “Microsoft Office 12.0 Access Database Engine OLE DB Provider” in the OLE DB provider list. Therefore, when connecting to Access or Excel in these versions, you will use the ACE OLE DB Provider. If you have the 64-bit version of Office 2010 or 2013 installed, the next section will also review working with the 32-bit provider, because it can be confusing. Later in this section you will learn how to connect to both Access and Excel for both the JET and ACE engines. 64-Bit Support
In older versions of Office (Office 2007 and earlier), only a 32-bit driver was available. That meant if you wanted to extract data from Excel or Access, you had to run SSIS in 32-bit mode. Beginning with Office 2010, however, a 64-bit version of the Office documents became available that enables you to extract data from Excel and Access using SSIS on a 64-bit server in native mode. In order to use the 64-bit versions of the ACE engine, you need to install the 64- bit access provider either by installing the 64-bit version of Microsoft Office 2010 or later or by installing the 64-bit driver from Microsoft’s download site: http://www.microsoft.com/en-us/download/details.aspx?id=39358.
However, even though a 64-bit version of the ACE provider is available, you cannot develop packages with the 64-bit driver. This is because Visual Studio is a 32-bit application and is unable to see a 64-bit driver. With ACE, if you try to install both 32-bit and 64-bit, you will receive the error shown in below screen shot.
Therefore, the 64-bit driver can be used for a test or production server where packages are not executed through the design-time environment. The approach to using the 64-bit driver is to design your package with the 32-bit driver and then deploy your package to a server that has the 64-bit ACE driver installed.
To be sure, SSIS can run natively on a 64-bit machine (just like it can on a 32- bit machine). This means that when the operating system is running the X64 version of Windows Server 2003, Windows 7, Windows 8, Windows Server 2008, or a future version of Windows Server, you can natively install and run SQL Server in the X64 architecture (an IA64 Itanium build is also available from Microsoft support). When you execute a package in either 64-bit or 32- bit mode, the driver needs to either work in both execution environments or, like the ACE provider, have the right version for either the 32-bit or 64-bit execution mode.
When you install SSIS with the native X64 installation bits, you also get the 32-bit runtime executables that you can use to run packages that need access to 32-bit drivers not supported in the 64-bit environment. When working on a 64-bit machine, you can run packages in 32-bit emulation mode through the SSDT design environment and through the 32-bit version of DTExec. In addition, when using the SSIS Server Catalog in SQL 2014, you are also able to run packages in 32-bit or 64-bit mode.
Here are the details:
Visual Studio 2012: By default, when you are in a native 64-bit environment and you run a package, you are running the package in 64-bit mode. However, you can change this behavior by modifying the properties of your SSIS project. below screen shot shows the Run64bitRuntime property on the Debugging property page. When you set this to False, the package runs in 32-bit emulation mode even though the machine is 64-bit.
- 32-bit version of DTExec: By default, a 64-bit installation of SSIS references the 64-bit version of DTExec, usually found in the C:\Program Files\Microsoft SQL Server\120\DTS\Binn folder. However, a 32-bit version is also included in C:\Program Files (X86)\Microsoft SQL Server\120\DTS\Binn, and you can reference that directly if you want a package to run in 32-bit emulation mode in order to access the ACE and JET providers.
- 32-bit version for packages deployed to SSIS catalog: When running a package that has been deployed to the SSIS 2014 catalog, an advanced configuration option, “32-bit runtime,” will allow your package to be executed in legacy 32-bit execution mode. This option is available both in SQL Agent and in the package execution UI in the SSIS 2014 catalog. The default is to have this option unchecked so that packages run in 64-bit mode.
Be careful not to run all your packages in 32-bit emulation mode when running on a 64-bit machine, just the ones that need 32-bit support. The 32- bit emulation mode limits the memory accessibility and the performance. The best approach is to modularize your packages by developing more packages with less logic in them. One benefit to this is the packages that need 32-bit execution can be separated and run separately.
Working with Excel Files
Excel is a common source and destination because it is often the favorite “database” software of many people without database expertise (especially in your accounting department!). SSIS has Data Flow Source and Destination Components made just for Excel that ease the connection setup, whether connecting to Excel 2003 or earlier or to Excel 2007 or later (the JET and ACE providers).
You can be sure that these components will be used in many SSIS packages, because data is often imported from Excel files into a SQL Server database or exported into Excel for many high-level tasks such as sales forecasting. Because Excel is so easy to work with, it is common to find inconsistencies in the data. For example, while lookup lists or data type enforcement is possible to implement, it is less likely for an Excel workbook to have it in place. It’s often possible for the person entering data to type a note in a cell where a date should go. Of course, cleansing the data is part of the ETL process, but it may be even more of a challenge when importing from Excel.
Exporting to All Versions of Excel
In this section, you will use SSDT to create SSIS packages to export data to Excel files. The first example shows how to create a package that exports a worksheet that the AdventureWorks inventory staff will use to record the physical inventory counts. Follow these steps to learn how to export data to Excel:
- Create a new SSIS package in SSDT and Rename the package Export Excel.dtsx.
- Drag a Data Flow Task from the Toolbox to the Control Flow design area and then switch to the Data Flow tab.
- Add an OLE DB Source Component.
- Create a Connection Manager pointing to the AdventureWorks database.
- Double-click the OLE DB Source Component to bring up the OLE DB Source Editor. Make sure that Connection Manager is selected on the left.
- Choose the AdventureWorks Connection Manager for the OLE DB Connection Manager property. The data access mode should be set to SQL Command. In this case, you will write a query (Excel_Export_SQL.txt) to specify which data to export:SELECT ProductID, LocationID, Shelf, Bin,
Null as PhysicalCount
ORDER by LocationID, Shelf, Bin
- If you select Columns in the left pane, you have the opportunity to deselect some of the columns or change the name of the output columns. Click OK to accept the configuration.
- Drag an Excel Destination Component from the SSIS Toolbox, found under the Other Destinations grouping and drag the Data Flow Path (blue arrow on your screen) from the OLE DB Source to the Excel Destination. Double-click the Excel Destination.
- Click the New button for the Connection Manager, and in the Excel Connection Manager window, choose Microsoft Excel 2007 from the Excel Version dropdown, and then enter the path to your destination (C:\ProSSIS\Data\Inventory_Worksheet.xlsx).
- Select OK in the Excel Connection Manager window, and then click New on the Name of Excel sheet dropdown to create a new worksheet.
- In the Create Table window, you can leave the name of the worksheet or change it and modify the columns as below screen shot shows. Click OK to create the new worksheet.
- The data access mode should be set to Table or View (more about this later). Click OK to create a new worksheet with the appropriate column headings in the Excel file. Make sure that Name of the Excel sheet is set to Inventory Worksheet.
- You must click Mappings on the left to set the mappings between the source and the destination. Each one of the Available Input Columns should match up exactly with an Available Output Column. Click OK to accept the Inventory Worksheet settings.
Run the package to export the product list. The fields selected in the Production.Inventory table will export to the Excel file, and your inventory crew members can each use a copy of this file to record their counts.
Importing from Excel 2003 and Earlier
For this example of importing Excel data, assume that you work for AdventureWorks and the AdventureWorks inventory crew divided up the assignments according to product location. As each assignment is completed, a partially filled-out worksheet file is returned to you. In this example, you create a package to import the data from each worksheet that is received:
- Open SQL Server Data Tools (SSDT) and create a new SSIS package.
- Drag a Data Flow Task to the Control Flow design pane.
- Click the Data Flow and add an Excel Source and an OLE DB Destination Component. Rename the Excel Source to Inventory Worksheet and rename the OLE DB Destination to Inventory Import.
- Drag the blue Data Flow Path from the Inventory Worksheet Component to the Inventory Import Component.
NOTE The OLE DB Destination sometimes works better than the SQL Server Destination Component for importing data from non-SQL Server sources! When using the SQL Server Destination Component, you cannot import into integer columns or varchar columns from an Excel spreadsheet, and must import into double precision and nvarchar columns. The SQL Server Destination Component does not support implicit data type conversions and works as expected when moving data from SQL Server as a source to SQL Server as a destination.
- Create a Connection Manager for the Excel file you have been working with by following the instructions in the previous section (select Microsoft Excel 97-2003 in the Excel version dropdown).
- Rename the Excel Connection Manager in the Properties window to Inventory Source.
- Create a Connection Manager pointing to the AdventureWorks database.
- Double-click the Inventory Worksheet Component to bring up the Excel Source Editor (see Below Screen shot).
- For this example the data access mode should be set to SQL Command because you only want to import rows with the physical count filled in. Type the following query (Excel_Import_SQL.txt) into the SQL command text box (see below screen shot):
SELECT ProductID, PhysicalCount, LocationID, Shelf, Bin
WHERE PhysicalCount IS NOT NULL
- Double-click the Inventory Import Component to bring up the OLE DB Destination Editor. Make sure the AdventureWorks connection is chosen. Under Data access mode, choose Table or View.
- Click the New button next to Name of the table or the view to open the Create Table dialog.
- Change the name of the table to InventoryImport and click OK to create the table. Select Mappings. Each field from the worksheet should match up to a field in the new table.
- Click OK to accept the configuration.
While this is a simple example, it illustrates just how easy it is to import from and export to Excel files.
Importing from Excel 2007 and Later
Setting up an SSIS package to import from Excel 2007 and later is very similar to setting up the connection when exporting to Excel 2007.
When you set up the connection, choose Excel 2007 from the Excel version dropdown (step 5 above). Once you have set up the connection as already shown in above 2 screen shot, you need to create an OLE DB Source adapter in the Data Flow. You can either reference the worksheet directly or specify a query that extracts data from specific Excel columns. Below screen shot shows a worksheet directly referenced, called “Inventory_Worksheet$.”
Working with Access
MS Access is the departmental database of choice for countless individual users and small workgroups. It has many great features and wizards that enable a small application or prototype to be quickly developed. Often, when an application has outgrown its humble Access origins, discussions about moving the data to SQL Server emerge. Many times, the client will be rewritten as a web or desktop application using VB.NET or another language. Sometimes the plan is to link to the SQL Server tables, utilizing the existing Access front end. Unfortunately, if the original application was poorly designed, moving the data to SQL Server will not improve performance. This section demonstrates how you can use SSIS to integrate with Microsoft Access. Access 2007 and later use the same ACE provider as Excel does, so as you work with Access in a 32-bit or 64-bit mode, please refer to the 64-bit discussion of Excel in the previous section.
Configuring an Access Connection Manager
Once the Connection Manager is configured properly, importing from Access is simple. First, look at the steps required to set up the Connection Manager:
- Create a new SSIS package and create a new Connection Manager by right-clicking in the Connection Managers section of the design surface.
- Select New OLE DB Connection to bring up the Configure OLE DB Connection Manager dialog.
- Click New to open the Connection Manager. In the Provider dropdown list, choose one of the following access provider types: Microsoft Jet 4.0 OLE DB Provider (for Access 2003 and earlier) Microsoft Office 12.0 Access Database Engine OLE DB Provider (for Access 2007 and later) If you do not see the Microsoft Office 12.0 Access Database Engine provider in the list, you need to install the 32-bit ACE driver described earlier.
- Click OK after making your selection.
- The Connection Manager dialog changes to an Access-specific dialog. In the server or file name box, enter the path to the Northwind database, C:\ProSSIS\Data\Northwind.mdb, as below screen shot shows. You are using the Northwind MS Access sample database for this example.
- By default, the database user name is blank, with a blank password. If security has been enabled for the Access database, a valid user name and password must be entered. Enter the password on the All pane in the Security section. The user Password property is also available in the properties window. Check the Save my password option.
- If, conversely, a database password has been set, enter the database password in the Password property on the Connection pane. This also sets the ODBC:Database Password property found on the All tab.
- If both user security and a database password have been set up, enter both passwords on the All pane. In the Security section, enter the user password and the database password for the Jet OLEDB:New Database Password property. Check the Save my password option. Be sure to test the connection and click OK to save the properties.
Importing from Access
Once you have the Connection Manager created, follow these steps to import from Access:
- Using the project you created in the last section with the Access Connection Manager already configured, add a Data Flow Task to the Control Flow design area.
- Click the Data Flow tab to view the Data Flow design area. Add an OLE DB Source Component and name it Customers.
- Double-click the Customers icon to open the OLE DB Source Editor. Set the OLE DB Connection Manager property to the Connection Manager that you created in the last section.
- Select Table or View from the Data access mode dropdown list. Choose the Customers table from the list under Name of the table or the view (see Below screen shot).
- Click Columns on the left of the Source Editor to choose which columns to import and change the output names if needed.
- Click OK to accept the configuration.
- Create a Connection Manager pointing to AdventureWorks.
- Create an OLE DB Destination Component and name it NW_Customers. Drag the connection (blue arrow on your screen) from the Customers Source Component to the NW_Customers Destination Component.
- Double-click the Destination Component to bring up the OLE DB Destination Editor and configure it to use the AdventureWorks Connection Manager.
- You can choose an existing table or you can click New to create a new table as the Data Destination. If you click New, notice that the Create Table designer does not script any keys, constraints, defaults, or indexes from Access. It makes its best guess as to the data types, which may not be the right ones for your solution. When building a package for use in a production system, you will probably want to design and create the SQL Server tables in advance.
- For now, click New to bring up the table definition (see below screen shot). Notice that the table name is the same as the Destination Component, so change the name to NW_Customers if you did not name the OLE DB Destination as instructed previously.
- Click OK to create the new table.
- Click Mappings on the left to map the source and destination columns.
- Click OK to accept the configuration.
- Run the package. All the Northwind customers should now be listed in the SQL Server table. Check this by clicking New Query in Microsoft SQL Server Management Studio. Run the following query (Access_Import.txt) to see the results:
SELECT * FROM NW_Customers
- Empty the table to prepare for the next example by running this query:TRUNCATE TABLE NW_CUSTOMERS
Using a Parameter
Another interesting feature is the capability to pass a parameter from a package variable to a SQL command. The following steps demonstrate how:
NOTE In Access, you can create a query that prompts the user for parameters at runtime. You can import most Access select queries as tables, but data from an Access parameter query cannot be imported using SSIS.
- Select the package you started in the last section.
- Navigate back to the Control Flow tab and right-click the design area.
- Choose Variables and add a variable by clicking the Add Variable icon. Name it CustomerID. Change the Data Type to String, and give it a value of ANTON (see below screen shot). Close the Variables window and navigate back to the Data Flow tab.
NOTE The design area or component that is selected determines the scope of the variable when it is created. The scope can be set to the package if it is created right after clicking the Control Flow design area. You can also set the scope to a Control Flow Task, Data Flow Component, or Event Handler Task.
- Double-click the Customers Component to bring up the OLE DB Source Editor and change the data access mode to SQL Command. A SQL Command text box and some buttons appear. You can click the Build Query button to bring up a designer to help build the command or click Browse to open a file with the command you want to use. For this example, type in the following SQL statement
(Access_Import_Parameter.txt) (see below screen shot):
SELECT CustomerID, CompanyName, ContactName, ContactTitle,
Address, City, Region, PostalCode, Country, Phone, Fax
WHERE (CustomerID = ?)
- The ? symbol is used as the placeholder for the parameter in the query. Map the parameters to variables in the package by clicking the Parameters button. Choose User::CustomerID from the Variables list and click OK (see below screen shot).
Note that you cannot preview the data after setting up the parameter because the package must be running to load the value into the parameter.
- Click OK to accept the new configuration and run the package. This time, only one record is imported (see below screen shot).
You can also return to SQL Server Management Studio to view the results:
SELECT * FROM NW_Customers
If you wish to use multiple parameters in your SQL command, use multiple question marks (?) in the query and map them in order to the parameters in the parameter mapping. To do this, you set up a second package-level variable for CompanyName and set the value to Island Trading. Change the query in the Customers Component to the following (Access_Import_Parameter2.txt):
SELECT CustomerID, CompanyName,
ContactName, ContactTitle, Address, City, Region,
PostalCode, Country, Phone, Fax
WHERE (CustomerID = ?) OR
(CompanyName = ?)
Now the Parameters dialog will show the two parameters. Associate each parameter with the appropriate variable (see below screen shot).
Importing data from Access is a simple process as long as Access security has not been enabled. Often, porting an Access application to SQL Server is the desired result. Make sure you have a good Tutorial or other resource to help ensure success.