Data Flow Task
The most important task in SSIS is the Data Flow Task. The SSIS Data Flow Task can be selected directly from the SSDT Toolbox, and then the source and destinations are defined within the task. The Data Flow Task isn’t merely a mapping transform for input and output columns. This task has its own design surface like the Control Flow, where you can arrange task-like components called transforms to manipulate data as it flows in a pipeline from the source to a destination. The Data Flow, as you can imagine, is the heart of SSIS, because it encapsulates all the data transformation aspects of ETL.
Data Flows can split the data in the pipeline based on a data element and handle each stream separately. In the Data Flow, the header line of the file can be split off and examined separately from the detail lines. As the pipeline exits the data-cleansing process, the streams can be sent to separate destinations or converged to a final combined destination. Note that you may have several different Data Flows within an SSIS package. For each of the Data Flow Tasks, you add to the control surface, you’ll have a corresponding Data Flow surface. This task is so important and such an integral part of moving data in SSIS.
If you would like to become a SQL Server Integration Services (SSIS) certified professional, then visit Mindmajix - A Global online training platform: "SSIS Training and Certification Course" . This course will help you to achieve excellence in this domain.
SSIS (SQL Server Integration Services) is an integral part of the Microsoft SQL Server Database software. The Data Flow Task in SSIS is one of the most important and intricate tasks, and also one of the reasons for SSIS to be considered as fast ETL tools. In this blog, we'll dig deeper into the SSIS Data Flow Task on a conceptual level, in the following order:
- What is data flow in SSIS?
- SSIS Data Flow – Component Types, Source, Destination
- How to add an SSIS Data Flow Task?
What is data flow in SSIS?
Data flow in SSIS defines the flow of data from a source to a destination. The Data Flow Task is the most frequently used task in SSIS, which contains the data transformation logic (ETL processes). It moves the data from source to destination and adds transforms in them to merge, update or split data.
A Data Flow Task comprises multiple data flows, and a data flow consists of various sources and destinations. Collectively with a connection manager, data flow ensures connection to a transformation, data source, and data transfer to a particular destination. Adding a Data Flow Task to a package control flow makes it possible for the package to extract, transform, and load data.
At run time, the Data Flow Tasks build an execution plan from the data flow, and the data flow engine implements the plan. We can create a Data Flow Task without data flow, but the task will perform only if it consists of at least one data flow.
The below diagram depicts the Data Flow Task with multiple data flows. The second data flow consists of various sources and destinations.
If a task copies various sets of data, and the order in which the data copied is not significant, then we include multiple data flows in the Data Flow Task. Through that, the data flow engine finds the execution order of multiple data flows within individual Data Flow Task. For finding the order, the package should make use of multiple Data Flow Tasks.
SSIS Data Flow - Component Types, Sources, and Destinations
In SSIS toolbox, components are classified into Data connection and data transformation.
1. Data connection
1.1. Favorites – Here, we can find the source and destination assistant for creating a provider to a database table or some file.
Subscribe to our youtube channel to get new updates..!
1.2. Other Sources – This helps you to choose the connection provider. For example, choosing OLE DB Source to connect to SQL Server.
- CDC Source
- ADO NET Source
- Flat File Source
- Excel Source
- OLE DB Source
- ODBC Source
- XML Source
- Raw File Source
1.3. Other Destination – This helps to choose the connection provider. For example, we would select the OLE source to connect to SQL Server. Data imports to destination.
- Data Mining Model Training
- ADO NET Destination
- Dimension processing
- DataReader Destination
- Flat File Destination
- OLE DB Destination
- Excel Destination
- ODBC Destination
- Raw File Destination
- Partion Processing
- SQL Server Compact Destination
- Recordset Destination
- SQL Server Destination
2. Data transformation
2.1. Common – Here, we will find the most commonly used tools for data transformation, such as data conversion, derived column, etc.
- Data Conversion
- Balanced Data Distributor
- Aggregate component
- Conditional Split
- Derived Column
- Lookup component
- Data Streaming Destination
- HDFS File Destination
- HDFS FIle Source
- Merge Join component
- Merge component
- OLE DB Command
- Sort component
- Multicast component
- OData Source
- Script Component
- Slowly Changing dimension
- Row Count component
- Union All component
2.2. Other Transformations
- Character Map
- Data Mining Query
- CDC Splitter
- Cache Transform
- Copy Column
- DQS Cleansing
- Fuzzy Grouping
- Export Column
- Fuzzy Lookup
- Import Column
- Percentage Sampling
- Term Lookup
How to add an SSIS Data Flow Task?
1. For this, we'll work with the package DataFlowTask.dtsx for creating an SSIS Data Flow Task and use the flat file as a data source and a SQL Server table as the destination
2. The SSIS Data Flow Task is an exceptional task that has its own tab in the Integrated Development Environment (IDE) as shown below. If you click on the tab, you will see a new SSIS Toolbox comprising Data Flow specific components.
3. Next, Drag & drop the Data Flow Task into the Control Flow of the package then click "Data Flow" tab. SSIS Data Flows are typically used to move data from one location to another.
4. Drag and drop source assistant into data flow editor. Next, click on the flat file and choose the flat file connection manager and click on the OK button. Using the data flow design panel or by drag and drop file sources from other sources in the toolbox directly into the design panel, we can build a flat file source.
5. Drag and drop OLE DB Destination into Data Flow design panel. Next, choose the Flat File Source and put its dark blue arrow onto the OLE DB Destination.
6. Right-click on the OLE DB Destination and select the Edit option to open the OLE DB Destination Editor. Then click on the New option and select localhost.Test.
7. Next, select the student table in the table or the view drop-down list.
8. Click the Mappings tab and in the Input Column, choose "Column 0" for name and Column 1 for age.
9. Click the OK button. Then Run the package.
10. From the result, you can see three rows added. Check the student table in the test database, and you can see the three records inserted.
Configuring the Data Flow:
We can configure a Data Flow Task by adding the components to the Data Flow tab. Various data flow components supported by SSIS are as follows:
- Sources: Where the data comes from
- Destinations: Where you want to put the data
- Transformations: How you can modify the data
The Data Flow Task will start from a source and end with a destination, but not always. We can include any no of transformations required to prepare the data for the destination.
In this blog, we shared an introduction to SSIS Data Flow Task and how to add a Data Flow Task with only a source and a destination. We hope you’ll now be able to compose your first SSIS Data Flow!
Explore SQL Server Integration Services (SSIS) Sample Resumes! Download & Edit for Free..! Download Now!
List of Related Microsoft Certification Courses:
|SCCM||SQL Server DBA|
|Team Foundation Server||BizTalk Server Administrator|