Are you looking for the SSIS interview questions? Wait! You are on the right track. I have compiled the top 50 SSIS interview questions and answers in this blog to help you crack SSIS interviews effortlessly. Whether you are a novice or an advanced learner, this blog is the best place to upgrade your knowledge of SSIS. Mainly, I have included many scenario-based SSIS questions to make you an expert SSIS developer.
Data transformation and integration are crucial for efficient data exploration in this data-driven world. In this way, SSIS is one of the robust platforms for data transformation and integration solutions. Companies widely employ SSIS for cleaning, mining, and loading data, so the demand for SSIS developers in the job market is constantly rising.
According to Glassdoor, the salary of an SSIS developer in India is over 7 LPA on average. In the USA, they make over 103k USD per year on average, says Talent. So, it is a great choice if you choose to make a career in the SSIS domain. Keeping this in mind, I have curated the top 50 SSIS questions and answers from the various SSIS topics in this blog. It will help you quickly breeze through your SSIS interviews and land your dream career.
We have categorized SSIS Interview Questions - 2024 (Updated) into 9 levels they are:
1. What are the Differences between SSIS and Informatica?
2. What is the SSIS package and what does it do?
3. What is the Difference between Execute TSQL Task and Execute SQL Task?
4. What is the use of a config file in SSIS?
5. What kind of containers can you use with SSIS packages?
6. What is the Manifest file in SSIS?
7. What is the difference between Merge and Union All?
8. How to back up or retrieve the SSIS packages?
9. What is the Different between Control Flow and Data Flow?
10. What is the Data Profiling task?
Before jumping into SSIS interview questions and answers, we will go through the primary and secondary skills required for SSIS developers in this section. Additionally, we will walk through the job responsibilities of SSIS developers at various expertise levels.
Let's look at the essential skills needed for SSIS developers.
Let's outline the secondary skills that are necessary for SSIS developers.
Let’s read through the job responsibilities of fresh SSIS developers.
Let’s find out the job responsibilities of SSIS developers with middle-level expertise.
Let’s look at the job responsibilities of senior-level SSIS developers below.
Well! Understanding the SSIS developer skills and responsibilities will help you prepare for SSIS interviews effectively and excel in your career.
I have compiled the basic SSIS interview questions and answers in this section. This section will provide you with a solid foundation in SSIS packages, package deployment, SSIS control flow and data flow, and many more.
Ans: SSIS or SQL Server Integration Services is the upgraded Data Transformation Services (DTS) platform. SSIS is also an ETL tool that handles data Extraction, Transformation, and Loading. We use SSIS for data profiling and transformation, file system manipulation, etc.
SSIS is a robust platform that supports creating workflows and loading data in data warehouses. We can use SSIS to automate the maintenance of SQL Server databases, update multidimensional cube data, and more.
If you want to enrich your career and become a certified professional in SSIS, then enroll in "SSIS Online Training". This course will help you to achieve excellence in this domain. |
Ans: An SSIS control flow allows developers to program graphically. A control flow includes many operations, such as data flow tasks, execution SQL tasks, etc. It also consists of the order and relationship between the operations.
A control flow has many components, each performing a specific task. The essential elements of a control flow include a sequence container, a ‘for loop’ container, a 'for each' loop container, a task host container, and a few more. Besides, control flow components are interconnected, which helps to control data flow.
Related Article: SSIS Control Flow |
Ans: The data transformation stage in an ETL process includes a set of functions or rules to modify the input data into the required format for a specific purpose. Data transformation includes data cleansing and standardisation to enhance the data quality. Thus, we can make accurate analyses using the transformed data.
For example, we may need the key performance metrics of a specific business by analysing the last six months of business data. However, the data may have many missing values and corrupt data. We must remove these errors first before using the data for further analysis. That’s where data transformation comes into the scene to improve the data quality.
Ans: SSIS supports two types of variables: user-defined variables and system variables. Package developers define user-defined variables, whereas integration services define system variables. We can create user-defined variables for all SSIS container types, such as packages, sequence containers, 'for each' loop containers, etc.
The main thing is that we can create many user-defined variables for a package. But at the same, we cannot create additional system variables.
An SSIS package includes connections, control and data flow elements, variables, event handlers, and configurations. We can assemble these elements in an SSIS package using graphical design tools. For example, we can create SSIS packages using the Business Intelligence Development Studio (BIDS).
We can store SSIS packages in the SSIS package store. SSIS packages can merge data from heterogeneous sources into the SQL Server. We can also use SSIS packages to clean and standardise data, populate data warehouses, and automate administrative tasks.
Ans: SSIS provides four types of containers for building SSIS packages. The containers are sequence, task host, for loop, and 'for-each' loop container.
The sequence container helps to group the same type of tasks. In a way, it is an organisation container that supports building complex SSIS packages.
The task host container provides services to a single task. This container is not configured separately. But it is configured while setting the properties for the task it holds.
The ‘for loop’ container executes tasks for a specific iteration. For example, we can use the ‘for loop’ container to update a record ten times. So, it eliminates the need to create ten different packages for execution. Also, it avoids running the packages ten times.
The ‘for each’ loop container helps to define a control flow in an SSIS Package. The ‘for-each’ enumerator enables the looping in the package. The ‘for-each’ container repeats the control flow for every enumerator member.
Ans: Deployment is the process in which packages convert from development mode into executables mode. For deploying the SSIS package, you can directly deploy the package by right-clicking the Integration Services project and build it.
This will save the package.dtsx file on the projectbin folder. Also, you can create the deployment utility using which the package can be deployed at either SQL Server or as a file on any location.
For creating deployment utility, follow these steps:
SQL Server 2022 (16.x) is the latest version of SQL Server. It was released on 16/11/2022. The support end date is 11/02/2028.
Below are the other details of the newest version of SQL Server.
RTM (no SP) | 16.0.1000.6 |
Latest CU | CU12 (16.0.4115.5,March 2024) |
Ans:
Features | SSIS | Informatica |
Ease of Use | Easy to setup | It's a bit tricky for beginners. |
Maintenance | Easy to maintain | Difficult to maintain |
Productivity | Moderate | High |
Cost | Cost-effective | Expensive |
Support | Community support is good | Need escalation for better support |
Connectivity | Moderate | Offers excellent connectivity |
The following are the enhancements made in the latest SQL server version.
A stored procedure is a pack of SQL statements stored in a relational database with an assigned name. It is a code that we can reuse any number of times.
We use stored procedures to:
The SQL server profiler is an interface with which we can create and manage traces. It helps to analyse and replay trace results. We can save events in a trace file for further analysis. We can also replay the events while diagnosing an issue.
For example, we can use the SQL Server profiler to monitor a production environment tightly. It helps to identify the stored procedures that run slowly and reduce performance.
Performance tuning is the process of examining and resolving the issues affecting an SQL database's efficiency. If we want to improve the performance of SQL queries, we need to tune SQL queries and databases for better results.
Disk I/O, CPU utilisation, and memory utilisation are a few critical metrics used to track the performance of resources. They help to make efficient SQL performance tuning. Moreover, performance tuning helps to improve the reliability and accuracy of the data stored in databases.
SSIS has a robust architecture that sharply separates data transformation from the package control flow. The architecture consists of two engines: the integration services runtime engine and the integration services dataflow engine.
The runtime engine controls the execution of SSIS packages, whereas the data flow engine manages the data flow task. The runtime engine helps to set options for logging, event handlers, and variables. The data flow engine helps to extract, transform, and load data.
Data modelling is the process of evaluating data and their relationships. We perform data modelling to fulfil application data requirements. Data modelling plays a crucial role in storing data in databases.
SQL data modelling includes three levels: conceptual, logical, and physical. We can identify the main entities, attributes, and domain relationships at the conceptual level. We can apply the rules and constraints of database systems at the logical level. The physical model includes the storage, network, hardware, and database security.
Well! You have gone through the basic-level SSIS interview questions and answers. The fundamental knowledge you have gained in this section will be the stepping stone to learning advanced SSIS concepts in the following sections.
I have collated in-depth SSIS interview questions for experienced learners in this section. It will help elevate your SSIS expertise to the next level.
Ans: The manifest file in SSIS consists of the vital information for deployment. It has the metadata of packages, security details, configurations, etc. It is a simple, readable XML file. We need manifest files to deploy SSIS packages to a target destination using a wizard. The advantage of the manifest file is that it comes with a user-friendly interface. The downside of the manifest file is that it lacks flexibility.
Ans: The file system deployment stores package files on a network or local drive. So, we can use the SQL agent job to schedule when the packages should run.
Ans: The msdb database stores SSIS packages. So, if the SQL Server goes down, we can restore the packages from the MSDB database. Additionally, we can back up SSIS packages using the full database backup.
Ans: Know that every data flow task will have at least a single data flow. It allows users to transform, clean, and modify data. A data flow task creates an execution plan from the data flow at runtime. Then, the dataflow engine executes the data flow task.
Further, adding a package control with a data flow task allows the package to extract, transform, and load data.
Ans: Data profiling is the process of analyzing the source data to better understand what condition the data is in, in terms of cleanliness, patterns, numbers or nulls, and so on. data profiling tasks usually be used at the beginning of the development cycle to support the design of the destination database schema. Note that this task is not used when you develop the normal recurring ETL packages.
Ans: We apply the multicast transformation to send input data to multiple output paths. This transformation can send a path to various destinations. The multicast transformation is similar to the split transformation because both send data to various outputs. The main thing about multicast transformation is that we cannot specify which part of the data will have which output.
Ans: The merge transformation merges data from two paths into a single one. The transformation breaks data flow into different paths that handle specific errors. Then, we must merge the paths back into the main data flow downstream once the errors have been removed. It also supports integrating data from two separate data sources. We must also perform data sorting in the merge transformation.
On the other hand, the Union All transformation works like the merge transformation but does not require data sorting. It combines outputs from multiple sources into a single result set.
Ans: The OLE DB command transform is a robust method designed to execute an SQL statement for each row in the input data. The SQL statement can be an inline statement or a stored procedure call. Further, the OLE DB command transform is the same as that of the ADO command object.
Ans: The execute package task enables developers to build SSIS solutions. They are called parent packages that execute other packages called ‘child packages’.
We must separate packages into discrete functional workflows if we want fast-paced development and testing cycles. In SSIS, most of the configurable properties are in the package tab of the executable package task editor.
If the child package is located on an SQL Server, we must use the OLE DB Connection Manager to hold the package. It allows developers to choose the child package within the connection to set the package.
Ans: SSIS provides three options for configuring transactions: Supported, Not Supported, and Required. Let’s take a closer look at them in the following.
Supported: A container does not start a transaction in this configuration option. But at the same time, it joins with any transaction started by its parent container. For example, consider a package with four executable SQL tasks to start a transaction. The tasks update the database after the execution. The updates are rolled back if any task fails. There will be no database updates if the package does not start a transaction.
Not Supported: The container doesn't start a transaction in this configuration option. Similarly, it doesn't join an existing transaction. The parent container transactions do not affect child containers in this configuration.
Required: In this configuration, a container initiates a transaction when no other container has already been started by its parent container. If a transaction already exists, the container joins the transaction. For example, if a package includes a sequence container that uses the ‘Required’ option, the sequence container would start its transaction. The sequence container will join the package transaction if the package is configured to use the ‘Required’ Option.
Ans: SSRS stands for SQL Server Reporting Services. It is a framework with tools like the report builder, report manager, report designer, and report server. It is a web interface that generates interactive reporting solutions in various formats.
SSAS stands for SQL Server Analysis Services. It is a multidimensional tool that supports online analytical processing. It helps to analyse the large amounts of data stored in relational databases.
SSIS stands for SQL Server Integration Services. It comes with ETL capabilities to transform and load data in data warehouses. The critical components of SSIS are the SSIS designer, import and export wizard, and SSIS API programming.
Good! We hope that this section has significantly enhanced your knowledge of SSIS. Now, you are ready to dive deep into the advanced concepts of SSIS packages that are coming up in the following section.
I have composed the advanced SSIS interview questions in this section, which will help to boost your SSIS knowledge to greater heights.
Ans: In the following, let’s examine the differences between the ‘Execute TSQL task’ and ‘Execute SQL Task’.
Execute the TSQL Task:
Execute SQL Task:
Ans: In SSIS, a task will be executed if the condition set by the precedence constraint is met. The constraints allow tasks to take different paths based on other tasks' success or failure.
Ans: A configuration file in SSIS enables the properties of package objects to be updated dynamically during runtime.
We can store configuration files in multiple ways, as listed below:
XML configuration file: Store the config file as an XML file.
Ans:
All process base tasks are part of control flow while ETL related tasks are part of Dataflow which is again a subset of control flow.
Ans: In SSIS, a checkpoint is a crucial property to restart a project after it has failed. We create a checkpoint file by setting the property as ‘True’. A Checkpoint file stores the information about the package execution.
Ans: We use DTExec.exe to run SSIS packages from the command line. We get DTExec.exe and the necessary tools with SQL SERVER 2005 Express Edition. It comes with the Advanced Services and Microsoft SQL Server 2005 Express Edition Toolkit.
Ans: The execute SQL task runs in the SQL Server. The task checks tables in the SQL Server to see whether they have over 1000 rows. We can fail a package manually if it has less than 1000 rows.
Ans: We can use the code below to force a fail inside a SQL statement.
You should get the results you want.
Ans: FYI, I’ve just had the same error.
I switched to Windows authentication, disconnected, then tried to log in again with SQL authentication. This time I was told my password had expired. I changed the password and it all worked again.
In this section, I have gathered a set of SSIS scenario-based interview questions and included the perfect answers. It will help you to improve your hands-on skills and make you job-ready.
Ans: SSIS (SQL Server Integration Services) is an upgrade of DTS (Data Transformation Services), which is a feature of the previous version of SQL Server. SSIS packages can be created in BIDS (Business Intelligence Development Studio). These can be used to merge data from heterogeneous data sources into SQL Server. They can also be used to populate data warehouses, to clean and standardize data, and to automate administrative tasks.
SQL Server Integration Services (SSIS) is a component of Microsoft SQL Server 2005. It replaces Data Transformation Services, which has been a feature of SQL Server since Version 7.0. Unlike DTS, which was included in all versions, SSIS is only available in the “Standard” and “Enterprise” editions.
Integration Services provides a platform to build data integration and workflow applications. The primary use of SSIS is data warehousing as the product features a fast and flexible tool for data extraction, transformation, and loading (ETL).). The tool may also be used to automate the maintenance of SQL Server databases, update multidimensional cube data, and perform other functions.
Ans: I have an SSIS package that has a child package that is failing. The error message isn’t very helpful.
The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020
The problem seems to be I am running out of Virtual Memory to complete the job.
I found a forum thread that may help solve the problem.
From the solutions offered I am unsure though how to:
Increase default buffer size
Allocate a child package into its own memory allocation.
I am running the package daily in SQL Server 2005. I was running fine daily up until the 12th. I am assuming the dat file that we are using to import data into the database grew to a size that was too large for the database to handle. It’s only an 8.90MB CSV file though. The import is a straight column to column import.
The problem child package in step 1 and fails and continues and successfully completes the next 8 steps.
Ans: The reason I ask is that SSIS memory is allocated from me to leave the area of memory that sits outside of the SQL Server process space.
See HERE for details on configuring the amount of memory available to me to leave a portion of memory.
For generic performance tuning of SSIS consult the following article.
Integration Services: Performance Tuning Techniques
I hope this makes sense but feel free to drop me a line once you have digested the material.
Ans: If you have access to SQL Server Agent through SQL Server Management Studio, here are the steps to create a job using the Graphical User Interface. The steps show how to create an SQL job to run SSIS using SQL Agent Service Account and also how to create a proxy to run under a different using different credentials.
Ans: VARCHAR(MAX) column values will be stored IN the table row, space permitting. So if you have a single VARCHAR(MAX) field and it’s 200, 300 bytes, chances are it’ll be stored inline with the rest of your data. No problem or additional overhead here.
Only when the entire data of a single row cannot fit on a single SQL Server page (8K) anymore, only then will SQL Server move VARCHAR(MAX) data into overflow pages.
So all in all, I think you get the best of both worlds – inline storage when possible, overflow storage when necessary.
Ans: That answer points to this article: SSIS package problem
Here are the proposed solutions – have you evaluated them?
Method 1: First, we must create an SQL Server Agent proxy account. The account will allow the SQL server to run the job using credentials. This account and the account created in the package should be the same. This method can decrypt sensitive information and meet users' critical requirements. This method has some limitations when we move packages from one computer to another.
Method 2: We set the SSIS package ProtectionLevel property to ServerStorage in this method. It allows packages to be stored in the SQL Server database. Also, it allows access control through SQL Server database roles.
Method 3: In this method, we must set the first set of the SSIS package ‘ProtectionLevel’ property. We must set this property as ‘EncryptSensitiveWithPassword’. So, we must use a password for encryption. We must change the ‘SQL Server Agent job step’ command line to use the password.
Method 4: In this method, we use the SSIS package configuration files to store sensitive data. We store the data in a secured folder. If we don’t want to encrypt the package, we must change the ‘ProtectionLevel’ property to ‘Don’tSaveSensitive’ status. The needed information is taken from the configuration file when we run the SSIS package.
Method 5: We must create a package template with a protection level. It must differ from the default settings.
Ans: If your ETL is mostly E and L, with very little T, and if you can write your SPs so they don’t rely on cursors, then going the SP-only route is probably fine.
For more complex processes, particularly those that involve heavy transforms, slowly changing dimensions, data mining lookups, etc, SSIS has three advantages.
First, it manages memory very efficiently, which can result in big performance improvements compared to T-SQL alone.
Second, the graphical interface lets you build large, complex, and reliable transform much more easily than hand-crafted T-SQL.
And third, SSIS lets you more easily interact with additional external sources, which can be very handy for things like data cleansing.
Ans: You are copying some user data from one SQL Server to another. We can call them Alpha and Beta. The SSIS package runs on Beta and gets the Alpha rows that meet a specific condition. The package then adds the rows to Beta's table.
Now, you want to add new rows to the Beta. You can use the code below.
But this doesn’t work in an SSIS package. How will you resolve this problem?
We are adding only new people and are not looking for changed data in existing records. So, we need to store the last ID in the DB. The below code will help us.
Now, we can use the code below to insert the row's last ID transferred.
When selecting the OLEDB source, set the data access mode to SQL Command and apply the below code.
Hope you are using ID int IDENTITY for your Primary Key. We must have the "last changed" column in every table to monitor for data changes in existing records. Besides, we need to store the time of the last transfer.
Ans: Your example seems simple, it looks like you are adding only new people, not looking for changed data in existing records. In this case, store the last ID in the DB.
CREATE TABLE dbo.LAST (RW int, LastID Int)
go
INSERT INTO dbo.LAST (RW, LastID) VALUES (1,0)
Now you can use this to insert the last ID of the row transferred.
UPDATE dbo.LAST SET LastID = @myLastID WHERE RW = 1
When selecting OLEDB source, set data access mode to SQL Command and use
DECLARE @Last int
SET @Last = (SELECT LastID FROM dbo.LAST WHERE RW = 1)
SELECT * FROM AlphaPeople WHERE ID > @Last;
Note, I do assume that you are using ID int IDENTITY for your PK.
If you have to monitor for data changes of existing records, then have the “last changed” column in every table, and store the time of the last transfer.
A different technique would involve setting up a linked server on Beta to Alpha and running your example without using SSIS. I would expect this to be way slower and more resource-intensive than the SSIS solution.
INSERT INTO dbo.BetaPeople
SELECT * FROM [Alpha].[myDB].[dbo].[AlphaPeople]
WHERE ID NOT IN (SELECT ID FROM dbo.BetaPeople)
Ans: There is a 4000-character limit to the DT_WSTR (Unicode string) data type used by SSIS. Since expressions are converted to DT_WSTR, they also cannot exceed 4000 characters.
You’ll get an error if you exceed the limit at design time. However, you will not receive any obvious warnings or errors if you do so at run time. The operation will fail silently, and the expression may be evaluated incorrectly.
Note: this does not mean that strings or SSIS variables cannot exceed 4000 characters. The limitation only applies to expressions and other places that DT_WSTR is used.
Here’s an example that reproduces the issue:
Even though the variable is not empty, the constraint will incorrectly evaluate to False because the expression exceeds 4000 characters. However, this will not cause any errors and the package will not fail.
If you notice strange behavior in your SSIS logic, take a look at your expressions and ensure that they don’t exceed 4000 characters when evaluated.
Ans: I have a SQL Task that needs to run a simple update to update a single row.
I have set the SQLStatement to:
update agency set AgencyLastBatchSeqNo = ? where agencyID = ?
On the Parameter Mapping page, I gave set Parameter 0 and Parameter 1 to variables that I know contain the right values. I have also set the Parameter Name values correctly.
In the database, the column AgencyLastBatchSeqNo is an int, AgencyID is a big int. Does anyone have a reference to find what the data types map to in SSIS? I have guessed at SHORT for the int and LONG for the big int.
When I run the task I get the following error:
[Execute SQL Task] Error: Executing the query “update agency set AgencyLastBatchSeqNo = ? where AgencyID = ?” failed with the following error: “Parameter name is unrecognized.”. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.
Ans: The answer to this is to change the Parameter Name value in the Parameter Mapping screen.
Given the following query
SELECT Id, AnimalName FROM dbo.Farm WHERE Farm_id =?
Assuming my Parameter is an integer Variable named User:: Farm_id
Choose the following values on the Parameter Mapping Screen
Variable Name: User::Farm_id
Direction: Input
Data Type: LONG
Parameter Name: 0
Parameter Size: -1
Originally the Parameter Name will be “NewParameterName”. Simply change this to the ordinal position of your variable marker (“?”)
Ans: Persist Security = true means that the Password used for SQL authentication is not removed from the ConnectionString property of the connection.
When Integrated Security = true is used then the Persist Security is completely irrelevant since it only applies to SQL authentication, not to windows/Integrated/SSPI.
Ans: From MS:
In Script component code, you use typed accessor properties to access certain package features such as variables and connection managers.
The PreExecute method can access only read-only variables. The PostExecute method can access both read-only and read/write variables.
For more information about these methods, see Coding and Debugging the Script Component. Comparing the Script Task and the Script Component
It looks like Dts is only available in the Script Task.
Here is what the code looks like:
Skip code block
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Public Class ScriptMain
Inherits UserComponent
Dim updateSQL As String
Public Overrides Sub PostExecute()
Me.ReadWriteVariables(“SQL_ATTR_Update”).Value = “Test”
End Sub
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
‘updateSQL = Row.ITMID + Row.PRCCAT
End Sub
End Class
Ans: On the script tab, make sure you put the variable in either the read-only variables or readwritevariables text boxes.
Here is a simple script that I use to format the errors in a data flow (saved in a RecordSet Variable) into the body of an email. Basically, I read the record set variable into a data table and process it row by row with the for loops. After this task completes I examine the value of uvErrorEmailNeeded to determine if there is anything to email using a conditional process flow connector. You will also need to add a reference to system.xml in your vb script. This is in SQL 2005.
Code block
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Xml
Imports System.Data.OleDb
Public Class ScriptMain
Public Sub Main()
Dim oleDA As New OleDbDataAdapter
Dim dt As New DataTable
Dim col As DataColumn
Dim row As DataRow
Dim sMsg As String
Dim sHeader As String
oleDA.Fill(dt, Dts.Variables(“uvErrorTable”).Value)
If dt.Rows.Count > 0 Then
Dts.Variables(“uvErrorEmailNeeded”).Value = True
For Each col In dt.Columns
sHeader = sHeader & col.ColumnName & vbTab
Next
sHeader = sHeader & vbCrLf
For Each row In dt.Rows
For Each col In dt.Columns
sMsg = sMsg & row(col.Ordinal).ToString & vbTab
Next
sMsg = sMsg & vbCrLf
Next
Dts.Variables(“uvMessageBody”).Value = “Error task. Error list follows:” & vbCrLf & sHeader& sMsg & vbCrLf & vbCrLf
End I
Dts.TaskResult = Dts.Results.Success
End Sub
End Class
Ans: There is no best value, it depends greatly on the design of the database, the number of users, the kind of hardware you are operating one, etc. That is why you need to test for yourself with your system.
Ans: If you connect to the Integration Services instance on the server (a different choice in the dropdown from “Database Engine” when you connect in SQL Server Management Studio), they’ll be under the MSDB folder under Stored Packages.
When you start management studio and connect to a database, make sure you have the server type set to Integration Services instead of Database Engine.
Ans: Between your control flow tasks, click on the arrow and choose Edit. When you do this, you get a dialog that allows you to check the “constraint” (success, completion or failure) of the task, an “expression” (i.e. you can have your execute SQL task return a value, store that value in a variable, and check the value of that variable in an expression to determine whether to continue down the path you are editing), an “expression and a constraint”, and an “expression or a constraint”.
These last two are the same except for the logic. “Expression and constraint” requires a true condition on both the expression and the constraint, “expression or constraint” requires a true condition on only one of the expressions and the constraint.
Ans: We have an SSIS job that has been running for over a year with no issue. The job takes a data set from a select statement in an oracle db and transfers the result to a table on a SQL Server 2005 instance.
As of this morning, we receive the following error message:
Error: 2010-05-26 05:06:47.71 Code: 0xC02090F5 Source: [job_name] DataReader Source [793] Description: The component “DataReader Source” (793) was unable to process the data. End Error Error: 2010-05-26 05:06:47.71 Code: 0xC0047038 Source: job_name Description: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.
The PrimeOutput method on component “DataReader Source” (793) returned error code 0xC02090F5. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure. End Error Error: 2010-05-26 05:06:47.71 Code: 0xC0047021 Source: P… The package execution fa… The step failed.
Ans: There may be error messages posted before this with more information about the failure.
Did you look for other more specific error messages?
Are you logging errors or steps as they run? If so did you look in the logging table? If you aren’t logging in your SSIS package, I’d set that up as part of this fix, it will make it immeasurably easier to find the problem.
Things I would consider: schema changes, permissions changes, any recent software, operating systems updates to the servers involved, data mismatches (the first time perhaps that the Oracle table held data that couldn’t be inserted into the SQL table – check columns that don't directly match first, string data that might get truncated, dates stored as strings that need to convert to DateTime, etc.).
Ans: Do you mean what are the pros and cons of using SQL Server Agent Jobs for scheduling running SSIS packages and command shell executions? I don’t really know the pros of windows scheduler, so I’ll stick to listing the pros of SQL Server Agent Jobs.
On one occasion we found a situation where Windows Scheduler was able to do something we couldn’t do with a SQL Server Agent Job. During the early days after a SAN migration, we had some scripts for snapshotting and cloning drives that didn’t work in a SQL Server Agent Job.
So we used a Windows Scheduler task to run the code for a while. After about a month, we figured out what we were missing and were able to move the step back to the SQL Server Agent Job.
Regarding SSIS over exe stored procedure calls.
Let me know if you need me to elaborate on any points. Good luck!
Ans: Every column has a boolean property called ColumnName_IsNull (in your case should be Row.ID_IsNull). I think you should set it to true to set the columns value NULL
Ans: Set the property RetainSameConnection on the Connection Manager to True so that a temporary table created in one Control Flow task can be retained in another task.
Here is a sample SSIS package is written in SSIS 2008 R2 that illustrates using temporary tables.
Walkthrough:
Create a stored procedure that will create a temporary table named ##tmpStateProvince and populate with few records. The sample SSIS package will first call the stored procedure and then will fetch the temporary table data to populate the records into another database table.
The sample package will use the database named Sora Use the below create stored procedure script.
Skip code block
USE Sora;
GO
CREATE PROCEDURE dbo.PopulateTempTable
AS
BEGIN
SET NOCOUNT ON;
IF OBJECT_ID(‘TempDB..##tmpStateProvince’) IS NOT NULL
DROP TABLE ##tmpStateProvince;
CREATE TABLE ##tmpStateProvince
(
CountryCode nvarchar(3) NOT NULL
, StateCode nvarchar(3) NOT NULL
, Name nvarchar(30) NOT NULL
);
INSERT INTO ##tmpStateProvince
(CountryCode, StateCode, Name)
VALUES
(‘CA’, ‘AB’, ‘Alberta’),
(‘US’, ‘CA’, ‘California’),
(‘DE’, ‘HH’, ‘Hamburg’),
(‘FR’, ’86’, ‘Vienne’),
(‘AU’, ‘SA’, ‘South Australia’),
(‘VI’, ‘VI’, ‘Virgin Islands’);
END
GO
Create a table named dbo.StateProvince that will be used as the destination table to populate the records from the temporary table. Use the below create table script to create the destination table.
Skip code block
USE Sora;
GO
CREATE TABLE dbo.StateProvince
(
StateProvinceID int IDENTITY(1,1) NOT NULL
, CountryCode nvarchar(3) NOT NULL
, StateCode nvarchar(3) NOT NULL
, Name nvarchar(30) NOT NULL
CONSTRAINT [PK_StateProvinceID] PRIMARY KEY CLUSTERED
([StateProvinceID] ASC)
) ON [PRIMARY];
GO
Create an SSIS package using Business Intelligence Development Studio (BIDS). Right-click on the Connection Managers tab at the bottom of the package and click New OLE DB Connection… to create a new connection to access SQL Server 2008 R2 database.
Click New… on Configure OLE DB Connection Manager.
Perform the following actions on the Connection Manager dialog.
SelectNative OLE DBSQL Server Native Client 10.0 from Provider since the package will connect to SQL Server 2008 R2 database
Enter theServer name, like MACHINENAMEINSTANCE
SelectUse Windows Authentication from Log on to the server section or whichever you prefer.
Select the database from select or enter a database name, the sample uses the database name Sora.
ClickTest Connection
ClickOK on the Test connection succeeded
ClickOK on Connection Manager
The newly created data connection will appear on Configure OLE DB Connection Manager. ClickOK.
OLE DB connection manager KIWISQLSERVER2008R2.Sora will appear under the Connection Manager tab at the bottom of the package. Right-click the connection manager and click Properties
Set the property RetainSameConnection on the connection KIWISQLSERVER2008R2.Sora to the value True.
Right-click anywhere inside the package and then click Variables to view the variables pane. Create the following variables.
A new variable namedPopulateTempTable of data type String in the package scopeSO_5631010 and set the variable with the value EXEC dbo.PopulateTempTable.
A new variable namedFetchTempData of data type String in the package scope SO_5631010and set the variable with the value SELECT CountryCode, StateCode, Name FROM ##tmpStateProvince
Drag and drop an Execute SQL Task onto the Control Flow tab. Double-click the Execute SQL Task to view the Execute SQL Task Editor.
On the General page of the Execute SQL Task Editor, perform the following actions.
Set the name to Create and populate a temp table
Set the connection Type to OLE DB
Set the connection to KIWISQLSERVER2008R2.Sora
SelectVariable from SQLSourceType
SelectUser::PopulateTempTable from SourceVariable
ClickOK
Drag and drop a Data Flow Task onto the Control Flow tab. Rename the Data Flow Task as Transfer temp data to a database table. Connect the green arrow from the Execute SQL Task to the Data Flow Task.
Double-click the Data Flow Task to switch to the Data Flow tab. Drag and drop an OLE DB Source onto the Data Flow tab. Double-click OLE DB Source to view the OLE DB Source Editor.
On the Connection Manager page of the OLE DB Source Editor, perform the following actions.
SelectKIWISQLSERVER2008R2.Sora from OLE DB Connection Manager
SelectSQL command from variable from Data access mode
SelectUser::FetchTempData from Variable name
ClickColumns page
Clicking the Columns page on OLE DB Source Editor will display the following error because the table##tmpStateProvince specified in the source command variable does not exist and SSIS is unable to read the column definition.
To fix the error, execute the statement EXEC dbo.PopulateTempTable using SQL Server Management Studio (SSMS) on the database Sora so that the stored procedure will create the temporary table. After executing the stored procedure, click the Columns page on OLE DB Source Editor, you will see the column information. Click OK.
Drag and drop OLE DB Destination onto the Data Flow tab. Connect the green arrow from OLE DB Source to OLE DB Destination. Double-click OLE DB Destination to open OLE DB Destination Editor.
On the Connection Manager page of the OLE DB Destination Editor, perform the following actions.
SelectKIWISQLSERVER2008R2.Sora from OLE DB Connection Manager
selectable or view – fast load from Data access mode
Select[dbo].[StateProvince] from Name of the table or the view
ClickMappings page
Click Mappings page on the OLE DB Destination Editor would automatically map the columns if the input and output column names are the same. Click OK. Column StateProvinceID does not have a matching input column and it is defined as an IDENTITY column in the database. Hence, no mapping is required.
Data Flow tab should look something like this after configuring all the components.
Click the OLE DB Source on the Data Flow tab and press F4 to view Properties. Set the propertyValidateExternalMetadata to False so that SSIS would not try to check for the existence of the temporary table during the validation phase of the package execution.
Execute the query select * from dbo.StateProvince in the SQL Server Management Studio (SSMS) to find the number of rows in the table. It should be empty before executing the package.
Execute the package. Control Flow shows successful execution.
In Data Flow tab, you will notice that the package successfully processed 6 rows. The stored procedure created early in this posted inserted 6 rows into the temporary table.
Execute the query select * from dbo.StateProvince in the SQL Server Management Studio (SSMS) to find the 6 rows successfully inserted into the table. The data should match with rows founds in the stored procedure.
The above example illustrated how to create and use a temporary table within a package.
Ans: If you are just interested in knowing the columns being processed and not interested in the info for further use, one possible option is making use of the SSIS logging feature. Here is how it works for data flow tasks.
I hope that helps.
Yes, learning SSIS is a cakewalk. Anyone can quickly learn to design SSIS packages and make data integration effortlessly. MindMajix offers SSIS training by highly skilled trainers with industry-designed course modules. Once you complete the training, you will become job-ready and stay ahead of the curve.
Yes, learning SSIS is relevant in 2024. First and foremost, learning SSIS is not time-consuming. You can easily integrate ETL processes and data with SSIS. Above all, big data and data analysis are the buzzwords today. SSIS greatly helps to transfer and transform data.
According to AmbitionBox, Accenture, TCS, Cognizant, Infosys, LTIMindtree, Wipro, Infosys, Deloitte, etc., companies hire SSIS developers for attractive salaries.
Yes, SSIS is a good ETL tool. SSIS is the best tool for developers to manage a large volume of complex data. With SSIS, you can create, schedule, and manage data integration efficiently. SSIS employs a virtual design interface that you can use to create packages to define ETL workflows.
Jotted down are the advantages of SSIS.
It’s time to wrap! We hope this blog’s SSIS interview questions have given you a broad knowledge of SSIS packages, data transformations, ETL design, database design and development, and many more. Now, you are ready to face your SSIS interviews with ultimate confidence.
However, if you go through professional SSIS training, it will help boost your SSIS skills and simplify your efforts to clear SSIS interviews. MindMajix offers top-rated SSIS training, which you can register and gain certification. It will help you elevate your SSIS skills to new heights and make a lot of success stories in your career.
Our work-support plans provide precise options as per your project tasks. Whether you are a newbie or an experienced professional seeking assistance in completing project tasks, we are here with the following plans to meet your custom needs:
Name | Dates | |
---|---|---|
SSIS Training | Jan 21 to Feb 05 | View Details |
SSIS Training | Jan 25 to Feb 09 | View Details |
SSIS Training | Jan 28 to Feb 12 | View Details |
SSIS Training | Feb 01 to Feb 16 | View Details |
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.