Talend Interview Questions
Talend Interview Questions
Q. What is the full name of Talend?
Talend Open Studio
Q. What is Talend Open Studio?
Talend Open Studio for Data Integration is an open source data integration product developed by Talend and designed to combine, convert and update data in various locations across a business.
Q. When was Talend Open Studio come into existence/launched?
Launched in October 2006
Q. Talend Open Studio written in which computer language?
Q. What is the most current version of Talend Open Studio?
Talend Open Studio 5.6.0
Q. What is the difference between the ETL and ELT?
ETL: Extract, Transform, and Load (ETL) is a process that involves extracting data from outside sources, transforming it to fit operational needs (sometimes using staging tables), then loading it into the end target database or data warehouse. This approach is reasonable as long as many different databases are involved in your data warehouse landscape. In this scenario you have to transport data from one place to another anyway, so it’s a legitimate way to do the transformation work in a separate specialized engine.
ELT: Extract, Load, Transform (ELT) is a process where data is extracted, then loaded into a staging table in the database, transforming it where it sits in the database and then loading it into the target database or data warehouse.
Q. What is the use of tLoqateAddressRow component in Talend?
This component is use for correct mailing addresses associated with customer data to ensure a single customer view and better delivery for their customer mailings.
Q. Can I change the background color of the Job designer?
Yes. Change the background color of the Job designer by clicking Preferences on the Window menu, followed by Talend, Appearance, Designer, and then Colors.
Q. Can you define a schema at run time?
No, schemas must be defined during design, not run time.
Q. Can you define a variable that is accessible from multiple Jobs?
Yes, you can declare a static variable in a routine, and add the setter/getter methods for this variable in the routine. The variable is then accessible from different Jobs.
Q. Can you save my personal settings in the DQ Portal?
No, you can’t.
Q. Can you edit generated code directly?
This is no possible; you cannot directly edit the code generated for a Talend Job.
Q. If you want to include your own Java code in a Job, use one of these methods:
- Use a tJava, tJavaRow, or tJavaFlex component.
- Create a routine by right -clicking Routines under Code in the Repository and then clicking Create routine
Q. Can you use ASCII or Binary Transfer mode in SFTP?
No. Secure(or SSH) File Transfer Protocol (SFTP) is not FTP. It was defined as an extension to SSH and assumes an underlying secure channel. There is no relationship between FTP and SFTP, so concepts such as “transfer mode’ or “current remote directory” that exist in FTP do not exist in SFTP.
For the same reason, there is no transfer option when you select ‘SFTP Support’ on a tFTPxxx component.
Q. Which component is used to sort data?
Q. What is the default pattern of a Date column in Talend?
By default, the date pattern for a column of type Date in a schema is “dd-MM-yyyy”.
Q. What is a component?
Basically a component is a functional piece that performs a single operation. For example, tMysglInput extracts data from a MySQL table, tFilterRow filters data based on a condition.
Physically, a component is a set of files stored within a folder named after the component name. All native components are located in:
<Talend Studio installation dir>/plugins/org.talend.designer.components.localprovider _ /components/ directory.
Each component is a sub -folder under this directory, the folder name is the component name. Graphically, a component is an icon that you can drag and drop from the Palette to the workspace. Technically, a component is a snippet of generated Java code that is part of a Job which is a Java class. A Job is made of one or more components or connectors. The job name will be the class name and each component in a job will be translated to a snippet of generated Java code. The Java code will be compiled automatically when you save the job.
Q. What is the difference between “Insert or Update” and “Update or Insert”?
- Insert or Update: First tries to insert a record, but if a record with a matching primary key already exists, instead updates that record.
- Update or Insert: First tries to update a record with a matching primary key, but if none already exists, instead inserts the record.
From a results point of view, there are no differences between the two, nor are there significant performance differences. In general, choose the action that matches what you expect to be more common: Insert or Update if you think there are more inserts than updates, Update or Insert if you think there are more updates than inserts.
Q. What is the difference between Built -In and Repository?
Built-in: all information is stored locally in the Job. You can enter and edit all information manually.
Repository: all information is stored in the repository.
You can import read-only information into the Job from the repository. If you want to modify the information, you must take one of the following actions:
- Convert the information from Repository to Built-in and then edit the built-in information.
- Modify the information in the Repository. Once you have made the changes, you are prompted to update the changes into the Job.
Q. Built -In vs. Repository, Which is better?
It depends on the way you use the information is used. Use Built -In for information that you only use once or very rarely. Use Repository for information that you want to use repeatedly in multiple components or Jobs, such as a database connection.
Q. What is the difference between OnSubjobOK and OnComponentOK?
OnSubjobOK and OnComponentOK are trigger links, which can link to another subjob.
The main difference between OnSubjobOK and OnComponentOK lies in the execution order of the linked subjob. With OnSubjobOK, the linked subjob starts only when the previous subjob completely finishes. With OnComponentOK, the linked subjob starts when the previous component finishes.
Q. How can you normalize delimited data in Talend Open Studio?
By using the tNormalize component
Q. What is tMap?
tMap is an advanced component, which integrates itself as plugin to Talend Studio. tMap transforms and routes data from single or multiple sources to single or multiple destinations. It allows you to define the tMap routing and transformation properties.
Q. What types of joins are supported by the tMap component?
Inner, outer, unique, first, and all joins
Q. What is tDenormalizeSortedRow?
tDenormalizeSortedRow combines in a group all input sorted rows. Distinct values of the denormalized sorted row are joined with item separators. tDenormalizeSortedRow helps synthesizing sorted input flow to save memory.
Q. Which Talend component is used for data transform using buitl in .NET classes?
tDotNETRow helps you facilitate data transform by utilizing custom or built-in .NET classes.
Q. What is tJoin?
tJoin joins two tables by doing an exact match on several columns. It compares columns from the main flow with reference columns from the lookup flow and outputs the main flow data and/or the rejected data.
Q. What do you understand by MDM in Talend?
Master data management, through which an organization builds and manages a single, consistent, accurate view of key enterprise data, has demonstrated substantial business value including improvements to operational efficiency, marketing effectiveness, strategic planning, and regulatory compliance. To date, however, MDM has been the privilege of a relatively small number of large, resource- rich organizations. Thwarted by the prohibitive costs of proprietary MDM software and the great difficulty of building and maintaining an in-house MDM solution, most organizations have had to forego MDM despite its clear value.
Q. What’s new in v5.6?
This technical note highlights the important new features and capabilities of version 5.6 of Talend’s comprehensive suite of Platform, Enterprise and Open Studio solutions.
Q. With version 5.6, Talend:
- Extends its big data leadership position enabling firms to move beyond batch processing and into real-time big data by providing technical previews for the Apache Spark, Apache Spark Streaming and Apache Storm frameworks.
- Enhances its support for the Internet of Things (IoT) by introducing support for key IoT protocols (MQTT, AMQP) to gather and collect information from machines, sensors, or other devices.
- Improves Big Data performance: MapReduce executes on average 24% faster in v5.6 than in v5.5, and 53% faster than in v5.4.2, while Big Data profiling performance is typically 20 times faster in v5.6 compared to v5.5.
- Enables faster updates to MDM data models and provides deeper control of data lineage, more visibility and control.
- Offers further enterprise application connectivity and support by continuing to add to its extensive list of over 800 connectors and components with enhanced support for enterprise applications such as SAP BAPI and Tables, Oracle 12 GoldenGate CDC, Microsoft HDInsight, Marketo and Salesforce.com.