This recipe will show how we can find Talend data issues by watching the data as it flows between components using the Talend debug mode.
Getting ready
Open the jo_cook_ch10_0030_useDebugMode job.
How to achieve it…
The steps for using the Talend debug mode are as follows:
If you would like to Enrich your career with a Talend certified professional, then visit Mindmajix - A Global online training platform: “Talend Online Course”. This course will help you to achieve excellence in this domain.
How it works…
Being able to view the data, progressing through the job in real-time allows us to see that the third row failed. Because the reported error is a null pointer exception and the only field in the row that has a null pointer is the age, we can confirm if the input age value is incorrect.
There’s more…
You will notice that the execution of a job is slowed down considerably by using this method for debugging because of the amount of data that is refreshed on the screen.
In case of larger datasets, it is better to use either logging or Java methods to debug the code.
Frequently Asked TALEND Interview Questions & Answers
Occasionally, it is necessary to delve deeper into the Java code generated by Talend in order to locate and understand the cause of a bug. This recipe is a very light introduction for debugging TALEND code using the Java debugging mode in Talend.
------- Related Page: Using Java in Talend --------
Getting ready
Open the jo_cook_ch10_0040_useJavaDebugger job.
How to accomplish it…
The steps for using the Java debugger to debug Talend jobs are as follows:
Adding a breakpoint to allow inspection of data:
[Related Article: "TALEND Tutorials"]
A Simple Example
As can be seen from the following example, tLogRow has been used to display the contents of a text file.
Basic Settings
The Basic settings tab allows you to control the presentation of this data. You can try the different options, to see which suit your needs.
If you are running your Job in Talend Studio, be careful not to display large volumes of data, as this can make Talend Studio unresponsive.
Function TJavaRow allows you to enter customized code which you can integrate in a Talend programme. With TJavaRow, you can enter the Java code to be applied to each row of the flow.
Purpose TJavaRow allows you to broaden the functionality of Talend Jobs, using the Java language.
Although tLogRow is flexible and very useful, it does have some limitations, in that it only prints what is defined in a schema. tJavaRow doesn’t have the same limitations. This recipe will show you how it can be utilized.
Getting ready
Open the jo_cook_ch10_0060_tJavaRow job.
How to do it…
The steps for using tJavaRow to display row information are as follows:
//Test for change of key and print heading lines if key has changed
if (Numeric.sequence(input_row.name, 1, 1) == 1){
System.out.println("nn******************** Records for
customer name: "+input_row.name+" ***********************");
System.out.printf("%-20s %-20s %-30s %-3s n","name","DOB","timestamp","age");
}
// print formatted output fields System.out.printf("%-20s %-20s %-30s %-3s
n",input_row.name,TalendDate.formatDate("dd/MM/yyyy",input_row.d ateOfBirth),
TalendDate.formatDate("dd/MM/yyyy HH:mm:ss",input_row.timestamp),input_row.age+"");
How it works…
System.out.println() is the Java function to print a line of text to the console, and is what you will most commonly use when logging using tJava (and tJavaRow) and System.out.printf, which allows a formatted string to be printed.
The if statement uses a sequence generated from the name to test whether this record is the first for the name (sequence is 1). If the sequence is 1, then the heading lines are printed.
The data is then formatted and printed for each line.
Tip
If you use tJavaRow within a flow, then make sure that you remember to propagate the data using the Generate code option.
Note also that, if you simply want to capture the value of, say a globalMap field, you could just add a temporary field to the schema, and then use tLogRow, but remember to delete the temporary fields once your testing is over.
Function tJava enables you to enter a personalized code in order to integrate it into Talend program. You can execute this code only once.
Purpose tJava makes it possible to extend the functionalities of a Talend Job using custom Java commands.
tJava is a very useful component for logging purposes, because it can be used in its own sub job. This enables tJava to be used to print job status information at given points in the process. The following recipe demonstrates this.
Getting ready
Open the jo_cook_ch10_0070_loggingWithtJava job.
How to do it…
The steps for using tJava to display status messages and variables are as follows:
1. Open and add the following code:
System.out.println("nnSearching directory "+context.cookbookData+"
chapter10 for files matching wildcard *jo*nn");
2. Open tJava_2 and add the following code:
System.out.println("Processing file: "+ ((String)globalMap.get("tFileList_1_CURRENT_FILE")));
3. Open tJava_3 and add the following code:
System.out.println("nnCompleted......"+ ((Integer)globalMap.get("tFileList_1_NB_FILE"))+"
files foundnn");
How it works…
tJava_1 and tJava_3 simply print out process status information (starting process end). tJava_2 and however, it is more interesting.
The tFileList component uses an iterator link to enable the components following it to be executed multiple times. This means tJava_2 is called once for each file found in the source directory that matches the wildcard expression.
Thus tJava_2 is used to log information regarding each of the files being processed, which is a very useful piece of log information.
Name | Dates | |
---|---|---|
Talend Training | Oct 15 to Oct 30 | View Details |
Talend Training | Oct 19 to Nov 03 | View Details |
Talend Training | Oct 22 to Nov 06 | View Details |
Talend Training | Oct 26 to Nov 10 | View Details |
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.