Java debugger and tJavaRow in Talend

This recipe will show how we can find Talend data issues by watching the data as it flows between components using the Talend debug mode.

TJavaRow Talend

Using the Talend debug mode – row-by-row execution

Getting ready

Open the jo_cook_ch10_0030_useDebugMode job.

How to achieve it…

The steps for using the Talend debug mode are as follows:

  • Open the run tab, and select the Debug Run option on the left-hand side as shown in the following screenshot:
  • Click on Traces Debug and the job will execute, and you can watch the data in the rows as they progress along with the main flow of the sub-job until the error is hit, and the job fails.

If you would like to Enrich your career with a Talend certified professional, then visit Mindmajix - A Global online training platform: “Talend Online Course”. This course will help you to achieve excellence in this domain.

How it works…

Being able to view the data, progressing through the job in real-time allows us to see that the third row failed. Because the reported error is a null pointer exception and the only field in the row that has a null pointer is the age, we can confirm if the input age value is incorrect.

There’s more…

You will notice that the execution of a job is slowed down considerably by using this method for debugging because of the amount of data that is refreshed on the screen.

In case of larger datasets, it is better to use either logging or Java methods to debug the code.

Frequently Asked TALEND Interview Questions & Answers

Using the JAVA DEBUGGER to debug Talend jobs

Occasionally, it is necessary to delve deeper into the Java code generated by Talend in order to locate and understand the cause of a bug. This recipe is a very light introduction for debugging TALEND code using the Java debugging mode in Talend.

MindMajix Youtube Channel

-------        Related Page: Using Java in Talend        --------

Getting ready

Open the jo_cook_ch10_0040_useJavaDebugger job.

How to accomplish it…

The steps for using the Java debugger to debug Talend jobs are as follows:

  • Select the Debug Run option from the Run dialogue and click on the down arrow for the run type. Select Java Debug to run using the Java option.
  • Confirm the perspective switch by clicking Yes.
  • Click the resume icon to start the job running.
  • The job will execute and return an error. Scroll through the console output (bottom panel), and you will see the error, as shown in the following screenshot:
  • Click the hyperlink for line 2574. This will take you to the line that is causing an error.

Adding a breakpoint to allow inspection of data:

  • Right-click on the line number, and select Toggle Breakpoint. The line now has a blue button next to it.
  • Run the job again using the Debug button, then the resume button, and it will stop before the line is executed.
  • Highlight the code age using the mouse, then right-click and select Inspect. You will see that the value is 23.
  • Right-click again and select the option Watch. Repeat this for the customer name. You will see that these fields have been added to the watch list in the top right-hand corner, as shown in the following screenshot:
  • Click the resume button twice more, and you will eventually hit the row where the value is null. This will give you the name of the customer (J Smith) for the erroneous age.
  • End the job by clicking on the resume button.

[Related Article: "TALEND Tutorials"]

Using tLogRow:

tLogRow Component Reference

  • The tLogRow component is part of the Logs & Errors family of components.
  • tLogRow allows you to write data, that is flowing through your Job (rows), to the console.

A Simple Example

As can be seen from the following example, tLogRow has been used to display the contents of a text file.

tLogRow

Basic Settings

The Basic settings tab allows you to control the presentation of this data. You can try the different options, to see which suit your needs.

Basic settings

Debugging

If you are running your Job in Talend Studio, be careful not to display large volumes of data, as this can make Talend Studio unresponsive.

Using TJavaRow to display row information

Function TJavaRow allows you to enter customized code which you can integrate in a Talend programme. With TJavaRow, you can enter the Java code to be applied to each row of the flow.

Purpose TJavaRow allows you to broaden the functionality of Talend Jobs, using the Java language.

Although tLogRow is flexible and very useful, it does have some limitations, in that it only prints what is defined in a schema. tJavaRow doesn’t have the same limitations. This recipe will show you how it can be utilized.

Getting ready

Open the jo_cook_ch10_0060_tJavaRow job.

How to do it…

The steps for using tJavaRow to display row information are as follows:

  • Run the job. You will see the data in the console output sorted by customer key.
  • Remove the tLogRow component, and add a tJavaRow component in its place.
  • Open the tJavaRow component and add the following code:
//Test for change of key and print heading lines if key has changed
if (Numeric.sequence(input_row.name, 1, 1) == 1){ 
 System.out.println("nn******************** Records for
customer name: "+input_row.name+" ***********************");
System.out.printf("%-20s %-20s %-30s %-3s n","name","DOB","timestamp","age");
}
// print formatted output fields System.out.printf("%-20s %-20s %-30s %-3s
n",input_row.name,TalendDate.formatDate("dd/MM/yyyy",input_row.d ateOfBirth),
 TalendDate.formatDate("dd/MM/yyyy HH:mm:ss",input_row.timestamp),input_row.age+"");
  • Run the job, and you will see that the simple list of records is now grouped within headings, as shown in the following screenshot:

How it works…

System.out.println() is the Java function to print a line of text to the console, and is what you will most commonly use when logging using tJava (and tJavaRow) and System.out.printf, which allows a formatted string to be printed.

The if statement uses a sequence generated from the name to test whether this record is the first for the name (sequence is 1). If the sequence is 1, then the heading lines are printed.
The data is then formatted and printed for each line.

Tip

If you use tJavaRow within a flow, then make sure that you remember to propagate the data using the Generate code option.

Note also that, if you simply want to capture the value of, say a globalMap field, you could just add a temporary field to the schema, and then use tLogRow, but remember to delete the temporary fields once your testing is over.

Using tJava to display status messages and variables

Function tJava enables you to enter a personalized code in order to integrate it into Talend program. You can execute this code only once.

Purpose tJava makes it possible to extend the functionalities of a Talend Job using custom Java commands.

tJava is a very useful component for logging purposes, because it can be used in its own sub job. This enables tJava to be used to print job status information at given points in the process. The following recipe demonstrates this.

Getting ready

Open the jo_cook_ch10_0070_loggingWithtJava job.

How to do it…

The steps for using tJava to display status messages and variables are as follows:

1. Open and add the following code:

System.out.println("nnSearching directory "+context.cookbookData+"
 chapter10 for files matching wildcard *jo*nn");

2. Open tJava_2 and add the following code:

System.out.println("Processing file: "+ ((String)globalMap.get("tFileList_1_CURRENT_FILE")));

3. Open tJava_3 and add the following code:

System.out.println("nnCompleted......"+ ((Integer)globalMap.get("tFileList_1_NB_FILE"))+"
 files foundnn");

How it works…

tJava_1 and tJava_3 simply print out process status information (starting process end). tJava_2 and however, it is more interesting.

The tFileList component uses an iterator link to enable the components following it to be executed multiple times. This means tJava_2 is called once for each file found in the source directory that matches the wildcard expression.

Thus tJava_2 is used to log information regarding each of the files being processed, which is a very useful piece of log information.

Course Schedule
NameDates
Talend TrainingOct 15 to Oct 30View Details
Talend TrainingOct 19 to Nov 03View Details
Talend TrainingOct 22 to Nov 06View Details
Talend TrainingOct 26 to Nov 10View Details
Last updated: 03 Apr 2023
About Author

Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.

read less
  1. Share: