Home / Talend

Talend - Administering Files

Rating: 5.0Blog-star
Views: 3718
by Ravindra Savaram
Last modified: February 12th 2021

Moving, Copying, Renaming and Deleting FILES AND FOLDERS

As well as reading from and writing to files, TALEND has a set of components that allow developers to perform file functions without the need to call native operating system commands. This recipe shows the basic file management components.



  • Getting ready

Open the job jo_cook_ch08_0100_basicFileCommands.

  • How to achieve it…

In the following recipes, it is worth noting that Talend uses the Linux style forward-slash (/) in the file paths, as opposed to the Windows backslash ().

If you would like to Enrich your career with a Talend certified professional, then visit Mindmajix - A Global online training platform: “Talend Course” . This course will help you to achieve excellence in this domain.

  • Copying a file to another directory

       step:1-Drag tFileCopy to the job.

       step:2-Set the file name to be


       step:3-Set the output directory to be


      step:4-Run the job, and you will see that the new file has been created: This is a simple copy.

  • Copying files to a different name

      step:1-Open tFileCopy, tick the Rename box, and then add a Destination filename of


        step:2-Run the job, and you will see that there is now a renamed copy of the file.

Renaming a file

        step:1-Open tFileCopy, and change the Input filename to be


        step:2-You will see that the input file is in the same directory as the Destination directory.

        step:3-Change the Destination filename to txt.

        step:4-Click the box Remove source file.

        step:5-Run the job, and you will see that the original file has been renamed.

  • Moving a file

This is the same as the previous, but click the box Remove source file.

  • Deleting a file

To delete a file, simply add the file path to the tFileDelete component.

  • How it works…

As you see, the tFileCopy is used to copy, move, and rename files, depending upon the options selected.

The tFileDelete component is simply used to delete files.

  • There’s more…

You should have noticed that the tFileDelete and tFileCopy components allow us to tick boxes to copy and delete directories as required. It does go without saying that the utmost care must be taken when deleting files, and even more especially when deleting directories using Talend.


Frequently Asked TALEND Interview Questions & Answers


Capturing File Information

Another useful Talend feature is the ability to capture information about a file for use within downstream processing, most probably to perform validation prior to processing.

  • Getting ready

Open the jo_cook_ch08_0110_fileInformation job.

  • How to accomplish it…

The steps for capturing file information are as follows:

step:1-Drag a tFileProperties component from the right-hand panel.

step:2-Open tFileProperties, and set the file name to


step:3-Drag tFlowToIterate to the canvas, and link the row from tFileProperties to it. Name the flow properties.

step:4-Drag tFileRowCount to the canvas and set the filename to match the tFileProperties component.

step:5-Add onSubjobOk from tFileProperties to tFileRowCount, and then to tFixedFlowInput, so that your job looks like the one shown as follows:

Capture file

  • Open tFixedFlowInput.
  • Add ((Long)globalMap.get(“properties.size”)) to the field fileSize.
  • Add ((Integer)globalMap.get(“tFileRowCount_1_COUNT”)) to the field numberOfRows.
  • Your tFixedFlowInput should look like the one as follows:

RUN Window

Run the job, and you will see the file information in the console.

  • How it works…

The tFileProperties component captures file information and passes the data in a row to the next component. The tFlowToIterate component is used as a shorthand method for adding the file information to globalMap.

The tFileRowCount component counts the number of rows in a file, and presents the count as a globalMap variable.

The final sub job shows the data held in globalMap being used in a process flow.

  • There’s more…

The final sub job, simply prints out some of the information; however, a good, real-life example is to use the file size from the properties to check against the file size written in a file trailer record or a validation file. This would ensure that a file transmitted from, a third party application, for example, is received in its entirety before it is processed by the receiving application.

  • Tip

One field in tFileProperties can be difficult to use; the file creation datetime, which is a complex string format of a date. If you need to read this into a date column, then use the following date pattern:

TalendDate.parseDateLocale("EEE MMM dd HH:mm:ss z yyyy",input_row.mtime_string,"EN") 

where EN is the locale that you may need to change.

[Related Article:- Organizing Talend Files]

Creating and writing files depending on the input data

Sometimes it is required that multiple files are written from a single data source where the file name is dependent upon the data held within the row. This recipe shows how this can be achieved.

  • Getting ready

Open the jo_cook_ch08_0140_filesFromInputData job.

  • How to accomplish it…

The steps for creating and writing files depending on the input data are as follows:

        step:1-Run the job, and you will see that the file txt has been created and populated with six rows.

        step:2-Open the tJavaRow component, and you will see that the move of data from input to output has already been performed.

        step:3-Add in the following code after the generated code:

// test for change of input_row.key
if (Numeric.sequence(input_row.key, 1, 1) == 1 ) { outtFileOutputDelimited_1.flush();
if this is the first record then do not flush and close - do not want to create dummy.txt
otherwise if sequence > 1 then we will close the previousfile
if(Numeric.sequence("all", 1, 1) !=1 ) {
// build the new file name fileName_tFileOutputDelimited_1 =
// create new writer for the new filename. Talend uses this for writing the record
outtFileOutputDelimited_1 = new java.io.BufferedWriter( new java.io.OutputStreamWriter(
new        java.io.FileOutputStream( fileName_tFileOutputDelimited_1, false), "ISO-8859-15"));

Run the job. You will see that besides the dummy file, there are three additional files: txt containing the records with key a, b.txt containing records with key b, and c.txt containing rows with key c.

  • How it works…

The code in tJavaRow makes use of the fact that Talend code is a series of loops within loops. Because the tJavaRow loop is within the tFileOutputDelimited loop in the generated Java code, we can change variables within the inner loop, which will affect the processing within the outer loop.

The variable that we will change is the writer that Talend uses for the tFileOutputDelimited component.

Explore TALEND Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!
  • tJavaRow code explained

The Numeric.sequence command uses input_row.key as the name, thus, causing a new sequence to be created whenever the key changes. Thus, by testing the sequence as 1, we know that the key has changed.

Once we know that the key changed, we can then close the previous file.Then we create a new file name consisting of the output directory plus the input_row.key suffixed with .txt. Thus, if the key is changed, we create a file named a.txt.

The next statement then creates a new writer for the tFileOutputDelimited component and Talend will use this writer when writing to the output.


About Author

NameRavindra Savaram
Author Bio


Ravindra Savaram is a Content Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.