Function tFilterRow filters input rows by setting conditions on the selected columns.
Purpose tFilterRow helps parameterizing filters on the source data.
If you would like to Enrich your career with a Talend certified professional, then visit Mindmajix - A Global online training platform: “Talend Certification Course” . This course will help you to achieve excellence in this domain.
The following scenario is a Java Job that uses a simple condition and a regular expression to filter a list of records. This scenario will output two tables: the first will list all Italian records where first names are shorter than six characters; the second will list all rejected records. An error message for each rejected record will be displayed in the same table to explain why such a record has been rejected.
Drop tFixedFlowInput, tFilterRow and tLogRow from the Palette onto the design workspace.
Connect the tFixedFlowInput to the tFilterRow, using a Row > Main link. Then, connect the tFilterRow to the tLogRow, using a Row > Filter link.
Drop tLogRow from the Palette onto the design workspace and rename it as reject. Then, connect the tFilterRow to the reject, using a Row > Reject link.
Double-click tFixedFlowInput to display its Basic settings view and define its properties.
Select the Use Inline Content(delimited file) option in the Mode area to define the input mode.
In the Value column, you must type in your values between double quotes for all data types, except for the Integer type, which does not need quotes.
Thus, the first table lists records that have Italian names made up of less than six characters and the second table lists all records that do not match the filter condition “rejected record”. Each rejected record has a corresponding error message that explains the reason of rejection.
------ Check Out Talend Tutorials ------
Often, it is required to filter the input data into multiple outputs depending upon given criteria, for instance, splitting customer data by region, as in this example, or by team. Another very common example is to split the input data into validated records and records that have been rejected due to having failed a quality check (see Checking a column against a list of allowed values, in VALIDATING DATA for examples of using tMap to filter invalid rows).
This recipe shows how the tMap output Expression filters are used to perform filtering of the nature described precedingly.
Open the job jo_cook_ch04_0060_multipleOutputs.
How to achieve it…
How it works…
tMap will pass an input row to the output from the top of the output table list downwards, depending upon their settings.
tMap will only pass data to an output if:
It is sometimes easy to think of this list as a set of if-then-else criteria.
It is recommended that lists of outputs be ordered like if-then-else to make understanding easier. It is also recommended that multiple tMaps be used in the scenario where many outputs are created, depending upon complex conditions. It is not that tMap cannot handle a high level of complexity, rather the impact of changes may be difficult to calculate if there are many inputs, outputs, joins, and conditions.
In this recipe, we have multiple copies of the input being created using input criteria. It is worth noting that the outputs do not need to be copies of each other.
It is also worth noting that if no criteria is specified for any output, then tMap will copy every input row to every output. What’s more is that each of the output can be of a different format and have different rules for the same input row. In this instance, tMap becomes a means of creating multiple different views of the same output data.
What is also possible is that multiple outputs can be specified with catch output reject specified. This means that multiple views of rejected data can also be created.
Ravindra Savaram is a Content Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.