Creating Validation Rules and Binary Error Codes in Talend
Creating validation rules for more complex requirements
Validation rules restrict what users can enter in a given field, and also help ensure that your database users enter the proper types or amounts of data.
Types of validation rules
You can create two basic types of validation rules:
- Field validation rules : Use a field validation rule to check the value that you enter in a field when you leave the field. For example, suppose you have a Date field, and you enter >=#01/01/2007# in the Validation Rule property of that field. Your rule now requires users to enter dates on or after January 1, 2007. If you enter a date earlier than 2007 and then try to place the focus on another field, Access prevents you from leaving the current field until you fix the problem.
- Record (or table) validation rules : Use a record validation rule to control when you can save a record (a row in a table). Unlike field validation rules, record validation rules refer to other fields in the same table. You create, record validation rules when you need to check the values into one field against the values in another. For example, suppose your business requires you to ship products within 30 days and, if you don’t ship within that time, you must refund part of the purchase price to your customer. You can define a record validation rule such as [RequiredDate]<=[OrderDate]+30 to ensure that someone doesn’t enter a ship date (the value in the RequiredDate field) too far into the future.
The following steps explain how to add validation rules to control, and how to lock controls and thus prevent users from altering data.
Where you can use validation rules
You can define validation rules for tables and for controls on forms. When you define rules for tables, those rules apply when you import data. To add validation rules to a table, you open the table in Design view and set various table properties. To add validation rules to a form, you open the form in Design view and add rules to the properties of individual controls.
What you can put in a validation rule
Your validation rules can contain expressions — functions that return a single value. You can use expressions to perform calculations, manipulate characters, or test data. When you create validation rules, you use expressions primarily to test data. For example, an expression can check for one of a series of values, such as “Tokyo” Or “Moscow” Or “Paris” Or “Helsinki”. Expressions can also perform mathematical operations. For example, the expression <100 forces users to enter values less than 100. The expression ([OrderDate] – [ShipDate]) calculates the number of days that elapsed between the time an order was placed and the time it shipped.
The steps in the following sections explain how to validate data for tables, forms, queries, and import operations.
Create a validation rule
- In the Navigation Pane, right-click the form that you want to change, and then click Design View.
- Right-click the control that you want to change, and then click Properties to open the property sheet for the control.
- Click the All tab, and then enter your validation rule in the Validation Rule property box.-or-Click next to start the Expression Builder and create an expression.
- Enter a message in the Validation Text property box.
Save your changes.
Creating binary error codes to store multiple test results
Sometimes, it is desirable to perform multiple checks on a row at the same time, so that when a row is rejected, all of the problems with the data can be identified from a single error message. An excellent method of recording this is to create a binary error code.
A binary error code is a binary number, where each of the digit position represents the result of a validation test: 0 being pass and 1 being fail.
For example, 1101 = failed test 1 (rightmost digit), test 3 and test 4 and passed test 2. This binary value can be held as a decimal integer, in this case 13.
Open the job jo_cook_ch03_0070_binaryErrorCode.
How to do it…
- Open tMap and create six new Integer type variables:
nameTest, dateOfBirthTest, timestampTest, ageTest, countryOfBirthTest and errorCode.
- Copy the following lines into the Expressions:
customer.name.equals("") ? 1 << 0 : 0
customer.dateOfBirth == null ? 1 << 1 : 0
customer.timestamp == null ? 1 << 2 : 0
customer.age == null ? 1 << 3 : 0 customer.countryOfBirth.equals("") ? 1 << 4 : 0
Var.nameTest + Var.dateOfBirthTest + Var.timestampTest + Var.ageTest + Var.countryOfBirthTest
- Add a condition in the ValidRows output
Var.errorCode == 0
- Set the tMap Settings for the rejects output to Catch output reject.
- Your tMap should now look like this:
- Run the job. You should see that the error codes are populated for all the rows where at least one field is null.
The operator << performs a bitwise shift of the value by the relevant number of places. For example 1<<3 would place a 1 in the 4th position of a binary number (0 being the first position).
So if the field is null, the variable is assigned a bit-shifted value, otherwise it is set to 0.
By adding the numbers together, we eventually arrive at a decimal value which represents a 1 in each of the positions where a null is found.
This may be simpler to explain using an example. The following is the output from tLogRow. In this case, it is one of the rejects where three nulls have been found.
So from this output the binary value will be built as shown:
- The nameTest variable is assigned 0
- The dateOfBirthTest variable is assigned 1 << 1 = 10 (Binary) = 2 (Decimal)
- The timestampTest variable is assigned 1 << 2 = 100 (Binary) = 4 (Decimal)
- The ageTest variable is assigned 1 << 3 =1000 (Binary) = 8 (Decimal)
- The countryOfBirthTest variable is assigned 0
So the decimal total is 0+2+4+8+0 = 14
An alternative to using the << operator is to assign the actual decimal values to each position: 1,2,4,8 (2 power 0, 2 power 1, and so on) being positioned 0 to 3. Again, adding the values gives us the desired integer result.
Decrypting the error code
Decrypting a binary error message is achieved by testing the individual bits of the integer. This can be achieved by using the shift function to create the binary bit position and performing a bitwise AND against the integer value. If the result is greater than 0, then the position is set.
For instance, if we have the value 0101 (7) in an integer column:
0101 & 1 (where the 1 equates to 1 <<0) = 1 (test 1 failed)
0101 & 10 (where 10 equates to 1<<1) = 0 (test 2 passed)
0101 & 100 (where 100 equates to 1<<2) = 100 (test 3 failed)
0101 & 1000 (where 1000 equates to 1<<3) = 0 (test 4 passed)
So the logic for our errors will look like this: