How to Deal with Data Quality Problems
Dealing with Data Quality Problems
In the vast majority of cases, useful data sets are not static, but are being updated, added to and purged constantly. Data quality monitoring aims to provide data quality information that is also being constantly updated, and can be used to detect issues quickly, before the bad data piles up.
Tracking quality requires collecting a lot of data. Tableau is very good at visualizing data and making it understandable. The tableau provides tools to help you deal with issues that don’t require intervention at the database-level to resolve unclean data problems. However, the best course of action when you find errors is to report them to the IT professional responsible for data quality within the database you are using.
Quick Solutions in Tableau
There are several different ways you can correct data problems within Tableau that don’t involve changing the source data.
You can rename fields in the Data pane. For example, you could rename a field named Customer Segment in the data source to be Business Segment in Tableau. You can also rename user-created fields. Renaming a field does not change the name of the field in the underlying data source, rather it is given a special name that appears only in Tableau workbooks. The changed field name is saved with the workbook as well as when you export the data source. You can rename any type of field: dimensions, measures, sets, or parameters.
Renaming fields in Tableau is done by right-clicking on the field and renaming it. Field member names can be aliased. These changes do not alter the source database. Tableau “remembers” what you renamed without altering the source data.
You can create a group by selecting headers in the view or selecting marks. You can also create a group from a dimension in the Data pane. Regardless of how you create a group, a new group field is added to the Data pane. You can use the group field like other fields in the view, adding it to the Columns or Rows shelves, to the Marks card, or to the Filters shelf.
Let’s assume that a company name has been entered as all of these: A&M, A and M, A+M. With Tableau, you can Ctrl+Select each of these names and group them— and then create a name alias for the ad hoc grouping. So, all the versions of the name appear as one record in Tableau— A& M. This grouping and name alias will be saved as part of Tableau’s metadata.
Sometimes the name of something in the database is not a useful term for reporting purposes. For example, everybody on the team enters the customer type as P1, P2, G1, G2 where P2 denotes the size of the customer in annual revenue. For example, “Platinum level 2” could mean that the customer has an annual revenue of $ 1m to $ 5m. In Tableau, you can right-click on P2 and alias it with a more meaningful description.
Although Tableau has a built-in mapping that works very well, there will be occasions when geographic locations are not recognized. Tableau will warn you by placing a small gray pill in the lower right area of your map. Clicking on that pill provides the ability to edit the offending locations or filter them out of the view. This is also accessible from Tableau’s map menu.
When a measure contains null values, they are usually plotted in a view as zero. However, sometimes that changes the view and you’d rather just suppress null values altogether. You can format each measure to handle null values in a unique way.
When you see the word null appear in a view, that means Tableau can’t match the record. You can filter out nulls, group them with non-null members of the set, or correct the join that is causing the null. There are many reasons why a null value could result. If you aren’t sure how to correct the null, seek assistance from a qualified technical resource.
Correcting Your Source Data
Although it’s quick and easy to address data quality issues directly in Tableau, it’s important to bear in mind that the changes you have made in Tableau will only benefit those using the same Tableau file. There is no substitute for correcting the underlying data in the data source. Report the errors to the responsible staff quickly and provide them with your Tableau report. Expose the details so that the database is corrected.
To edit the data source
- On the Data menu, select a data source, and then select Edit Data Source.
- On the data source page, make the changes to the data source.
In the next chapter, you’ll learn how to create more nuanced charts that go beyond the basic visualizations provided by the Show Me button by adding more features to your charts in such a way to enhance its meaning without cluttering your view of the information.