Informatica Data Quality tutorial
Informatica Data Quality Tutorial
This tutorial gives you an overview and talks about the fundamentals of Informatica Data Quality (IDQ).
- Informatica Data Quality is a suite of applications and components that you can integrate with Informatica Power Center to deliver enterprise-strength data quality capability in a wide range of scenarios.
- The core components are: Data Quality Workbench, Data Quality Server
- Data Quality Workbench. Use to design, test, and deploy data quality processes, called plans. Workbench allows you to test and execute plans as needed, enabling rapid data investigation and testing of data quality methodologies.
- Data Quality Server. Use to enable plan and file sharing and to run plans in a networked environment. Data Quality Server supports networking through service domains and communicates with Workbench over TCP/IP.
- Both Workbench and Server install with a Data Quality engine and a Data Quality repository. Users cannot create or edit plans with Server, although users can run a plan to any Data Quality engine independently of Workbench by runtime commands or from PowerCenter.
- Users can apply parameter files, which modify plan operations, to runtime commands when running data quality plans to a Data Quality engine.
- Informatica also provides a Data Quality Integration plug-in for PowerCenter. This plug-in enables PowerCenter users to add data quality plan instructions to a PowerCenter transformation and to run the plan to the Data Quality engine from a PowerCenter session.
- In Data Quality, a plan is a self-contained set of data analysis or data enhancement processes. A plan is composed of one or more of the following types of component:
o Data sources provide the input data for the plan.
o Data sinks collect the data output from the plan.
o Operational components perform the data analysis or data enhancement actions on the data they receive.
Role of Dictionaries
- Plans can make use of reference dictionaries to identify, repair, or remove inaccurate or duplicate data values. Informatica Data Quality plans can make use of three types of reference data.
- Standard dictionary files. These files are installed with Informatica Data Quality and can be used by several types of component in Workbench. All dictionaries installed with Data Quality are text dictionaries. These are plain-text files saved in .DIC file format. They can be created and edited manually.
- Database dictionaries. Informatica Data Quality users with database expertise can create and specify dictionaries that are linked to database tables, and that thus can be updated dynamically when the underlying data is updated.
- Third-party reference data. These data files are provided by third-parties and are provided by Informatica customers as premium product options. The reference data provided by third-party vendors is typically in database format.
For more information about Informatica, please refer http://www.informatica.com
We have compiled few more articles to get you acquainted with Informatica Data Quality course. We will cover these in-depth in our Informatica Data Quality online training sessions.