Informatica Tutorial

Informatica Tutorial

This Informatica Tutorial gives you an overview and talks about the basics and components of Informatica.

What is Informatica ?

An introduction to Informatica PowerCenter

PowerCenter is industry’s leading Data Integration software that enables its users to integrate various data stores (such as databases, flat files…) and transform their data in multiple ways. Data Integration software such as PowerCenter provides out of the box integration with many databases and applications, than allowing the developers to focus on their data transformation requirements. Designer (a PowerCenter client tool) is a graphical user interface based integrated development editor used to develop mappings (unit of code) in PowerCenter. When a mapping is developed, it points to a source (from where data is read) and a target (where data is written). This mapping can then be executed to read data, transform it and then finally load the transformed data into the target. When a database changes, for example, a customer moved his database from Microsoft SQL Server to Oracle, he can simply change the mapping to point to the new source and target without having to worry about any integration challenges. Assuming the connectivity between the PowerCenter and oracle database is established, developers need not worry if their code will properly integrate with the new database. If it’s supported, it works. It is that simple. The mapping would run the same regardless of the database to which it is pointing. Developers do not have to bind any drivers or jar files (Java archive files) to the code. As long as the Informatica Administrator established a connection between PowerCenter and the database, developers just have to point their mappings to pre-defined connections. This keeps the business logic separate from its integration aspects, thereby, making data integration a lot simpler.

Informatica is for IT organizations that are standardizing data integration at an enterprise level, across numerous projects and departments.

PowerCenter Advanced Editionadds a metadata manager and data lineage, a business glossary, and team-based development and versioning.

Informatica PowerCenter Premium Edition protects critical business processes against data integration failures. It combines automated, auditable, and repeatable data validation testing with monitoring and proactive alerts to warn stakeholders when issues arise in your development and operational processes.

Informatica PowerCenter Real Time is an add-on for PowerCenter Standard, Advanced, or Premium that increases business agility and performance through real-time connectivity and web services.

Informatica PowerCenter for Big Data is an add-on for PowerCenter Standard, Advanced, or Premium that reduces Big Data management costs while handling growing data volumes and complexity through Hadoop-based data integration processing, unstructured file parsing, and natural language processing.

Informatica Cloud Integration provides sophisticated cloud-based data integration services for organizations that rely on cloud computing or have both in-cloud and on-premise applications. Combining pay-as-you-go affordability with increased. Cloud Integration integrates seamlessly with PowerCenter Enterprise, enabling your smaller and one-off projects to grow or merge with larger initiatives.

Informatica Key Features

Role-based tools support iterative business/IT collaboration and development as well as support business and department self-service

Rapid prototyping and data profiling instantly accesses and combines data from multiple sources to ensure requirements are met early and throughout the development process.

Graphical, intuitive metadata-driven views of data flows, impact analysis and lineage provide better change management and governance

Specialized, high-performance pushdown of data transformation processing for optimal use of database resources

All the above key features will be discussed in Informatica Training.

Informatica Powercenter components

PowerCenter architecture is based on client-server model. It consists of PowerCenter server and PowerCenter client tools. Below is a list of all of the components that are part of the PowerCenter architecture.

Informatica PowerCenter Domain

In its simple terms, Domain can be defined as an environment. You will have a PowerCenter domain for each               environment. For example, if you have Development, Test and Production environments, you essentially create 3 different domains — one for each environment. Domain information is stored in a set of tables, which are created and configured as part of PowerCenter server installation. These domain tables store metadata related to services within the PowerCenter, users, groups, etc…

Informatica Node

A node is a machine participating in the Informatica Domain. Typically, a node consists of CPU, Memory and Disk. A node can be active or passive depending on the services it is hosting. Informatica domain can consist of more than one node. These nodes can host a wide variety of operating systems such Windows, Lin., HP-UX, etc. Informatics server software is installed on each node participating in a domain.

PowerCenter Services

A domain consists of several services, such as license service, PowerCenter Repository Service and PowerCenter Integration Service. Each of this service provides a unique functionality to clients.

PowerCenter Repository Service (PCRS)

A PowerCenter Repository is a net of tables created when your Informatica Administrator creates a PowerCenter Repository Service during post installation process. The entire code that a developer builds is stored inside the repository. Repository contains hundreds of tables, whereas PowerCenter stores the developer’s code within these tables very intelligently. It is hard to manually look at these tables and comprehend and hence, they should be left alone unless there is a dire need to look at them. Along with developer’s code, repository also contains metadata like definitions of the tables used by the mappings, source and target connections, etc…

When the developer runs a Workflow (a job in PowerCenter), its information is fetched from the repository. Thereafter, the runtime statistics are stored back in the repository again. Hence the repository is a key and live element in PowerCenter architecture

PowerCenter Integration Service (PCIS)

An integration service is the engine that actually runs PowerCenter workflows (jobs). Integration services continuously interact with PowerCenter Repository to fetch the information of the job it is about to start and keeps the repository up-to-date regarding the status of the job, including the processed row counts. Each workflow is assigned to an integration service. Each integration service can run one or more workflows at the same time. Workflows can also be scheduled to run on Integration Service at specific date/time. We will have detailed discussion on this feature in Informatica Online Training.


A grid is a collection of nodes. A PowerCenter Integration Service can not upon an individual node or on a grid. When an Integration Service runs on a grid, it automatically load balances the workflows that it is executing, such that the resources (nodes) are optimally utilized. When a node in the domain fails, integration service can be configured to failover the workflows running on that node to another node(s) to provide a seamless failover of the jobs.

Putting it all together

Now that we have a basic understanding of each component, let’s take a look at it all together. See the picture below.


The above picture represents a single Informatica domain, containing 3 nodes. Out of these, two nodes (node 1 and node 2) are participating together to form a grid. An integration service is running atop of this grid. Node 3 is hosting a PowerCenter repository service, whose repository tables lie in the schema 1 of the database server. The schema 2 of the same database server hosts the domain metadata tables. Informatica server software is installed on all the 3 nodes.

While there are many possible configurations for the given nodes, the one above is an example for understanding how the components fit together in the Informatica PowerCenter architecture.

Informatica PowerCenter clients

PowerCenter has more than one client.

Informatica Administrator is a thin client. It is a web based application that can be accessed from a browser by pointing to its URL. Administrator is used to administer the PowerCenter environment including: • Creating and managing services Dike repository service, integration service, etc…) • Authentication and authorization • Create and manager users • Create and manage groups • Establish and maintain AD authentication • Manage privileges and roles • Manage connection objects • Manage domain logs • Start, shutdown, restart services • Backup, restore repositories and domain

As the name suggests, Repository Manager is used to manage the repository. Typically, it is used to perform the following: • Create and manage folders • Manage folder permissions • Query the repository objects • Deployments: Copy folders and objects from one repository to another

Designer is one of the tools where PowerCenter developers spend most of their time. This is where they define the data flows called mappings, import source and target table definitions to use into the mappings. This is where they debug their mappings, build reusable transformations, etc. In short, this is where the development of the data integration process happens.

Workflow manager is where sessions and workflows are developed and managed. Workflow manager is also used to develop and manage: • Relational connections • Application and other connections • Session tasks, command tasks and other tasks • Reusable tasks as such as command tasks • Reusable sessions • Worklets • Assign workflows to run on Integration services and grids • Start, stop, recover workflows

Workflow monitor is a read-only client where developers can monitor and keep track of their workflows and sessions. One can view the current status of a running or completed sessions, row counts for each source and target, error messages and detailed session and workflow logs. Developers can also restart and recover their workflows from the workflow monitor.

For more information about Informatica, please refer www.informatica.com

0 Responses on Informatica Tutorial"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.