SAS ETL Studio Overview
What Is SAS ETL Studio?
SAS ETL Studio, a Java application, is a visual design tool that helps organizations quickly build, implement, and manage ETL processes from source to destination, regardless of the data sources or platforms.
Users can standardize metadata across the organization and perform in-depth transformations with minimal programming or manual work to meet enterprise data integration requirements and to support business and analytic intelligence.
SAS ETL Studio enables you to perform the following tasks:
- the Extraction of data from operational data stores
- the Transformation of this data
- the Loading of the extracted data into your data warehouse or data mart.
SAS ETL Studio is an application that enables you to manage ETL process flows by allowing:
- specification of metadata for sources, such as tables in an operational system
- specification of metadata for targets – the tables and other data stores in a data warehouse
- creation of jobs that specify how data is extracted, transformed, and loaded from a source to a target.
SAS ETL Studio: Change Management
In SAS ETL Studio, the change management facility enables multiple SAS ETL Studio users to work with the same metadata repository at the same time – without overwriting each other’s changes.
An Example of working with change-management in SAS ETL Studio
The following is a general description of what it is like to work under change-management control in SAS ETL Studio.
- When you open a metadata profile whose default repository is change-managed, metadata in the change-managed repository is displayed in the Inventory tree and the Custom tree. Metadata in the Project repository which is displayed in the Project tree.
- The Project tree contains any metadata that has been checked out of the change-managed repository and any new metadata objects that have been added.
- Typically, users will not have the appropriate privilege to directly add or update metadata directly in a change-managed repository. You must check metadata objects out of and into the change-managed repository.
- To update an existing metadata object that is under metadata source control, use the Inventory tree or the Custom tree to check out the object from the change-managed repository. The object will appear in the Project tree, where you can update the object’s metadata.
- After an object has been checked out by one person, it is locked so that it cannot be updated by another person until the object has been checked back in.
- You do not have to check out a library in order to add metadata about a table in that library.
- If two or more parent objects share a common object such as a table, a primary key, a note, or a document, and you check out one of these parent objects, only you will be able to check out the other parent objects that share the common object. (Other users will not be able to access the common object that you have checked out, and the shared object is required in order to check out Job2.
- When you add a new metadata object, it goes directly into the Project repository. The object will appear in the Project tree, where you can update the object’s default metadata.
- The Fetch option is used to get a copy of a metadata object for testing purposes. The copied object is not checked out, so the original object is not locked. The copied object can be modified, but it cannot be checked in. Fetched items will remain in the Project repository until they are deleted.
- When you are finished working with all objects in the Project repository (and the Project tree), use the Check In feature to remove the objects from the Project repository and store them in the change- managed repository. A check-in operation checks in all of the metadata objects that are in the Project repository. You cannot check in some objects and leave other objects in the Project repository, therefore it may be convenient to work with small sets of related objects in the Project repository.
- To remove a metadata object from the Project repository, use the Delete option or the Undo Check Out option. To remove a metadata object from both the Project repository and the change-managed repository, use the Destroy option.
SASETL Studio: Data SurveyorWizards
Optional Data Surveyor wizards can be licensed that provide access to the metadata in enterprise applications, such as
■ SAP R/3
■ Oracle Applications.
SAS ETL Studio: Metadata CWM Compliant
The metadata maintained by SAS ETL Studio is CWM (Common Warehouse Metamodel) compliant and portable to other CWM-compliant applications. Likewise, metadata from other CWM-compliant applications (that is, data modeling tools) can be imported easily into SAS ETL Studio.
For example, you could use a data modeling tool to create a model for a set of tables, save the model in CWM format, and then use the Metadata Importer wizard to import the model into a metadata repository. In SAS ETL Studio, you could view the properties of each table and verify that the appropriate metadata was imported. The tables could then be used in SAS ETL Studio jobs.
SAS ETL Studio: Data Quality
SAS ETL Studio is fully integrated with the data quality software from DataFlux Corporation. Both products now use the same Quality Knowledge Base (QKB), which contains rules, routines, and schemes necessary to integrate data quality into the ETL process.
The Process Library in SAS ETL Studio contains two data quality transformation templates: Create Match Code and Apply Lookup Standardization. These templates enable you to increase the value of your data through data analysis and data cleansing.
Extending SAS ETL Studio Functionality
The SAS ETL Studio functionality is extended by Java plug-ins packaged with the product.
Further extensions can be implemented by
- writing additional plug-ins (Java programming required)
- using the Transformation Generator Wizard (no Java programming required).
Server Connections and SAS ETL Studio
As a client, SAS ETL Studio must connect to a SAS Metadata Server to read or write metadata. It must connect to other servers to run SAS code, connect to a third-party database management system, or to perform other tasks.
Interaction with SAS Application Servers
SAS ETL Studio can use different types of application servers: