Mindmajix

Data Warehousing Concepts in SSIS

Data Warehousing Concepts

DATA:  data is composed of observable and recordable facts that are often found in operational for transactions systems.

OLTP:  OLTP is nothing but observation of online transaction processing. The system is an applicable application that modifies data the instance it receives and has a large number of concurrent users.

OLAP: OLAP is abbreviation of online analytical processing this system is an application that collects manager processes and presents multidimensional data for analysis and management purpose.

DATA MINING: data mining is the process of analyzing data from different perspectives and summarizing it in to useful information.

BI:  BI is the leveraging of BW to help make business decisions and recommendations. Information and data rules engines are leveraged here to help make these decisions along with statistical analysis tools and data mining tools.

DWH DEFINITIONS:  Bill Inmon, considered to be the father of the data ware housing provides the following definition

     “A data ware house is a Subject oriented, Integrated, Non volatile, And time variant.

Collection of data in support of management’s decision seam Kelly, another leading data ware housing Practitioner defines the data ware house in the following way.

      The data in the data ware house is,

                               “Separate

                                Available

                                Integrated

                                Time stamped

                                Subject oriented

                                Non volatile

                                Accessible “

Subject oriented 

In the data ware house, data is not stored by operational applications, but by business subjects.

Operational applications data ware house subject

Operational Applications

       Screenshot_65

Data ware house Subjects

 Screenshot_66

Integrated

Data inconsistencies are removed; data from diversified operational applications is integrated

 Screenshot_67

Here are some of the items that would need standardization:

  • Naming conventions
  • Codes
  • Data attributing
  • Measurements

 Time variant data:

      The time- variant nature of the data in a data ware house

  • Allows for analysis of the past
  • Relates information to the presents
  • Enables forecasts for the future

 Non volatile data:

 Usually the data in the data ware house is not updated for deleted,

 Screenshot_68

Data granularity:

Data granularity in a data ware house refers to the level of detail data. The lower level is detail, the finer the data granularity.

  • Depending on the requirements multiple levels of details may be present. Many data ware house have at least dual levels of granularity.

Three data levels in a banking data ware house

Daily detail monthly summery quarterly summery
Account Account Account
Available date Month Month
Amount Number of transactions Number of transactions
Deposit/withdrawal With drawlsDepositsBeginning balance

Ending balance

With drawlsDepositsBeginning balance

Ending balance

 Data warehouse approaches

RALPH KIMBAL

(Bottom up approach)W.H 2 NMON

(Top down approach)Here data marts designed first, later from data marts for ware houseEnterprise DWH concentrated first, next data marts

 Screenshot_69

Top- down approach:

    The advantages of this approach are:

  • A truly corporate efforts, an enterprise view of data
  • Inherently architected – not a union of the separate data marts
  • Single, central storage of data about the content
  • Centralized rules and control
  • May sea quick results if implemented with iterations

  The disadvantages are:

  • Takes longer to build even with an iterative method
  • High exposure risk to failure
  • Needs high level of cross – fundamental skills
  • High outlay without proof of concept.

Bottom – up approach:

  The advantages of these approaches are:

  • Faster and easier implementation of manageable pieces
  • Favorable returns on investment and proof of concept
  • Less risk of failure
  • Inherently, incremental: can schedule important data marts first
  • Allows project team to learn and grow

The disadvantages are:

  • Each data mart has its own narrow view of data
  • Permits redundant data in every data mart
  • Perpetuates inconsistent and irreconcilable data
  • Proliferates, unmanageable interfaces.

DWH Life Cycle

  Screenshot_70

Difference between data warehouse and OLTP

s.no

OLTP

DWH

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

Designed for transaction processing

Volatile

Store current data only

Store detail data

More joins

Normalized (more tables )

Less indexes

Low level managers access

Read, update, insert, delete

Access frequency high and in sec

Designed for decision support

Store non-volatile

Store historical data

Store summarized data

Less joins

De normalized (less tables )

More indexes

High level manager access

Read

Low, sec to minutes

Types of OLAP

ROLAP: (Relational OLAP):

Applied on relational sources both data and aggregate information store under relational sources.

Ex: BO, congo’s crystal reports, micro strategy

MOLAP: (multi dimensional OLAP):

Here, analysis will be done on multi dimensional applicatioHere, data and aggregate information store under multi dimensional sources.

Eg: congo’s SSAS, hyperion, Micro-strategy etc…………

HOLAP: (hybrid OLAP)

Here data store in relational sources and aggregated values store under multi dimensional sources       “cubes”

Eg: cognos, BO, Micro strategic……………

ETL

a) Code based.

Development cost, testing, maintenance

Eg:    sas base, sas access, tera data utilities, sql, plsql.

b)    GUI based etl- informatica, data stage, abinitio etc …..

Data Extraction

a) relational sources: ORACLE, SQL SERVER, TD, DB2, INFORMIX, SY BASE, RED BRICK

b) ERP sources: SAP R/3, DEOPLESOFT, J.D. EDWARDS, BANN, RAMCO MARSHALL.

c) Main frames files: COBOL FILES, IMS FILES, JCL FILES, DB2, JCL

d) File sources: FLAT FILES (TEXT FILES), XML FILES

e) Other sources: WEB LOGIC FILES, TIBCO M2 SERVICES, EXCEL SHEET, and PDF ETC……………….

Enroll for Live Instructor Led MSBI Training


 

0 Responses on Data Warehousing Concepts in SSIS"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.