Satish MSBI World: April 2014

Tuesday 29 April 2014

Unified Dimensional Model in SSAS

UDM (Unified Dimensional Model)

The UDM is central to your Analysis Services database architecture.UDM is your friend because it helps you narrow the gap between end users and data they need. Analysis Services provides you with features that help you design a model that will serve the needs of end users. It provides you with a way to bring data from multiple heterogeneous sources into single model.

Pre-UDM versions of Analysis Services only supported one fact table per cube. Therefore, only one data source could be used for specifying the fact table of cube.

Figure: The UDM provides a bridge between end users and their data

The above figure shows the architecture of the Unified Dimensional Model that is implemented in Analysis Services. It helps you to integrate data from various data sources such as Oracle, Sql Server, DB2 and TeraData into a single model that merges the underlying schemas into a single schema.To restrict end user view by using concept Perspectives in Analysis Services.
In OLAP, the data analyzed by end users is often historical data that might be a few days, months, or even year old. XML for Analysis (XMLA) standard, which allows client tools to retrieve data from Analysis Services. To write OLAP queries retrieve data within a few seconds.

The UDM consists of several components as follows
• Data source – It represents a connection to the database where the data is stored.
• Data source view (DSV) - Abstracts the underlying database schema. Although a DSV might seen redundant, it can be very useful by letting you augment the schema.
• Dimensional Model – After you’ve created a DSV, the next step is to build the dimensional model.

Key elements of the UDM are as follows:
• Heterogeneous data access support: UDM helps you to integrate and encapsulate data from heterogeneous data sources. It helps you combine various schemas into a single model.
• Real time data access with high performance: The UDM provides end users with real time data access. The UDM creates a MOLAP cache of underlying data. It helps in providing real time data access with the speed of an OLAP database due to MOLAP cache.
• Rich metadata, ease of use for exploration, and navigation of Data: UDM provides a consolidated view of the underlying data sources with the richness of metadata provided by the OLAP world. End users are able to exploit this Meta data to navigate and explore data in support of marking business decisions.
• Rich analytics support: UDM provides rich analytics such as Key performance Indicators and Actions that help in understanding your business with ease and automatically take appropriate actions based on changes in data by using MDX (Multidimensional expressions) language.
• Model for Reporting and Analysis: It provides you with the capability of not only querying the aggregated data that are typically used for analysis, but also has the ability to provide for detailed reporting up to the transaction level across multiple heterogeneous data sources.

Additional Features in Analysis services 2008:
• In Analysis services 2008, the UDM is tightly integrated with data mining features. You can also create a data mining dimension from data mining model.

• Several MDX functions are also added in Analysis services 2008.
• In a dimension that can have hundreds of attributes. But in real time attributes with in dimension are not used in many queries. Only a subset of the attributes might be heavily used. Including unnecessary attributes and hierarchies in your UDM cases performance decreases in some cases.
The Analysis Services 2008 design tools also allow you to better visualize the relationships between the attributes that you need and help you build a better dimension model. (It’s known as Fine Tuning Dimensions.)

Merge UDM in Analysis Services 2008 is become a powerful, enterprise class product that you can use to build large scale OLAP databases and implement strategic business analysis applications against those databases.

Finally you learn about the overall architecture of Analysis Services 2008 followed by the concept of the unified Dimensional Model (UDM), it helps you to have a unified view of your entire Datawarehouse.

Wednesday 9 April 2014

What is Fact or Measure & Types with Examples

Measure or Fact Table
Fact tables contain measurements of individual business processes.
It contains two sections in the table
(a) Foreign Keys to Dimensions
(b) Facts or Measures

Fully Additive Facts:
It can be summarized across any and all dimensions information stored in Fact Table.

Example:
Revenue Quantity

Semi Additive Facts:
Semi Additive Facts are Facts that can be summed up for some of the dimensions in the Fact Table.

Example:
Inventory Quantities can be summed up through the model and Dealer dimension but not through the Time Dimension

Non Additive Facts:
Non Additive Facts are Facts that cannot be summed across any dimensions in the Fact Table.
All Ratios are Non Additive Facts.

Example:
In the below Diagram shows Margin rate is non additive.

Margin rate=Margin_Amount / Revenue

Fact less Fact Table:
A Fact Table contains no measures or facts that type of Facts are known as Fact less Fact Table. Sometimes store a “1” for convenience.

Example:
A fact table which has only key elements is a fact less fact. There are no measures in this table

What is Dimension & Types of Dimensions with Examples

Dimension Table
It is the Master Table of Cube. It contains textual information more and more. Generally
A Specific business information (or) A particular task information is stored in this table. It is a collection
Of Hierarchies, Categories and logics. This can be used for a user to traverse in hierarchical nodes.

Types of Dimensions

Confirmed Dimension
The Dimension which is shared by two or more Fact tables (if it is used in multiple projects) is called as confirmed Dimension.
(Or)
The Dimension which is created only once and it is used in many schemas then it is called Confirmed Dimension.

Example: Customer Dimension can be used across the saving & current Fact Tables in Banking Environment.

Role Playing Dimension
A database Dimension that acts as multiple dimensions with in a cube is called as Role Playing Dimension. From the same table if we have multiple foreign keys in fact tables then the table acts differently for each key attribute.

Example:
Time Dimension is one of the best Example of Role playing Dimension, you can have one Time Dimension called Date and then you can add ShipDate, DueDate and OrderDate as Cube Dimensions.

Junk Dimension

It is a group of flags which gives true or false, yes or no, type of information. The attributes in the junk dimension do not belongs to the fact table. It contains a Unique key for all possible combinations of flags and use that unique key in the fact table.
This is not relegated to any Data warehouse schema and is used only for reference.

Example:

Degenerated Dimension
It is a Dimension which is derived from fact tables and does not have its Entry in Dimension Tables.

Example:

In above table contains Fact Internet Sales information directly taken from OLTP Database. We will create dimension table depends on Fact Table like this

Note: IF number of Rows in Fact Table = Number of Rows in Dimension Table (That time De-Generated Dimension is not possible)

Slowly Changing Dimension
* link as shown below *
Slowly changing dimension & scd-types URL Link

Tuesday 8 April 2014

Slowly Changing Dimension (SCD) & Types of SCD's

Slowly Changing Dimension

Dimension source data may change over time, rather than changing on regular schedule, time base. In Datawarehouse there is need to track changes in dimension attributes in order to report historical data.

Slowly Changing Dimensions are often categorized into three types namely Type1, Type2 and Type3. The Following Example clearly explains the types of SCD’s.

Type 1: Updates existing record with modifications. (Does not Maintains History)

In the year of 2012, if the salary of the person to 19000, then the old values of the columns “Year” and “Salary” have to be updated with the new values. In this Type1, there is no way to find out the old salary of person Person Id – 1 in year 2012 since the table now contains only the new salary and year information.

Type 2: Creating an additional record (Does Maintains History)

In this Type 2, the old values will not be replaced but new values are stored in to another row of table. So at any point of time, the difference the old values and new values can be retrieved from database and early be compared. This is very useful for Real Time for reporting purposes.

Type 3: Creating New Fields. Keep old and new values in the existing row (Requires a design Change)

In this Type 3, the latest update to changed values can be seen. Example mentioned below how to add new columns and maintains old columns of that column changes as shown below.

The problem with the Type3 used in Real Time, if the salary of person1 changes, then added new columns to the same row that type u can change structure of cube every time.