If you're looking for SSAS Interview Questions for Experienced or Freshers, you are in right place. There are a lot of opportunities from many reputed companies in the world. According to research, SSAS has a market share of about 26.35%.
So, You still have the opportunity to move ahead in your career in the SSAS certification guide. Mindmajix offers Advanced SSAS Interview Questions 2024 that help you in cracking your interview & acquire a dream career as SSAS Developer.
If you want to enrich your career and become a professional in SSAS, then enroll in "SSASTraining" - This course will help you to achieve excellence in this domain. |
Data Mining Architecture & Techniques | ||
Resources | Explanation | |
Architecture | Data sources | WWW or Internet is a big source of Data |
Database | The database contains data that is ready to be processed | |
Data mining engine | It contains modules used to perform data mining tasks | |
Pattern evaluation module | It mentions the measure of the differentiation of patterns | |
GUI | It is the communication between user and data mining system | |
Knowledge Base | It is a guide for result patterns, based on models it interacts | |
Types Of Mining Architecture | No-coupling Data Mining | It retrieves data from particular data sources |
Loose Coupling Data Mining | It retrieves data from a particular database | |
Semi-Tight Coupling Data Mining | In this, it uses few features of data warehouse sys | |
Tight Coupling Data mining | The data layer, Data mining application layer, Front-end layer | |
Data Mining Techniques | Decision Trees | A common technique used for mining, The root acts as a condition |
Sequential patterns | It is used to identify events, similar patterns of transaction data | |
Clustering | By an automatic method, similar characteristics clusters have to form | |
Prediction | It defines the relationship between dependent & independent instances | |
Association | It is a relation technique, used to recognize the pattern | |
Classification | It depends on ML, Used to the classified item of the particular set to predefined groups | |
Technological Drivers | Database size | For maintaining & Processing data we need powerful sys |
Query complexity | It is used to analyze complex queries in a large number. |
Microsoft SQL Server 2014 Analysis Services (SSAS) delivers online analytical processing (OLAP) and data mining functionality for business intelligence applications. Analysis Services supports OLAP by letting us design, create, and manage multidimensional structures that contain data aggregated from other data sources, such as relational databases. For data mining applications, Analysis Services lets us design, create, and visualize data mining models that are constructed from other data sources by using a wide variety of industry-standard data mining algorithms.
Analysis Services is a middle-tier server for analytical processing, OLAP, and Data mining.
It manages multidimensional cubes of data and provides access to heaps of information including aggregation of data. One can create data mining models from data sources and use them for Business Intelligence also including reporting features.
Analysis service provides a combined view of the data used in OLAP or Data mining. Services here refer to OLAP, Data mining. Analysis services assist in creating, designing, and managing multidimensional structures containing data from varied sources. It provides a wide array of data mining algorithms for specific trends and needs.
Some of the key features are:
1. Ease of use with a lot of wizards and designers.
2. Flexible data model creation and management
3. Scalable architecture to handle OLAP
4. Provides integration of administration tools, data sources, security, caching, and reporting, etc.
5. Provides extensive support for custom applications
Related Article: SSAS Tutorial |
Explore - OLTP VS OLAP |
A-DATA SOURCE contains the connection information used by SSAS to connect to the underlying database to load the data into SSAS during processing. A Data Source primarily contains the following information (apart from various other properties like Query timeout, Isolation, etc.):
SSAS Supports both .Net and OLE DB Providers. Following are some of the major sources supported by SSAS: SQL Server, MS Access, Oracle, Teradata, IBM DB2, and another relational database with the appropriate OLE DB provider.
Impersonation allows SSAS to assume the identity/security context of the client application which is used by SSAS to perform the server-side data operations like data access, PROCESSING, etc. As part of impersonation, the following options are available in SSAS:
A-DATA SOURCE VIEW (DSV) is a logical view of the underlying database schema and offers a layer of abstraction for the underlying database schema. This layer acts as a source for SSAS and captures the scheme-related information from the underlying database. The schematic information present in DSV includes the following:
A Named Calculation is a new column added to a Table in DSV and is based on an expression. This capability allows you to add an extra column into your DSV which is based on one or more columns from the underlying data source Table(s)/View(s) combined using an expression without requiring the addition of a physical column in the underlying database Table(s)/View(s).
The expression used in the Named Calculation should conform to the underlying data source dialect. For example, if the underlying data source is SQL Server, then it should conform to T-SQL, If is it Oracle, then it should conform to PL/SQL, etc…
Named Calculations can be used in many scenarios, following are some of the common scenarios:
(
SELECT Country
FROM Address
WHERE AddressID = Employee.AddressID
)
Following are some of the pros and cons of using Tables and Named Queries in DSV. Tables in the below comparison refer to the Table in DSV which references a single Table or a View in the underlying source database.
Tables | Named Queries |
Named Calculations can be added to Tables in DSV. | Named Calculations cannot be added to Named Queries in DSV. |
Named Calculations cannot be added to Named Queries in DSV. | Keys and Relationships have to be set explicitly in the DSV. |
Only one Table/View from the underlying data source can be referenced in DSV. | More than one Table/View from the underlying data source can be referenced using a SQL Expression in the DSV. |
Any filter/limiting conditions cannot be applied on a table in DSV. | Filter/limiting conditions can be applied as part of the SQL expression in the Named Query in the DSV. |
Although Named Calculations and Named Queries can be used to extend the functionality of SSAS to address the evolving business needs, it is always a good practice to first build a good DIMENSIONAL MODEL at the beginning of a Data Warehousing/SSAS project.
Explore - SSAS Processing |
The role of a Unified Dimensional Model (UDM) is to provide a bridge between the user and the data sources. A UDM is constructed over one or more physical data sources, and then the end-user issues query against the UDM using one of a variety of client tools, such as Microsoft Excel.
At a minimum, when the UDM is constructed merely as a thin layer over the data source, the advantages to the end-user are a simpler, more readily understood model of the data, isolation from heterogeneous backend data sources, and improved performance for summary type queries. In some scenarios, a simple UDM like this is constructed totally automatically. With greater investment in the construction of the UDM, additional benefits accrue from the richness of metadata that the model can provide.
The UDM provides the following benefits:
Analysis Services is the only component in SQL Server using which we can perform Analysis and Forecast operations.
A database is called an OLAP Database if the database satisfies the FASMI rules :
The difference between a derived measure and a calculated measure is when the calculation is performed. A derived measure is calculated before aggregations are created, and the values of the derived measure are stored in the cube. A calculated measure is calculated after aggregations are created, and the values of a calculated measure aren’t stored in the cube. The primary criterion for choosing between a derived measure and a calculated measure is not efficiency, but accuracy.
A partition in Analysis Services is the physical location of stored cube data. Every cube has at least one partition by default. Each time we create a measure group, another partition is created. Queries run faster against a partitioned cube because Analysis Services only needs to read data from the partitions that contain the answers to the queries.
Queries run even faster when partition also stores aggregations, the pre-calculated totals for additive measures. Partitions are a powerful and flexible means of managing cubes, especially large cubes.
Nonempty behavior is an important property for ratio calculations. If the denominator Is empty, and MDX expression will return an error just as it would if the denominator Were equal to zero. By selecting one or more measures for the Non-Empty Behavior property, we are establishing a requirement that each selected measure first be evaluated before the calculation expression is evaluated. If each selected measure is empty, then The expression is also treated as empty and no error is returned.
Under normal circumstances, each level in a hierarchy in Microsoft SQL Server Analysis Services (SSAS) has the same number of members above it as any other member at the same level. In a ragged hierarchy, the logical parent member of at least one member is not in the level immediately above the member. When this occurs, the hierarchy descends to different levels for different drill-down paths. Expanding through every level for every drill down the path is then unnecessarily complicated.
The role of an Analysis Services information worker is the traditional “domain expert” role in business intelligence (BI) someone who understands the data employed by a solution and is able to translate the data into business information. The role of an Analysis Services information worker often has one of the following job titles: Business Analyst (Report Consumer), Manager (Report Consumer), Technical Trainer, Help Desk/Operation, or Network Administrator.
We can create aggregations for faster MDX statements using Aggregation Wizard or thru UBO – Usage-Based Optimizations. Always, prefer the UBO method in real-time performance troubleshooting.
The Enable/Disable Writeback dialog box enables or disables writeback for a measure group in a cube. Enabling writeback on a measure group defines a writeback partition and creates a writeback table for that measure group. Disabling writeback on a measure group removes the writeback partition but does not delete the writeback table, to avoid unanticipated data loss.
Processing is a critical and resource-intensive operation in the data warehouse lifecycle and needs to be carefully optimized and executed. Analysis Services offers high performance and scalable processing architecture with a comprehensive set of controls for database administrators.
We can process an OLAP database, individual cube, Dimension, or a specific Partition in a cube.
The following table lists the business intelligence enhancements that are available in Microsoft SQL Server Analysis Services (SSAS). The table also shows the cube or dimension to which each business intelligence enhancement applies, and indicates whether an enhancement can be applied to an object that was created without using a data source and for which no schema has been generated.
Enhancement | Type | Applied to | No data source | |
Time Intelligence | Cube | Cube | No | |
Account Intelligence | Dimension | Dimension or cube | No | |
Dimension Intelligence |
| Dimension or cube | Yes | |
Custom Aggregation |
| Dimension (unary operator) or cube | No | |
Semiadditive Behavior | Cube | Cube | Yes | |
Custom Member Formula | Dimension | Dimension or cube | No | |
Custom Sorting and Uniqueness Settings | Dimension | Dimension or cube | Yes | |
Dimension Writeback | Dimension | Dimension or cube | Yes |
This is a great question because you only know this answer by experience. If you ask me this question, the answer practically rushes out of me. “CrossJoin, Descendants, and NonEmpty, in addition to Sum, Count, and Aggregate. My personal favorite is CrossJoin because it allows me to identify non-contiguous slices of the cube and aggregate even though those cube cells don’t roll up to a natural ancestor.” Indeed, CrossJoin has easily been my bread and butter.
The reflexive answer is “in the Measures dimension” but this is the obvious answer. So I always follow up with another question. “If you want to create a calculated member that intersects all measures, where do you put it?” A high percentage of candidates can’t answer this question, and the answer is “In a dimension other than Measures.” If they can answer it, I immediately ask them why. The answer is “Because a member in a dimension cannot intersect its own relatives in that dimension.”
Simply using the bottom count will return customers with null sales. You will have to combine it with NONEMPTY or FILTER.
SELECT { [Measures].[Internet Sales Amount] } ON COLUMNS ,
BOTTOMCOUNT(
NONEMPTY(DESCENDANTS( [Customer].[Customer Geography].[All Customers]
, [Customer].[Customer Geography].[Customer] )
, ( [Measures].[Internet Sales Amount] ) )
, 10
, ( [Measures].[Internet Sales Amount] )
) ON ROWS
FROM [Adventure Works]
WHERE ( [Date].[Calendar].[Calendar Year].&[2003] ) ;
By default Analysis Services returns members in an order specified during attribute design. Attribute properties that define ordering are “OrderBy” and “OrderByAttribute”. Let's say we want to see order counts for each year. In Adventure Works MDX query would be:
SELECT {[Measures].[Reseller Order Quantity]} ON 0
, [Date].[Calendar].[Calendar Year].Members ON 1
FROM [Adventure Works];
Same query using TopCount:
SELECT
{[Measures].[Reseller Order Quantity]} ON 0,
TopCount([Date].[Calendar].[Calendar Year].Members,3, [Measures].[Reseller Order Quantity]) ON 1
FROM [Adventure Works];
Use could usefunctionSet.Item(0)
Example:
SELECT {{[Date].[Calendar].[Calendar Year].Members
}.Item(0)}
ON 0
FROM [Adventure Works]
You can use the ALTER CUBE statement.
Syntax:
ALTER CUBE CurrentCube | YourCubeName UPDATE DIMENSION , DEFAULT_MEMBER=”;
A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject that may be distributed to support business needs. Data marts are analytical data stores designed to focus on specific business functions for a specific community within an organization.
Data marts are often derived from subsets of data in a DATA WAREHOUSE, though in the bottom-up data warehouse design methodology the data warehouse is created from the union of organizational data marts.
They are 3 types of data mart they are
A data warehouse is complete data whereas Datamart is a subset of the same.
Example:
All the organization data may be related to the finance department, HR, banking dept are stored in a data warehouse whereas in data mart only finance data or HR department data will be stored. So data warehouse is a collection of different data marts.
We need to identify the bottlenecks to tune the performance, to overcome the bottleneck we need to follow the following.
This question is either to test whether you are really experienced or when he does not have any questions to ask ..
You can tell any area where you feel difficult to work. But always the best answers will be the following.
Steps to create a cube in SSAS
The data source is the Physical Connection information that the analysis service uses to connect to the database that hosts the data. The data source contains the connection string which specifies the server and the database hosting the data as well as any necessary authentication credentials.
A data source view is a persistent set of tables from a data source that supplies the data for a particular cube. BIDS also includes a wizard for creating data source views, which you can invoke by right-clicking on the Data Source Views folder in Solution Explorer.
A named calculation is a SQL expression represented as a calculated column. This expression appears and behaves as a column in the table. A named calculation lets you extend the relational schema of existing tables or views in a data source view without modifying the tables or views in the underlying data source.
Named calculation is used to create a new column in the DSV using hardcoded values or by using existing columns or even with both.
The named query in DSV is similar to View in Database. This is used to create a Virtual table in DSV which will not impact the underlying database. A named query is mainly used to merge the two or more tables in the data source view or to filter the columns of a table.
A named query is used to join multiple tables, to remove unnecessary columns from a table of a database. You can achieve the same in the database using Views but this Named Queries will be the best bet when you don’t have access to create Views in the database.
By using named calculations we can add a new column to an existing table in the data source view. Named Calculation is explained above.
A dimension table contains hierarchical data by which you’d like to summarize. A dimension table contains specific business information, a dimension table that contains the specific name of each member of the dimension. The name of the dimension member is called an “attribute”
The key attribute in the dimension must contain a unique value for each member of the dimension. This key attribute is called the “primary key column”
The primary key column of each dimension table corresponding to one of the key columns in any related fact table.
A fact table contains the basic information that you wish to summarize. The table that stores the detailed value for a measure is called the fact table. In simple and best we can define as “The table which contains METRICS” that are used to analyze the business.
It consists of 2 sections
This is a very important interview question. The “Factless Fact Table” is a table that is similar to Fact Table except for having any measure; I mean that this table just has the links to the dimensions. These tables enable you to track events; indeed they are for recording events.
Factless fact tables are used for tracking a process or collecting stats. They are called so because the fact table does not have aggregatable numeric values or information. They are mere key values with reference to the dimensions from which the stats can be collected
Attribute relationships are the way of telling the analysis service engine that how the attributes are related to each other. It will help to relate two or more attributes to each other. Processing time will be decreased if proper relationships are given. This increases the Cube Processing performance and MDX query performance too.
In Microsoft SQL Server Analysis Services, attributes within a dimension are always related either directly or indirectly to the key attribute. When you define a dimension based on a star schema, which is where all dimension attributes are derived from the same relational table, an attribute relationship is automatically defined between the key attribute and each non-key attribute of the dimension. When you define a dimension based on a snowflake schema, which is where dimension attributes are derived from multiple related tables, an attribute relationship is automatically defined as follows:
They are 2 types of attribute relationships they are
Rigid: In Rigid relationships where the relationship between the attributes is fixed, attributes will not change levels or their respective attribute relationships.
Example: The time dimension. We know that month “January 2009″ will ONLY belong to Year “2009″ and it won't be moved to any other year.
Flexible: In Flexible relationship between the attributes is changed.
Example: An employee and department. An employee can be in the accounts department today but it is possible that the employee will be in the Marketing department tomorrow.
They are 3 types of dimensions:
Confirm dimension: It is the dimension that is sharable across the multiple facts or data model. This is also called Role Playing Dimensions.
Junk dimension: A number of very small dimensions might be lumped (a small irregularly shaped) together to form a single dimension, a junk dimension – the attributes are not closely related. Grouping Random flags and text Attributes in a dimension and moving them to a separate sub dimension is known as the junk dimension.
Degenerated dimension: In this degenerate dimension contains their values in the fact table and the dimension is not available in the dimension table. Degenerated Dimension is a dimension key without a corresponding dimension.
Example: In the PointOfSale Transaction Fact table, we have:
Date Key (FK), Product Key (FK), Store Key (FK), Promotion Key (FP), and POS Transaction Number
Date Dimension corresponds to Date Key, Production Dimension corresponds to Production Key. In a traditional parent-child database, POS Transactional Number would be the key to the transaction header record that contains all the info valid for the transaction as a whole, such as the transaction date and store identifier. But in this dimensional model, we have already extracted this info into other dimensions. Therefore, POS Transaction Number looks like a dimension key in the fact table but does not have the corresponding dimension table.
They are 3 types of database schema they are
Explore - Star schema and Snowflake schema in QlikView |
We can hide the attribute by selecting “AttributeHierarchyVisible = False” in the properties of the attribute.
By selecting “ AttributeHierarchyEnabled = False”, we can make an attribute, not in process.
In Analysis Service we generally see all dimension has All member. This is because of the IsAggregatable property of the attribute. You can set its value to false so that it will not show All members. It's the default member for that attribute.
If you hide this member then you will have to set other attribute values to the default member else it will pick some value as default and this will create confusion in browsing data if someone is not known to change in default member.
A key column of any attribute: Contains the column or columns that represent the key for the attribute, which is the column in the underlying relational table in the data source view to which the attribute is bound. The value of this column for each member is displayed to users unless a value is specified for the NameColumn property.
Name column of an attribute: Identifies the column that provides the name of the attribute that is displayed to users, instead of the value in the key column for the attribute. This column is used when the key column value for an attribute member is cryptic or not otherwise useful to the user, or when the key column is based on a composite key. The NameColumn property is not used in parent-child hierarchies; instead, the NameColumn property for child members is used as the member names in a parent-child hierarchy.
Value columns of an attribute: Identifies the column that provides the value of the attribute. If the NameColumn element of the attribute is specified, the same DataItem values are used as default values for the ValueColumn element. If the NameColumn element of the attribute is not specified and the KeyColumns collection of the attribute contains a single KeyColumn element representing a key column with a string data type, the same DataItem values are used as default values for the ValueColumn element.
A hierarchy is a very important part of any OLAP engine and allows users to drill down from summary levels hierarchies represent the way users expect to explore data at a more detailed level
Hierarchy is made up of multiple levels creating the structure based on end-user requirements.
->years->quarter->month-> week, are all the levels of calendar hierarchy
They are 2 types of hierarchies are
Natural hierarchy: This means that the attributes are intuitively related to one another. There is a clear relationship from the top of the hierarchy to the bottom.
Example: An example of this would be date: year, quarter and month follow from each other, and in part, define each other.
Unnatural hierarchy: This means that the attributes are not clearly related.
Example: An example of this might be geography; we may have a country -> state -> city, but it is not clear where Province might sit.
An attribute hierarchy is created for every attribute in a dimension, and each hierarchy is available for dimensioning fact data. This hierarchy consists of an “All” level and a detail level containing all members of the hierarchy.
you can organize attributes into user-defined hierarchies to provide navigation paths in a cube. Under certain circumstances, you may want to disable or hide some attributes and their hierarchies.
AttributeHierarchyDisplayFolder: Identifies the folder in which to display the associated attribute hierarchy to end-users. For example, if I set the property value as “Test” to all the Attributes of a dimension then a folder with the name “Test” will be created and all the Attributes will be placed into the same.
AttributeHierarchyEnabled: Determines whether an attribute hierarchy is generated by Analysis Services for the attribute. If the attribute hierarchy is not enabled, the attribute cannot be used in a user-defined hierarchy and the attribute hierarchy cannot be referenced in Multidimensional Expressions (MDX) statements.
AttributeHierarchyOptimizedState: Determines the level of optimization applied to the attribute hierarchy. By default, an attribute hierarchy is FullyOptimized, which means that Analysis Services builds indexes for the attribute hierarchy to improve query performance. The other option, NotOptimized, means that no indexes are built for the attribute hierarchy. Using NotOptimized is useful if the attribute hierarchy is used for purposes other than querying because no additional indexes are built for the attribute. Other uses for an attribute hierarchy can be helping to order another attribute.
Determines whether the associated attribute hierarchy is ordered. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.
AttributeHierarchyVisible: Determines whether the attribute hierarchy is visible to client applications. The default value is True. However, if an attribute hierarchy will not be used for querying, you can save processing time by changing the value of this property to False.
There are three standard storage modes in OLAP applications
MOLAP (Multidimensional Online Analytical Processing): MOLAP is the most used storage type. It's designed to offer maximum query performance to the users. the data and aggregations are stored in a multidimensional format, compressed and optimized for performance. This is both good and bad. When a cube with MOLAP storage is processed, the data is pulled from the relational database, the aggregations are performed, and the data is stored in the AS database. The data inside the cube will refresh only when the cube is processed, so latency is high.
Advantages:
ROLAP (Relational Online Analytical Processing): ROLAP does not have a high latency disadvantage of MOLAP. With ROLAP, the data and aggregations are stored in relational format. This means that there will be zero latency between the relational source database and the cube.
The disadvantage of this mode is the performance, this type gives the poorest query performance because no objects benefit from multidimensional storage.
Advantages:
Hybrid Online Analytical Processing (HOLAP): HOLAP is a combination of MOLAP and ROLAP. HOLAP stores the detailed data in the relational database but stores the aggregations in multidimensional format. Because of this, the aggregations will need to be processed when changes are occurring. With HOLAP you kind of have medium query performance: not as slow as ROLAP, but not as fast as MOLAP. If, however, you were only querying aggregated data or using a cached query, query performance would be similar to MOLAP. But when you need to get that detailed data, performance is closer to ROLAP.
Advantages:
Translation: The translation feature in the analysis service allows you to display caption and attribute names that correspond to a specific language. It helps in providing GLOBALIZATION to the Cube.
All the dimensions that are created using the NEW DIMENSION Wizard are database dimensions. In other words, the dimensions which are at the Database level are called Database Dimensions.
A cube dimension is an instance of a database dimension within a cube is called a cube dimension. A database dimension can be used in multiple cubes, and multiple cube dimensions can be based on a single database dimension
To add a dimension to a cube follow these steps.
Slowly changing dimensions (SCD) determine how the historical changes in the dimension tables are handled. Implementing the SCD mechanism enables users to know to which category an item belonged to in any given date.
It is a concept of STORING Historical Changes and whenever an IT guy finds a new way to store then a new Type will come into the picture. Basically, there are 3 types of SCD they are given below
Roleplay dimensions: We already discussed this. This is nothing but CONFIRMED Dimensions. A dimension can play a different role in a fact table you can recognize a roleplay dimension when there are multiple columns in a fact table that each has foreign keys to the same dimension table.
Ex1: There are three dimension keys in the factinternalsales, factresellersales tables which all refer to the dim time table, the same time dimension is used to track sales by that contain either of these fact tables, the corresponding role-playing dimension is automatically added to the cube.
Ex2: In retail banking, for checking account cube we could have transaction date dimension and effective date dimension. Both dimensions have a date, month, quarter, and year attributes. The formats of attributes are the same on both dimensions, for example, the date attribute is in ‘dd-mm-yyyy’ format. Both dimensions have members from 1993 to 2010.
Measure groups: These measure groups can contain different dimensions and be at different granularity but so long as you model your cube correctly, your users will be able to use measures from each of these measure groups in their queries easily and without worrying about the underlying complexity.
Creating multiple measure groups: To create a new measure group in the Cube Editor, go to the Cube Structure tab and right-click on the cube name in the Measures pane, and select ‘New Measure Group’. You’ll then need to select the fact table to create the measure group from and then the new measure group will be created; any columns that aren’t used as foreign key columns in the DSV will automatically be created as measures, and you’ll also get an extra measure of aggregation type Count. It’s a good idea to delete any measures you are not going to use at this stage.
Measures: Measures are the numeric values that our users want to aggregate, slice, dice, and otherwise analyze, and as a result, it’s important to make sure they behave the way we want them to. One of the fundamental reasons for using Analysis Services is that unlike a relational database it allows us to build into our cube design business rules about measures: how they should be formatted, how they should aggregate up, how they interact with specific dimensions, and so on.
An attribute is a specification that defines a property of an object, element, or file. It may also refer to or set the specific value for a given instance of such.
A surrogate key is the SQL generated key which acts as an alternate primary key for the table in the database, Data warehouses commonly use a surrogate key to uniquely identify an entity. A surrogate is not generated by the user but by the system. A primary difference between a primary key and a surrogate key in a few databases is that the primary key uniquely identifies a record while a Surrogate key uniquely identifies an entity.
Ex: An employee may be recruited before the year 2000 while another employee with the same name may be recruited after the year 2000. Here, the primary key will uniquely identify the record while the surrogate key will be generated by the system (say a serial number) since the SK is NOT derived from the data.
They are six relations between the dimension and measure group, they are
No relationship: The dimension and measure group are not related.
Regular: The dimension table is joined directly to the fact table.
Referenced: The dimension table is joined to an intermediate table, which in turn, is joined to the fact table.
Many to many: The dimension table is to an intermediate fact table, the intermediate fact table is joined, in turn, to an intermediate dimension table to which the fact table is joined.
Data mining: The target dimension is based on a mining model built from the source dimension. The source dimension must also be included in the cube.
Fact table: The dimension table is the fact table.
Calculations are an item in the cube that is evaluated at runtime
Calculated members: You can create customized measures or dimension members, called calculated members, by combining cube data, arithmetic operators, numbers, and/or functions.
Example: You can create a calculated member called Marks that converts dollars to marks by multiplying an existing dollar measure by a conversion rate. Marks can then be displayed to end-users in a separate row or column. Calculated member definitions are stored, but their values exist only in memory. In the preceding example, values in marks are displayed to end-users but are not stored as cube data.
In Analysis Services, af is a collection of calculations that are associated with a measure group in a cube that is used to evaluate business success. We use KPI to see the business at the particular point, this is represented with some graphical items such as traffic signals, ganze, etc.
Actions are a powerful way of extending the value of SSAS cubes for the end-user. They can click on a cube or portion of a cube to start an application with the selected item as a parameter, or to retrieve information about the selected item.
One of the objects supported by a SQL Server Analysis Services cube is the action. An action is an event that a user can initiate when accessing cube data. The event can take a number of forms. For example, a user might be able to view a Reporting Services report, open a Web page, or drill through to detailed information related to the cube data
Analysis Services supports three types of actions.
Report action: Report action Returns a Reporting Services report that is associated with the cube data on which the action is based.
Drill through: Drillthrough Returns a result set that provides detailed information related to the cube data on which the action is based.
Standard: Standard has five action subtypes that are based on the specified cube data.
Dataset: Returns a multidimensional dataset.
Proprietary: Returns a string that can be interpreted by a client application.
Rowset: Returns a tabular rowset.
Statement: Returns a command string that can be run by a client application.
URL: Returns a URL that can be opened by a client application, usually a browser.
You can use the Partition Wizard to define partitions for a measure group in a cube. By default, a single partition is defined for each measure group in a cube. Access and processing performance, however, can degrade for large partitions. By creating multiple partitions, each containing a portion of the data for a measure group, you can improve the access and processing performance for that measure group.
In 2005 a MAX of 2000 partitions can be created per measure group and that limit is lifted in later versions.
In any version, the MINIMUM is ONE Partition per measure group.
Aggregations provide performance improvements by allowing Microsoft SQL Server Analysis Services (SSAS) to retrieve pre-calculated totals directly from cube storage instead of having to recalculate data from an underlying data source for each query. To design these aggregations, you can use the Aggregation Design Wizard. This wizard guides you through the following steps:
Perspectives are a way to reduce the complexity of cubes by hidden elements like measure groups, measures, dimensions, hierarchies, etc. It’s nothing but slicing off a cube, forex we are having retail and hospital data, and the end-user is subscribed to see only hospital data, then we can create perspective according to it.
Build: Verifies the project files and creates several local files.
Deploy: Deploy the structure of the cube(Skeleton) to the server.
Process: Read the data from the source and build the dimensions and cube structures
Elaborating the same is given below.
Build: It is used to process the data of the cube database. The build is a version of a PROGRAM. As a rule, a build is a pre-release version and as such is identified by a build number, rather than by a release number. Reiterative (repeated) builds are an important part of the development process. Throughout development, application components are collected and repeatedly COMPILED for testing purposes, to ensure a reliable final product. Build tools, such as MAKE or ANT, enable developers to automate some programming tasks. As a verb, to build can mean either to write CODE or to put individual coded components of a program together.
Deployment: During the development of an Analysis Services project in the Business Intelligence Development Studio, you frequently deploy the project to a development server in order to create the Analysis Services database defined by the project. This is required to test the project.
for example, to browse cells in the cube, browse dimension members, or verify key performance indicators (KPIs) formulas.
The maximum size of the dimension is 4 GB.
The basic unit of storage and analysis in Analysis Services is the cube. A cube is a collection of data that’s been aggregated to allow queries to return data quickly.
For example, a cube of order data might be aggregated by time period and by title, making the cube fast when you ask questions concerning orders by week or orders by title.
Explore - SSAS Analysis Services Cube Deployment Methods |
The full form of AMO is Analysis Management Objects. This is used to create or alter cubes from the .NET code.
SSIS | Power BI |
SSRS | SQL Server |
SCCM | SQL Server DBA |
SharePoint | BizTalk Server |
Team Foundation Server | BizTalk Server Administrator |
Name | Dates | |
---|---|---|
SSAS Training | Nov 02 to Nov 17 | View Details |
SSAS Training | Nov 05 to Nov 20 | View Details |
SSAS Training | Nov 09 to Nov 24 | View Details |
SSAS Training | Nov 12 to Nov 27 | View Details |
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.