SAS is one of the most used software tools in the data analytics industry. It helps in building up an excellent knowledge in data analytics platform and, it can also be used for report writings. A brief on SAS & Data Analytics can be explained in this article:
In this SAS Tutorial, you will learn the below topics. |
The Science of drawing insights which is through raw information sources is known as Data Analytics. It uses qualitative and quantitative techniques and processes with the aim of ensuring an enhancement in productivity and business gain. The techniques in Data analytics will help the enterprises to disclose updates in metrics and trends else it will be an effect on the mass of information. Data is first extracted and then it is categorized so that behavioral data and patterns can be identified and then analyzed.
Related Article: Introduction to Data Analytics |
Data Analysis techniques vary according to business requirements. Data analytics is primarily used in business-to-consumer (B2C) applications. Global enterprises collect and analyze data that has been obtained from various customers, business processes, market economics, or practical experience. All of this helps in thorough decision-making.
Data analytics can be categorized into four basic types. They are as follows.
It gives an idea of what has happened over a given period of time; was there an increase in the number of views? Or are sales of this month stronger than last?
Related Article: Data Science vs Big Data vs Data Analytics |
It is more focused on why something has happened. It includes more diverse data inputs and involves a bit of hypothesizing; were the beer sales affected by the weather? Or was there an impact on sales by the latest marketing campaign?
It includes the analysis of what is likely going to happen in the near term; where the sales affected last time we had a hot summer? Or How many weather models are predicting a hot summer this year?
Prescriptive analytics
This moves into the territory of suggesting a plan of action; if the likelihood of a hot summer after averaging these five weather models is above 58%, then we should be adding a post afternoon shift to the brewery and rent an additional tank in order to increase the output.
SAS can be considered as an integrated system of software solutions. It is an application suite that will mine, manage, alter, and retrieve data that has been gathered from a wide range of sources, and statistical analysis will then be implemented on it. A graphical point-and-click user interface is provided keeping in mind the non-technical users and more advanced options can be accessed through the SAS language.
The software suite SAS comprises components that can be useful to manage and maintain by the operating system. Some of the SAS components are:
Base SAS is the most widely used component of SAS. It provides a data management facility and one can perform data analysis using Base SAS. It is a very flexible, extensible, fourth-generation programming language, and web-based interface for data access, transformation, and reporting. It's is considered as the foundation for all SAS software.
Apart from the easy-to-learn and flexible programming language, one acquires a programming interface that is web-based; programs that are ready-to-use for information storage and retrieval, data manipulation, reporting, and descriptive statistics; a repository which is centralized metadata along with a macro facility that supports in minimizing the time of programming and problems in maintenance.
Base SAS provides many benefits such as:
With the help of SAS/Graph, one will be able to represent data in the form of graphs. This is useful in making the process of data visualization easy. Being vast and voluminous, it is a very powerful tool used in the creation of a wide range of business and scientific graphs. A graph represents data in an entirely different way, and with the help of SAS/GRAPH, one can perform this in a dazzling array of color and also make use of the 2 dimensional (2-D) and 3 dimensional (3-D) shapes, which are distinctly different compared to the tables consisting of rows and columns of text and numbers.
The size of the sheer along with the SAS/GRAPH complexity speaks to its power. Apart from the various versions of charts as well as graphs varying from a simple plot of data on an XY axis, the user of SAS entitles data in horizontal & vertical bars, pie charts, stars, and many additional graphical tools. Using those choices, there are tens of thousands of ways in which one can change the look of the graph with options, symbols, and other means.
Graphing in SAS can seem intimidating to the new programmer since it occupies a whole wing of the house of SAS. There are many different ways to make changes to the size, look, and demonstration of data using SAS/GRAPH within a graphical output.
There are five SAS/GRAPH statistical graphics procedures, each of which has been designed with a specific purpose in mind.
The SGPLOT
This procedure was designed with the motive of designing a single-celled graph, along with a single set of axes which is overlaid upon the multiple plots. The syntax of the procedure is useful in supporting most of the different kinds of graph features and plots.
The SGPANEL
This procedure is useful in creating classification panels for one or more classification variables. Every graph cell in the panel can have different kinds of plots - simple plots, multiple plots, or overlaid plots.
The SGSCATTER
This procedure is useful in creating paneled graphs containing multiple scatter plots. One has the ability to create three different types of layouts.
The SGRENDER
This procedure can be considered as a utility procedure that helps in outputting graphs from templates that are written in the Graph Template Language.
The SGDESIGN
This procedure is useful in creating graphical output which is based on a graph file that has been created by using the SAS/GRAPH ODS Graphics Designer application.
It helps you in performing various Statistical analyses, such as Variance, Regression, Multivariate, Survival, and Psychometric analysis. SAS/STAT serves both the specialized and business-wide statistical requirements by providing traditional analysis of linear regression and variance to Bayesian inference and also high-performance modeling tools for huge data.
SAS/STAT software provides statistical techniques. These techniques are used for applications that span industries. Let’s have a brief look at a couple of them below.
Manufacturing: Helps in identifying all the important factors which go into the manufacturing of a machine.
Food: Helps in identifying and targeting the right audience for a new food item.
Telecommunications: Helps in determining which factors are involved in communication with a pilot at an airport.
Government: Helps in predicting the opinions of the public using statistical sampling techniques.
Environmental Research: Helps in describing air pollution patterns with the use of spatial statistics.
Health: This is useful in determining factors influencing the healthcare of patients.
Retail: Helps in predicting customer behavior on the launch of new products.
SAS/STAT provides many benefits such as:
SAS/ETS is more suitable for Time Series Analysis. SAS/ETS provides a wide range of forecasting, time series, and econometric techniques that are helpful in modeling, forecasting, and simulation of business processes for enhanced strategic and tactical planning. SAS/ETS finds its use in estimating the effect that factors such as customer demographics, pricing decisions, economic and market conditions, and marketing activities have on business.
It also helps whenever it is necessary to analyze or predict processes that happen over a period of time, or to analyze models that include simultaneous relationships. It is also useful whenever simultaneous relationships, or dynamic processes, time dependencies are complicating data analysis. For instance, to analyze pollution emissions data, the study of environmental quality might be using SAS/ETS software time series analysis tools. A study of pharmacokinetic to model the dynamics of drug metabolism in different tissues will be using features of SAS/ETS software for nonlinear systems.
The use of SAS depends on what you want to achieve.
SAS can also be helpful in many ways
Ex: SAS tables, Microsoft Excel tables, and database files.
Ex: Data can be subset and then combined with other data, and create new columns.
If you treat yourself as a professional analyst, you must have already realized how important it is to put forth your working nature through the keyboard. Instead of your extraordinary performance with quick and more efficiently done, it is a remarkable representation by your colleagues and peers which can be achieved through the work you do without the utilization of your mouse.
To lessen the workload and to make ease of tasks, have a glance at the following keyboard shortcuts of SAS.
DESCRIPTION | SHORTCUT KEY |
Find text | Ctrl + F |
Find and replace text | Ctrl + H |
Copy Selection | Ctrl + C |
Paste | Ctrl + V |
Cut Selection | Ctrl + X |
Go to a particular line | Ctrl + G |
Run or submit a program | F3 or F8 |
Comment the selected code (/) | Ctrl + / |
Uncomment the selected code (/) | Ctrl + Shift + / |
Stop Processing or Cancel Submitted Statement | Ctrl + Break |
Convert selected text to upper case | Ctrl + Shift + U |
Convert selected text to lower case | Ctrl + Shift + L |
Move to top | CTRL+Home |
Move to end | CTRL+End |
Move to top | CTRL+Home |
To move the cursor to the matching DO/END statement | Alt + [ or Alt + ] |
To move the cursor to match brace/parentheses | Ctrl + [ or Ctrl + ] |
Move to begin of line | Home |
To close the active window | CTRL+F4 |
To exit the SAS system | ALT+F4 |
Some more important Shortcuts that work only in SAS Enterprise Guide are as follows.
Shortcut key | Description |
Ctrl + right arrow | Move to the last column |
Ctrl + left arrow | Move to the first column |
Ctrl + I | Format ugly code (Select the code and press Ctrl I) |
F2 | Rename dataset |
Ctrl+Home | Go to the first record, the first column |
Ctrl+G | Go to a specific row or column |
Ctrl+End | Go to the last record, last column |
Some other Useful SAS Keyboard Shortcuts are given below:
Description | Shortcut Key |
Collapse all folding blocks | Alt + Ctrl + Number pad - |
Expand all folding blocks | Alt + Ctrl + Number pad + |
Get Help for a SAS procedure | Place the cursor within a procedure name and press F1 |
Expand all folding blocks | Alt + Ctrl + Number pad + |
Move cursor to previous case change | Alt + Left |
Open file window | Ctrl + O |
Save as window | Ctrl + S |
Clear window | Ctrl + E |
Paste program below | F4 |
Bring up word tip | Alt + F1 + No Selection |
Hide the current word tip | Esc |
Move cursor to next case change | Alt + Right |
Move cursor to previous case change | Alt + Left |
Undo edit | Ctrl + Z |
Redo edit | Ctrl + Y |
Clear window | Ctrl + E |
System options window | Ctrl + I |
Open file window | Ctrl + O |
Save as window | Ctrl + S |
Convert the selected text to uppercase | Ctrl + Shift + U |
Submit selected code | F8 |
Log | F6 |
Output | F7 |
Editor | F8 |
File window | Ctrl + Q |
Explorer window | Ctrl + W |
Titles window | Ctrl + T |
System options window | Ctrl + I |
Context Help | F1 |
Move cursor to next case change | Alt + Right |
Next window | Ctrl + F6 |
Next Window | Ctrl + Tab |
Convert the selected text to lowercase | Ctrl + Shift + L |
Cascade | Shift + F5 |
Next window | Ctrl + F6 |
Below are the steps to create keyboard shortcuts in SAS:
The latest version of SAS available is SAS 9.4. Following are the complete instructions for installing the Windows version of SAS 9.4 on 7 and 8. This installation might take an hour or more. Similar steps can be followed for installation on any higher version of Windows.
Though there are many different ways to run SAS programs, they usually differ in the speed with which they run, the number of computer resources that are required, and the amount of interaction that you will be having with the program (i.e, the kinds of changes one can make while the program is running).
The SAS windowing environment is the way you will be interacting with SAS directly through a series of windows.
By making use of these windows, one will be able to accomplish normal tasks such as organizing and locating files, editing and entering programs, analyzing information of logs, observing procedure output, options set, and much more. OS commands can also be given from within this environment, or the ongoing session of the SAS windowing environment can be suspended, and then enter operating system commands, and then resume the SAS windowing environment session at the next level of time.
The SAS windowing environment is a fast and simplest way to program in SAS. It is a useful way in order to learn SAS and building programs on simple test files. Though it requires more computer resources than other techniques, a lot of time is saved in program development time using the SAS windowing environment.
One more important feature of SAS is the presence of SAS/ASSIST software. Here, a point-and-click interface is provided that helps you in selecting tasks that you want to perform. SAS then makes the submission of the SAS statements to complete those tasks. In order to use SAS/ASSIST, one need not necessarily know how to program in the SAS language.
SAS/ASSIST functions by submitting SAS statements just as shown earlier in this section. In that way, a number of features are provided, but the total functionality of the software is not represented. In order to perform any tasks other than the ones that are available in SAS/ASSIST, you have to learn to program in SAS.
In non-interactive mode, first, a file will be prepared that contains SAS statements and also system statements that are required by your operating environment. You can submit the program after doing this. Immediately the program starts running and the current workstation session will be occupied. While the program is running, you cannot continue to work in that session and there will be minimal interaction with the program.
The log and procedure output go to the prespecified destinations, and these will not be visible until the program ends. One should make the edits and resubmit the program in order to modify the program or correct errors.
Noninteractive execution might be faster than batch execution. This is due to the fact that the computer system runs the program immediately instead of waiting to schedule the program among other programs.
Similar to non-interactive mode, one should have a file prepared that has all the SAS statements and any system statements that are required by your operating environment, and then you make the program submission. You can then work on a different task at your workstation.
At the time when you are working, your jobs for execution will be scheduled by the operating environment (along with other jobs submitted by various people) and it will run. You can view the log and the procedure output when the execution has been completed. The key characteristic of batch execution and other activities at your workstation is completely independent of these executions. While the program is running, you cannot view it, and the errors can also be not rectified at the time they occur. Similarly, you can view the log and procedure output only after the program has finished running. They both will go to prespecified destinations.
To make modifications to the SAS program, the editor that is supported by your operating environment is used and a new batch job can be submitted. When there is a charge for computer resources by the sites, batch processing is a relatively cost-effective way in order to execute programs. It is particularly useful for huge programs or when the workstation has to be used for other tasks while the program is executing.
Nevertheless, the batch mode might not be efficient for learning SAS or developing and testing new programs.
In an interactive line-mode session, a very contrasting approach is followed compared to batch and non-interactive mode. You will be writing one line of a SAS program at once, then SAS executes each DATA or PROC step automatically once the end of the step is recognized. The procedure output will be immediately displayed on the monitor.
Based on your site's computer system and your workstation, the feature of scrolling backward and forward to see different parts of your log and procedure output is available. While scrolling up to the top of the screen, You tentatively can’t get them. For correcting errors and updating programs, the facilities are restricted which is one of the demerits of interactive line mode.
These sessions will use fewer computer resources compared to a windowing environment. SAS Language, One should be familiar with the RUN, %LIST, and %INCLUDE and statements.
SAS Studio is the user interface using which you will be able to create SAS programs. Let us take a quick glance at the various windows and their usage.
This is the first window you will be seeing when you enter the SAS environment. To navigate through various programming features, the Navigation Pane which is on the left is used. On the right side is the Work Area which is useful for writing and executing the code.
In SAS keywords, achieving the exact syntax is the prominent feature. It also provides the link redirecting to the documentation for that particular keyword.
Upon pressing the run icon present, the execution of the code will start. This is the first icon from the left or you can also use the F3 button.
The total log of the executed code is shown under the Log tab. All the errors, warnings, or notes regarding the program’s execution are described here. In order to troubleshoot your code, this would be the window you should be looking for.
The RESULTS tab will show the results of the code execution. By default, they are formatted as Html tables.
For the creation and management of programs, the Navigation Area contains features in the form of Program tabs. Apart from this, Pre-built functionalities are also given to be used with your program.
When you want to create additional programs, this would be the tab where you have to go. Also, we can import the data which needs to be analyzed and query the existing data. We can also create Folder shortcuts.
In order to use the built-in SAS programs, we use the Tasks tab. We have to only supply the input variables. For instance, under the statistics folder, you will find a SAS program to do linear regression by supplying only the SAS data set name and variable names.
In order to write SAS Macro and execute files through the current data set, we make use of the snippets tab.
All the datasets in SAS will be stored in SAS libraries. The temporary library is named WORK and will be available only for a single session. But the permanent libraries are available always.
In order to access files that have been stored outside the SAS environment, this tab is used. The shortcuts to access such files are stored under this tab.
The SAS Programming has the following flow. First, the data sets will be created/read into the memory, and then the analysis will be done on this data. Each and every SAS program should consist of the following steps in order to complete reading the input data, and then, it is analyzing, and finally the output. Also, the RUN statement is required at the end of each step so as to complete the execution of that step.
The program has the following structure:
This step involves two things.
The records are also captured by the data step.
The following is the syntax for the DATA statement.
DATA data_set_name; #Name the data set.
INPUT var1,var2,var3; #Define the variables in this data set.
NEW_VAR; #Create new variables.
LABEL; #Assign labels to variables.
DATALINES; #Enter the data.
RUN;
This step involves the invocation of a SAS built-in procedure to analyze the data.
The syntax for the PROC statement is as follows.
PROC procedure_name options; #The name of the proc. RUN;
Making use of the conditional output statements, the data from the data sets can be displayed.
The syntax for the OUTPUT statement is as follows.
PROC PRINT DATA = data_set;
OPTIONS;
RUN;
The representation of an arithmetic calculation, grouping parentheses; logical operation; a SAS function, and a comparison are the symbols of a SAS operator.
It makes use of two major types of operators:
It should be applied to the variable, constant, function, or parenthetic expression that immediately follows it.
Ex: +, -, NOT
An infix operator will be applied to the operands on each side of it, for example.
Ex: arithmetic, comparison, logical, or Boolean, minimum, maximum, and concatenation.
For the combination of SAS data sets, the following methods can be used.
Data Representation deals with how individual quantities and strings will be stored.
There are four ways to accomplish this.
Let us take a quick glance at "procedure steps" which allow us to call a SAS procedure in order to analyze or process a SAS dataset. The procedures are:
The general syntax for these procedures in SAS is as follows:
proc [NAME OF PROCEDURE] data=[NAME OF SAS DATA SET]; [Options for Procedure being used] run;
Some of the other options which can be used in a procedure step include:
We hope you have got a good understanding of what SAS is, and how it can help you progress further in your career. For any further reading, we recommend you the following resources:
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.