SAS Tutorial

SAS Tutorial For Beginner 

SAS is one of the most used software tools in the data analytics industry. It helps in building up an excellent knowledge in data analytics platform and, it can also be used for report writings. A brief on SAS & Data Analytics can be explained in this article:

In this SAS Tutorial, you will learn the below topics.
  1. What is Data Analytics?
  2. What is SAS?
  3. SAS Components
  4. Need for SAS
  5. Keyboard shortcut for SAS
  6. Installation of SAS Programming
  7. Running a SAS Program
  8. SAS User Interface
  9. SAS Program Structure
  10. SAS Useful Resources

What is Data Analytics

The Science of drawing insights which is through raw information sources is known as Data Analytics. It uses qualitative and quantitative techniques and processes with the aim of ensuring an enhancement in productivity and business gain. The techniques in Data analytics will help the enterprises to disclose updates in metrics and trends else it will be an effect on the mass of information. Data is first extracted and then it is categorized so that behavioral data and patterns can be identified and then analyzed.

Related Article: Introduction to Data Analytics

 

What is Data Analytics

Data Analysis techniques vary according to business requirements. Data analytics is primarily used in business-to-consumer (B2C) applications. Global enterprises collect and analyze data that has been obtained from various customers, business processes, market economics, or practical experience. All of this helps in thorough decision-making.

Data analytics can be categorized into four basic types. They are as follows.

Descriptive analytics

It gives an idea of what has happened over a given period of time; was there an increase in the number of views? Or are sales of this month stronger than last?

Related Article: Data Science vs Big Data vs Data Analytics

Diagnostic analytics

It is more focused on why something has happened. It includes more diverse data inputs and involves a bit of hypothesizing; were the beer sales affected by the weather? Or was there an impact on sales by the latest marketing campaign?

Predictive analytics

It includes the analysis of what is likely going to happen in the near term; where the sales affected last time we had a hot summer? Or How many weather models are predicting a hot summer this year?

 MindMajix Youtube Channel

Prescriptive analytics

This moves into the territory of suggesting a plan of action; if the likelihood of a hot summer after averaging these five weather models is above 58%, then we should be adding a post afternoon shift to the brewery and rent an additional tank in order to increase the output.

What is SAS?

SAS can be considered as an integrated system of software solutions. It is an application suite that will mine, manage, alter, and retrieve data that has been gathered from a wide range of sources, and statistical analysis will then be implemented on it. A graphical point-and-click user interface is provided keeping in mind the non-technical users and more advanced options can be accessed through the SAS language.

What is SAS?

SAS Components: 

The software suite SAS comprises components that can be useful to manage and maintain by the operating system. Some of the SAS components are:

  • Base SAS
  • SAS/Graph

Base SAS

Base SAS is the most widely used component of SAS. It provides a data management facility and one can perform data analysis using Base SAS. It is a very flexible, extensible, fourth-generation programming language, and web-based interface for data access, transformation, and reporting. It's is considered as the foundation for all SAS software.

Apart from the easy-to-learn and flexible programming language, one acquires a programming interface that is web-based; programs that are ready-to-use for information storage and retrieval, data manipulation, reporting, and descriptive statistics; a repository which is centralized metadata along with a macro facility that supports in minimizing the time of programming and problems in maintenance.

Base SAS provides many benefits such as:

  • Helps in Integrating data across environments.
  • Reads, formats, and analyzes any data.
  • Programming is now fast and easy.
  • Reports are now delivered to mobile devices.
  • Computing resources can be maximized.
  • SAS and Hadoop can be combined.

SAS/GRAPH

With the help of SAS/Graph, one will be able to represent data in the form of graphs. This is useful in making the process of data visualization easy. Being vast and voluminous, it is a very powerful tool used in the creation of a wide range of business and scientific graphs. A graph represents data in an entirely different way, and with the help of SAS/GRAPH, one can perform this in a dazzling array of color and also make use of the 2 dimensional (2-D) and 3 dimensional (3-D) shapes, which are distinctly different compared to the tables consisting of rows and columns of text and numbers.

The size of the sheer along with the  SAS/GRAPH complexity speaks to its power. Apart from the various versions of charts as well as graphs varying from a simple plot of data on an XY axis, the user of SAS entitles data in horizontal & vertical bars, pie charts, stars, and many additional graphical tools. Using those choices, there are tens of thousands of ways in which one can change the look of the graph with options, symbols, and other means.

Graphing in SAS can seem intimidating to the new programmer since it occupies a whole wing of the house of SAS. There are many different ways to make changes to the size, look, and demonstration of data using SAS/GRAPH within a graphical output.

There are five SAS/GRAPH statistical graphics procedures, each of which has been designed with a specific purpose in mind. 

The SGPLOT

This procedure was designed with the motive of designing a single-celled graph, along with a single set of axes which is overlaid upon the multiple plots. The syntax of the procedure is useful in supporting most of the different kinds of graph features and plots. 

The SGPANEL

This procedure is useful in creating classification panels for one or more classification variables. Every graph cell in the panel can have different kinds of plots - simple plots, multiple plots, or overlaid plots.

The SGSCATTER

This procedure is useful in creating paneled graphs containing multiple scatter plots. One has the ability to create three different types of layouts.

The SGRENDER

This procedure can be considered as a utility procedure that helps in outputting graphs from templates that are written in the Graph Template Language.

The SGDESIGN

This procedure is useful in creating graphical output which is based on a graph file that has been created by using the SAS/GRAPH ODS Graphics Designer application.

SAS/STAT

It helps you in performing various Statistical analyses, such as Variance, Regression, Multivariate, Survival, and Psychometric analysis. SAS/STAT serves both the specialized and business-wide statistical requirements by providing traditional analysis of linear regression and variance to Bayesian inference and also high-performance modeling tools for huge data.

SAS/STAT software provides statistical techniques. These techniques are used for applications that span industries. Let’s have a brief look at a couple of them below.

Manufacturing: Helps in identifying all the important factors which go into the manufacturing of a machine.

Food: Helps in identifying and targeting the right audience for a new food item.
Telecommunications: Helps in determining which factors are involved in communication with a pilot at an airport.

Government: Helps in predicting the opinions of the public using statistical sampling techniques.

Environmental Research: Helps in describing air pollution patterns with the use of spatial statistics.

Health: This is useful in determining factors influencing the healthcare of patients.

Retail: Helps in predicting customer behavior on the launch of new products.

SAS/STAT provides many benefits such as:

  • Developing risk models.
  • Performing diagnostic statistics.
  • The latest and trending statistical techniques can be applied regardless of the size of the data.
  • Is useful in providing the advantage of technical support and web user communities.
  • Relies on algorithms that have been validated. 
  • Simplification will be offered within a single environment.

SAS/ETS

SAS/ETS is more suitable for Time Series Analysis. SAS/ETS provides a wide range of forecasting, time series, and econometric techniques that are helpful in modeling, forecasting, and simulation of business processes for enhanced strategic and tactical planning. SAS/ETS finds its use in estimating the effect that factors such as customer demographics, pricing decisions, economic and market conditions, and marketing activities have on business.

It also helps whenever it is necessary to analyze or predict processes that happen over a period of time, or to analyze models that include simultaneous relationships. It is also useful whenever simultaneous relationships, or dynamic processes, time dependencies are complicating data analysis. For instance, to analyze pollution emissions data, the study of environmental quality might be using SAS/ETS software time series analysis tools. A study of pharmacokinetic to model the dynamics of drug metabolism in different tissues will be using features of SAS/ETS software for nonlinear systems.

Need for SAS

The use of SAS depends on what you want to achieve.

  • It is used in data entry, retrieval, and management
  • It helps in report writing and graphics design
  • It finds its use in statistical and mathematical analysis
  • It is also popular for business forecasting and decision support
  • It also helps in operations research, project management, and applications development

SAS can also be helpful in many ways

  • It is helpful in accessing data in almost any format

          Ex: SAS tables, Microsoft Excel tables, and database files.

  • It helps in managing and manipulating existing data in order to get the data that you need.

          Ex: Data can be subset and then combined with other data, and create new columns.

  • It helps in analyzing your data utilizing the techniques of statistics which vary from descriptive measures like correlations to logistic regression and mixed models to sophisticated methods like modern model selection and Bayesian hierarchical models.
  • It also presents the results of your analyses in a meaningful report in a wide variety of formats, including HTML, PDF, and RTF that you can share with others.

Keyboard Shortcuts for SAS

If you treat yourself as a professional analyst, you must have already realized how important it is to put forth your working nature through the keyboard. Instead of your extraordinary performance with quick and more efficiently done, it is a remarkable representation by your colleagues and peers which can be achieved through the work you do without the utilization of your mouse. 

To lessen the workload and to make ease of tasks, have a glance at the following keyboard shortcuts of SAS.

DESCRIPTIONSHORTCUT KEY
Find textCtrl + F
Find and replace textCtrl + H
Copy SelectionCtrl + C
PasteCtrl + V
Cut SelectionCtrl + X
Go to a particular lineCtrl + G
Run or submit a programF3 or F8
Comment the selected code (/)Ctrl + /
Uncomment the selected code (/)Ctrl + Shift + /
Stop Processing or Cancel Submitted StatementCtrl + Break
Convert selected text to upper caseCtrl + Shift + U
Convert selected text to lower caseCtrl + Shift + L
Move to topCTRL+Home
Move to endCTRL+End
Move to topCTRL+Home
To move the cursor to the matching DO/END statementAlt + [ or Alt + ]
To move the cursor to match brace/parenthesesCtrl + [ or Ctrl + ]
Move to begin of lineHome
To close the active windowCTRL+F4
To exit the SAS systemALT+F4

Some more important Shortcuts that work only in SAS Enterprise Guide are as follows.

Shortcut keyDescription
Ctrl + right arrowMove to the last column
Ctrl + left arrowMove to the first column
Ctrl + IFormat ugly code (Select the code and press Ctrl I)
F2Rename dataset
Ctrl+HomeGo to the first record, the first column
Ctrl+GGo to a specific row or column
Ctrl+EndGo to the last record, last column

Some other Useful SAS Keyboard Shortcuts are given below:

DescriptionShortcut Key
Collapse all folding blocksAlt + Ctrl + Number pad -
Expand all folding blocksAlt + Ctrl + Number pad +
Get Help for a SAS procedurePlace the cursor within a procedure name and press F1
Expand all folding blocksAlt + Ctrl + Number pad +
Move cursor to previous case changeAlt + Left
Open file windowCtrl + O
Save as windowCtrl + S
Clear windowCtrl + E
Paste program belowF4
Bring up word tipAlt + F1 + No Selection
Hide the current word tipEsc
Move cursor to next case changeAlt + Right
Move cursor to previous case changeAlt + Left
Undo editCtrl + Z
Redo editCtrl + Y
Clear windowCtrl + E
System options windowCtrl + I
Open file windowCtrl + O
Save as windowCtrl + S
Convert the selected text to uppercaseCtrl + Shift + U
Submit selected codeF8
LogF6
OutputF7
EditorF8
File windowCtrl + Q
Explorer windowCtrl + W
Titles windowCtrl + T
System options windowCtrl + I
Context HelpF1
Move cursor to next case changeAlt + Right
Next windowCtrl + F6
Next WindowCtrl + Tab
Convert the selected text to lowercaseCtrl + Shift + L
CascadeShift + F5
Next windowCtrl + F6

Below are the steps to create keyboard shortcuts in SAS:

  • Click the Enhanced Editor window in SAS.
  • In the toolbar, you should select Tools and then Options next click on Keys.
  • Go down to the keystroke where you would like to assign to the series of commands, by looking for a keystroke that has no assignment.
  • Add the command code under the definition heading. For example: log; clear; output;clear;
  • Click to Close the Keys window.

Installation of SAS programming/Development environment

The latest version of SAS available is SAS 9.4. Following are the complete instructions for installing the Windows version of SAS 9.4 on 7 and 8. This installation might take an hour or more. Similar steps can be followed for installation on any higher version of Windows.

Installation Requirements

  • For the installation from Software Acquisition, one needs to choose up a USB Key or 6 DVD discs.
  • The editions such as Home Premium, Enterprise, Ultimate, and professional in Microsoft Windows 7 SP1.
  • Professional and Enterprise editions in Microsoft Windows 8, 8.1, and 10.
  • To install, windows administrator privileges are mandatory to log in to windows.
  • There must be a minimum free space of 13 GB for doing the installation (depends on the installation of SAS Foundation, it will be needing extra which depends upon the installation of products chosen).
  • To know more about the installation requirements, take a note on the SAS 9.4 Foundation for Microsoft Windows x64 or SAS 9.4 Foundation for Microsoft Windows which is mentioned in the PDF document Systems Requirements.

Installation Procedure

  • Start Windows
  • All the remaining Windows applications including virus-scanning programs should be closed
  • Insert into your DVD drive or USB Key, the SAS 9.4 DVD (disc 1).
    • Run the “setup.exe” from the root of the disc SAS_99YTT3_ds01. In case the installation does not start automatically, run from the root of the USB Key.
  • After this is done, a prompt will be generated by User Account Control (UAC).  Click on Yes.
  • Choose the setup language of your choice in the “Choose Language” window, and then click on OK

SAS Tutorial

  • The “SAS Deployment Wizard 9.4″ splash screen will then appear.
  • On this screen, click “Install SAS Software” and then click on Next.
  • The “Specify SAS Home” screen will be appearing.
  • On the “Select Deployment Type” screen, select Next.
  • You need to specify the SAS Products to be installed which is present on the “Select Products to Install” screen.
    • Check off the following products at least which is mandatory: SAS Foundation and the Remaining selection of products are optional
  • Once you can view the dialog box as Select SAS Foundation Products, click the SAS Foundation products which you are interested to install except the checked defaults. 
  • Click Next in order for further steps.
    • In case you are running a 64-bit version of windows, you might be prompted to Select SAS Enterprise Guide Mode.
      • Choose a mode and then click on Next to continue.
    • On the “Select SAS Foundation Products”, you can choose default selected options or you have to choose your interested products to be installed.
      • Click Next to continue.
  • On the “Specify SAS Installation Data File”, select Next.
    • In case the SAS Installation Data File on the media has expired, navigate to https://software.unc.edu/sas and then download an updated SAS Installation Data File.
  • Next, click the Clear All button to reset all languages in the screen of “Select Language Support”, except English. 
  • In a dialog box of “Select Regional Settings”, it is your wish to select from the list of drop-down for other regional settings or follow the default.
    • Press Next for further steps to be continued.
  • On the “Default Product for SAS File Types” screen, Select SAS Foundation (64-bit) and then click on Next.
  • Leave the default Host Name and Port Number. 
    • Click Next to continue.
  • Don’t click Next in the dialog box of “Checking System”.
    • It automatically forwards to the further dialog box Checking System, at the installation process which is from the package of loading installation, which is the.

Running a SAS program

Though there are many different ways to run SAS programs, they usually differ in the speed with which they run, the number of computer resources that are required, and the amount of interaction that you will be having with the program (i.e, the kinds of changes one can make while the program is running).

SAS Windowing Environment

The SAS windowing environment is the way you will be interacting with SAS directly through a series of windows.

By making use of these windows, one will be able to accomplish normal tasks such as organizing and locating files, editing and entering programs, analyzing information of logs, observing procedure output, options set, and much more. OS commands can also be given from within this environment, or the ongoing session of the SAS windowing environment can be suspended, and then enter operating system commands, and then resume the SAS windowing environment session at the next level of time.

The SAS windowing environment is a fast and simplest way to program in SAS. It is a useful way in order to learn SAS and building programs on simple test files. Though it requires more computer resources than other techniques, a lot of time is saved in program development time using the SAS windowing environment.

SAS/ASSIST Software

One more important feature of SAS is the presence of SAS/ASSIST software. Here, a point-and-click interface is provided that helps you in selecting tasks that you want to perform. SAS then makes the submission of the SAS statements to complete those tasks. In order to use SAS/ASSIST, one need not necessarily know how to program in the SAS language.

SAS/ASSIST functions by submitting SAS statements just as shown earlier in this section. In that way, a number of features are provided, but the total functionality of the software is not represented. In order to perform any tasks other than the ones that are available in SAS/ASSIST, you have to learn to program in SAS.

Modes of Programming in SAS

Non-Interactive Mode

In non-interactive mode, first, a file will be prepared that contains SAS statements and also system statements that are required by your operating environment. You can submit the program after doing this. Immediately the program starts running and the current workstation session will be occupied. While the program is running, you cannot continue to work in that session and there will be minimal interaction with the program.

The log and procedure output go to the prespecified destinations, and these will not be visible until the program ends. One should make the edits and resubmit the program in order to modify the program or correct errors.
Noninteractive execution might be faster than batch execution. This is due to the fact that the computer system runs the program immediately instead of waiting to schedule the program among other programs.

Batch Mode

Similar to non-interactive mode, one should have a file prepared that has all the SAS statements and any system statements that are required by your operating environment, and then you make the program submission. You can then work on a different task at your workstation.

At the time when you are working, your jobs for execution will be scheduled by the operating environment (along with other jobs submitted by various people) and it will run. You can view the log and the procedure output when the execution has been completed. The key characteristic of batch execution and other activities at your workstation is completely independent of these executions. While the program is running, you cannot view it, and the errors can also be not rectified at the time they occur. Similarly, you can view the log and procedure output only after the program has finished running. They both will go to prespecified destinations.

To make modifications to the SAS program, the editor that is supported by your operating environment is used and a new batch job can be submitted. When there is a charge for computer resources by the sites, batch processing is a relatively cost-effective way in order to execute programs. It is particularly useful for huge programs or when the workstation has to be used for other tasks while the program is executing.

Nevertheless, the batch mode might not be efficient for learning SAS or developing and testing new programs.

Interactive Line Mode

In an interactive line-mode session, a very contrasting approach is followed compared to batch and non-interactive mode. You will be writing one line of a SAS program at once, then SAS executes each DATA or PROC step automatically once the end of the step is recognized. The procedure output will be immediately displayed on the monitor.

Based on your site's computer system and your workstation, the feature of scrolling backward and forward to see different parts of your log and procedure output is available. While scrolling up to the top of the screen, You tentatively can’t get them. For correcting errors and updating programs, the facilities are restricted which is one of the demerits of interactive line mode.

These sessions will use fewer computer resources compared to a windowing environment. SAS Language, One should be familiar with the RUN, %LIST, and %INCLUDE and statements.

SAS User Interface

SAS Studio is the user interface using which you will be able to create SAS programs. Let us take a quick glance at the various windows and their usage.

SAS Main Window

This is the first window you will be seeing when you enter the SAS environment. To navigate through various programming features, the Navigation Pane which is on the left is used. On the right side is the Work Area which is useful for writing and executing the code.

SAS Main Window

Code Autocomplete

In SAS keywords, achieving the exact syntax is the prominent feature. It also provides the link redirecting to the documentation for that particular keyword.

Program Execution

Upon pressing the run icon present, the execution of the code will start. This is the first icon from the left or you can also use the F3 button.

Program Log

The total log of the executed code is shown under the Log tab. All the errors, warnings, or notes regarding the program’s execution are described here. In order to troubleshoot your code, this would be the window you should be looking for.

Program Result

The RESULTS tab will show the results of the code execution. By default, they are formatted as Html tables. 

Program Tabs

For the creation and management of programs, the Navigation Area contains features in the form of Program tabs. Apart from this, Pre-built functionalities are also given to be used with your program.

Server Files and Folders

When you want to create additional programs, this would be the tab where you have to go. Also, we can import the data which needs to be analyzed and query the existing data. We can also create Folder shortcuts. 

Tasks

In order to use the built-in SAS programs, we use the Tasks tab. We have to only supply the input variables. For instance, under the statistics folder, you will find a SAS program to do linear regression by supplying only the SAS data set name and variable names.

Snippets

In order to write SAS Macro and execute files through the current data set, we make use of the snippets tab.

Program Libraries

All the datasets in SAS will be stored in SAS libraries. The temporary library is named WORK and will be available only for a single session. But the permanent libraries are available always.

File Shortcuts

In order to access files that have been stored outside the SAS environment, this tab is used. The shortcuts to access such files are stored under this tab.

SAS-program structure

The SAS Programming has the following flow. First, the data sets will be created/read into the memory, and then the analysis will be done on this data. Each and every SAS program should consist of the following steps in order to complete reading the input data, and then, it is analyzing, and finally the output. Also, the RUN statement is required at the end of each step so as to complete the execution of that step.

SAS-program

The program has the following structure:

DATA Step

This step involves two things.

  • Loading the data set (that is required) into the SAS memory.
  • Identifying the data set variables (also known as columns).

The records are also captured by the data step.

The following is the syntax for the DATA statement.

DATA data_set_name;                  #Name the data set.
INPUT var1,var2,var3;                    #Define the variables in this data set.
NEW_VAR;                                         #Create new variables.
LABEL;                                         #Assign labels to variables.
DATALINES;                              #Enter the data.
RUN;

PROC Step

This step involves the invocation of a SAS built-in procedure to analyze the data.

The syntax for the PROC statement is as follows.

PROC procedure_name options; #The name of the proc.
RUN;

The OUTPUT Step

Making use of the conditional output statements, the data from the data sets can be displayed.

The syntax for the OUTPUT statement is as follows.

PROC PRINT DATA = data_set;
OPTIONS;
RUN;

SAS-data operators

The representation of an arithmetic calculation, grouping parentheses; logical operation; a SAS function, and a comparison are the symbols of a SAS operator.

It makes use of two major types of operators:

prefix operators

It should be applied to the variable, constant, function, or parenthetic expression that immediately follows it.

Ex: +, -, NOT

Infix operators

An infix operator will be applied to the operands on each side of it, for example.

Ex: arithmetic, comparison, logical, or Boolean, minimum, maximum, and concatenation.

SAS data set operations

For the combination of SAS data sets, the following methods can be used.

  • concatenating
  • interleaving
  • one-to-one reading
  • one-to-one merging
  • match merging
  • updating

SAS data representation

Data Representation deals with how individual quantities and strings will be stored.

There are four ways to accomplish this.

  • type (numeric vs. character)
  • character codes (ASCII, Unicode, ...)
  • numeric precision and rounding
  • SAS date and time conventions

SAS basic statistical procedure

Let us take a quick glance at "procedure steps" which allow us to call a SAS procedure in order to analyze or process a SAS dataset. The procedures are:

  • proc import
  • proc export
  • Viewing datasets
  • Summarising the contents of data sets
  • Obtaining summary statistics of data sets
  • Obtaining frequency tables
  • Obtaining linear models
  • Plotting data

The general syntax for these procedures in SAS is as follows:

proc [NAME OF PROCEDURE] data=[NAME OF SAS DATA SET];
[Options for Procedure being used]
run;

Some of the other options which can be used in a procedure step include:

  • "var" - which tells SAS which are the variables to be processed.
  • "by" - tells SAS to compartmentalize the procedure for each different value of the named variable(s). The data set should first be sorted by those variables.
  • "where" - it will be selecting only those observations for which the expression is true.

SAS useful resources

We hope you have got a good understanding of what SAS is, and how it can help you progress further in your career. For any further reading, we recommend you the following resources:

Blog

Books

  • The Little SAS Book
  • Learning SAS by Example
  • SAS For Dummies
  • Practical and Efficient SAS Programming
  • SAS Essentials: Mastering SAS for Data Analytics
  • Applied Statistics and the SAS Programming Language

 

About Author

Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.

read less