This blog post covers the latest and most advanced SAS interview questions for beginners and professionals. If you are preparing for an interview, going through these questions will help you get familiar with the basics of SAS efficiently. Have a read ahead.
Statistical Analysis Systems (SAS) dates back to the 1970s. Currently, it is one of the most considerable software tools used across organisations and companies. It is a closed source analysis tool that is extensively used in the corporate world to make strategic decisions.
SAS programming lets you process a massive chunk of raw data into manageable small sets of data, resulting in decision making. It helps perform a variety of statistical analysis variance, psychometric analysis and regression analysis. If you are considering making a fruitful career in this domain, this post covers some of the advanced SAS interview questions. Let’s read ahead.
We have categorised Advanced SAS Interview Questions into two levels:
Advanced SAS Interview Questions And Answers - Table of Contents
If you are just beginning in this industry and are preparing for an upcoming interview, these advanced SAS interview questions for beginners will help you prepare better.
Ans: Some of the functions performed by SAS include:
Ans: To write a SAS program, we will need an Editor Window. Here, it comprises a series of statements that are followed by adequate syntax in a comprehensive order for the SAS program to comprehend it. Thus, some of the syntax rules followed in the SAS statements are:
Ans: There are two types of data that a SAS program contains, such as Numeric and Character.
Ans: Program Data Vector (PDV) is one logical concept that is defined as an area of memory where a specific dataset is developed by SAS. Some of the major functions of PDV include:
Ans: During the iteration of the data step, the inclusion of double trailing @@ in Input statements talks about SAS holding the current record for the execution of the next Input statement instead of switching to a new record altogether.
Ans:
Features | SAS | SPSS |
User Interface | Highly interactive UI | Moderately interactive UI |
Decision Making | Works along with Enterprise Miner | Possible to obtain answer tree |
Data Management | Advantageous than SPSS | Supports Data Management |
Documentation | Huge set of technical documentation | Lack of documentation |
Ans: Macros in SAS are used when we wish to use a program step to execute the same Proc step on several datasets. We can accomplish varying repetitive tasks efficiently and quickly with Macros. Moreover, a macro program can be reused several times.
Ans: The SAS system provides an array of efficient system options that help debug macro problems and issues. The results linked with using macro options get automatically displayed on the SAS log. Some of the specific options related to macro debugging get displayed in alphabetical order, such as
Check Out: Advanced SAS Tutorials |
Ans: An auto call facility is referred to as a facility in the SAS system that uses the same SAS macro code for multiple programs by storing the macro in a specific location. It helps benefit a variety of programmers by permitting better consistency and faster updates across all of the programs.
Ans:
Ans: There are five varying ways to create macro variables, such as
Ans:
Ans: The %GLOBAL is basically used to create a global macro variable and remains accessible till the session ends. Post that, it gets removed.
The %LOCAL is used to create a local macro variable during the time the macro is getting executed. Once the macro completes the processing, %LOCAL gets removed.
Ans: Both %NRSTR and %STR functions are macro quoting functions that are used to conceal the normal meaning of special tokens and other logical and comparison operators; thus, they get displayed as contant text. However, the only difference between both is that %NRSTR can mask up the macro triggers while %STR cannot do so.
Ans: To create multiple observations from a single observation, we will use double trailing @@.
If you are a professional or an expert in advanced SAS and looking forward to switching your job, refer to these advanced SAS interview questions for professionals to get a gist of what’s going to come.
Ans: The Append procedure helps adding the observations from one SAS dataset to the end of the another dataset. The PROC Append procedure doesn’t process the observations in the first datasets.
Ans: PROC SORT command is the one that will be used to perform sorting, regardless of it is being done on multiple variables or a single variable. This command is performed on a dataset where the new set of data is created as a result of sorting but the original one remains untouched.
The syntax for this command is:
PROC SORT DATA=original OUT=Sorted;
BY variable
Here,
‘Original’ is in reference to the original dataset
‘Sorted’ is in reference to the result as sorted dataset
‘Variable’s is in reference to the column where sorting is being done
Also, sorting can be done in both descending and ascending orders.
PROC SORT DATA=original OUT=Sorted;
BY DESCENDING variable
Ans: Under the SAS programming, we come across several such tasks where a character variable has to be converted into a numeric variable and vice versa. To convert a numeric variable to a character variable, we use PUT( ). In such a situation, the source variable type and the source format will be the same always. For instance: Under the SAS programming, we come across several such tasks where a character variable has to be converted into a numeric variable and vice versa. To convert a numeric variable to a character variable, we use PUT( ). In such a situation, the source variable type and the source format will be the same always. For instance:
char_var= PUT( num_var, 6.);
On the other hand, to convert a character variable to a numeric variable, INPUT ( ) will be used. In such a situation, the source variable type should be the character variable always. For instance:
Num_var= INPUT(char_var,2.0);
Ans: In SAS programming, the table lookup values can get stored in the following ways:
Below-mentioned techniques are used to perform the table lookup in SAS:
Let’s consider an example that shows the Code way to perform the table lookup with the help of IF/THEN statements:
data location;
set myinfo;
if AreaCode='226' then Location='Mumbai, India';
else if AreaCode='212' then Location='Delhi, India';
else Location='Unknown';
run;
Ans: The purpose of the RETAIN statement in SAS programming is to keep the value that was once assigned. Within a SAS program, whenever it has to move from the current iteration to the upcoming datastep, RETAIN statement is used to tell SAS to retain the values instead of setting them as missing.
To understand better, let’s print a program that will showcase the output value of ‘h’ beginning from 1 through the RETAIN statement.
data abc;
set xyz;
RETAIN h 0;
h = h + 1;
run;
Ans: The MPRINT option helps writing to the SAS log each SAS statement that is generated by a macro. You can use the MPRINT option when you suspect the bug lies in code that is generated in a way you didn’t expect.
For instance, the below-mentioned program can generate a simple DATA step:
%macro second(param);
%let a = %eval(¶m);a
%mend second;
%macro first(exp);
data _null_;
var=%second(&exp);
put var=;
run;
%mend first;
options mprint;
%first(1+2)
When you will submit the statement with MPRINT option, the following lines will get written to the SAS log:
MPRINT(FIRST): DATA _NULL_;
MPRINT(FIRST): VAR=
MPRINT(SECOND): 3
MPRINT(FIRST): ;
MPRINT(FIRST): PUT VAR=;
MPRINT(FIRST): RUN;
VAR=3
The MPRINT option displays the generated text and identifies the macro that has generated it.
Ans: Symget is a data step function that returns the macro variable’s value to the data step during its execution. However, it comes with a few restrictions and is not supported by the CAS engine. The syntax for Symget is SYMGET (argument).
Basically, SYMPGET returns the character value that is of the maximum length of the data step's character variable. In case SYMGET is unable to find the macro variable discovered as the argument, it ends up returning a missing value and the program gives a message for an invalid argument.
You can use SYMGET in all of the SAS language programs as it resolves variables at the execution of a program.
Ans: The primary difference between a SAS views and SAS datasets is the place where data values get stored:
Feature | SAS View | SAS Dataset |
Merge Efficiency | One view is capable of performing a multi-table join. With SAS/CONNECT, a view can join datasets that are stored on varying host computers. | Multiple data steps are needed to merge datasets by common variables. |
Disk Space vs Processing Speed | A view doesn’t store any underlying data; thus, processing speed can get impacted | It stores the full data for faster processing |
Data Integrity | Data is dynamic; hence, whenever you refer to a view in a PROC step, the view gets executed and offers the data values as they currently exist in the underlying data | Here, the data remains static |
Data Preparation | Data gets processed in the existing form during its execution | Variables can get sorted and indexed before used |
Separation of Data from the Consumers’ Data | A view can offer custom, the prepackaged perspective of the underlying data. The query of view can get altered without changing the data | A custom perspective may need a dataset duplication; modifying the data may need replacing the entire dataset. |
Ans: It is possible to read the last observation to a new dataset by using the end= dataset option. For instance:
data work.calculus;
set work.comp end=last;
If last;
run;
Here, calculus is the new dataset that has to be created and comp is the existing dataset. Also, last is the temporary variable, which has been set to 1 when the set statement reads the last observation.
Ans: The SAS functions anticipate argument values that should be supplied across the observation in the SAS dataset. On the other hand, the SAS procedure anticipates one variable value for each observation. For instance:
data average ;
set temp ;
avgtemp = mean( of T1 – T24 ) ;
run ;
Here, the arguments of mean function have been taken across the observation. The mean function calculates the average of varying values in one observation.
proc sort ;
by month ;
run ;
proc means ;
by month ;
var avgtemp ;
run ;
The Proc means is used to evaluate average temperature by month. Here, the procedure means on the variable month.
Ans: The “+” operator returns a value that is missing in a situation where any of the arguments are missing. On the other hand, the SUM function returns the sum of all the non-missing arguments. For example:
data mydata;
input x y z;
cards;
33 3 3
24 3 4
24 3 4
. 3 2
23 . 3
54 4 .
35 4 2
;
run;
data mydata2;
set mydata;
a=sum(x,y,z);
p=x+y+z;
run;
In the output, the value of p is not found for 4th, 5th, and 6th observation as:
a p
39 39
31 31
31 31
5 .
26 .
58 .
41 41
Ans:
Proc sort data=SAS-Dataset nodups;
by var;
run;
Proc sql;
Create SAS-Dataset as select * from Old-SAS-Dataset where var=distinct(var);
quit;
Set temp;
By group;
If first.group and last.group then
Run;
Ans: To specify the number of specific conditions and iterations in a single DO loop, we will use the following code:
data work;
do i=1 to 20 until(Sum>=20000);
Year+1;
Sum+2000;
Sum+Sum*.10;
end;
run;
This iterative DO statement allows you to execute the DO loop until the Sum is bigger than or equal to 20000 or until the DO loop is executed 10 times, whichever happens first.
Ans: To use the Scan function, we will apply:
scan(argument,n,delimiters)
Here, the argument specifies the expression or character variable to scan. N specifies the word to read and delimiters are the special characters that should be enclosed in the single quotation marks.
Ans: If both the datasets in the merge statement have been sorted by ID and every observation in one dataset has a corresponding observation in another dataset, a one to one merge will be applied. For instance:
data mydata1;
input id class $;
cards;
1 Sa
2 Sd
3 Rd
4 Uj
;
data mydata2;
input id class1 $;
cards;
1 Sac
2 Sdf
3 Rdd
4 Lks
;
data mymerge;
merge mydata1 mydata2;
run;
However, if the observations don’t match, the match mergin will be used. For example:
data mydata1;
input id class $;
cards;
1 Sa
2 Sd
2 Sp
3 Rd
4 Uj
;
data mydata2;
input id class1 $;
cards;
1 Sac
2 Sdf
3 Rdd
3 Lks
5 Ujf
;
data mymerge;
merge mydata1 mydata2;
by id
run;
Now that you are ready to crack the upcoming interview, the list of SAS interview questions will surely help you out, whether you are a fresher or an experienced SAS programmer. Most of these questions have been asked around in previous rounds of interviews. Thus, before you appear for one, brush up the concept by referring to these questions and answers mentioned above.
Explore SAS Sample Resumes! Download & Edit, Get Noticed by Top Employers! |
Anjaneyulu Naini is working as a Content contributor for Mindmajix. He has a great understanding of today’s technology and statistical analysis environment, which includes key aspects such as analysis of variance and software,. He is well aware of various technologies such as Python, Artificial Intelligence, Oracle, Business Intelligence, Altrex, etc. Connect with him on LinkedIn and Twitter.