Advanced SAS Interview Questions

This blog post covers the latest and most advanced SAS interview questions for beginners and professionals. If you are preparing for an interview, going through these questions will help you get familiar with the basics of SAS efficiently. Have a read ahead.

Statistical Analysis Systems (SAS) dates back to the 1970s. Currently, it is one of the most considerable software tools used across organisations and companies. It is a closed source analysis tool that is extensively used in the corporate world to make strategic decisions. 

SAS programming lets you process a massive chunk of raw data into manageable small sets of data, resulting in decision making. It helps perform a variety of statistical analysis variance, psychometric analysis and regression analysis. If you are considering making a fruitful career in this domain, this post covers some of the advanced SAS interview questions. Let’s read ahead.

We have categorised Advanced SAS Interview Questions into two levels:

Advanced SAS Interview Questions And Answers - Table of Contents

  1. What are some functions that the SAS performs?
  2. Differentiate between SAS vs SPSS.
  3. Define all ways to create a macro variable
  4. What is the meaning of Append procedure in SAS?
  5.  What is the purpose of the RETAIN statement?
  6. What is Symget?
  7. What is the difference between SAS views and SAS datasets? 
  8. State the difference between SAS functions and SAS procedures

Advanced SAS Interview Questions for Freshers

If you are just beginning in this industry and are preparing for an upcoming interview, these advanced SAS interview questions for beginners will help you prepare better. 

1. What are some functions that the SAS performs?

Ans: Some of the functions performed by SAS include:

  • Project management and data management
  • Statistical analysis
  • Data warehousing
  • Business planning
  • Operational research and decisional support
  • Information retrieval and quality management

2. What are the syntax rules followed in the SAS statements?

Ans: To write a SAS program, we will need an Editor Window. Here, it comprises a series of statements that are followed by adequate syntax in a comprehensive order for the SAS program to comprehend it. Thus, some of the syntax rules followed in the SAS statements are:

  • The end of any statement gets marked with a semicolon (;)
  • SAS statements are not case sensitive
  • Extra spacing before any statement gets removed automatically
  • A semicolon is used to separate multiple statements as well that appear on a single line
  • There are two varying ways to include comments in the SAS program for statements, such as:
    • A line beginning with a forwarding slash and an asterisk (/*) and ending with an asterisk and a forward slash (*/)
    • A line beginning with an asterisk (*) and ending with a semicolon (;)

3. Name the data types that SAS contains.

Ans: There are two types of data that a SAS program contains, such as Numeric and Character. 

4. What do you mean by PDV? State its function.

Ans: Program Data Vector (PDV) is one logical concept that is defined as an area of memory where a specific dataset is developed by SAS. Some of the major functions of PDV include:

  • A database has one observation at a time while it is created
  • PDV comprises two automatic variables, such as _N_ and _ERROR_; the former displays the count of datastep being executed and the latter notifies the error occurring during the execution
  • The input buffer for holding the data from an external file is created during compilation

5. Explain the use of double trailing @@ in Input statements.

Ans: During the iteration of the data step, the inclusion of double trailing @@ in Input statements talks about SAS holding the current record for the execution of the next Input statement instead of switching to a new record altogether. 

6. Differentiate between SAS vs SPSS.

Ans: 

FeaturesSASSPSS
User InterfaceHighly interactive UIModerately interactive UI
Decision MakingWorks along with Enterprise MinerPossible to obtain answer tree
Data ManagementAdvantageous than SPSSSupports Data Management
DocumentationHuge set of technical documentationLack of documentation

7. What is the use of Macros in SAS?

Ans: Macros in SAS are used when we wish to use a program step to execute the same Proc step on several datasets. We can accomplish varying repetitive tasks efficiently and quickly with Macros. Moreover, a macro program can be reused several times.

8. What different options in SAS can be used to debug the Macro program?

Ans: The SAS system provides an array of efficient system options that help debug macro problems and issues. The results linked with using macro options get automatically displayed on the SAS log. Some of the specific options related to macro debugging get displayed in alphabetical order, such as

  • MEMRPT
  • MERROR
Check Out: Advanced SAS Tutorials

9. What is Autocall Facility?

Ans: An auto call facility is referred to as a facility in the SAS system that uses the same SAS macro code for multiple programs by storing the macro in a specific location. It helps benefit a variety of programmers by permitting better consistency and faster updates across all of the programs. 

10. What other SAS features do you use for error trapping and data validation?

Ans: 

  • Conditional statements, if then else.
  • Put statement
  • Debug option

11. In how many ways we can create Macro variables in the Global Symbol Table?

Ans: There are five varying ways to create macro variables, such as

  • %Let
  • %Global and %Local
  • Call Symput
  • Proc SQL into clause
  • Macro Parameters

12. Define all ways to create a macro variable.

Ans: 

  • %LET: One of the seamless methods to define a macro variable is through the %LET statement. It works quite similar to an assignment statement in the dataset. It is followed by the name of the macro variable, an equal sign (=) and the text value to be assigned to the macro variable. The syntax is %LET macro-variable-name = text-or-text-value.
  • %GLOBAL and %LOCAL: These statements are generally used to force macro variables into certain referencing scopes or environments. When used, they create the macro variables with null values. The syntax is %global dsn;
  • Call SYMPUT: The SYMPUT call routine is used to assign a Data set value to a specific macro variable. However, it is not a macro level statement but a dataset routine. As such, it is being used as part of a dataset and allows you to directly assign dataset variables’ values to macro variables. The syntax is CALL SYMPUT (macro_varname, value);
  • PROC SQL Into Clause: PROC SQL can be used to create macro variables by writing directly to the symbol tables. The string gets placed in a macro variable (&CLN) and the SQL COUNT function gets used to count the observations to match the WHERE clause.
  • Macro Parameters: There are different types of macro parameters, such as positional parameters and keyword or named parameters. The former can be defined by listing the macro variable names that have to receive the parameter values in the %MACRO statement. The latter gets designated by following the name of the parameter with an equal sign (=). They can be used to refine the last version of the %LOOK macro.

13. What is the difference between Global and Local Symbol Table?

Ans: The %GLOBAL is basically used to create a global macro variable and remains accessible till the session ends. Post that, it gets removed.

The %LOCAL is used to create a local macro variable during the time the macro is getting executed. Once the macro completes the processing, %LOCAL gets removed.

 MindMajix YouTube Channel

14. What is the difference between %NRSTR and %STR functions?

Ans: Both %NRSTR and %STR functions are macro quoting functions that are used to conceal the normal meaning of special tokens and other logical and comparison operators; thus, they get displayed as contant text. However, the only difference between both is that %NRSTR can mask up the macro triggers while %STR cannot do so. 

15. How would you create multiple observations from a single observation?

Ans: To create multiple observations from a single observation, we will use double trailing @@.

Advanced SAS Interview Questions for Experienced

If you are a professional or an expert in advanced SAS and looking forward to switching your job, refer to these advanced SAS interview questions for professionals to get a gist of what’s going to come.

16. What is the meaning of Append procedure in SAS?

Ans: The Append procedure helps adding the observations from one SAS dataset to the end of the another dataset. The PROC Append procedure doesn’t process the observations in the first datasets. 

17. Which command will you use to sort in the SAS program?

Ans: PROC SORT command is the one that will be used to perform sorting, regardless of it is being done on multiple variables or a single variable. This command is performed on a dataset where the new set of data is created as a result of sorting but the original one remains untouched. 

The syntax for this command is:

PROC SORT DATA=original OUT=Sorted;
BY variable

Here, 
‘Original’ is in reference to the original dataset
‘Sorted’ is in reference to the result as sorted dataset
‘Variable’s is in reference to the column where sorting is being done

Also, sorting can be done in both descending and ascending orders.

PROC SORT DATA=original OUT=Sorted;
BY DESCENDING variable

18. How can you convert the character variable into a numeric variable and vice versa?

Ans: Under the SAS programming, we come across several such tasks where a character variable has to be converted into a numeric variable and vice versa. To convert a numeric variable to a character variable, we use PUT( ). In such a situation, the source variable type and the source format will be the same always. For instance: Under the SAS programming, we come across several such tasks where a character variable has to be converted into a numeric variable and vice versa. To convert a numeric variable to a character variable, we use PUT( ). In such a situation, the source variable type and the source format will be the same always. For instance: 

char_var= PUT( num_var, 6.);

On the other hand, to convert a character variable to a numeric variable, INPUT ( ) will be used. In such a situation, the source variable type should be the character variable always. For instance: 

Num_var= INPUT(char_var,2.0);

19. What are the few ways with which a table lookup is executed in SAS programming?

Ans: In SAS programming, the table lookup values can get stored in the following ways:

  • Code
  • Dataset
  • Array
  • Format
  • Hash object

Below-mentioned techniques are used to perform the table lookup in SAS:

  • Merge, join, KEY= Option
  • SELECT/WHEN or IF/THEN statements
  • FORMAT statement, PUT function
  • Array Index Value
  • Hash Object Key Value

Let’s consider an example that shows the Code way to perform the table lookup with the help of IF/THEN statements:

data location;
set myinfo;
if AreaCode='226' then Location='Mumbai, India';
else if AreaCode='212' then Location='Delhi, India';
else Location='Unknown';
run;

20. What is the purpose of the RETAIN statement?

Ans: The purpose of the RETAIN statement in SAS programming is to keep the value that was once assigned. Within a SAS program, whenever it has to move from the current iteration to the upcoming datastep, RETAIN statement is used to tell SAS to retain the values instead of setting them as missing. 

To understand better, let’s print a program that will showcase the output value of ‘h’ beginning from 1 through the RETAIN statement.

data abc;
set xyz;
RETAIN h 0;
h = h + 1;
run;

21. What is the use of MPRINT Option in Macros?

Ans: The MPRINT option helps writing to the SAS log each SAS statement that is generated by a macro. You can use the MPRINT option when you suspect the bug lies in code that is generated in a way you didn’t expect.

For instance, the below-mentioned program can generate a simple DATA step:

%macro second(param);
   %let a = %eval(&param);a
%mend second;

%macro first(exp);
   data _null_;
      var=%second(&exp);
      put var=;
   run;
%mend first;

options mprint;
%first(1+2)

When you will submit the statement with MPRINT option, the following lines will get written to the SAS log:

MPRINT(FIRST):   DATA _NULL_;
MPRINT(FIRST):   VAR=
MPRINT(SECOND):  3
MPRINT(FIRST):  ;
MPRINT(FIRST):   PUT VAR=;
MPRINT(FIRST):   RUN;

VAR=3

The MPRINT option displays the generated text and identifies the macro that has generated it.

22. What is Symget?

Ans: Symget is a data step function that returns the macro variable’s value to the data step during its execution. However, it comes with a few restrictions and is not supported by the CAS engine. The syntax for Symget is SYMGET (argument).

Basically, SYMPGET returns the character value that is of the maximum length of the data step's character variable. In case SYMGET is unable to find the macro variable discovered as the argument, it ends up returning a missing value and the program gives a message for an invalid argument.

You can use SYMGET in all of the SAS language programs as it resolves variables at the execution of a program.

23. What is the difference between SAS views and SAS datasets?

Ans: The primary difference between a SAS views and SAS datasets is the place where data values get stored:

  • A view has metadata and instructions to retrieve data but it doesn’t store the data values
  • A dataset comprises both the data values and metadata
FeatureSAS ViewSAS Dataset
Merge EfficiencyOne view is capable of performing a multi-table join. With SAS/CONNECT, a view can join datasets that are stored on varying host computers. Multiple data steps are needed to merge datasets by common variables.
Disk Space vs Processing SpeedA view doesn’t store any underlying data; thus, processing speed can get impactedIt stores the full data for faster processing
Data IntegrityData is dynamic; hence, whenever you refer to a view in a PROC step, the view gets executed and offers the data values as they currently exist in the underlying dataHere, the data remains static
Data PreparationData gets processed in the existing form during its executionVariables can get sorted and indexed before used
Separation of Data from the Consumers’ DataA view can offer custom, the prepackaged perspective of the underlying data. The query of view can get altered without changing the dataA custom perspective may need a dataset duplication; modifying the data may need replacing the entire dataset.

24. If there is an unsorted dataset, how will you read the last observation to a new dataset

Ans: It is possible to read the last observation to a new dataset by using the end= dataset option. For instance:

data work.calculus;
set work.comp end=last;
If last;
run;

Here, calculus is the new dataset that has to be created and comp is the existing dataset. Also, last is the temporary variable, which has been set to 1 when the set statement reads the last observation. 

25. State the difference between SAS functions and SAS procedures.

Ans: The SAS functions anticipate argument values that should be supplied across the observation in the SAS dataset. On the other hand, the SAS procedure anticipates one variable value for each observation. For instance:

data average ;
set temp ;
avgtemp = mean( of T1 – T24 ) ;
run ;

Here, the arguments of mean function have been taken across the observation. The mean function calculates the average of varying values in one observation.

proc sort ;
by month ;
run ;
proc means ;
by month ;
var avgtemp ;
run ;

The Proc means is used to evaluate average temperature by month. Here, the procedure means on the variable month.

26. Can you state the differences between using “+” operator and sum function?

Ans: The “+” operator returns a value that is missing in a situation where any of the arguments are missing. On the other hand, the SUM function returns the sum of all the non-missing arguments. For example:

data mydata;
input x y z;
cards;
33 3 3
24 3 4
24 3 4
. 3 2
23 . 3
54 4 .
35 4 2
;
run;
data mydata2;
set mydata;
a=sum(x,y,z);
p=x+y+z;
run;

In the output, the value of p is not found for 4th, 5th, and 6th observation as:

a p
39 39
31 31
31 31
5 .
26 .
58 .
41 41

 27. What is the process of deleting duplicate observations in SAS?

Ans: 

  • Through Nodups in the Procedure
    Proc sort data=SAS-Dataset nodups;
    by var;
    run;
  • Through SQL Query in a Procedure
    Proc sql;
    Create SAS-Dataset as select * from Old-SAS-Dataset where var=distinct(var);
    quit;
  • Through Data Cleaning
    Set temp;
    By group;
    If first.group and last.group then
    Run;

28. How can you specify the number of specific conditions and iterations in a single DO loop?

Ans: To specify the number of specific conditions and iterations in a single DO loop, we will use the following code: 

data work;
do i=1 to 20 until(Sum>=20000);
Year+1;
Sum+2000;
Sum+Sum*.10;
end;
run;

This iterative DO statement allows you to execute the DO loop until the Sum is bigger than or equal to 20000 or until the DO loop is executed 10 times, whichever happens first.

29. Define the parameter of the scan function.

Ans: To use the Scan function, we will apply:

scan(argument,n,delimiters)

Here, the argument specifies the expression or character variable to scan. N specifies the word to read and delimiters are the special characters that should be enclosed in the single quotation marks.

30. Define the difference between One to One Merge and Match Merge with an example.

Ans: If both the datasets in the merge statement have been sorted by ID and every observation in one dataset has a corresponding observation in another dataset, a one to one merge will be applied. For instance:

data mydata1;
input id class $;
cards;
1 Sa
2 Sd
3 Rd
4 Uj
;
data mydata2;
input id class1 $;
cards;
1 Sac
2 Sdf
3 Rdd
4 Lks
;
data mymerge;
merge mydata1 mydata2;
run;

However, if the observations don’t match, the match mergin will be used. For example:

data mydata1;
input id class $;
cards;
1 Sa
2 Sd
2 Sp
3 Rd
4 Uj
;
data mydata2;
input id class1 $;
cards;
1 Sac
2 Sdf
3 Rdd
3 Lks
5 Ujf
;
data mymerge;
merge mydata1 mydata2;
by id
run;

Conclusion

Now that you are ready to crack the upcoming interview, the list of SAS interview questions will surely help you out, whether you are a fresher or an experienced SAS programmer. Most of these questions have been asked around in previous rounds of interviews. Thus, before you appear for one, brush up the concept by referring to these questions and answers mentioned above.

Explore SAS Sample Resumes! Download & Edit, Get Noticed by Top Employers!
About Author

Anjaneyulu Naini is working as a Content contributor for Mindmajix. He has a great understanding of today’s technology and statistical analysis environment, which includes key aspects such as analysis of variance and software,. He is well aware of various technologies such as Python, Artificial Intelligence, Oracle, Business Intelligence, Altrex, etc. Connect with him on LinkedIn and Twitter.

read less