Mindmajix

Advanced SAS Interview Questions

Advanced SAS Interview Questions

Q. How PROC SQL differs from other PROC statements in SAS?

PROC SQL is unique and it differs from other SAS Procedures in the following ways:

a) Almost all the SAS procedures require RUN statement while PROC SQL does not require a RUN statement. PROC SQL executes each query automatically.

b) Unlike other SAS procedures the PROC SQL continues to run after you submit the step, so it is always necessary to submit a PROC step, DATA step or a QUIT statement to end the procedure.

c) Also PROC SQL has several statements which include clauses. This makes it unique compared to other SAS Procedures.

E.g.: The following program contains three statements – PROC SQL statement, SELECT statement and QUIT statement. The SELECT statement contains select clause, from clause, where clause and order by clause.

proc sql;

select type, slno, author

from exam.questionset1

where slno <30000

order by type;

quit;


Q. Suppose you are generating a report from the data set exam.questionset1 using PROC SQL. You wish to display the name of the column “author” as “writer” in the report. How do you write the query to modify the report?

The following program generates a report from exam.questionset1 using PROC SQL. The column named author is assigned an alias writer in the report using the keyword AS. Column alias will appear as column heading in the output.

proc sql;

select type, slno, author as writer

from exam.questionset1;

quit;


Q. While generating a report using PROC SQL, how do you sort the rows in descending order of any particular column?

The order by clause is used in the SELECT statement of PROC SQL to sort the rows in the output according to the values of a particular column. By default, order by clause sorts the rows in the ascending order. The rows can be sorted in the descending order of any column by specifying the keyword, desc, after the column name.

E.g.: The following program selects type, slno and author for all the rows having slno less than 30000. In the output, the rows are sorted according to the descending values of slno.

proc sql;

select type, slno, author

from exam.questionset1

where slno <30000

order by slno desc;

quit;


Q. What is referred to as “qualifying a column name”?

“Qualifying a column name” refers to the process of prefixing the SAS data set name to a column name. This is done mostly in those situations where you need to select the data from two data sets and the two data sets have the same named columns. If you want to include one of the same named columns then you will have to prefix the name of the data set from which you wish to take the column to the column name in the query.

E.g.: The following PROC SQL query joins two data sets exam.set1 and exam.set2. Both the data sets contain the same columnslno. So, to indicate the data set from which the value of slno is to be read, the name of the data set is prefixed to the column name in the select clause. This is referred to as “qualifying the column name”.

proc sql;

select type, set1.slno,author

from exam.set1, exam.set2

where set1.slno=set2.slno

order by slno;

quit;


Q. Is there any way to display all the cohunn names from a data set without mentioning the names of the columns?

Yes, an asterisk(*) can be used in the select clause to display all the columns in a data set.

E.g.: The following program selects all the columns from the data set exam.questionset1.

proc sql;

select *

from exam.questionset1

quit;


Q. What is the significance of FEEDBACK option?

The FEEDBACK option is a debugging tool which helps us to see what is being submitted to the processor. When asterisk(*) is used with select clause, FEEDBACK option is used in the PROC SQL statement, which in turn writes the list of column names to SAS log.

E.g.: In the following program FEEDBACK option is used in the PROC SQL statement. This causes a detailed list of column names (type, slno and author) to be written to SAS log.

proc sql feedback;

select *

from exam.questionset1

quit;


Q. While using PROC SQL, how do you limit the number of rows which is displayed in the output?

The OUTOBS= option can be used in the PROC SQL statement to limit the number of rows displayed. OUTOBS= option is similar to OBS= data set option.

E.g.: The following program selects type, slno and author for all the rows having slno less than 30000. In the output, the rows are sorted according to the descending values of slno. Only 10 rows are displayed in the output as OUTOBS=10 is added to the PROC SQL statement.

proc sql outobs=10;

select type, slno, author

from exam.questionset1

where slno <30000

order by slno desc;

quit;


Q. Which keyword is used to eliminate the rows containing duplicate values while using PROC SQL?

The DISTINCT keyword can be used in the SELECT statement to remove the rows that contain duplicate values. The DISTINCT keyword applies to all the columns mentioned in the SELECT statement.

E.g.: The following program selects type, slno and author from the data set exam.questionset1. Use of DISTINCT keyword removes the rows containing duplicate values.

proc sql ;

select distinct type, slno, author

from exam.questionset1

order by slno;

quit;


Q. Explain BETWEEN-AND operator with example?

The BETWEEN-AND operator is used to select rows based on a range of numeric or character values.

E.g.: The following program displays all the rows whose slno is between 3000 and 4000.

proc sql ;

select *

from exam.questionset1

where slno between 3000 and 4000;

quit;


Q. Explain the significance of CONTAINS operator.

The CONTAINS operator is used to select the rows in which a character column includes a particular string.

E.g.: The following program displays all the rows in which type column contains the string ‘tech’.

proc sql ;

select *

from exam.questionset1

where type contains ‘tech’;

quit;


Q. How does IN operator ftmction when used with PROC SQL?

The IN operator is used to select those rows which match with one value among a fixed list of values. The fixed list can contain either numeric or character values. The list of values is enclosed in parenthesis.

E.g.: The following program displays all the rows for which type column contains the string ‘technical’ or ‘fiction’ or ‘literature’. Also the character values in the list are enclosed in quotation marks.

proc sql ;

select *

from exam.questionset1

where type in (‘technical’, ‘fiction’, ‘literature’);

quit;


Q. Which operator is used with PROC SQL to retrieve those rows for which a particular column has missing values?

The IS MISSING operator can be used to select the rows for which particular column has missing values.

E.g.: Suppose you need to check if there are any rows for which the column type contains missing values then the following program can be used. This displays all the rows for which the type column contains missing values.

proc sql ;

select *

from exam.questionset1

where type is missing;

quit;


Q. Is there any other operator which can be used in the place of IS MISSING operator?

The IS NULL operator can be used to select the rows for which particular column has missing values. So IS NULL operator also can be used in the place of IS MISSING operator.

E.g.: Suppose you need to check if there are any rows for which the column type contains missing values then the following program can be used. This displays all the rows for which the type column contains missing values.

proc sql ;

select *

from exam.questionset1

where type is null;

quit;

Also if you know the data type of a column, you can use a comparison operator to check for the rows which contain missing values.

E.g.: The following program also selects those rows for which the type column contains missing values. Since type is a character column, a blank is used in the quotation to represent the missing values.

proc sql ;

select *

from exam.questionset1

where type = ‘ ‘ ;

quit;


Q. Explain the purpose of using the wild card operator underscore with LIKE operator in PROC SQL?

The LIKE operator is used to select the rows which have values that match with specific pattern of characters. The wild card operator underscore (_) is used to represent a single character value.

E.g.: The following program returns all those rows, whose author column contains values which start with T, end with M and which have a character in between. This program will return rows in which author column contains values like Tom, Tim, Tam etc.

proc sql ;

select *

from exam.questionset1

where author like ‘T_m’;

quit;


Q. Explain the purpose of using the wild card operator percent (%) with LIKE operator in PROC SQL?

The LIKE operator is used to select the rows which have values that match with specific pattern of characters. The wild card operator percent (%) is used to represent any number of character values.

E.g.: The following program returns all those rows, whose city column contains values which start with S, end with E and which have any number of characters in between. This program will return rows in which city column contains values like Sunnyvale, Susanville, Springville, Sanjose etc.

proc sql ;

select *

from exam.questionset1

where city like ‘S%e’;

quit;


Q. Explain the functionality of SOUNDS-LIKE (=*) operator?

The SOUNDS-LIKE (=*) operator is used to select the rows which have values that sound like another value you specify. The SOUNDS-LIKE operator uses soundex algorithm to compare the values of a column with the words specified.

E.g.: The following program returns all those rows, whose author column contains values that sound like Justin. This query returns the rows whose author column contains values like Justin, JestinJesstin etc.

proc sql ;

select *

from exam.questionset1

where author =* ‘Justin’;

quit;


Q. Is it possible to define a new column after performing a calculation using SELECT clause?

Yes, it is possible to define a new column after performing a calculation in select clause.

E.g.: The following program selects type, slno from the data set exam.questionset1. This also creates a new column, marks, by taking the sum of three columns, marksphy, markschem and marksbio.

proc sql ;

select type, slno, marksphy+markschem+marksbio as marks,

from exam.questionset1;

quit;


Q. How do you reference a column in the WHERE clause, whose value is calculated in the SELECT clause of the same PROC SQL query?

While using those columns in the where clause whose values are computed in the select clause of the same PROC SQL query, it is always necessary to specify the keyword CALCULATED. The absence of the keyword CALCULATED will generate an error message in the SAS log.

E.g.: The following program selects type, slno from the data set exam.questionset1. This also creates a new column, marks, by taking the sum of three columns, marksphy, markschem and marksbio. This displays those rows where the value of marks is greater than 80. Also while using the calculated column marks in the where clause , it is necessary to use the keyword CALCULATED.

proc sql ;


Q. How do you specify a label for a cohmm in the PROC SQL query?

The data set options LABEL= can be used after the column name in the select clause to label the column. By default PROC SQL uses the labels which are already stored in the data set. The use of LABEL= option overrides the default label.


Q. What is the difference between %LOCAL and %GLOBAL?

The %LOCAL that variable will be used only at the particular block only but in case of the %GLOBAL that variable will be used till the end of the SAS session


Q. What happens in a one-on-one merge? When would you use one?

If you want to merge two data set that have different variable and only one variable as a common variable with that unique variable we can merge the data set with one-on-one merge.


Q. How to know how &&var&i or &&dsn&i resolves?

It is very confusing some times to tell right away how &&var&i or &&dsn&i get resolved… but here is the simple technique you can use to know….
ex: We generally use &&var&i or &&dsn&i these macro variables when we are using a %do loop… to execute same code n number of times.
You have a data set and it has 5 variables … Patid sex age ethnic race wt ht;
%macro doit; %do i=1 %to &nvars; &&var&i %end; %mend doit;
So if the nvars value is 7, then the loop creates a macro varia


Q. What can you do with %PUT ?

Numerous options are available for the %PUT statement.
%PUT _all_: It prints all macro variables in the log that are available in all environments (global, local, user and automatic).
%PUT _automatic_: It prints all the SAS defined automatic macro variables in the log. (ex: &sysdate, &systime ,%sysdsn, %syserr etc)
%PUT _global_: It prints macro variables that are created by the user and available in all environments.
%PUT _local_: It prints macro variables that are created b


Q. Why and How to Use %PUT Statement?

%Put statement is similar to the PUT statement in data step, What it does is it writes text and values of macro variable after execution to the SAS System LOG. If you want to make sure your macro variable resolves as expected, you can make sure it with %PUT statement.
Unique advantage of %PUT over PUT is …you can use %PUT outside the data step whereas you cannot with PUT.
How to use %PUT:
%let program=AE;
%put program Name here as &program;
Above %put statement resolves


Q. What is auto call macro and how to create a auto call macro? What is the use of it? How to use it in SAS with macros?

SAS Enables the user to call macros that have been stored as SAS programs.
The auto call macro facility allows users to access the same macro code from multiple SAS programs. Rather than having the same macro code for in each program where the code is required, with an autocall macro, the code is in one location. This permits faster updates and better consistency across all the programs.Macro set-up:The fist step is to set-up a program that contains a macro, desired to be used in multiple p


Q. Write a macro to split the number of observations in a data-set .

%macro split (dsnorig, dsnsplit1, dsnsplit2, obs1); data &dsnsplit1; set &dsnorig (obs = &obs1); run; data &dsnsplit2; set &dsnorig (firstobs = %eval(&obs1 + 1)); run; %mend split;
%split(sasuser.admit,admit4,admit5,2)


Q. Write a macros for sorting common variables in various data-sets.

%macro sortit (datasetname,pid,inverstigator); PROC SORT DATA = &DATASETNAME; BY &PID &INVESTIGATOR; %mend sortit;
%sortit (ae,001,sarath);


Q. What are SYMGET and SYMPUT?

SYMPUT puts the value from a dataset into a macro variable where as SYMGET gets the value from the macro variable to the dataset.


Q. What system options would you use to help debug a macro?

The SAS System offers users a number of useful system options to help debug macro issues and problems. The results associated with using macro options are automatically displayed on the SAS Log.
Specific options related to macro debugging appear in alphabetical order in the table below:
MEMRPT: Specifies that memory usage statistics be displayed on the SAS Log.
MERROR: SAS will issue warning if we invoke a macro that SAS did not find. Presents Warning Messages when there are missp


Q. Automatic variables for macro?

Every time we invoke SAS, the macro processor automatically creates certain macro var. eg: &sysdate, &sysday.


Q. What is the maximum length of the macro variable?

32 characters long.


Q. Tell me more about the parameters in macro?

Parameters are macro variables whose value you set when you invoke a macro. To add the parameters to a macro, you simply name the macro vars names in parenthesis in the %macro statement.
Syntax:
%MACRO macro-name (parameter-1= , parameter-2= , ……parameter-n = ); macro-text%; %MEND macro-name; %macro_name(par1,par2,….parn);


Q. Describe the ways in which you can create macro variables?

There are the 5 ways to create macro variables:


Q. Tell me about % include and % eval?

The %include statement, despite its percent sign, is not a macro statement and is always executed in SAS, though it can be conditionally executed in a macro.It can be used to setting up a macro library. But this is a least approach.
The use of %include does not actually set up a library. The %include statement points to a file and when it executed the indicated file (be it a full program, macro definition, or a statement fragment) is inserted into the calling program at the location of the


Q. Tell me about call symput?

CALL SYMPUT takes a value from a data step and assigns it to a macro variable. I can then use this macro variable in later steps. To assign a value to a single macro variable,
We use CALL SYMPUT with this general form:
CALL SYMPUT (“macro-variable-name”, value);
Where macro-variable-name, enclosed in quotation marks, is the name of a macro variable, and value is the value I want to assign to that macro variable. Value can be the name of a variable whose value SAS will use, or it


Q. How are parameters passed to a macro?

A macro variable defined in parentheses in a %MACRO statement is a macro parameter. Macro parameters allow you to pass information into a macro.
Here is a simple example:
%macro plot(yvar= ,xvar= ); proc plot; plot &yvar*&xvar; run; %mend plot; %plot(age,sex)


Q. Can you execute macro within another macro? If so, how would SAS know where the current macro ended and the new one began?

Yes, I can execute macro within a macro, we call it as nesting of macros, which is allowed. Every macros beginning is identified the keyword %macro and end with %mend.


Q. If you need the value of a variable rather than the variable itself what would you use to load the value to a macro variable?

If we need a value of a macro variable then we must define it in such terms so that we can call them everywhere in the program. Define it as Global. There are different ways of assigning a global variable. Simplest method is %LET.
Ex: A, is macro variable. Use following statement to assign the value of a rather than the variable itself
%Let A=xyz; %put x=”&A”;
This will assign “xyz” to x, not the variable xyz to x.


Q. Can you execute a macro within a macro? Describe.

Yes, Such macros are called nested macros. They can be obtained by using symget and call symput macros.


Q. How do you add a number to a macro variable?

Using %eval function or %sysevalf function if the number is a floating number.


Q. What is the difference between %PUT and SYMBOLGEN?

%PUT is used to display user defined messages on log window after execution of a program where as % SYMBOLGEN is used to print the value of a macro variable resolved, in log window.


Q. What do you code to create a macro? End one?

We create a macro with %MACRO statement and end a macro with %MEND statement.


Q. If you use a SYMPUT in a DATA step, when and where can you use the macro variable?

The macro variable created by the CALL SYMPUT routine cannot be used in the same data-step in which it got created. Other than that we can use the macro variable at any time


Q. For what purposes have you used SAS macros?

If we want use a program step for executing to execute the same Proc step on multiple data sets. We can accomplish repetitive tasks quickly and efficiently. A macro program can be reused many times. Parameters passed to the macro program customize the results without having to change the code within the macro program. Macros in SAS make a small change in the program and have SAS echo that change thought that program.


Q. How would you define the end of a macro?

The end of the macro is defined by %Mend Statement


Q. How would you identify a macro variable?

with Ampersand (&) sign


Q. How can you create a macro variable with in data step?

with CALL SYMPUT


Q. How would you invoke a macro?

After I have defined a macro I can invoke it by adding the percent sign prefix to its name like this: % macro name a semicolon is not required when invoking a macro, though adding one generally does no harm.


Q. Have you used macros? For what purpose you have used?

Yes, I have used macros in creating analysis data-sets and tables where it is necessary to make a small change throughout the program and where it is necessary to use the code again and again.


Q. How to know how &&var&i or &&dsn&i resolves?

It is very confusing some times to tell rightaway how &&var&i or &&dsn&i get resolved… but here is the simple technique you can use to know….
ex: We generally use &&var&i or &&dsn&i these macro variables when we are using a %do loop… to execute same code n number of times. You have a dataset and it has 5 variables … Patid sex age ethnic race wt ht;
%macro doit;
%do i=1 %to &nvars;
&&var&i
%end;
%mend doit;
So if the nvars value is 7, then the loop creates a m


Q. Why and How to Use %PUT Statement ?

%Put statement is similar to the PUT statement in data step, What it does is it writes text and values of macro variable after execution to the SAS System LOG. If you want to make sure your macro variable resolves as expected, you can make sure it with %PUT statement.
Unique advantage of %PUT over PUT is …you can use %PUT outside the data step whereas you can’t with PUT.
How to use %PUT:
%let program=AE; %put program Name here as &program;
Above %put statement resolves to … %put


Q. Write a macro for sorting common variables in various data sets.

%macro sortit (datasetname,pid,inverstigator);
PROC SORT DATA = &DATASETNAME;
BY &PID &INVESTIGATOR;
%mend sortit;
%sortit (ae,001,sarath);


Q. Write a macro to determine distribution of Missing / Non-Missing Values.

%macro missrep (dsn, vars=_numeric_);
proc freq data=&dsn.;
tables &vars. / missing;
format _character_ $missf. _numeric_ missf.;
title1 “Distribution or Missing / Non-Missing Values”;
run;
%mend missrep;
%missrep(study.demog, vars=age gender bdate);


Q. Write a macro to determine the list of variables in a data set.

%macro varlist (dsn);
proc contents data = &dsn out = cont noprint;
run;
%mend;
%varlist(demo);
proc sql noprint;
select distinct name into:varname1-:varname22 from cont;
quit;
%do i =1 %to &sqlobs;
%put &i &&varname&i;
%end;
%mend varlist;
%varlist(adverse)


Q. What are SYMGET and SYMPUT?

SYMPUT puts the value from a dataset into a macro variable where as SYMGET gets the value from the macro variable to the dataset.


Q. What system options would you use to help debug a macro?

The SAS System offers users a number of useful system options to help debug macro issues and problems. The results associated with using macro options are automatically displayed on the SAS Log.
Specific options related to macro debugging appear in alphabetical order in the table below:
MEMRPT: Specifies that memory usage statistics be displayed on the SAS Log.
MERROR: SAS will issue warning if we invoke a macro that SAS did not find. Presents Warning Messages when there are missp


Q. What are automatic variables for macro?

Every time we invoke SAS, the macro processor automatically creates certain macro var. eg: &sysdate &sysday.


Q. What is the maximum length of the macro variable?

32 characters long.


Q. Describe about the parameters in macro?

Parameters are macro variables whose value you set when you invoke a macro. To add the parameters to a macro, you simply name the macro vars names in parenthesis in the %macro statement.
Syntax:
%MACRO macro-name (parameter-1= , parameter-2= , ……parameter-n = );
macro-text%;
macro-text%;
%macro_name(par1,par2,….par-n);


Q. Explain call symput?

CALL SYMPUT takes a value from a data step and assigns it to a macro variable. I can then use this macro variable in later steps. To assign a value to a single macro variable,
I use CALL SYMPUT with this general form: CALL SYMPUT (“macro-variable-name”, value); Where macro-variable-name, enclosed in quotation marks, is the name of a macro variable, and value is the value I want to assign to that macro variable. Value can be the name of a variable whose value SAS will use, or it can be a con


Q. How we can call macros with in data step?

We can call the macro with
CALL SYMPUT,
Proc SQL ,
%LET statement. and macro parameters.


Q. How would you code a macro statement to produce information on the SAS log?

OPTIONS MPRINT MLOGIC MERROR SYMBOLGEN;


Q. How are parameters passed to a macro?

A macro variable defined in parentheses in a %MACRO statement is a macro parameter. Macro parameters allow you to pass information into a macro.
Here is a simple example:
%macro plot(yvar= ,xvar= );
proc plot;
plot &yvar*&xvar;
run;
%mend plot;
%plot(age,sex)


Q. Can you execute macro within another macro? If so, how would SAS know where the current macro ended and the new one began?

Yes, I can execute macro within a macro, we call it as nesting of macros, which is allowed.
Ex: A, is macro variable. Use following statement to assign the value of a rather than the variable itself
Every macros beginning is identified the keyword %macro and end with %mend.


Q. If you need the value of a variable rather than the variable itself what would you use to load the value to a macro variable?

If we need a value of a macro variable then we must define it in such terms so that we can call them everywhere in the program. Define it as Global. There are different ways of assigning a global variable. Simplest method is %LET.
%Let A=xyz; %put x=”&A”;
This will assign “xyz” to x, not the variable xyz to x.


Q. Can you execute a macro within a macro? Describe.

Yes, Such macros are called nested macros. They can be obtained by using symget and call symput macros.


Q. How do you add a number to a macro variable?

Using %eval function or %sysevalf function if the number is a floating number.


Q. If you use a SYMPUT in a DATA step, when and where can you use the macro variable?

The macro variable created by the CALL SYMPUT routine cannot be used in the same datastep in which it got created. Other than that we can use the macro variable at any time..


Q. How long can a macro variable be? A token?

A component of SAS known as the word scanner breaks the program text into fundamental units called tokens.

  • Tokens are passed on demand to the compiler.
  • The compiler then requests token until it receives a semicolon.
  • Then the compiler performs the syntax check on the statement.

Q. What is the difference between %LOCAL and %GLOBAL?

% Local is a macro variable defined inside a macro.%Global is a macro variable defined in open code (outside the macro or can use anywhere).


Q. For what purposes have you used SAS macros?

If we want use a program step for executing to execute the same Proc step on multiple data sets. We can accomplish repetitive tasks quickly and efficiently. A macro program can be reused many times. Parameters passed to the macro program customize the results without having to change the code within the macro program. Macros in SAS make a small change in the program and have SAS echo that change thought that program.


Q. How would you define the end of a macro?

The end of the macro is defined by %Mend Statement


Q. How would you identify a macro variable?

With Ampersand (&) sign


Q. How can you create a macro variable with in data step?

with CALL SYMPUT


Q. How would you invoke a macro?

After I have defined a macro I can invoke it by adding the percent sign prefix to its name like this: % macro name a semicolon is not required when invoking a macro, though adding one generally does no harm.


0 Responses on Advanced SAS Interview Questions"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.