# Business Analyst with R Interview Questions

**Business Analyst with R Interview Question And Answers:**

**Q.What is R?**

R is a programming language which is used for developing statistical software and data analysis.

**Q.How R commands are written?**

By using # at the starting of the line of code like #division commands are written.

**Q.What is t-tests() in R?**

It is used to determine that the means of two groups are equal or not by using t.test() function.

**Q.What are the disadvantages of R Programming?**

The disadvantages are:-

- Lack of standard GUI
- Not good for big data.
- Does not provide spreadsheet view of data.

**Q.What is the use of With () and By () function in R?**

with() function applies an expression to a dataset.

#with(data,expression)

By() function applies a function t each level of a factors.

#by(data,factorlist,function)

**Q.In R programming, how missing values are represented?**

In R missing values are represented by NA which should be in capital letters.

**Q.What is the use of subset() and sample() function in R?**

Subset() is used to select the variables and observations and sample() function is used to generate a random sample of the size n from a dataset.

**Q.Explain what is transpose?**

Transpose is used for reshaping of the data which is used for analysis. Transpose is performed by t() function.

**Q.What are the advantages of R?**

The advantages are:-

- It is used for managing and manipulating of data.
- No license restrictions
- Free and open source software.
- Graphical capabilities of R are good.
- Runs on many Operating system and different hardware and also run on 32 & 64 bit processors etc.

**Q.What is the function used for adding datasets in R?**

For adding two datasets rbind() function is used but the column of two datasets must be same.

Syntax: rbind(x1,x2……) where x1,x2: vector, matrix, data frames.

**Q.How you can produce co-relations and covariances?**

Cor-relations is produced by cor() and covariances is produced by cov() function.

**Q.What is difference between matrix and dataframes?**

Dataframe can contain different type of data but matrix can contain only similar type of data.

**Q.What is difference between lapply and sapply?**

lapply is used to show the output in the form of list whereas sapply is used to show the output in the form of vector or data frame.

**Q.What is the difference between seq(4) and seq_along(4)?**

Seq(4) means vector from 1 to 4 (c(1,2,3,4)) whereas seq_along(4) means a vector of the length(4) or 1(c(1)).

**Q.Explain how you can start the R commander GUI?**

rcmdr command is used to start the R commander GUI.

**Q.What is the memory limit of R?**

In 32 bit system memory limit is 3Gb but most versions limited to 2Gb and in 64 bit system memory limit is 8Tb.

**Q.How many data structures R has?**

There are 5 data structure in R i.e. vector, matrix, array which are of homogenous type and other two are list and data frame which are heterogeneous.

** Explain how data is aggregated in R?**

There are two methods that is collapsing data by using one or more BY variable and other is aggregate() function in which BY variable should be in list.

**Q.How many sorting algorithms are available?**

there are 5 types of sorting algorithms are used which are:-

- Bubble Sort
- Selection Sort
- Merge Sort
- Quick Sort
- Bucket Sort

**Q.How to create new variable in R programming?**

For creating new variable assignment operator ‘<-’ is used

For e.g. mydata$sum <- mydata$x1 + mydata$x2

**Q.What are R packages?**

Packages are the collections of data, R functions and compiled code in a well-defined format and these packages are stored in library.

**Q.What is the workspace in R?**

Workspace is the current R working environment which includes any user defined objects like vector, lists etc.

**Q.What is the function which is used for merging of data frames horizontally in R?**

Merge()function is used to merge two data frames

Eg. Sum<-merge(data frame1,data frame 2,by=’ID’).

**Q.what is the function which is used for merging of data frames vertically in R?**

rbind() function is used to merge two data frames vertically.

Eg. Sum<- rbind(data frame1,data frame 2)

**Q.What is the power analysis?**

It is used for experimental design .It is used to determine the effect of given sample size.

**Q.Which package is used for power analysis in R?**

Pwr package is used for power analysis in R.

**Q.Which method is used for exporting the data in R?**

There are many ways to export the data into another formats like SPSS, SAS , Stata , Excel Spreadsheet.

**Q.Which packages are used for exporting of data?**

For excel xlsReadWrite package is used and for sas,spss ,stata foreign package is implemented.

**Q.How impossible values are represented in R?**

In R NaN is used to represent impossible values.

**Q.Which command is used for storing R object into a file?**

Save command is used for storing R objects into a file.

Syntax: >save(z,file=”z.Rdata”)

**Q.Which command is used for restoring R object from a file?**

load command is used for storing R objects from a file.

Syntax: >load(”z.Rdata”)

**Q.What is the use of coin package in R?**

coin package is used to achieve the re randomization or permutation based statistical tests.

**Q.Which function is used for sorting in R?**

order() function is used to perform the sorting.

**Q.What is the use of tapply?**

IOS-6.1.3

**Q.What happens when the application object does not handle an event?**

the event will be dispatched to your delegate for processing.

**Q.Explain app specific objects which store the app contents?**

Data model objects are app specific objects and store app’s content. Apps can also use document objects.

**Q.Explain the purpose of using UIWindow object?**

UIWindow object coordinates the one or more views presenting on the screen.

**Q.Tell me the super class of all view controller objects?**

UIView Controller class.

**Q.How to create axes in the graph?**

Using axes() function custom axes are created.

**Q.What is the use of abline() function?**

abline() function is add the reference line to a graph.

Syntax:- abline(h=yvalues, v=xvalues)

**Q.Why vcd package is used?**

vcd package provides different methods for visualizing multivariate categorical data.

**Q.What is GGobi?**

GGobi is an open source program for visualization for exploring high dimensional typed data.

**Q.What is iPlots?**

It is a package which provide bar plots, mosaic plots, box plots, parallel plots, scatter plots and histograms.

**Q.What is the use of lattice package?**

lattice package is to improve on base R graphics by giving better defaults and it have the ability to easily display multivariate relationships.

**Q.What is fitdistr() function?**

It is used to provide the maximum likelihood fitting of univariate distributions. It is defined under the MASS package.

**Q.Which data structures are used to perform statistical analysis and create graphs.**

Data structures are vectors, arrays, data frames and matrices.

**Q.What is the use of sink() function?**

It defines the direction of output.

**Q.Why library() function is used?**

This function is used to show the packages which are installed.

**Q.Why search() function is used?**

By this function we see that which packages are currently loaded.

**Q.On which type of data binary operators are worked?**

Binary operators are worked on matrices, vectors and scalars.

**Q.What is the use of doBY package?**

It is used to define the desired table using function and model formula.

**Q.Which function is used to create frequency table?**

Frequency table is created by table() function.

**Q.Define loglm() function.**

Loglm() function is used to create log-linear models.

**Q.What is the use of corrgram() function?**

corrgram() function is used to plot correlograms.

**Q.How to create scatterplot matrices?**

Pair() or splom() function is used for create scatterplot matrices.

**Q.What is npmc?**

It is a package which gives nonparametric multiple comparisons.

**Q.What is the use of diagnostic plots?**

It is used to check the normality, heteroscedasticity and influential observations.

**Q.Define anova() function.**

anova() is used to compare the nested models.

**Q.What is cv.lm() function?**

It is defined under the DAAG package which is used for k-fold validation.

**Q.Define stepAIC() function.**

It is define under the MASS package which performs stepwise model selection under exact AIC.

**Q.Define leaps().**

It is used to perform the all-subsets regression and it is defined under the leaps package.

**Q.Define relaimpo package.**

It is used to measure the relative importance of each of the predictor in the model.

**Q.Why car package is used?**

It provide a variety of regression including scatter plots, variable plots and it also enhanced diagnostic.

**Q.Define robust package.**

It provides a library of robust methods including regression.

**Q.What is robustbase?**

It is a package which provides basic robust statistics including model selection methods.