Business Analytics with R Interview Questions
Business Analyst with R Interview Questions
Q. Explain what is R?
R is data analysis software which is used by analysts, quants, statisticians, data scientists and others.
Q. List out some of the function that R provides?
The function that R provides are
- Mixed Effects
- GAM. etc.
Q. Explain how you can start the R commander GUI?
Typing the command, (“Rcmdr”) into the R console starts the R commander GUI.
Q. In R how you can import Data?
You use R commander to import Data in R, and there are three ways through which you can enter data into it
New Data Set
- You can enter data directly via Data
- Import data from a plain text (ASCII) or other files (SPSS, Minitab, etc.)
- Read a data set either by typing the name of the data set or selecting the data set in the dialog box
Q. Mention what does not ‘R’ language do?
- Though R programming can easily connects to DBMS is not a database
- R does not consist of any graphical user interface
- Though it connects to Excel/MS office easily, R language does not provide any spreadsheet view of data
Q. Explain how R commands are written?
In R, anywhere in the program you have to preface the line of code with a #sign, for example
- # subtraction
- # division
- # note order of operations exists
Q. How can you save your data in R?
To save data in R, there are many ways, but the easiest way of doing this is
Go to Data > Active Data Set > Export Active Data Set and a dialogue box will appear, when you click ok the dialogue box let you save your data in the usual way.
Q. Mention how you can produce co-relations and covariances?
You can produce co-relations by the cor () function to produce co-relations and cov () function to produce covariances.
Q. Explain what is t-tests in R?
In R, the t.test () function produces a variety of t-tests. T-test is the most common test in statistics and used to determine whether the means of two groups are equal to each other.
Q. Explain what is With () and By () function in R is used for?
- With() function is similar to DATA in SAS, it apply an expression to a dataset.
- BY() function applies a function to each level of factors. It is similar to BY processing in SAS.
Q. What are the data structures in R that is used to perform statistical analyses and create graphs?
R has data structures like
- Data frames
Q. Explain general format of Matrices in R?
General format is
Mymatrix< – matrix (vector, nrow=r , ncol=c , byrow=FALSE,
dimnames = list ( char_vector_ rowname, char_vector_colnames))
Q. In R how missing values are represented?
In R missing values are represented by NA (Not Available), why impossible values are represented by the symbol NaN (not a number).
Q. Explain what is transpose?
For re-shaping data before, analysis R provides various method and transpose are the simplest method of reshaping a dataset. To transpose a matrix or a data frame t () function is used.
Q. Explain how data is aggregated in R?
By collapsing data in R by using one or more BY variables, it becomes easy. When using the aggregate() function the BY variable should be in the list.
Q. What is the function used for adding datasets in R?
rbind function can be used to join two data frames (datasets). The two data frames must have the same variables, but they do not have to be in the same order.
Q. What is the use of subset() function and sample() function in R?
In R, subset() functions help you to select variables and observations while through sample() function you can choose a random sample of size n from a dataset.
Q. Explain how you can create a table in R without external file?
Use the code
myTable = data.frame()
This code will open an excel like spreadsheet where you can easily enter your data.
Q. How is R different from other statistical tools available in market? What are its strengths and weaknesses vis-à-vis SAS and SPSS?
R is fundamentally different from SAS language (which is divided into procedures and data steps) and the menu driven SPSS. It is object oriented, much more flexible, hence powerful, yet confusing to the novice, as there are multiple ways to do anything in R. It is overall a very elegant language for statistics and the strengths of the language are enhanced by nearly 5000 packages developed by leading brains across the universities of the planet.
Q. Which R packages do you use the most and which ones are your favorites?
I use R Commander and Rattle a lot, and I use the dependent packages. I use car for regression, and forecast for time series, and many packages for specific graphs. I have not mastered ggplot though but I do use it sometimes. Overall I am waiting for Hadley Wickham to come up with an updated book to his ecosystem of packages as they are very formidable, completely comprehensive and easy to use in my opinion, so much I can get by the occasional copy and paste code.
Q. What level of adoption do you see for R as a preferred tool in the industry? Are Indian businesses also keen to adopt R?
I see surprising growth for R in Business, and I have had to turn down offers for consulting and training as I write my next book R for Cloud Computing. Indian businesses are keen to cut costs like businesses globally, but have an added advantage of having a huge pool of young engineers and quantitatively trained people to choose from. So there is more interest in India for R, but is growing thanks to the efforts of companies like SAP, Oracle, Revolution Analytics and R Studio who have invested in R and are making it more popular. The R Project organization is dominated by academia, and this reflects the fact their priorities is making the software better, faster, stabler but the rest of the community has been making efforts to introduce it to industry.
Q. How did you start your career in analytics and how were you first acquainted with R?
I started my career after MBA in selling cars, which was selling a lot of dreams and managing people telling lies to people to sell cars. So I switched to Business Analytics thanks to GE in 2004, and I had the personal good luck of having Shrikant Dash, ex CEO GE Analytics as my first US client. He was a tough guy and taught me a lot. I came to R only after leaving the cozy world of corporate analytics in 2007.
Q. How do you see Analytics evolving today in the industry as a whole? What are the most important contemporary trends that you see emerging in the Analytics space across the globe?
I don’t know how analytics will evolve, but it will grow bigger and more towards the cloud and bigger data sizes. Big Data /Hadoop, Cloud Computing, Business Analytics and Optimization, Text Mining, are some of the buzz words that are currently in fashion.