If you're looking for Business Analytics with R Interview Questions & Answers for Experienced or Freshers, you are at right place. There are lot of opportunities from many reputed companies in the world. According to research Business Analytics with R salary ranges from $30,000 to $182,000. So, You still have opportunity to move ahead in your career in Business Analytics with R. Mindmajix offers Advanced Business Analytics with R Interview Questions 2018 that helps you in cracking your interview & acquire dream career as Business Analytics with R Developer.
Q. Define R?
R is an programming language used in statistical computing and with high-level graphics.
Q. Name and briefly explain about the different types of data structure present in R.
The types of data structure found in R are described below:-
Vector - this is a series of data element which are of similar type. This element present in vector is known as components.
Lists - those R objects that are not of the same type of numbers, vectors or strings are known as lists.
Matrix - this is a data structure that is two dimensional and they are used for binding vectors which are of the same exact length. The elements present in the matrix are of the similar type.
Data frame – Matrix is less generic than a data frame which means that different columns are allowed to have different types of data types. It also combines the features of lists and Matrices. It is considered to be rectangular list.
Q. Write the method to load. CSV file on R programming language.
This is an easy method. Use the function read.csv() and also mention file’s path and you will be able to store the file.
Q. Name the various components of the concept graphics grammar in R.
The various types of components that are used at present in a grammar of graphics include:-
1. Aesthetics layer
3. Geometry layer
4. Layer of coordinate
5. Data Layer
6. Layer of Themes
Q. Give a brief description of Rmarkdown and what are the uses of it?
Reporting tool that is provided from R programming language is considered as R markdown. You will be able to producing reports that are of high quality if you make use of R markdown. The output type of this can be either HTML, Pdf or Word.
Q. What is a process of installing packages on R platform?
The command that is used for installing packages on R platform is:-
Q. Write the steps involved in making and evaluating a regression model that is linear for R programming language.
The steps that are needed to be performed are:-
1. First, start the process by diving data into test sets and train. This is an important step as it helps in making the model in a set of a train and thus evaluating the performance that is based according to the test set. For this purpose make use of the command, sample.split () function that is present in catools package. This function will provide you with the opportunity of splitting in ratio that you will be able to specify as per your needs.
2. After the dividing step is complete, then you should proceed further and build a model based on the trains set. For building the model make use of the command, lm().
3. After this, make use of the command, predict() that will even in predicting the valued that are in the set that are used for testing.
4. The last and final step is to find the value of RMSE. A lower value of the RMSE means the rate of prediction would be higher.
Q. There are certain packages used for the purpose of data imputation in R programming language. Name these packages.
The packages that are used for the purpose of data imputation in R are:-
Q. Give description regarding confusion-Matrix on R programming language platform?
In order to find the accuracy rate of the built model in R, a confusion matrix is used. Cross-tabulation is used for predicted and observed classes are calculated in confusion matrix. The command or function that is used from caTools is, confusion-matrix ().
Q. How is custom function used in case of R programming language?
The syntax that is utilized for write custom function is:-
Q. What are the functions that are available in a dplyr package?
The functions that are available are:-
Q. What is the process that is followed to produce a R6 class which is new in R Programming language?
The first step is to make a template which is an object. This object template will consist of the class functions and data members that are part of the class. The various parts that are present in R6 object template are:-
1. Class name
2. Data members which are private
3. Functions that are public members.
Q. Explain the concept of random forest?
The ensemble classifier that is made by using several models of decision tree is known as random forests. The results obtained from the various decision trees are combined and the result that is gained is more accurate than the results that are evaluated from an individual model. The process of building and evaluating random forests is done by first separating the data into retrain and test. Then the random forest is built on this train set and now the prediction will be made based in the set that are used for testing.
Q. Write a brief description of shiny in R.
Q. Write down the advantages of using a function of apply family in case of R.
Changes which are done per entry can be made to Matrices and data frames with the help. Of apply function. The example of apply family function in R:-
apply (X, FUN ,MARGIN,....) Here X is denoted as the matrix or array. The margin is denoted as a variable which helps in determining if it is applicable to columns, rows or both. The function that is applied is known as FUN. The advantage is that with the help of apply function there is the chance of editing every data entry frame by using one a command line that is single. There is no chance of wasting of CPU cycles and auto-filling.
Q. What are the packages that are used in case of mining of data on R programming language platform?
The packages that are used for the purpose of mining of data in R are:-
** Data.table that will provide you with a faster reading of files those are big in size.
** caret and rpart are used in case of models for machine learning.
** Arules is used for the purpose of association learning of rule.
** GGplot which helps by providing several plots for data visualization.
** In order to perform mining of text, tm is performed.
** Forecast helps by providing functions that are used for analysis of time series.
Q. What does clustering mean in R?
A combination of objects which are part of exact similar class is known as a cluster and the process that helps in transferring a group containing objects which are abstract into a class of similar object is known as clustering. Clustering is a requirement in R programming for the following reasons:-
1. In order to handle big databases, you will require scalable clustering.
2. It has the capability to handle with various types of attributes.
3. A clustering algorithm is utilized to detect clusters which have arbitrary shape.
4. A clustering algorithm must also be efficient in dealing with space that are high dimensional.
5. There are databases that contain mousy, erroneous and missing data and clustering algorithm should deal with these.
6. Another important feature of a clustering algorithm is that the result should be usable, interpret-able, and comprehensible.
Q. Give the difference between k means cluster and hierarchical clustering.
One of a popular partitioning method where classification of objects are done so that they can belong to one of K-groups is known as K means cluster.
Hierarchical clustering is a method which makes a decomposition of data objects in a hierarchical manner those are already given. The two types of approach in hierarchical clustering are:-
1. Agglomerative approach
2. Divisive approach
Q. Describe rattle package concept in R Programming language.
The rattle is known as a famous GUI that is used for the purpose of data mining in R. Visual summaries and Statistical of data are presented by the rattle and it helps in transforming data and produces both supervised and unsupervised models for machine learning. It also gives a graphical presentation of performance of models and gives scores to new databases for the purpose of deployment into the production.
Q. What is the process of making multiple plots into a single page?
It is quite easy to plot multiple plots into single page with the help of base graph. Let’s take an example, if there is the need of plotting 4 graphs on the same page then you can. Make use of the function:- par (mfrow =c (2,2))
Q. Give a brief description regarding model of white noise?
A basic three series model is known as WN model. This is one of the simplest instances of a process that is stationary. It consists of:-
1. Mean that is of fixed constant
2. Variance (fixed constant)
3. Over time without correlation
Q. Write the component analysis of principle.
It is a method that is concerned with dimensionality reduction. A lot of chaos is created in data when one observation has relation to dimensions or features which is multiple in nature. This is the reason why it is necessary to decrease the qauntity of dimensions. The features of this principle are described as:-
1. There is the transformation of data to a space that is new which consists of less or equal the quantity of dimensions. The dimension is called principal components.
2. It also holds the maximum quantity of variance which is part of those features that are present in original data. This is the first principal component.
3. Second principal is considered to be orthogonal of the first principal. This helps in capturing the maximum anoint variability left.
4. The entire principal components are.
Q. What is known as a factor?
From a conceptual point of view, those variables in R that are taken on a limited number of different values are known as factors. This variable is often pointed as categorical variables. In case of statistical modeling, factors are used. Data are stored as factors because it helps in treating data correctly.
Q. Name the different import functions of R programming language.
You can import data from various formats and sources into R. The different types of import functions that are present are:-
1. From reading .csv files there is the development and read.csv function.
2. In order to read .sas7bdat files, there is the function called read_sas ().
3. To read the XL sheets there is the read_excel () function.
4. And lastly, for the spss data, you can make use of read_sav () function.
Q. What are the functions that are used in case of debugging in R programming language?
The functions that are used for debugging in R programming language are:-
Q. Write about correlation?
The measure that is utilized for determining the strength of association among two variables is known as correlation.
Q. Write the function that is utilized for performing cross-product of two tables in R programming language.
The function that is used is merge () in case of finding cross-product of two tables.
Q. What is the process of joining multiple strings together in R?
With the help of the functions like paste () to and string_c () you can join strings in R.
Q. What is function that is utilized for performing left join and right joins?
The function that is used is dplyr.
Q. Write the process to produce a PCA model in R programming model.
The PCA can be done in R with the help of the command:- prcomp () and function
Q. Write about random walk model?
Random walk model is a simple non-stationary process that has no specified variance or mean.
Q. Name some R functions?
Q. Brief R Commander?
R commander - A free statistical tool, command written as “Rcmdr” in R commander GUI. Here are the series of R commander plug-ins:
Q. How is R different from other statistical tools available in market? What are its strengths and weaknesses vis-à-vis SAS and SPSS?
R is fundamentally different from SAS language (which is divided into procedures and data steps) and the menu driven SPSS. It is object oriented, much more flexible, hence powerful, yet confusing to the novice, as there are multiple ways to do anything in R. It is overall a very elegant language for statistics and the strengths of the language are enhanced by nearly 5000 packages developed by leading brains across the universities of the planet.
Q. Which R packages do you use the most and which ones are your favorites?
I use R Commander and Rattle a lot, and I use the dependent packages. I use car for regression, and forecast for time series, and many packages for specific graphs. I have not mastered ggplot though but I do use it sometimes. Overall I am waiting for Hadley Wickham to come up with an updated book to his ecosystem of packages as they are very formidable, completely comprehensive and easy to use in my opinion, so much I can get by the occasional copy and paste code.
Q. What level of adoption do you see for R as a preferred tool in the industry? Are Indian businesses also keen to adopt R?
I see surprising growth for R in Business, and I have had to turn down offers for consulting and training as I write my next book R for Cloud Computing. Indian businesses are keen to cut costs like businesses globally, but have an added advantage of having a huge pool of young engineers and quantitatively trained people to choose from. So there is more interest in India for R, but is growing thanks to the efforts of companies like SAP, Oracle, Revolution Analytics and R Studio who have invested in R and are making it more popular. The R Project organization is dominated by academia, and this reflects the fact their priorities is making the software better, faster, stabler but the rest of the community has been making efforts to introduce it to industry.
Q. How did you start your career in analytics and how were you first acquainted with R?
I started my career after MBA in selling cars, which was selling a lot of dreams and managing people telling lies to people to sell cars. So I switched to Business Analytics thanks to GE in 2004, and I had the personal good luck of having Shrikant Dash, ex CEO GE Analytics as my first US client. He was a tough guy and taught me a lot. I came to R only after leaving the cozy world of corporate analytics in 2007.
Q. How do you see Analytics evolving today in the industry as a whole? What are the most important contemporary trends that you see emerging in the Analytics space across the globe?
I don’t know how analytics will evolve, but it will grow bigger and more towards the cloud and bigger data sizes. Big Data /Hadoop, Cloud Computing, Business Analytics and Optimization, Text Mining, are some of the buzz words that are currently in fashion.
Get Updates on Tech posts, Interview & Certification questions and training schedules