Data Analytics - Industrial Engineering & Manufacturing
Practice questions to test your knowledge and improve your understanding.
What will the following R code do?mydata$v2 <- mydata$v4 <- NULL
In Google Analytics tool, which of the following analysis should be performed in order to identify the origin of a user's web traffic?
In data mining, which of the following options correctly defines Precision, which is used for assessing the quality of text retrieval?
In a generalized linear model. which of the following link functions belongs. by default, to Poisson family?
The ______ of a worksheet defines its appearance.
In data mining, which of the following classification models is built by kNN algorithm?
In association rule mining, an indication of how often the rule has been found to be true is represented by a term known as confidence. How is this term. confidence. represented for the rule, A => B?
Find the output of the following R programming language code.a <- c(7.5.FALSE.4+4i)b <- c(6,0,TRUE,4+7i)print(a&&b)
In which of the following types of reasoning in data science, the conclusions reached are probable,reasonable. plausible and believable?Deductive reasoningInductive reasoning
Which of the following options denotes the probability of avoiding a type-ll error in hypothesis testing?
Which of the following is the correct syntax of the PredictSupport (DMX) prediction function used with Microsoft linear regression algorithm?
What will be the output of the following code of the R programming language?b1 <- 17b2 <- 13z <— 5:7print(b1 96in96 z)print(b2 %in% z)
Which of the following is the correct R syntax used for selecting certain rows from a data frame, based on specific logical criteria?
In logistic regression. which of the given methods is used to display the conditional density plot of thebinary outcome, F. on the continuous x variable?
Which of the following statements is correct about the judgement sampling method?
Which Of the following t-tests should be performed in order to compare means from two different groups?
Which of the following data mining algorithms is applied to a database containing a large number of transactions and also learns association rules?
With respect to advanced statistics, which of the following options is correct about the arimaO function?
Which of the following options is the default CLUSTERING_METHOD used by the Microsoft clustering algorithm?
_______ reduces the number of bits in a file by identifying and eliminating redundancy
With respect to the Microsoft sequence clustering algorithm, which of the following options is the correct syntax of the PredictCaseLikelihood (DMX) function?
As per Microsoft association rules algorithm, which of the following Options is the prediction function with scalar value as the return type?
As per Microsoft association rules algorithm, which of the following prediction functions has/have a Boolean return type?
Data types that are created by the programmer are known as ________.
In data mining, which of the following statements is NOT correct about C45 algorithm?
In advanced statistics, which of the following statements is correct about the Dirichlet Regression method?
Using the following information, find the correct syntax of the R function used for creating binary files.Assume object as the binary file to be written. n as the number Of bytes and con as the connection object.
The values of X and Y are given in figure-1 Of the image. Choose the correct value of 2X — 5Y fromfigure-2.
It is given that y is a Poisson variate and satisfies the condition P(y=4) = P(y=5). What are the values of mean and standard deviation of y?
In association rule mining, which of the following statements is correct about Frequent Itemset Generation of the two-step approach?
ln data mining, according to Bayes‘ theorem, which of the following formulae represents posterior probability in terms of prior probability?
Regression equation of Z on V is given as following:7. = c + dVThe relationship between two variables a and b, is given as b + 6a = 20 and between another two variables c and d, as 4c + 10d = 50. The regression coefficient of c on a is given as 0.90. Find the regression coefficient of d on b.
In which of the following text mining methods, terms are analyzed on the sentence and document level?
For a given set of 25 items, coefficient of correlation between x and y is 0.6. The values of the arithmetic mean of x and y are 14 and 18, respectively, and the values of standard deviation of x and y are 4 and 6. respectively. If the pair (25. 18) has been wrongly taken as (18, 25). then find the correct value of correlation coefficient.
Which of the following clustering algorithms is used for grid-based partitioning?
For a group of 12 students, the sum of squares of differences in their ranks for science and math is given as 60. On the basis of the given information. find the value of rank correlation coefficient.
IN SOL Server data mining, which of the following algorithm types predicts one or more discrete variables that are based on other attributes in a dataset?
Diigo and delicious are ________ tools.
In survival analysis, which of the following methods is used to model the hazard function on a set of predictor variables?
Which of the following statements is NOT correct about data science?
Find the output of the following code of the R programming language.Iista <- Iist(5:7)print(lista)Iistb <-Iist(12:14)print(listb)x1 <- unlist(lista)x2 <- unlist(listb)print(xl)print(x2)r <- x1+x2print(r)
As per the Microsoft association rules algorithm, which of the following parameters specifies the minimum number of cases that must contain an itemset before the algorithm generates a rule?
According to advanced statistics generalized linear model, which of the following is the default link function for the gaussian family?
What will be the output of the following R code?c(4,7,TRUE,3+7i) -> v1c(9,6,FALSE,3+7i) ->> v2print(v1)print(v2)
Which of the following statements are NOT correct about the Bayesian belief network?
Which of the following is the correct default value for the INSTABILITY_SENSITIVITY parameter used with the Microsoft time series algorithm?
Which of the following is the correct syntax of the command used for merging two data frames, myFrame1 and myFrame2, by ID and Country?
Consider the following data:Average cost of wafers = Rs. 35Average cost of chocolates = Rs. 37Standard deviation of cost of wafers = 2.0Standard deviation of cost of chocolates = 3.0Correlation coefficient between the costs of chocolates and wafers = 0.7What will be the expected cost of chocolates when the cost of wafers is Rs. 40?
Which of the following options is correct about the logistic regression technique?
Which of the following is the default value of the parameter HISTORICAL_MODEL_GAP used in Microsoft time series algorithm?