CS 6264 Statistical Analysis with Software Application
___________ uses artifacts to present data
visually. |
data
visualization Correct |
_____________ includes identifying groups of data
record. |
Cluster analysis Correct |
_____________ is rated as the number
one business analytics software. |
Rapid miner Correct |
“All models are wrong but some are
useful “ |
George E. P. Box Correct |
A bell shaped curve that is
symmetric about a vertical line. |
normal distribution Correct |
A bell-shaped distribution that is symmetric about
a vertical line? |
normal Correct |
A distribution where large distribution
are displayed. |
Grouped frequency
distribution Correct |
A frequently used method as it enables binary
variables, sum polytomous variable to be modelled. |
logistic regression Correct |
A matrix that has the same number of
rows and columns is called |
square Correct |
A negative correlation exists when___________. |
x increases y decreases Correct |
A network purpoting to describe family
memberships. |
network topology Correct |
A new phenomenon for the explosion of
_________data |
interaction Correct |
A positive z-score means that the score
is |
Higher than the mean Correct |
A score of 50 lies 2 standard deviations above a
mean of 30.What is the value of the standard deviation? |
10 Correct |
A special type of function where the
domain is a set of consecutive integers. |
sequence Correct |
A survey of 100 consumers said that the price
charged for a kilo of rice could be approximated by a normal distribution
with a mean of 35 and a standard deviation of 4.How many are less than 39? |
84 Correct |
A survey of 100 consumers said that the
price charged for a kilo of rice could be approximated by a normal
distribution with a mean of 35 and a standard deviation of 4.How many of them
lie between 27 and 43? |
95 Correct |
Addition and subtraction of matrices only is
possible if two are more matrices. |
Have same sizes. Correct |
Addition and subtraction of matrices
only is possible if two are more matrices. |
Have same sizes. Correct |
A distribution with 4 modes is said to be a
_________distribution. |
multimodal Correct |
All representations are ________ |
imperfect Correct |
An example of an abstract computer. |
Turing machine Correct |
Another term for an empty set. |
null Correct |
Another term for text analytics. |
Another term for text
analytics. Correct |
Any way to get new expressions from old
ones. |
inference Correct |
Aperfect positive correlation coefficient is equal
to |
1 Correct |
Apositive z-score means that the score
is |
Higherthan the mean Correct |
AUC means___________. |
Area Under the Curve Correct |
Data involving two variables are called
_________data. |
bivariate Correct |
Data involving two variables. |
bivariate Correct |
Earlier name for data science. |
datalogy Correct |
Empirical rule for a normal distribution that is 2
standard deviations above and below the mean is ________% of data. |
95 Correct |
Empirical rule for a normal
distribution that is 3 standard deviations above and below the mean covers
______% of the data. |
99.7 Correct |
Exabyte means ________bytes |
billionbillion Correct |
He coined the term “analysis of
algorithms”. |
Donald Knuth Correct |
He pointed out that until 2003 ,all of mankind had
generated just 5 exabytes of data |
Eric Schmidt Correct |
He proposed the use of a penalized
likehood function. |
Firth Correct |
He coined the term “analysis of algorithms”. |
DonaldKnuth Correct |
He said that “ In mathematics the art of
proposing a question must be held of higher value than solvingit”. |
GeorgCantor Correct |
How many bytes of data are generated every two
days in today's world? |
5 exabytes Correct |
If in a distribution all scores
are distinctthen_____________. |
thereis no mode. Correct |
If A= { x/x is a distinct letter in the word
"MATHEMATICS"} AND B={x/x is a distinct letter in the word
"STATISTICS"} then their intersection is |
{A,C,I,S,T} Correct |
If A={ 2,3} B={4,5},which of the
following is a Cartesian product of the two sets? |
{ (3,4) (3,5) (2,4 ) {2,5)
} Correct |
If R= { (3,3), (3,6), (5,5),(5,10),(6.12)} is a
binary relation in R which the domain is |
{3,5,6} Correct |
If the standard deviation of a distribution
is 3.5, the variance is |
12.25 Correct |
If there are 101 scores the median is equal tothe
_____ranked score. |
51 Correct |
If there are 103 scores the median is
equal to the _____ranked score. |
52nd Correct |
Ifthe standard deviation of a distribution is 3,
the variance is |
9 Correct |
In 2,4,4,4,5,5,6,8,9 the range is |
7 Correct |
In the equation of the regression line represented
by Y= 1.24 X + 6.9 if X=2 then Y =? |
9.38 Correct |
In α =babaa β
=a^6b^5bb, what is the length of the concatenation of the two strings? |
18 Correct |
is an important part of a broader_____________. |
computational
complexity theory Correct |
It corresponds to the case where the
dependent variable has more than 2 categories. |
multinomial logit
model Correct |
It does NOT require the assumption that the
parameters are normally distributed. |
profile likehood Correct |
It enables the performance of a model
and enables a comparison to be made with other models. |
ROC Correct |
It extracts meaningful numerical indices from
information and make it available to statistical and |
Text analytics Correct |
It has the goal of discovering useful
information to support decision making. |
data analysis Correct |
It includes identifying groups of data records |
cluster analysis Correct |
It includes identifying groups of data
records. |
cluster analysis Correct |
It involves a commitment in viewing the world in
terms of individual entities and relations. |
logic Correct |
it is a perfect software
for machine learning. |
orange Correct |
It is popular among financial data analysts. |
Knime Correct |
It is a collection of machine learning
algorithms for data mining task. |
WEKA Correct |
It is a free software programming language. |
R-programming Correct |
It is a language that we say things
about the world. |
Medium of human
expression. Correct |
It is a method for discovering patterns in large
data sets. |
Data Mining Correct |
It is a module in rapid miner that
considers the workflow. |
studio Correct |
It is a perfect software which is written in
Python computing language. |
Orange Correct |
It is a powerful tool that shows the
network of data. |
Knime Correct |
It is a process of finding the computational
complexity of algorithms. |
analysis of algorithms Correct |
It is a process that goes on internally
while most things it wishes about exists only externally. |
reasoning Correct |
It is a theoretical classification that estimates
and anticipates the increase increase in running time for algorithms. |
run time analysis Correct |
It is a variety of formal
calculation typically deduction. |
Intelligent Reasoning Correct |
It is a variety of formal calculation typically
deduction. |
Intelligent Reasoning Correct |
It is used for prototyping in Rapid
miner. |
studio Correct |
It is used in organization’s strategic and
tactical business decision making. |
business intelligence Correct |
It is used to discover patterns in
large data sets |
Data mining Correct |
It is used to enable an entity to determine
consequences by thinking rather than acting. |
Knowledge
Representation Correct |
It makes complex data more
understandable and usable. |
data visualization Correct |
It offers a way to examine trends from
collected data and derive insights from it. |
Business Intelligence Correct |
It refers to the degree of relationship
between two variables? |
Correlation Correct |
It relates the length of an algorithm to the
number of storage location it uses. |
space complexity Correct |
It relates the length of an algorithm’s
input to the number of steps it takes. |
time complexity Correct |
It sees a set of prototypes in particular prototypical
diseases to be matched against the case at hand. |
INTERNIST Correct |
It transforms data into actionable
intelligence for business purposes. |
Business Intelligence Correct |
It views the world in terms of attributes object
value triples. |
rule based Correct |
It views the world in thinking of
prototypical objects. |
frame Correct |
It expands available data enormously since there
is so much more text being generated than numbers. |
Text mining Correct |
It is a process of finding the computational
complexity of algorithms. |
analysis of algorithms Correct |
It refers to well based theories and sound business
judgement. |
DataScience Correct |
It relates the length of an algorithm
to the number of storage location it uses. |
spacecomplexity Correct |
It relates the length of an algorithm’s input to
the number of steps it takes. |
timecomplexity Correct |
It shows a high correlation between the
incidence of flu and searches about flu on google. |
GoogleFlu trends Correct |
KR as a _________is a substitute for the thing
itself. |
surrogate Correct |
KR is a set of __________commitments. |
ontological Correct |
KR means __________________________. |
Knowledge Representation Correct |
Matrix B is |
invertible Correct |
Null strings are indicated by |
λ Correct |
On an examination given to 1000
students, Jef’s score of 80 was higher than the score of 480 students who
took the exam. What is the percentile for Jef’s score? |
48th Correct |
Onan examination given to 1000
students,Jef’s score of 80 was higher than the score of 480 students who took
the exam.What is the percentile for Jef’s score? |
48 Correct |
PAWmeans____________. |
PredictiveAnalytics
World Correct |
Positive correlation means that_______________. |
as x increases y also
increases and vice versa Correct |
Primarily used for data pre-processing. |
Knime Correct |
Refers to using tools of statistics to present
data visually. |
data visualization Correct |
The _______value is the weighted
average of the value the random variable may assume. |
Expected Correct |
The area of the standard normal curve to the right
of z=0.82 is _______. |
0.206 Correct |
The classification table that XL Stat
can display. |
confusion matrix Correct |
The constant multiplicative factor in which
algorithms are related are_______ constants. |
hidden Correct |
The creation of data from varied
sources and its qualification into information. |
datafication Correct |
The developer of farmville, a famous game in the
internet. |
Zynga Incorporated Correct |
The equation of the _______line
predicts the value of Y given X. |
Regression Correct |
The following are abstract notions EXCEPT |
casualty Correct |
The following are artifacts used in
data analysis EXCEPT: |
ANOVA Correct |
The following are data mining techniques EXCEPT: |
Collection Correct |
The following are distinct roles that
KR plays EXCEPT |
Medium for pragmatically
diligent interpretation Correct |
The following are large inputs EXCEPT |
Big beta notation Correct |
The following are softwares used
in data mining EXCEPT |
SPSS Correct |
The following are the 3V's of big data EXCEPT |
veracity Correct |
The following processes are used in
data analysis EXCEPT: |
collecting Correct |
The following provided inspirations of what
constitutes intelligent reasoning EXCEPT |
Sociology Correct |
The function describing the performance
of an algorithm is usually an upper bound determined from ______inputs. |
worst case Correct |
The goal is to transform raw data into
understandable business information. |
Data mining Correct |
The integral of all the values of a
random variable in a probability density function is equal to______. |
One Correct |
The intersection of the two sets A={ 2,3} B={4,5}
is a |
null set Correct |
The method of correlation used for
ranked score is ________. |
Spearman rho Correct |
The method that does not require the assumption
that parameters are normally distributed. |
profile likehood Correct |
The most common functions used to link
probability to the explanatory variables are the LOGIT model and
________model. |
PROBIT Correct |
The most widely used continuous probability
distribution. |
Normal Correct |
The normal distribution with a mean of
0 and standard deviation of 1. |
Standard Correct |
The number that occurs most frequently is
called________. |
Mode Correct |
The process of
inspecting,cleansing,transforming and modelling data with the goal of
discovering useful information. |
data analysis Correct |
The product of a 2x5 and 5x3 matrices is a
______matrix |
2x3 Correct |
The proportion of a well defined
positive event is called _________________. |
sensitivity Correct |
The proportion of a well-defined classified positive
events. |
sensitivity Correct |
The range in R={ (3,3), (3,6),
(5,5),(5,10),(6.12)} is a binary relation in R is |
{3,5,6,10,12} Correct |
The score NOT easily affected by extreme values. |
Median Correct |
The sets A= { x/x is a
distinct letter in the word "MATHEMATICS"} and B={x/x is a distinct
letter in the word "STATISTICS"} , the two sets are |
joint Correct |
The symbol used to indicate strings with no
elements. |
λ Correct |
The two sets If A={ 2,3} B={4,5} are
said to be |
disjoint Correct |
Thecreation of data from varied sources and its
quantification into information. |
Datafication Correct |
Theexplosion of _______data is the main
reason why every 2 days 5 exabytes of dataare generated. |
interaction Correct |
Thenumber that occurs most frequently is
called________. |
Mode Correct |
Theperson who said that “ The future is
not google-able”. |
WilliamGibson Correct |
There are how many data mining techniques? |
7 Correct |
Thescore easily affected by extreme
values is the _________. |
Mean Correct |
Thescore NOT easily affected byextreme
values. |
Median Correct |
These are the data skills that a good
data scientist need to cultivate EXCEPT |
speaking Correct |
To estimate the parameters of the model ,the
________function is maximized. |
likehood Correct |
What conditions must be satisfied in
the development of a probability function for a discrete random
variable? |
a and b Correct |
What is a great example of data product? |
google maps Correct |
What is the process of deriving useful
information from text? |
Text Analytics Correct |
What is the size of the product of a 5x 6 and a 6x
8 matrices? |
5x 8 Correct |
What is the value of the mean if a
score of 110 is 3 standard deviation above the mean? |
95 Correct |
What is the value of the mean in a normal
probability density function? |
50 Correct |
What is the value of the standard
deviation in a standard normal distribution? |
1 Correct |
What is value of quartile 3
in 2,4,4,4,5,5,6,8,9 ? |
7 Correct |
What percent of data will lie within 2
standard deviation of the mean? |
95 Correct |
What programming language doe Orange use? |
python Correct |
What programming language is used in
Rapid miner? |
Java Correct |
What type of text are processed in Text analytics? |
unstructured Correct |
Whatis the size of the product of a 5x
6 and a 6x 8 matrices? |
5x8 Correct |
Which function provides the value of a function at
any particular value of x but does NOT directly give the probability of the
random variable? |
Probability density Correct |
Which is NOT a basic representation
technology? |
graph Correct |
Which is NOT a component of KR? |
it adheres to the
function Correct |
Which is NOT a KR technology? |
roles Correct |
Which is NOT a value of r ? |
1.02 Correct |
Which is Not an interaction data? |
data base Correct |
Which is primarily written in C and in Fortran? |
R-programming Correct |
Which of the following does
NOT use continuous distribution? |
hypergeometric Correct |
Which of the following belong to the GLM? |
logistic Correct |
Which of the following data mining
techniques is predictive? |
classification Correct |
Which of the following is a continuous
distribution? |
Chi-square Correct |
Which of the following is a discrete
distribution? |
Hypergeometric Correct |
Which of the following is a predictive data mining
technique? |
regression Correct |
Which of the following is NOT a data
mining tool? |
Python Correct |
Which of the following is NOT a goal in data
mining? |
collecting data Correct |
Which of the following is NOT a method
used in data analysis? |
Statistics Analytics Correct |
Which of the following is NOT a module in rapid
Miner? |
loop Correct |
Which of the following is TRUE when a
distribution is normal? |
The correct answers
are: Mean, Median, Mode Correct |
Which of the following is TRUE? |
A + B = B+ A Correct |
Which of the following is used as a
method for Correlation? |
Pearson r Correct |
Which of the following pertains to predictive data
mining technique? |
Regression Correct |
Which of the following type of
text is processed in text analytics? |
unstructured Correct |
Which of the matrices is singular? |
A Correct |
Which pair belongs to the same family
of models called GLM? i) logistic ii) linear regression
iii.) multinomial regression iv)probability |
I and ii Correct |
Whichof the following is TRUE? |
A +B = B+ A Correct |
Whichof the following statements is
TRUE? |
Q2=median Correct |
Whichof the matrices is singular? |
A Correct |
No comments:
Post a Comment