|
It is a free software programming language. |
R-programming |
|
Which of the following is NOT a goal in data
mining? |
collecting data |
|
It transforms data into actionable intelligence
for business purposes. |
Business Intelligence |
|
It
has the goal of discovering useful information to support decision making. |
data analysis |
|
Which
of the following type of text is processed in text analytics? |
unstructured |
|
The
following are artifacts used in data analysis EXCEPT: |
ANOVA |
|
___________
uses artifacts to present data visually. |
data visualization |
|
It
extracts meaningful numerical indices from information and make it available
to statistical and machine learning. |
Text analytics |
|
It
includes identifying groups of data records. |
cluster analysis |
|
Which
of the following data mining techniques is predictive? |
classification |
|
The
goal is to transform raw data into understandable business information. |
Data mining |
|
What
is the process of deriving useful information from text? |
Text Analytics |
|
_____________
includes identifying groups of data record. |
Cluster analysis |
|
It is
used in organization’s strategic and tactical business decision making. |
business intelligence |
|
It is
a powerful tool that shows the network of data. |
Knime |
|
What
programming language doe Orange use? |
python |
|
The
following processes are used in data analysis EXCEPT: |
collecting |
|
Which
of the following is NOT a method used in data analysis? |
Statistics Analytics |
|
It is
a method for discovering patterns in large data sets. |
Data Mining |
|
It
makes complex data more understandable and usable. |
data visualization |
|
The
two sets If A={ 2,3} B={4,5} are said to be |
disjoint |
|
Which of the matrices is singular? |
A |
|
What
is the focus of data science? |
manipulate data efficiently and effectively |
|
What
is an organized collection of information and set of information used to
manage that operation? |
ADT |
|
What
is the correct meaning of ADT? |
Abstract Data Type |
|
ML
means: |
Machine Learning |
|
Addition
and subtraction of matrices only is possible if two are more matrices. |
Have same sizes |
|
3A +
B = |
d. |
|
Which
is NOT a characteristic feature of data structure? |
It contains a fixed structure. |
|
It
refers to a data structure that grows and shrinks at execution time. |
dynamic |
|
If A={
2,3} B={4,5},which of the following is a Cartesian product of the two sets? |
{ (3,4) (3,5) (2,4 ) {2,5) } |
|
What
is the earlier name for data science? |
datalogy |
|
Which
of the following is the transpose of B? |
a |
|
Which
of the following is TRUE? |
A + B = B+ A |
|
What
is a data structure that has a fixed size? |
static |
|
Matrix
B is |
invertible |
|
The
intersection of the two sets A={ 2,3} B={4,5} is a What
is the size of the product of a 5x 6 and a 6x 8 matrices? |
null set 5x 8 |
|
_______________
is a data structure that every component has a unique processor and succesor. |
linear |
|
An
array is a good example of _________data structure. |
Static |
|
Addition
and subtraction of matrices only is possible if two are more matrices. |
Have same sizes |
|
Which
of the following is TRUE? |
A + B = B+ A |
|
The
goal is to transform raw data into understandable business information. |
Data mining |
|
If A=
{ x/x is a distinct letter in the word "MATHEMATICS"} AND
B={x/x is a distinct letter in the word "STATISTICS"} then
their intersection is |
{A,C,I,S,T} |
|
It is
a process of finding the computational complexity of algorithms. |
analysis of algorithms |
|
The
function describing the performance of an algorithm is usually an upper bound
determined from ______inputs. |
worst case |
|
Which
of the following data mining techniques is predictive? |
classification |
|
In α
=babaa β =a^6b^5bb, what is the length of the concatenation of
the two strings? |
18 |
|
He
coined the term “analysis of algorithms”. |
Donald Knuth |
|
Another
term for an empty set. |
null |
|
What
is the size of the product of a 5x 6 and a 6x 8 matrices? |
5x 8 |
|
It is
a free software programming language. |
R-programming |
|
It
is popular among financial data analysts. |
Knime |
|
It is
used to discover patterns in large data sets |
Data mining |
|
Earlier
name for data science. |
datalogy |
|
Which
of the following is a predictive data mining technique? |
regression |
|
Another
term for text analytics. |
text mining |
|
Algorithm analysis is an important part of a
broader_____________. |
computational complexity theory |
|
it
is a perfect software for machine learning. |
orange |
|
It is
a process of finding the computational complexity of algorithms. |
analysis of algorithms |
|
The
constant multiplicative factor in which algorithms are related are_______
constants. |
hidden |
|
The
range in R={ (3,3), (3,6), (5,5),(5,10),(6.12)} is a binary relation in
R is |
{3,5,6,10,12} |
|
It is
used in organization’s strategic and tactical business decision making. |
business intelligence |
|
Refers
to using tools of statistics to present data visually. |
data visualization |
|
Null
strings are indicated by |
λ |
|
Which
of the following is the transpose of B? |
a. |
|
It
offers a way to examine trends from collected data and derive insights
from it. |
Business Intelligence |
|
The
following are softwares used in data mining EXCEPT |
SPSS |
|
If R=
{ (3,3), (3,6), (5,5),(5,10),(6.12)} is a binary relation in R which the
domain is |
{3,5,6} |
|
It
includes identifying groups of data records |
cluster analysis |
|
It is
a powerful tool that shows the network of data. |
Knime |
|
It
relates the length of an algorithm to the number of storage location it uses. |
space complexity |
|
There
are how many data mining techniques? |
7 |
|
Which
of the matrices is singular? |
A |
|
A
special type of function where the domain is a set of consecutive
integers. |
sequence |
|
It is
used for prototyping in Rapid miner. |
studio |
|
An
example of an abstract computer. |
Turing machine |
|
The
process of inspecting,cleansing,transforming and modelling data with the goal
of discovering useful information. |
data analysis |
|
It
makes complex data more understandable and usable. |
data visualization |
|
The
symbol used to indicate strings with no elements. |
λ
|
|
What
type of text are processed in Text analytics? |
unstructured |
|
It is
a theoretical classification that estimates and anticipates the increase
increase in running time for algorithms.
|
run time analysis
|
|
The
following are large inputs EXCEPT |
Big beta notation |
|
It
relates the length of an algorithm’s input to the number of steps it takes. |
time complexity |
|
Matrix
B is |
invertible |
|
The
sets A= { x/x is a distinct letter in the word
"MATHEMATICS"} and B={x/x is a distinct letter in the
word "STATISTICS"} , the two sets are |
Equal (not sure) |
|
What
programming language is used in Rapid miner? |
Java |
|
The
product of a 2x5 and 5x3 matrices is a ______matrix |
2x3 |
|
A
matrix that has the same number of rows and columns is called |
Square |
|
“ All
models are wrong but some are useful “ |
George E. P. Box |
|
What
is a great example of data product? |
google maps |
|
He
pointed out that until 2003 ,all of mankind had generated just 5 exabytes of
data |
Eric Schmidt |
|
A new
phenomenon for the explosion of _________data |
interaction |
|
He
said that “ In mathematics the art of proposing a question must be held
of higher value than solving it”. |
Georg Cantor |
|
The
creation of data from varied sources and its quantification into information. |
Datafication |
|
How
many bytes of data are generated every two days in today's world? |
5 exabytes |
|
It
refers to well based theories and sound business judgement. |
Data Science |
|
PAW
means____________. |
Predictive Analytics World |
|
Exabyte
means ________bytes |
billion billion |
|
The
developer of farmville, a famous game in the internet. |
Zynga Incorporated |
|
The
person who said that “ The future is not google-able”. |
William Gibson |
|
It
shows a high correlation between the incidence of flu and searches about flu
on google. |
Google Flu trends |
|
It
expands available data enormously since there is so much more text being
generated than numbers. |
Text mining |
|
The
creation of data from varied sources and its qualification into information. |
datafication |
|
These
are the data skills that a good data scientist need to cultivate EXCEPT |
speaking |
|
Which
is Not an interaction data? |
data base |
|
The
following are the 3V's of big data EXCEPT |
veracity |
|
IOT
means |
Internet of things |
|
The
explosion of _______data is the main reason why every 2 days 5 exabytes of
data are generated. |
interaction |
|
Which
of the following is TRUE when a distribution is normal? |
Mean=Median=Mode |
|
What
is the mean for a standard normal distribution? |
0 |
|
Empirical
rule for a normal distribution that is 3 standard deviations above and below
the mean covers ______% of the data. |
99.7 |
|
Empirical
rule for a normal distribution that is 2 standard deviations above and below
the mean is ________% of data. |
95 |
|
A
bell shaped curve that is symmetric about a vertical line. |
normal distribution |
|
A
distribution where large distribution are displayed. |
Grouped frequency distribution |
|
A
survey of 100 consumers said that the price charged for a kilo of rice could
be approximated by a normal distribution with a mean of 35 and a standard
deviation of 4.How many of them lie between 27 and 43? |
95 |
|
What
percent of data will lie within 2 standard deviation of the mean? |
95 |
|
What
range of values 3 SD below and above the mean in a normal distribution if the
mean is 10 and standard deviation is 2? |
4-16 |
|
Lists
the percent of data in each distribution. |
relative frequency distribution |
|
The
normal distribution with a mean of 0 and standard deviation of 1. |
Standard |
|
The
area of the standard normal curve to the right of z=0.82 is _______. |
0.206 |
|
What
is the value of the mean if a score of 110 is 3 standard deviation
above the mean? |
95 |
|
A
score of 50 lies 2 standard deviations above a mean of 30.What is the value
of the standard deviation? |
10 |
|
A
graph used to indicate intervals in a frequency distribution is refereed to
as a______________. |
Histogram |
|
What
range of values lie between 3 standard deviations above and below the mean if
the mean is 80 and the standard deviation is 3? |
71-89 |
|
A
bell-shaped distribution that is symmetric about a vertical line? |
normal |
|
A
survey of 100 consumers said that the price charged for a kilo of rice could
be approximated by a normal distribution with a mean of 35 and a standard
deviation of 4.How many are less than 39? |
84 |
|
Empirical
rule for a normal distribution lie ______% of data with 1 standard
deviation below and above the mean. |
68 |
|
What
is the value of the standard deviation in a standard normal distribution? |
1 |
|
What
is the value of the mean if a score of 110 is 3 standard deviation above the
mean? |
95 |
|
Which
is NOT a value of r ? |
1.02 |
|
A
vegetable distributor knows that during the month of August ,the
weights of tomatoes are normally distributed with a mean of 0.61 lb and
a standard deviation of 0.15 lb. What percent of the tomatoes weigh less than
0.71 lb? |
84 |
|
The
equation of the _______line predicts the value of Y given X. |
Regression |
|
What
range of values lie between 3 standard deviations above and below the mean if
the mean is 80 and the standard deviation is 3? |
71-89 |
|
Data
is NOT information unless we add_________. |
analytics |
|
The
creation of a data product contains 3 components EXCEPT |
time |
|
What
increases data volume? |
velocity |
|
A
negative correlation exists when___________. |
x increases y decreases |
|
The
following are elements in an analytic plan EXCEPT |
graphs |
|
As of
2014,there are _______million of tweets a day. |
500 |
|
The
value of X in the regression equation Y= 1.24 X + 6.9 if Y=13.1 is |
5 |
|
According
to Hilary Mason which is NOT a skill that a good data scientist must
cultivate. |
critical thinking |
|
If
there are 103 scores the median is equal to the _____ranked score. |
52nd |
|
A
vegetable distributor knows that during the month of August ,the
weights of tomatoes are normally distributed with a mean of 0.61 lb and
a standard deviation of 0.15 lb. How many can be expected to weigh between
0.31 to 0.91 in a shipment of 4500 tomatoes. |
4275 |
|
It
partitions a ranked data into four equal groups. |
quartile |
|
A
perfect positive correlation coefficient is equal to |
1 |
|
In
the equation of the regression line represented by Y= 1.24 X + 6.9 if X=2
then Y =? |
9.38 |
|
It
refers to the degree of relationship between two variables? |
Correlation |
|
Who
said that "The future is not google-able " ? |
William Gillason |
|
He is
someone who asks interesting questions on formal and informal theory. |
data scientist |
|
The
number that occurs most frequently is called________. |
Mode |
|
If
the standard deviation of a distribution is 3.5, the variance is |
12.25 |
|
The
difference between the highest and lowest value. |
range |
|
It
list the percent of data in a distribution. |
relative frequency distribution |
|
The
normal distribution with a mean of 0 and standard deviation of 1. |
Standard |
|
Which
is NOT a correct correlation Coefficient? |
1.2 |
|
A
bell-shaped distribution that is symmetric about a vertical line? |
normal |
|
The
middle-most value in a ranked list of numbers. |
median |
|
Which
of the following is TRUE when a distribution is normal? |
Mode |
|
It
expands available data enormously. |
text mining |
|
A
bell-shaped distribution that is symmetric about a vertical line. |
normal |
|
What
percent of data will lie within 2 standard deviation of the mean? |
95 |
|
Data
involving two variables. |
bivariate |
|
The
quantification of data into information. |
Datafication |
|
He
coined the term "data scientist" |
DJ Patil |
|
The
major outcome of correlation. |
prediction |
|
A
vegetable distributor knows that during the month of August ,the
weights of tomatoes are normally distributed with a mean of 0.61 lb and
a standard deviation of 0.15 lb. How many can be expected to weigh more than
0.31 lb in a shipment of 6000 tomatoes. |
150 |
|
The
method of correlation used for ranked score is ________. |
Spearman rho |
|
A
data having the same number of occurrence in scores is said to be |
no mode |
|
Example
of a data product. |
google map |
|
A
positive z-score means that the score is |
Higher than the mean |
|
A
graph that is used to indicate frequency distribution. |
histogram |
|
The
area of the standard normal curve to the right of z=0.82 is _______. |
0.206 |
|
Which
of the following is used as a method for Correlation? |
Pearson r |
|
A
score of 50 lies 2 standard deviations above a mean of 30.What is the value
of the standard deviation? |
10 |
|
The
score NOT easily affected by extreme values. |
Median |
|
On an
examination given to 1000 students, Jef’s score of 80 was higher than the
score of 480 students who took the exam. What is the percentile for Jef’s
score? |
48th |
|
The
classification table that XL Stat can display. |
confusion matrix |
|
Which
pair belongs to the same family of models called GLM ? |
Logistic and linear regression |
|
Classification
table is also called ________ |
confusion matrix |
|
It
corresponds to the case where the dependent variable has more than 2
categories. |
multinomial logit model |
|
It
enables the performance of a model and enables a comparison to be made with
other models. |
ROC |
|
He
proposed the use of a penalized likehood function. |
Firth |
|
Which
belong to the GLM family? |
logistic and linear |
|
The
proportion of a well-defined classified positive events. |
sensitivity |
|
What
does GLM means? |
Generalized Linear model |
|
The
method that does NOT require t he assumption that the parameters are normally
distributed. |
profile
likehood |
|
To
estimate the parameters of the model ,the ________function is maximized. |
likehood |
|
The
most common functions used to link probability to the explanatory variables
are the LOGIT model and ________model. |
PROBIT |
|
The
method used to iteratively find a solution to a multinomial legit model. |
Newton-Raphson algorithm |
|
SBC
means_________ |
Schwar’s Bayesian Criterion |
|
Displays
the performance of a model and enables a comparison to be made with other
models. |
ROC curve |
|
What
does ROC mean? |
Receiver Operating Characteristics |
|
A
frequently used method as it enables binary variables, sum polytomous
variable to be modelled. |
logistic regression |
|
It
does NOT require the assumption that the parameters are normally distributed. |
profile likehood |
|
The
proportion of a well-classified negative event. |
specificity |
|
ROC
comes from ______theory. |
signal detection |
|
Which
is NOT a KR technology? |
roles |
|
It is
used to enable an entity to determine consequences by thinking rather than
acting. |
Knowledge Representation |
|
The
following are distinct roles that KR plays EXCEPT |
Medium for pragmatically diligent interpretation |
|
Any
way to get new expressions from old ones. |
inference |
|
A
network purpoting to describe family memberships. |
network topology |
|
KR
means __________________________. |
Knowledge Representation |
|
It is
a variety of formal calculation typically deduction |
Intelligent Reasoning |
|
The
following are abstract notions EXCEPT |
casualty |
|
It
views the world in terms of attributes object value triples. |
rule based |
|
The
following provided inspirations of what constitutes intelligent reasoning
EXCEPT |
Sociology |
|
It
involves a commitment in viewing the world in terms of individual entities
and relations. |
logic |
|
KR is
a set of __________commitments. |
ontological |
|
It
sees a set of prototypes in particular prototypical diseases to be matched
against the case at hand. |
INTERNIST |
|
Which
is NOT a basic representation technology? |
graph |
|
All representations are
________. |
imperfect |
|
Which
is NOT a component of KR? |
it adheres to the function |
|
KR as
a _________is a substitute for the thing itself |
surrogate |
|
It
views the world in thinking of prototypical objects. |
frame |
|
It is
a language that we say things about the world. |
Medium of human expression |
|
It is
a process that goes on internally while most things it wishes about exists
only externally. |
reasoning |
|
Which
function provides the value of a function at any particular value of x but
does NOT directly give the probability of the random variable? |
Probability density |
|
It
involves a commitment in viewing the world in terms of individual entities
and relations between them. |
logic |
|
ROC
means |
Receiver Operating Characteristics |
|
Which
of the following is a discrete distribution? |
Hypergeometric |
|
Any
way to get new expressions from old ones |
inference |
|
The
_______value is the weighted average of the value the random variable may
assume |
Expected |
|
Which
is an example of a discrete random variable? |
number of book |
|
The
following are discrete distributions EXCEPT |
chi-square |
|
The
classification table that XLSTAT can display |
confusion matrix |
|
The most common
functions used to link probability to the explanatory variables are the LOGIT
model and ________model. |
PROBIT |
|
Which
of the following is a continuous distribution? |
Chi-square |
|
It is
a numerical function of the outcome of a statistical experiment. |
random variable |
|
The
most commonly used continuous probability distribution |
normal |
|
It is
a language that we say things about the world |
Medium of human expression |
|
Which
is NOT a basic representation technologies? |
graph |
|
Two
of the most widely used discrete probability distribution |
poisson and binomial |
|
It
does NOT require the assumption that the parameters are normally distributed |
INTERNIST |
|
The
integral of all the values of a random variable in a probability density
function is equal to______. |
One |
|
The
following provided inspirations of what constitute intelligent reasoning
EXCEPT |
philosophy |
|
Which
is NOT a KR technology? |
roles |
|
The
following are distinct roles that KR plays EXCEPT |
Medium for pragmatically diligent interpretation |
|
What
is KR? |
Knowledge Representation |
|
The
most common function used to link probability to explanatory variables |
logit model |
|
It
does NOT require the assumption that the parameters are normally distributed |
profile likehood |
|
A
model that corresponds to the case where the dependent variable has more
than two categories. |
multinomial logit model |
|
The
following are distinct roles that KR plays EXCEPT |
Medium for pragmatically diligent interpretation |
|
Which function provides
the value of a function at any particular value of x but does NOT directly give
the probability of the random variable? |
Probability density |
|
The
following are continuous distributions EXCEPT |
geometric |
|
To
estimate the parameters of the model ,the ________function is maximized |
Likehood |
|
It provides the height
or the value of the function at any particular value of x |
probability density function |
|
It
sees the medical world as made of empirical associations connecting symptoms
to diseases. |
MYCIN |
|
Any
way to get new expressions from old ones |
inference |
|
It
refers to a frequently used method as it enables binary or polytomous
variables to be modelled. |
logistic
regression |
|
KR as
a _________is a substitute for the thing itself |
surrogate |
|
It is
used to enable an entity to determine consequences by thinking rather than
acting. |
KR |
|
A
network purpoting to describe family memberships |
network topology |
|
The
most widely used continuous probability distribution |
Normal |
|
What
is the value of the mean and standard deviation in a normal probability
density function |
mean-50 s=5 |
|
SBC
means_________ |
Schwar’s Bayesian Criterion |
|
The
most widely used continuous probability distribution |
Normal |
|
It
views the world in terms of attribute -object value triples |
rule-based |
|
Which
of the following is a discrete distribution? |
Hypergeometric |
|
It is
a variety of formal calculation typically deduction |
Intelligent Reasoning |
|
Which
of the following is a continuous distribution? |
Chi-square |
|
It
corresponds to the case where the dependent variable has more than 2
categories. |
multinomial logit model |
|
Which
is NOT a component of KR? |
it adheres to the function |
|
It
is often used as a model of the number of arrivals at a
facility in a given period of time. |
poison probability distribution |
Wednesday, February 24, 2021
Data Analysis
Data
Analysis
This course reviews and expands upon core topics in
probability and statistics through the study and practice of data anlysis
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment