Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?
A. Missing data
B. Duplicate data
C. Redundant data
D. Invalid data
An analyst wants to determine whether a relationship between an individual's age and voting preferences exists. Which of the following is the best statistical method for the analyst to use?
A. P-value
B. Chi-squared
C. F-test
D. Z-score
Which one the following is not considered an aggregate function?
A. SUM
B. MIN
C. SELECT
D. MAX
Given the following:

Which of the following is the most important thing for an analyst to do when transforming
the table for a trend analysis?
A. Fill in the missing cost where it is null.
B. Separate the table into two tables and create a primary key
C. Replace the extended cost field with a calculated field.
D. Correct the dates so they have the same format.
An analyst in a consumer bank department wants to showcase the concentration of accounts opened in the United States by ZIP Code to describe the effectiveness of the bank's marketing campaigns. Which of the following would be the best way to visualize the data?
A. A stacked chart
B. A tree map
C. A waterfall chart
D. A geographic map
Which of the following is the first step an analyst should perform upon receiving a business request for analysis?
A. Determine the data needs and sources for analysis.
B. Initiate the analysis for exploratory data analysis.
C. Review the business questions to understand the scope.
D. Finalize the methodology to solve the problem.
An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?
A. Data merge
B. Data append
C. Data blending
D. Data imputation
Which of the following would be the best way to identify multicollinear attributes in a data set?
A. Correlation coefficient
B. Chi-squared test
C. Two-sample f-test
D. Two-way ANOVA
Angela is aggregating data from CRM system with data from an employee system.
While performing an initial quality check, she realizes that her employee ID is not
associated with her identifier in the CRM system.
What kind of issues is Angela facing?
Choose the best answer.
A. ETL process.
B. Record linkage.
C. ELT process.
D. System integration.
Which of the following value is the measure of dispersion "range" between the scores of
ten students in a test.
The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.
A. 90
B. 60
C. 70
D. 80
| Page 5 out of 40 Pages |