Think You're Ready?

Your Final Exam Before the Final Exam.
Dare to Take It?

Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?

A. Missing data

B. Duplicate data

C. Redundant data

D. Invalid data

B.   Duplicate data

An analyst wants to determine whether a relationship between an individual's age and voting preferences exists. Which of the following is the best statistical method for the analyst to use?

A. P-value

B. Chi-squared

C. F-test

D. Z-score

B.   Chi-squared

Which one the following is not considered an aggregate function?

A. SUM

B. MIN

C. SELECT

D. MAX

C.   SELECT

Given the following:

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

A. Fill in the missing cost where it is null.

B. Separate the table into two tables and create a primary key

C. Replace the extended cost field with a calculated field.

D. Correct the dates so they have the same format.

D.   Correct the dates so they have the same format.

An analyst in a consumer bank department wants to showcase the concentration of accounts opened in the United States by ZIP Code to describe the effectiveness of the bank's marketing campaigns. Which of the following would be the best way to visualize the data?

A. A stacked chart

B. A tree map

C. A waterfall chart

D. A geographic map

D.   A geographic map

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?

A. Determine the data needs and sources for analysis.

B. Initiate the analysis for exploratory data analysis.

C. Review the business questions to understand the scope.

D. Finalize the methodology to solve the problem.

C.   Review the business questions to understand the scope.

An analyst for a small business with multiple locations is using each location’s quarterly sales reports from last year to create a single revenue report for the year. Which of the following data mining techniques should the analyst use to complete this task?

A. Data merge

B. Data append

C. Data blending

D. Data imputation

A.   Data merge

Which of the following would be the best way to identify multicollinear attributes in a data set?

A. Correlation coefficient

B. Chi-squared test

C. Two-sample f-test

D. Two-way ANOVA

A.   Correlation coefficient

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

A. ETL process.

B. Record linkage.

C. ELT process.

D. System integration.

B.   Record linkage.

Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.

The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.

A. 90

B. 60

C. 70

D. 80

B.   60

Page 5 out of 40 Pages