A data analyst is working for a shipping company and calculating the volume of boxes
according to the following formula:
volume = height × width × depth.
Which of the following variable types describes volume?
A. Derived
B. Normalized
C. Concatenated
D. Aggregated
What R package makes it easy to work with dates?
A. Lubridate.
B. Datemath.
C. Stringr.
D. ggplot.
Given the diagram below:

Which of the following data schemas shown?
A. Key-value pairs
B. Online transactional processing
C. Data Lake
D. Relational database
A gambler thinks that a coin is fair and is equally likely to turn up heads or tails when the coin is flipped. Which of the following tests should the gambler use to fest this hypothesis?
A. t-test
B. Chi-squared test
C. Rank sum test
D. Ratio test
Given the diagram below:

Which of the following steps is missing?
A. Remove redundant data.
B. Validate the data types.
C. Connect to the data API.
D. Normalize the data.
A column is being used to store strings of variable lengths. Performance is a concern, so the column needs to use as little space as possible. Which of the following data types best meets these requirements?
A. char
B. nchar
C. varchar
D. nvarchar
An analyst computed a new variable of income per day in the household by multiplying the number of days worked by the number of people working in the household and the income earned per day. Which of the following is the correct name for this new variable?
A. Derived
B. Categorical
C. Continuous
D. Control
A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:
A. Non-relational schema
B. Galaxy schema
C. Snowflake schema
D. Star schema
An analyst is explaining the company’s financial systems and reporting tools to a new coworker. Which of the following data quality dimensions are the most important? (Select three).
A. Data formatting
B. Data accuracy
C. Data maturity
D. Data field
E. Data completeness
F. Data consistency
G. Data diversity
H. Data deletion
E. Data completeness
F. Data consistency
Given the following data table:

Which of the following are appropriate reasons to undertake data cleansing? (Select two).
A. Non-parametric data
B. Missing data
C. Duplicate data
D. Invalid data
E. Redundant data
F. Normalized data
D. Invalid data
| Page 11 out of 40 Pages |