Question

    A dataset contains "USA," "United States," and "U.S." as

    country entries. What is the best way to resolve these inconsistencies?
    A Standardize entries using a predefined mapping Correct Answer Incorrect Answer
    B Remove inconsistent entries from the dataset Correct Answer Incorrect Answer
    C Use the entry that appears most frequently Correct Answer Incorrect Answer
    D Replace all entries with a null value Correct Answer Incorrect Answer
    E Leave the data as is to preserve authenticity Correct Answer Incorrect Answer

    Solution

    Standardizing data ensures uniformity and consistency across the dataset. A predefined mapping such as replacing "USA," "United States," and "U.S." with a single standardized value like "United States" avoids ambiguities during analysis. This method maintains data integrity and improves the dataset's usability for analytical tasks. Why Other Options Are Wrong : B) Removing inconsistent entries leads to data loss and reduced sample size. C) Choosing the most frequent entry ignores valid data variations and may bias results. D) Replacing entries with null values reduces the dataset's quality. E) Leaving the inconsistencies untreated causes challenges during data interpretation.

    Practice Next