Which data cleaning technique is most appropriate for handling missing data when missing values are randomly distributed across a dataset?
When missing data points are randomly distributed, imputing values using the mean (for continuous data) or median (for skewed distributions) can be an effective technique. This approach maintains the dataset’s overall structure and helps reduce potential bias introduced by missing values. By substituting missing values with central tendencies, analysts can preserve statistical relationships without significantly distorting the data, ensuring a more accurate analysis. Option A is incorrect as removing rows may lead to a significant data loss, especially if many rows contain missing values. Option C is incorrect because dropping columns with missing values reduces feature dimensions, potentially discarding useful information. Option D is incorrect as placeholder values can introduce bias or mislead analysis, especially if the placeholder value skews the distribution. Option E is incorrect because ignoring missing values leaves gaps, making it difficult to perform accurate analysis.
The _________ property of the element is a whole number.
Which of the following is the most reactive element in the Periodic table?
Which substance is commonly used as a thermometric material in thermometers due to its expansive properties under temperature changes?
What is the primary purpose of using bleaching powder in drinking water?
Litmus paper, used to test pH levels, is derived from which organism?
What is the primary use of calcium carbonate in antacid tablets?
Formula of ‘Quick Lime’ is __________
What are antibiotics?
What term describes the enthalpy change when a substance transitions from solid to liquid at its melting point?
Which gas is most abundant in the Earth's atmosphere.