Question

    In Python, which method in the Pandas library would you use to replace NaN values in a DataFrame with the median value of each column?

    A df.fillna(df.mean()) Correct Answer Incorrect Answer
    B df.replace(df.median()) Correct Answer Incorrect Answer
    C df.fillna(df.median()) Correct Answer Incorrect Answer
    D df.dropna(inplace=True) Correct Answer Incorrect Answer
    E df.interpolate(method="median") Correct Answer Incorrect Answer

    Solution

    The fillna() method in Pandas is used to replace NaN values in a DataFrame. By passing df.median() as an argument to fillna(), we can replace missing values with the median value of each column. This approach is especially useful when missing values are suspected to deviate from the mean due to outliers, making median imputation a more robust choice. The Pandas fillna() method is highly flexible and frequently used in data cleaning to handle missing data without discarding rows or losing valuable information in other columns. Option A (df.fillna(df.mean())) is incorrect as it fills NaNs with the mean rather than the median. Option B (df.replace(df.median())) is incorrect because replace() is not used directly for filling NaN values. Option D (df.dropna(inplace=True)) is incorrect as it removes rows with NaNs instead of filling them. Option E (df.interpolate(method="median")) is incorrect as interpolate() does not directly support median filling.

    Practice Next