Question

    Which data transformation technique would be best for

    converting categorical variables, such as “Gender” (Male, Female), into a format usable in machine learning models?
    A Data normalization Correct Answer Incorrect Answer
    B One-hot encoding Correct Answer Incorrect Answer
    C Logarithmic transformation Correct Answer Incorrect Answer
    D Data binning Correct Answer Incorrect Answer
    E Polynomial transformation Correct Answer Incorrect Answer

    Solution

    One-hot encoding is a technique used to convert categorical variables into a numerical format, where each category is represented by a binary variable. For instance, in the “Gender” variable, one-hot encoding would create two binary columns: “Male” and “Female.” Each observation will have a value of 1 in one column and 0 in the other, making the data usable in machine learning algorithms that require numerical input. One-hot encoding prevents ordinal relationships from being falsely implied, ensuring accurate representation of non-numeric data in modeling. The other options are incorrect because: • Option 1 (normalization) scales data but is ineffective for categorical conversion. • Option 3 (logarithmic transformation) is used for continuous data to reduce skew, not categorical data. • Option 4 (binning) groups continuous data into categories rather than encoding existing categories. • Option 5 (polynomial transformation) applies to numerical features and is unrelated to categorical conversion.

    Practice Next