Question

    A data analyst at a bank is tasked with developing a

    credit scoring model to assess loan applicants' eligibility. Which of the following statistical methods would be most suitable for predicting the likelihood of default?
    A Linear Regression Correct Answer Incorrect Answer
    B Logistic Regression Correct Answer Incorrect Answer
    C Time Series Analysis Correct Answer Incorrect Answer
    D K-means Clustering Correct Answer Incorrect Answer
    E Decision Tree Regression Correct Answer Incorrect Answer

    Solution

    Logistic regression is particularly suited for predicting binary outcomes, such as whether a borrower will default (yes/no). In credit scoring, the objective is to assess an applicant’s risk level, which aligns with logistic regression's ability to estimate the probability of a particular outcome within a range (0 to 1). By focusing on the likelihood of default, logistic regression helps to transform continuous variables into a predictive model that identifies high-risk and low-risk borrowers. This model considers various financial and demographic indicators, weighting each variable’s impact on the default risk. Logistic regression is also robust against outliers, making it highly effective in finance, where data can be volatile. The other options are incorrect because: • Linear Regression assumes a continuous outcome variable and is less suited for binary prediction. • Time Series Analysis is used for forecasting over time, not for categorical risk predictions. • K-means Clustering groups data into clusters, which does not directly predict probability. • Decision Tree Regression is typically used for continuous outcomes and lacks logistic regression’s probability estimation capability.

    Practice Next