In Classification problems the cost of false positives are almost never same as cost of false negatives. As such, if you are optimizing a business problem where Type 1 and Type 2 errors have different impact, you can optimize your classifier for a probability threshold value to optimize the custom loss function simply by defining cost of true positives, true negatives, false positives and false negatives separately. Optimizing threshold in PyCaret is as simple as writing optimize_threshold. It takes a trained model object (a classifier) and the loss function simply represented by true positives, true negatives, false positives and false negatives. This function returns an interactive plot where loss function (y-axis) is represented as a function of different probability threshold values on x-axis. A vertical line is then shown to represent the best value of probability threshold for that specific classifier. Probability threshold optimized using optimize_threshold can then be used in predict_model function to generate labels using the custom probability threshold. Normally, all classifiers are trained to predict positive class at 50%.
This function is only available in pycaret.classification module.
# Importing dataset from pycaret.datasets import get_data credit = get_data('credit') # Importing module and initializing setup from pycaret.classification import * clf1 = setup(data = credit, target = 'default') # create a model xgboost = create_model('xgboost') # optimize threshold for trained model optimize_threshold(xgboost, true_negative = 1500, false_negative = -5000)
Try this next
Was this page helpful?