Calibrate Model


When performing Classification experiments you often want to predict not only the class labels, but also obtain a probability of the prediction. This probability gives you some kind of confidence. Some models can give you poor estimates of the class probabilities. Well calibrated classifiers are probabilistic classifiers for which the probability output can be directly interpreted as a confidence level. Calibrating classification models in PyCaret is as simple as writing calibrate_model. The functions takes a trained model object and method of calibration through method parameter. Method can be ‘sigmoid‘ which corresponds to Platt’s method or ‘isotonic‘ which is a non-parametric approach. It is not advised to use isotonic calibration with too few calibration samples (<<1000) since it tends to overfit. This functions returns a table with k-fold cross validated scores of classification evaluation metrics (Accuracy, AUC, Recall, Precision, F1 and Kappa) along with trained model object.

The number of folds can be defined using fold parameter within calibrate_model function. By default, the fold is set to 10. All the metrics are rounded to 4 decimals by default by can be changed using round parameter within calibrate_model.

This function is only available in pycaret.classification module.

Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# Importing module and initializing setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# create a model
dt = create_model('dt')

# calibrate a model
calibrated_dt = calibrate_model(dt)

 

Output

CalibratedClassifierCV(base_estimator=DecisionTreeClassifier(ccp_alpha=0.0,
                                                             class_weight=None,
                                                             criterion='gini',
                                                             max_depth=None,
                                                             max_features=None,
                                                             max_leaf_nodes=None,
                                                             min_impurity_decrease=0.0,
                                                             min_impurity_split=None,
                                                             min_samples_leaf=1,
                                                             min_samples_split=2,
                                                             min_weight_fraction_leaf=0.0,
                                                             presort='deprecated',
                                                             random_state=123,
                                                             splitter='best'),
                       cv=10, method='sigmoid')

Try this next