Finalize Model

Finalize model is the last step in a typical supervised experiment workflow. When an experiment is started in PyCaret using setup, a hold-out set is created that is not being used in model training. By default, if no train_size parameter is defined in setup, hold-out set contains 30% sample of the dataset. All the functions in PyCaret use the remaining 70% as training set to create, tune or ensemble models. As such, the hold-out set is the final assurance and used for diagnosis of overfitting / underfitting. However, once the predictions are generated on hold-out set using predict_model and you have chosen to deploy the specific model, you want to train your model for one final time on the entire dataset including hold-out. Finalizing the model on entire dataset is as easy as writing finalize_model. This function takes trained model object and returns a model that has been trained on the entire dataset. 

This function is only available in pycaret.classification and pycaret.regression module. 





# Importing dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# Importing module and initializing setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# create a model
rf = create_model('rf')

# finalize a model
final_rf = finalize_model(rf)


RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=10,
                       n_jobs=None, oob_score=False, random_state=123,
                       verbose=0, warm_start=False)

Try this next