Tune Model


Tuning hyperparameters of a machine learning model in any module is as simple as writing tune_model. It tunes the hyperparameter of the model passed as an estimator using Random grid search with pre-defined grids that are fully customizable. Optimizing the hyperparameters of a model requires an objective function which is linked to target variable automatically in supervised experiments such as Classification or Regression. However for unsupervised experiments such as Clustering, Anomaly Detection and Natural Language Processing PyCaret allows you to define custom objective function by specifying supervised target variable using supervised_target parameter within tune_model (see examples below). For supervised learning, this function returns a table with k-fold cross validated scores of common evaluation metrics along with trained model object. For unsupervised learning, this function only returns trained model object. The evaluation metrics used for supervised learning are:

  • Classification: Accuracy, AUC, Recall, Precision, F1, Kappa, MCC
  • Regression: MAE, MSE, RMSE, R2, RMSLE, MAPE

The number of folds can be defined using fold parameter within tune_model function. By default, the fold is set to 10. All the metrics are rounded to 4 decimals by default by can be changed using round parameter. Tune model function in PyCaret is a randomized grid search of a pre-defined search space hence it relies on number of iterations of search space. By default, this function performs 10 random iteration over search space which can be changed using n_iter parameter within tune_model. Increasing the n_iter parameter may increase the training time but often results in a highly optimized model. Metric to be optimized can be defined using optimize parameter. By default, Regression tasks will optimize R2 and Classification tasks will optimize Accuracy. 

 

See Examples


Classification Example

 

Code
# Importing dataset 
from pycaret.datasets import get_data 
diabetes = get_data('diabetes') 

# Importing module and initializing setup 
from pycaret.classification import * 
clf1 = setup(data = diabetes, target = 'Class variable')

# train a decision tree model
dt = create_model('dt')

# tune hyperparameters of decision tree
tuned_dt = tune_model(dt)

# tune hyperparameters with increased n_iter
tuned_dt = tune_model(dt, n_iter = 50)

# tune hyperparameters to optimize AUC
tuned_dt = tune_model(dt, optimize = 'AUC') #default is 'Accuracy'

# tune hyperparameters with custom_grid
params = {"max_depth": np.random.randint(1, (len(data.columns)*.85),20),
          "max_features": np.random.randint(1, len(data.columns),20),
          "min_samples_leaf": [2,3,4,5,6],
          "criterion": ["gini", "entropy"]
          }

tuned_dt_custom = tune_model(dt, custom_grid = params)

# tune multiple models dynamically
top3 = compare_models(n_select = 3)
tuned_top3 = [tune_model(i) for i in top3]

 

Sample Output

Regression Example

 

Code
from pycaret.datasets import get_data 
boston = get_data('boston') 

# Importing module and initializing setup 
from pycaret.regression import * 
reg1 = setup(data = boston, target = 'medv')

# train a decision tree model
dt = create_model('dt')

# tune hyperparameters of decision tree
tuned_dt = tune_model(dt)

# tune hyperparameters with increased n_iter
tuned_dt = tune_model(dt, n_iter = 50)

# tune hyperparameters to optimize MAE
tuned_dt = tune_model(dt, optimize = 'MAE') #default is 'R2'

# tune hyperparameters with custom_grid
params = {"max_depth": np.random.randint(1, (len(data.columns)*.85),20),
          "max_features": np.random.randint(1, len(data.columns),20),
          "min_samples_leaf": [2,3,4,5,6],
          "criterion": ["gini", "entropy"]
          }

tuned_dt_custom = tune_model(dt, custom_grid = params)

# tune multiple models dynamically
top3 = compare_models(n_select = 3)
tuned_top3 = [tune_model(i) for i in top3]

 

Sample Output

Clustering Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# Importing module and initializing setup
from pycaret.clustering import *
clu1 = setup(data = diabetes)

# Tuning K-Modes Model
tuned_kmodes = tune_model('kmodes', supervised_target = 'Class variable')

 

Sample Output

Anomaly Detection Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
boston = get_data('boston')

# Importing module and initializing setup
from pycaret.anomaly import *
ano1 = setup(data = boston)

# Tuning Isolation Forest Model
tuned_iforest = tune_model('iforest', supervised_target = 'medv')

 

Sample Output

Natural Language Processing Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
kiva = get_data('kiva')

# Importing module and initializing setup
from pycaret.nlp import *
nlp1 = setup(data = kiva, target = 'en')

# Tuning LDA Model
tuned_lda = tune_model('lda', supervised_target = 'status')

 

Sample Output

Try this next


 

Was this page helpful?

GitHub

Send feedback