Tune Model


Tuning hyperparameters of a machine learning model in any module is as simple as writing tune_model. It takes only one mandatory parameter i.e. the model abbreviation as string. Since optimizing the hyperparameters of a model requires an objective function which is linked to target variable automatically in supervised experiments such as Classification or Regression. However for unsupervised experiments such as Clustering, Anomaly Detection and Natural Language Processing PyCaret allows you to define custom objective function by specifying supervised target variable using supervised_target parameter within tune_model (see examples below). For supervised learning, this function returns a table with k-fold cross validated scores of common evaluation metrics along with trained model object. For unsupervised learning, this function only returns trained model object. The evaluation metrics used for supervised learning are:

  • Classification: Accuracy, AUC, Recall, Precision, F1, Kappa
  • Regression: MAE, MSE, RMSE, R2, RMSLE, MAPE

The number of folds can be defined using fold parameter within tune_model function. By default, the fold is set to 10. All the metrics are rounded to 4 decimals by default by can be changed using round parameter. Tune model function in PyCaret is a randomized grid search of a pre-defined search space hence it relies on number of iterations of search space. By default, this function performs 10 random iteration over search space which can be changed using n_iter parameter within tune_model. Increasing the n_iter parameter may increase the training time but often results in a highly optimized model. Metric to be optimized can be defined using optimize parameter. By default, Regression tasks will optimize R2 and Classification tasks will optimize Accuracy. 

 

See Examples


Classification Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# Importing module and initializing setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')

# tuning LightGBM Model
tuned_lightgbm = tune_model('lightgbm')

 

Output

LGBMClassifier(boosting_type='gbdt', class_weight=None, colsample_bytree=1.0,
               importance_type='split', learning_rate=0.4, max_depth=10,
               min_child_samples=20, min_child_weight=0.001, min_split_gain=0.9,
               n_estimators=90, n_jobs=-1, num_leaves=10, objective=None,
               random_state=123, reg_alpha=0.9, reg_lambda=0.2, silent=True,
               subsample=1.0, subsample_for_bin=200000, subsample_freq=0)

Regression Example

 

Code
from pycaret.datasets import get_data
boston = get_data('boston')

# Importing module and initializing setup
from pycaret.regression import *
reg1 = setup(data = boston, target = 'medv')

# tuning Random Forest model
tuned_rf = tune_model('rf', n_iter = 50, optimize = 'mae')

 

Output

RandomForestRegressor(bootstrap=False, ccp_alpha=0.0, criterion='mse',
                      max_depth=100, max_features='sqrt', max_leaf_nodes=None,
                      max_samples=None, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=1,
                      min_samples_split=2, min_weight_fraction_leaf=0.0,
                      n_estimators=110, n_jobs=None, oob_score=False,
                      random_state=123, verbose=0, warm_start=False)

Clustering Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')

# Importing module and initializing setup
from pycaret.clustering import *
clu1 = setup(data = diabetes)

# Tuning K-Modes Model
tuned_kmodes = tune_model('kmodes', supervised_target = 'Class variable')

 

Output

KModes(cat_dissim=<function matching_dissim at 0x000001E0DC2BD288>, init='Cao',
       max_iter=100, n_clusters=25, n_init=1, n_jobs=1, random_state=123,
       verbose=0)

Anomaly Detection Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
boston = get_data('boston')

# Importing module and initializing setup
from pycaret.anomaly import *
ano1 = setup(data = boston)

# Tuning Isolation Forest Model
tuned_iforest = tune_model('iforest', supervised_target = 'medv')

 

Output

IForest(behaviour='new', bootstrap=False, contamination=0.01,
    max_features=1.0, max_samples='auto', n_estimators=100, n_jobs=1,
    random_state=123, verbose=0)

Natural Language Processing Example

 

Code
# Importing dataset
from pycaret.datasets import get_data
kiva = get_data('kiva')

# Importing module and initializing setup
from pycaret.nlp import *
nlp1 = setup(data = kiva, target = 'en')

# Tuning LDA Model
tuned_lda = tune_model('lda', supervised_target = 'status')

 

Output

LdaModel(num_terms=3862, num_topics=4, decay=0.5, chunksize=100)

Try this next