2026-04-24
Session 23: God-class drain: `predict_model`
Engineering log for session 23.
Baseline: session 22 drained the 4 persistence verbs. Session 23 drains the 5th OOP verb — predict_model. The heart of the rewrite is a task-aware dispatch that handles classification / regression / clustering / anomaly without ever touching self._legacy.
CHANGED — engine#
CHANGED, BREAKING—packages/engine/pycaret/core/experiment.py—Experiment.predict_model. No longer delegates toself._legacy.predict_model. Rewritten as ~170 LoC of native dispatch. Signature slimmed from(estimator, *args, **kwargs)to(estimator, data=None, *, raw_score=False, round=4, verbose=False). All 3.x-era params are gone:probability_threshold— removed. Callers thresholding on positive-class probability can doout["prediction_label"] = out["prediction_score"] >= t.encoded_labels— removed. Label encoding happens inside the preprocessor; for integer labels,out["prediction_label"].map(class_to_int)is a one-liner.preprocess— removed. In 4.0 the pipeline either preprocesses itself (Pipeline case) or we applyself.preprocess_pipelineautomatically (bare estimator case, transitional).ml_usecase— removed. Comes fromself.taskdirectly now.
CHANGED— Transitional bare-estimator accommodation.CreateResult.pipelinetoday is a bare sklearn estimator (e.g.LogisticRegression), not a Pipeline. Oncecreate_model's drain lands (session 24), it becomes a proper Pipeline with preprocessing baked in, and the transitional code-path inpredict_modelcollapses to a one-lineestimator.predict(X). Flagged in the docstring + guarded byisinstance(estimator, sklearn.pipeline.Pipeline).CHANGED— Metric computation inlined. Nativepredict_modeluses the existing metric registry (pycaret.containers.metrics.{classification,regression}.get_all_metric_containers+pycaret.utils.generic.calculate_metrics) directly — no longer depends onself._legacy.pull()to surface the holdout metrics DataFrame. Wraps the whole metric block in a broad try/except → returnsNoneon any registry hiccup. Rationale: metrics are advisory; a predict must never fail because a metric choked.CHANGED— Per-task output columns:- Classification binary →
prediction_label+prediction_score(positive-class probability). - Classification multiclass,
raw_score=False→prediction_label+prediction_score(winning-class probability). - Classification multiclass,
raw_score=True→prediction_label+prediction_score_<class>per class. - Regression →
prediction_labelonly. - Clustering →
Clustercolumn with"Cluster {i}"labels. - Anomaly →
Anomaly+Anomaly_Score(when the detector exposesdecision_function).
- Classification binary →
ADDED — tests#
ADDED—packages/engine/tests/test_session23_predict.py— 12 new tests, split by speed:- Fast (7 tests) — fabricate a tiny
StandardScaler + {LogReg,LinReg}pipeline + a fit-sentinel Experiment; exercise raw predict paths. Confirms: non-estimator rejection (TypeErroron dict), NotFittedError without fit, metrics absent when data lacks target, metrics present + model name set when data has target, regression has no score column, multiclassprediction_scoreis winning-class prob, multiclassraw_scoresums to ~1 per row. - Slow (5 tests, @slow marker) — full engine E2E on
juice/boston. Covers binary classification output columns, classificationraw_score, regression output + metrics, event stream capturesMODEL_PREDICTEDwithn_rows+duration_ms, and the drain-lock test (test_predict_model_does_not_call_legacy_predict_model).
- Fast (7 tests) — fabricate a tiny
ADDED— Drain-lock test pattern. Monkeypatchesexp._legacy.predict_modelwith a raise-on-call function, then callsexp.predict_model(pipeline)and asserts it succeeds. Any future refactor that accidentally re-delegates will fail on the Ubuntu + Windows matrix. Same shape as session 22'stest_save_model_does_not_touch_legacy.
INTERNAL#
INTERNAL— Why bare estimators are still accepted (temporarily). The pyramid of the 10-verb drain issave_model → predict_model → create_model → .... Sessions 22–23 drain verbs that consume the output ofcreate_model.create_modelstill returns a bare estimator today, so strictly rejecting that inpredict_modelwould red-light the slow E2E suite (and would also break every notebook in the wild). Session 24 drainscreate_model, replacing the returned bare estimator with a Pipeline. At that point the transitional branch inpredict_model(+ theself.preprocess_pipelinetransform call, + theestimator_is_pipelinecheck) can be deleted in a follow-up cleanup commit.INTERNAL— Task dispatch viaself.tasknotisinstance. The legacy code usedself._ml_usecase(aMLUsecaseenum on the god-class). The native code readsself.task(apycaret.core.tasks.TaskTypeenum on the Experiment). Cleaner becauseTaskTypealready lives on the 4.0 surface; no import needed from the internal module.INTERNAL— Metric registry called with emptyglobals_dict.get_all_metric_containers(globals_dict, raise_errors=False)— we pass{}forglobals_dictbecause the default-behavior metrics don't need the legacy experiment's variables. If any metric registers viaglobals_dictreads, theraise_errors=Falsemeans it gets silently skipped rather than crashing the predict. This is consistent with the legacy behavior (which also had try/except aroundcalculate_metricscalls for robustness).
Session 23 delta summary#
| Metric | Session 22 end | Session 23 end |
|---|---|---|
OOP verbs still on self._legacy | 6 | 5 |
| Engine tests (fast + slow) | 35 | 51 |
| Combined tests | 181 | 197 |