← Back to blog
2026-04-22

Session 2: Phase 4 (Engine Architecture) kickoff

Engineering log for session 2.

Baseline: end of session 1. Environment: unchanged (Python 3.13.13, sklearn 1.7.2, NumPy 2.3.5, pandas 2.x).

Theme: the user called out that the 3.x OOP-on-top-of-functional layering was "a piece of hack shit." Session 2 rebuilds the engine core as a proper sklearn-composable object graph — pycaret.core.Experiment is a real BaseEstimator subclass — while the legacy code paths stay intact under the new class (delegation) so the notebook golden path never breaks during the migration.

ADDED#

  • ADDEDpycaret.core package. New engine primitives:
    • pycaret/core/tasks.pyTaskType str-enum (CLASSIFICATION, REGRESSION, CLUSTERING, ANOMALY, TIME_SERIES) with is_supervised / is_classification / is_regression helpers.
    • pycaret/core/errors.pyPyCaretError hierarchy (ConfigurationError extends ValueError, NotFittedError extends RuntimeError, UnknownModelError / UnknownMetricError extend KeyError) so UI/agent callers can catch engine errors distinctly from upstream ones.
    • pycaret/core/results.py — frozen dataclasses CompareResult, CreateResult, TuneResult, EnsembleResult, BlendResult, StackResult, CalibrateResult, FinalizeResult, PredictResult — the typed return shape of every verb. Each carries fitted pipeline, metrics DataFrame, event trace.
    • pycaret/core/state.pycurrent_experiment(), set_current_experiment(), reset_current_experiment(), require_current_experiment() backed by contextvars.ContextVar. Thread- and async-safe replacement for the 3.x module-level global.
    • pycaret/core/experiment.pyExperiment(BaseEstimator) base class. Implements get_params, set_params, __sklearn_tags__, __sklearn_is_fitted__, fit(X, y=None, **setup_kwargs). Verbs (compare_models, create_model, tune_model, ensemble_model, blend_models, stack_models, calibrate_model, finalize_model, predict_model, plot_model, interpret_model, evaluate_model, automl) delegate to a legacy _SupervisedExperiment held as self._legacy during the transition; each returns a typed result dataclass.
  • ADDEDpycaret.logging package. Replaces the 3.x tracker-adapter concept with a lean structured event stream designed for React UI / LLM agent consumption:
    • pycaret/logging/events.pyEvent frozen dataclass and EventKind str-enum with 22 canonical kinds (experiment.started, model.created, model.compared, model.tuned, etc.). Event.to_dict() produces a JSON-serializable dict.
    • pycaret/logging/base.pyBaseLogger hook interface with log() / emit() + subscribe(callback) for UI fan-out. NullLogger (default silent) and the 3.x no-op shim methods (log_experiment, log_model, log_model_comparison, log_plot, log_params, log_metrics, log_artifact, log_hpram_grid, log_sklearn_pipeline, init_logger, init_experiment, finish_experiment, set_tags, .loggers property) so legacy god-class calls through a PyCaret 4.0 BaseLogger instance continue to work.
    • pycaret/logging/memory.pyMemoryLogger: thread-safe in-memory buffer with optional JSONL file teeing (flushed after every write so a UI can tail). events property, as_jsonl() method, clear().
  • ADDEDpycaret.api package — agent+UI introspection surface. All functions return JSON-serializable dataclasses:
    • pycaret/api/cards.pyParameterCard, ModelCard, MetricCard dataclasses + ParameterKind str-enum (BOOL, INT, FLOAT, STRING, ENUM, LIST, COLUMN, COLUMNS, MODEL_ID, METRIC_ID, UNKNOWN) — widget hints for a React form.
    • pycaret/api/schemas.pySetupParamSchema dataclass grouping ParameterCards.
    • pycaret/api/describe.pylist_models(task) (19 classification cards, 26 regression cards curated from the legacy containers), describe_model(task, id), list_metrics(task), describe_setup_params(task) (13 common params organised into groups: Data / Experiment / Cross-Validation / Preprocessing / Compute / Logging), list_available_models(experiment) (runtime-aware: flags is_available=False when a model's package isn't installed).
  • ADDEDpycaret.tasks package — task-specific experiment subclasses.
    • pycaret/tasks/classification.pyClassificationExperiment(Experiment) pre-configures task=CLASSIFICATION, sets estimator_type="classifier" in __sklearn_tags__, and explicitly declares all 15 init parameters on the concrete class (rather than via **kwargs) so that sklearn's get_params() introspection surfaces every configured knob.
  • ADDEDEnd-to-end proof the new stack works. ClassificationExperiment(target="Purchase").fit(data).compare_models().predict_model(result.best) on the juice dataset: fitted in 1.3s, compared 3 models in 8.3s, emitted 5 typed events through the logger, returned CompareResult / PredictResult dataclasses. Captured in the new-architecture test suite.
  • ADDEDtests/test_core_architecture.py — 17 fast unit tests (0.2s) covering every new primitive: TaskType enum, error hierarchy, frozen result dataclasses, event JSON round-trip, MemoryLogger (captures / subscribers / file teeing), BaseLogger no-op compat methods, ModelCard/MetricCard/ParameterSchema introspection, describe_model raises UnknownModelError for bad ids, ClassificationExperiment is sklearn-cloneable, declares classifier tag, ContextVar state. All 17 pass.

CHANGED#

  • CHANGEDpycaret/loggers/base_logger.py is now a thin re-export shim over pycaret.logging.base.BaseLogger. The full BaseLogger lives in pycaret/logging/base.py. User subclasses of pycaret.loggers.base_logger.BaseLogger (a 3.x import path) still work unchanged.
  • CHANGEDpycaret/loggers/__init__.py re-exports only BaseLogger from the new location. Previously exported 5 symbols (all removed in session 1).

DOCS#

  • DOCSdocs/revamp/ARCHITECTURE.md — new design doc explaining the 4.0 engine architecture: why the 3.x layering was broken, the 8 core design principles (sklearn-canonical BaseEstimator, typed results, build-on-not-replace Pipeline, ColumnTransformer-based preprocessor, canonical search CV, event-stream logger, no prints/interactive), the package layout, the full Experiment interface contract, task-subclass pattern, result/event contracts, multi-session migration plan, and explicit non-goals.
  • DOCSRelease-notes section updated to reflect Phase 4 kickoff and the new package boundaries.

INTERNAL#

  • INTERNALTransition pattern documented. During the multi-session migration, Experiment._legacy holds an instance of the legacy _SupervisedExperiment (task-specific subclass picked by _build_legacy_experiment). Every verb on the new Experiment calls through to self._legacy.<verb>(...), wraps the legacy return in a typed result dataclass, and emits structured events. Future sessions replace the delegation bodies one verb at a time without breaking the public API.
  • INTERNALsklearn compatibility validated. ClassificationExperiment(...).get_params() returns all 15 init params; sklearn.base.clone(exp) preserves configuration and resets fitted state; __sklearn_tags__() surfaces estimator_type="classifier".

TESTS#

  • TESTS17 new unit tests added under tests/test_core_architecture.py. Fast (< 0.5s total); designed to run on every CI matrix entry.
  • TESTSNo regressions on the legacy subset. pytest tests/test_models.py tests/test_datasets.py tests/test_core_architecture.py — 23/23 pass.
  • TESTSEnd-to-end proof via the new stack. Full fit → compare_models → predict_model golden path runs green on Python 3.13 + sklearn 1.7.2 + NumPy 2.3.5 using pycaret.tasks.ClassificationExperiment.