2026-05-16
Session 57: Phase 0: publishable foundation (secrets lockdown + docker compose polish + README + vision doc)
Engineering log for session 57.
Maintainer goal: get the repo into a state where git clone && docker compose up works for a brand-new contributor, with categorical guarantees that no secrets can leak via git. Phase 0 of the long roadmap toward a Databricks/Snowflake-style self-hosted ML platform.
ADDED — Secrets-leak prevention (categorical, layered)#
ADDED—.gitignorebroadened to cover ALL local DB / credential / env patterns, not just the specific filenames previously listed. New entries:*.db,*.db-journal,*.db-wal,*.db-shm,*.sqlite*(anywhere in tree),.env+.env.*(with.env.exampleallow-listed),*.pem/*.key/credentials.json/service-account*.json/aws-credentials*, plus the/data/docker-compose volume mount point. Previously only/pycaret.dband/services/api/pycaret.dbwere covered, which leftpycaret-dev.db,smoke.db, andtest_phase_smoke.dbexposed to a carelessgit add -A.ADDED—scripts/check-secrets.sh— pre-push gate (works manually too). Scans for Anthropic / OpenAI / Stripe / Slack / GitHub / AWS / Google API key shapes, Fernet ciphertext blobs (≥40-char base64 afterENC:v1:), and PEM private key blocks. Whole-file allow-list atscripts/.secrets-allowlist; single-line# pragma: allow-secret. Hook installation:cp scripts/check-secrets.sh .git/hooks/pre-push && chmod +x .git/hooks/pre-push. Validated clean on the current 527-file tree.ADDED—.env.exampleat repo root — documented template for every overridable env var. Commented-out by default so contributors know what's available without polluting their actual.env.
ADDED — docker compose up from repo root#
ADDED—compose.ymlat repo root. Single-file 2-service stack (api + web), named volumepycaret-datafor SQLite + artifacts + Fernet key persistence. Every env var defaults to a sane local-dev value via${VAR:-default}substitution, all overridable via root.env. Healthchecks wired;webwaits forapi: service_healthybefore starting.ADDED—infra/docker/docker-entrypoint.shfor the api container. On first run: generates a Fernet key + persists to/data/.secrets/fernet.key(chmod 600). On subsequent runs: reads from the same file. The Fernet key thus survives container restarts inside the data volume, fixing the silent "secrets become unreadable after restart" trap we hit live in session 55. Hand-off to uvicorn viaexecso docker stop signals reach the server cleanly.CHANGED—infra/docker/Dockerfile.apinow copies the entrypoint script and sets it asENTRYPOINT. The actual server command moves toCMD(uvicorn invocation unchanged).REMOVED—infra/docker/docker-compose.ymldeleted. Two-source-of-truth was confusing; the rootcompose.ymlis canonical now. The prod variant atinfra/docker/docker-compose.prod.ymlstays (it's a different document — Postgres + S3 + replicas).
ADDED — Public-facing docs#
ADDED—docs/revamp/PLATFORM_ARCHITECTURE.md— vision document. Captures the pluggable backends foundation pattern: every external dep (storage / DB / secrets / auth / queue / compute / notifier) sits behind aProtocolwith local + cloud impls; choice is config, neverif AWS:branches. Includes: the 7-slot matrix with current state, the proposedBackendscontainer dataclass, AWS Terraform target architecture diagram, 5-phase rollout, explicit no-list. Anchor for every future session — "before you hardcode an external dep, read this."CHANGED—README.mdrewritten as the publishable face of the repo. Five-minutedocker compose upquickstart, golden-path tour (Train → Register → Deploy → Predict), local-dev-without-docker instructions, security section explainingcheck-secrets.sh, troubleshooting table, contributor flow. The architecture sections were softened to honestly describe whatdocker compose upACTUALLY runs today (single uvicorn process holding API + scheduler + compute + SQLite file inside the api container, same shape as Plausible / Vaultwarden / n8n self-hosters) versus the target enterprise-scale split (separate api / worker / runtime / RDS / S3 / SQS) that lives in PLATFORM_ARCHITECTURE.md as Phase 1-3 work still ahead. Don't over-promise; ship what we have, grow in the open.
ADDED — .gitattributes for cross-platform line endings#
ADDED—.gitattributesat repo root. Forces LF line endings on*.sh,Dockerfile*,*.yml,*.py, etc. in the index, regardless of the contributor's OS. Prevents the classic Windows-cloned-repo failure where git auto-convertsdocker-entrypoint.shto CRLF and the Linux container then errors with/bin/sh^M: bad interpreter. Validated locally:file scripts/check-secrets.shreportsUnicode text, UTF-8 text executable(no CRLF marker).
TESTS — Validation done before claiming done#
TESTS—bash scripts/check-secrets.sh— clean across all 527 tracked files (initial run flagged 4 false positives on doc-mentions of the literal stringENC:v1:, fixed by requiring ≥40 chars of base64 after the prefix in the pattern).TESTS—docker compose config— validates without error. CORS env var renders as a properly-quoted JSON string after fixing the${...:-default}quoting trap.TESTS— Not validated yet: actualdocker compose up --buildfrom clean. The maintainer will run that as the smoke test in their own environment per the README quickstart.