HermodLabs — Home Page

About the Role

Our platform is built on a hard rule: don’t learn on lies. We co-time streams, gate windows for validity, cancel nuisances, and only then let ML learn patterns in the residual. ML is not a standalone sandbox—it’s part of a contract-driven system that must be auditable end-to-end.

The ML Platform / MLOps Engineer makes this real by building the infrastructure that turns ML into a reliable product: reproducible pipelines, strict lineage, model versioning, safe deployments, and monitoring that tells us when a model should abstain, retrain, or be rolled back.

What You’ll Own

Reproducible ML pipelines: deterministic training + evaluation workflows that can be rerun from a receipt.
Dataset lineage + artifacts: dataset versioning/hashing, feature snapshots, artifact storage tying models to exact inputs.
Model registry + release process: versioning, promotion (dev → stage → prod), canary/rollback, “what model is serving where?”
Inference plumbing: batch + near-real-time scoring, attaching receipts (model version, data version, validity gates) to predictions.
ML observability: monitoring for data drift, model drift, performance decay, calibration degradation, validity-uptime regressions.
Governance guardrails: enforce “train only on valid windows,” require evaluation receipts, block un-audited models from production.

What You’ll Do

Build a standard ML workflow: residual datasets → validity filters/weights → train + evaluate → package → register + deploy → monitor + retrain/rollback.
Implement receipt generation for ML: dataset snapshot hashes, preprocessing code version, hyperparameters, seeds, gate definitions, eval suite version.
Create CI/CD for ML: automated backtests, metric gates, reproducibility checks, scheduled retraining jobs (when appropriate).
Build drift + health monitoring: feature drift per site/zone, validity uptime changes, calibration/alert-quality proxies, confidence calibration checks.
Support portal integration: prediction APIs, model metadata endpoints (“why trust this?”), explanations + evidence artifacts.
Work with scientific ML and backend/data teams to keep schemas evolution-safe and pipelines robust under change.

Concrete Deliverables

A working ML pipeline skeleton: one workflow from dataset → model → evaluation report → registered artifact.
A model registry + promotion flow with staged rollouts and rollback support.
An inference scorer/service that writes predictions + receipts to the backend store.
A monitoring dashboard: data drift, model drift, calibration health, retraining triggers, validity-uptime regressions.
A policy gate that enforces: no training/inference without validity metadata, no deploy without evaluation receipt.

Required Qualifications

Strong experience in MLOps / ML platform engineering: training pipelines, model packaging, deployment, monitoring.
Experience with experiment tracking and artifact management (MLflow, W&B, SageMaker/Vertex, or self-hosted equivalents).
Solid backend/data engineering skills: APIs, queues, storage patterns, schema evolution, observability.
Comfort with cloud infrastructure and CI/CD: containers, orchestration, secrets management, reliable automation.
Ability to reason about reproducibility and leakage risks in time-series ML.

Preferred Qualifications

Experience with time-series ML in production (drift is the default state).
Familiarity with governed ML: model cards, evaluation gates, audit trails, compliance-friendly logs.
Experience with multi-tenant SaaS constraints and cost control (per-customer models vs global models).
Comfort with probabilistic outputs and calibration monitoring.
DSP/measurement intuition helpful for validity uptime and abstention semantics (not required).

How You’ll Be Measured (First 60–90 Days)

A training run is fully reproducible from a receipt (another engineer can rerun and get the same artifact).
The first pilot ML feature can be deployed with versioned artifacts, attached prediction receipts, rollback capability, and monitoring.
Drift/health dashboards exist and catch at least one meaningful issue (data shift, validity collapse, calibration drift).
ML releases become safer and faster because pipeline gates prevent silent regressions.

Working Style

You treat reproducibility as a product feature.
You prefer boring, deterministic pipelines over clever notebooks.
You build guardrails that let the team move fast without breaking trust.

Title & Level

ML Platform / MLOps Engineer (Receipts + Model Governance) (senior IC; can scale to Staff owning the ML platform architecture), partnering with Scientific ML, backend/data, validation, and product/UI.

Apply

Send a short note and your resume.