About the Role
We’re building structured ML on top of a contract-driven signal stack: co-timing → validity gating → nuisance
cancellation → residual dynamics learning. This role lives in the middle: make the models train well.
You own the optimization and training craft needed to turn promising dynamics models into stable, high-performing
systems: hyperparameter search, training diagnostics, ablations, compute efficiency, and evidence-based answers
to “why didn’t it converge?”
This is not “try 1,000 random configs.” It’s disciplined experimentation with controlled comparisons and tight reporting.
What You’ll Own
-
Hyperparameter optimization: systematic searches (Bayes opt / bandits / PBT where appropriate) with reproducible configs.
-
Training stability: diagnose divergence, pathological gradients, stiffness, numerical instability, and non-identifiability.
-
Ablation discipline: prove which architectural/feature choices matter via controlled ablations.
-
Compute efficiency: profiling, batching, mixed precision/compile modes; keep training and inference costs bounded.
-
Model selection criteria: ship-ready criteria beyond loss curves: calibration, robustness across regimes, abstention behavior, failure modes.
What You’ll Do
-
Build stable training recipes for structured dynamics models (and strong baselines) on residual datasets.
-
Develop training diagnostics: gradient norms, loss decomposition, sensitivity to window length/sampling rate/validity weighting.
-
Run hyperparameter studies with receipts (seeds, dataset hashes, gate definitions, code versions) and interpret results.
-
Stress-test generalization under regime shifts and across sites/zones; handle validity collapse scenarios where usable data shrinks.
-
Collaborate with Scientific ML, Applied Statistics, and MLOps on targets, evaluation, and pipeline integration.
Concrete Deliverables
-
A tuning and training framework (v1) integrated with receipts (configs, seeds, datasets, metrics).
-
Stable training baselines for residual dynamics (AR/VAR/state-space) plus at least one structured model that converges reliably.
-
An ablation report template and first ablation suite (which state features matter, which regularizers help, what’s brittle).
-
A compute profile + optimization plan with bottlenecks identified and mitigations implemented.
-
A model selection rubric tied to pilot value (lead time, false alarms, calibration, robustness thresholds).
Required Qualifications
-
Strong experience training ML models with nontrivial optimization challenges (time-series, dynamical systems, or physics-informed models).
-
Demonstrated skill in hyperparameter optimization and experiment design (not just running sweeps—interpreting them).
-
Strong Python + PyTorch/JAX (or equivalent) proficiency and ability to write clean, testable training code.
-
Practical numerical instincts: stability, step sizes, stiffness, normalization, failure-mode debugging.
Preferred Qualifications
- Experience with Neural ODEs / continuous-depth models, stiff solvers, adjoint methods, or related numerical methods.
- Familiarity with structured dynamics inductive biases (energy/Lagrangian/Hamiltonian styles) where practical.
- Experience operating tuning infrastructure at scale (distributed training, scheduling, spot instances).
- Experience with robustness evaluation and uncertainty calibration under nonstationarity.
How You’ll Be Measured (First 60–90 Days)
-
A structured model (or closest practical equivalent) trains reliably on at least one pilot dataset and beats baseline on a meaningful metric.
-
Hyperparameter studies become reproducible and interpretable (not “we tried a bunch of stuff”).
-
Training failures are diagnosable: you can explain why a run diverged and what fixes it.
-
Compute costs come under control (clear profiles, faster iterations, fewer wasted sweeps).
Working Style
- You prefer controlled experiments over “try everything.”
- You treat convergence as an engineering problem with instrumentation and receipts.
- You like turning training folklore into repeatable playbooks.
Title & Level
ML Research Engineer (Optimization & Training) (senior IC; can scale to Staff if owning experimentation/tuning platform),
partnering with Scientific ML, Applied Statistics, and MLOps.
Apply
Send a short note and your resume.