roles.sh/u/dan/resumes/staff-eng-platform ← back to profile v3 · generated 14m ago
Tailored résumé

For Stripe — Senior Platform Engineer

Generated for linear.app/careers/staff-eng-platform · this résumé won't show up for any other role
Edit

Daniel Bodnar

Building reliable platforms for high-throughput services.
San Francisco · CA
dan@bodnar.sh
roles.sh/u/dan
github.com/danielbodnar

Summary

Staff-level platform engineer with a decade of building high-throughput Go services and the Kubernetes substrate they run on. At HashiCorp, owned the multi-region K8s platform for Terraform Cloud (99.97% SLO, 5 regions). At Cloudflare, built bot-protection that runs at 300M req/s on the edge. I like infrastructure that's boring on purpose.

Experience

Staff Platform Engineer

HashiCorp · Terraform Cloud
Mar 2022 — Present
San Francisco
  • Owned the multi-region Kubernetes platform backing Terraform Cloud — 14 clusters across 5 AWS regions, sustained 99.97% SLO over 11 trailing quarters.
  • Designed and shipped the internal platform layer (Go + custom controllers) that 80+ application teams ship against — reduced average onboarding from 3 weeks to 4 days.
  • Led the migration off Consul-as-source-of-truth to a custom CRD-based service catalog; cut cross-cluster lookup p99 from 240ms → 18ms.
  • Wrote chaos-friday, the team's chaos-engineering practice; MTTR dropped 38% across the year we ran it.

Senior Site Reliability Engineer

Cloudflare · Workers & Bot Management
Jun 2019 — Mar 2022
San Francisco
  • Took Cloudflare Workers from late beta to GA on the SRE side; sustained 300M req/s of edge compute across 270 PoPs with sub-millisecond cold starts.
  • Built the bot-management control plane (Rust + V8 isolates) that classifies ~2T requests/day; tuned the model-serving path to keep p99 under 8ms at edge.
  • On-call for 4 customer-impacting incidents — 3 were resolved without external-facing impact thanks to fast-path circuit breakers we'd added the quarter prior.
  • Authored polite-bot, a public spec for HTTP bot identification; adopted by 6 companies and now on the IETF RFC track.

Linux Solutions Engineer

Red Hat · OpenShift Field
Aug 2017 — Jun 2019
Remote
  • Field engineer for OpenShift enterprise rollouts — embedded with 8 Fortune-100 financial and government customers; learned what "production" actually means.
  • Wrote the platform-hardening playbook still used by the field team (Ansible + CIS benchmarks), reducing avg customer-side install time from 6 weeks to 9 days.

DevOps Engineer

Stitch Fix · Data Platform (de-emphasized for this role)
Feb 2015 — Aug 2017
San Francisco
  • First infrastructure hire on the data platform team. Built and operated Spark/Airflow infra for ~120 analysts and ML engineers.

Skills

Languages
Go, Rust, Python, TypeScript, Bash
Orchestration
Kubernetes (CKAD, CKS), Nomad, custom CRD controllers
Cloud
AWS (EKS, IAM, VPC, KMS), GCP, bare-metal Linux at scale
Infra-as-Code
Terraform, Pulumi, Crossplane, Ansible
Observability
Prometheus, Grafana, OpenTelemetry, Honeycomb, Tempo
Distributed
Consensus, leader election, leaderless replication, queues, exactly-once semantics

Education & Notes

  • B.S. Computer Science · University of Waterloo · 2014
  • Open-source: maintain kontain (4.2k stars), publish linuxlife.sh (18k subscribers).
  • Speaker at SREcon '22, KubeCon EU '23 ("The platform that fits in your head").