DORA Metrics in Practice guide showing CI/CD pipeline instrumentation, deployment frequency tracking, lead time measurement, MTTR monitoring, and DevOps performance dashboards for Platform Engineering teams.

DORA Metrics in Practice: Instrument Your Pipeline and Actually Move the Numbers 

Share this post on:

The short answer: Most teams track DORA metrics wrong, they report numbers without instrumenting the right data sources. This guide covers exactly what to instrument in your CI/CD pipeline, how to pull MTTR from PagerDuty, and includes a Grafana dashboard JSON you can deploy today. This is a Platform Engineering implementation guide, not another definition post. 

Why Your DORA Numbers Are Lying to You 

If your DORA metrics live in a spreadsheet or someone manually updates them in a weekly standup, they’re already wrong. Real DORA measurement requires automated instrumentation at four specific points. Here’s what actually matters and where to wire it in. 

The Four Metrics and What to Actually Instrument 

1. Deployment Frequency 

What most teams do: Count merged PRs. What you should do: Count successful production deployments. 

In GitHub Actions, emit a deployment event only when your production job completes with success status: 

yaml 
- name: Track Deployment 
  if: success() 
  run: | 
    curl -X POST $METRICS_ENDPOINT \ 
      -d '{"event":"deployment","status":"success","timestamp":"'
      "$(date -u +%Y-%m-%dT%H:%M:%SZ)"'"}'

Push this to Prometheus using a Pushgateway. Label by service and environment, not just team so you can drill down in Grafana. 

2. Lead Time for Changes 

What to measure: Time from first commit on a branch to that commit running in production. 

  • Pull commit.author.date from GitHub API when the PR is opened 
  • Record deployment.completed_at from your pipeline 
  • Delta = Lead Time 

Store both timestamps in a time-series database. A common mistake is measuring from PR merge, which hides slow review cycles, exactly the bottleneck you need to see. 

3. Change Failure Rate 

Formula: Failed deployments ÷ Total deployments 

Flag a deployment as failed when: 

  • A rollback is triggered within 1 hour of deploy 
  • A PagerDuty incident is created and linked to the deployment window 

Link PagerDuty incidents to deployments using deployment markers. If an incident opens within your deployment window (configurable, start with 60 minutes), mark that deployment as a failure automatically. 

python 

def is_change_failure(deploy_time, incidents): 
    window = timedelta(minutes=60) 
    return any( 
        deploy_time <= i['created_at'] <= deploy_time + window 
        for i in incidents 
    )

4. MTTR — Pull This From PagerDuty, Not From Memory 

MTTR (Mean Time to Restore) is where most teams have the worst data quality. The fix: use PagerDuty’s API directly. 

Steps to automate MTTR tracking: 

  1. Connect PagerDuty to your metrics pipeline via webhook or scheduled API pull 
  2. For each resolved incident, calculate: resolved_at – triggered_at
  3. Filter to P1/P2 incidents only (noise from P3/P4 distorts your elite vs high performer classification)
  4. Push to Prometheus with service labels   
bash 

# Pull resolved incidents from PagerDuty API 
curl -H "Authorization: Token token=$PD_API_KEY" \ 
  "https://api.pagerduty.com/incidents?statuses[]=resolved&since=2024-01-01"

Map the created_at to resolved_at gap per incident. Average this weekly, not monthly, monthly averages hide regression patterns. 

What to Fix First (Priority Order) 

  1. Instrument deployment events : everything downstream depends on this being accurate
  2. Wire PagerDuty MTTR : highest ROI for leadership visibility
  3. Add lead time tracking : exposes review bottlenecks most teams ignore
  4. Calculate CFR last : needs both deployment and incident data to be clean first    

Common Mistakes That Kill Your Data Quality 

  • Counting PR merges as deployments : only production deployments count 
  • Including all PagerDuty incidents in MTTR : filter to production, filter to severity 
  • Measuring monthly averages : use weekly; monthly hides regressions 
  • No deployment markers in your APM/incident tools : without these, you can’t link incidents to specific deploys 
  • Manual data entry anywhere in the chain : automate or the data becomes political, not factual 

The Outcome You’re Actually After 

DORA metrics are not a reporting exercise. They’re a feedback loop. When your pipeline emits deployment events automatically, when PagerDuty MTTR flows into Grafana without human intervention, and when your Grafana dashboard shows real-time state, you stop debating whether you’re improving and start seeing exactly where the constraint is. 

That’s the difference between tracking DORA and using DORA. 

Need help instrumenting your CI/CD pipeline and building a Platform Engineering practice that actually moves these numbers? See how 200OK Solutions approaches Platform Engineering → 

FAQ 

Q. How often should I review DORA metrics?  

A. Weekly at the team level, monthly at the leadership level. Weekly cadence surfaces regressions before they compound. 

Q. Which DORA metric should I fix first?  

A. Deployment frequency, it’s the leading indicator. Low deployment frequency almost always causes poor scores across the other three. 

Q. Can I track DORA metrics without PagerDuty?  

A. Yes. Any incident management tool with an API works (Opsgenie, VictorOps, even Slack-based on-call workflows). The logic is the same: capture incident_start and incident_resolved timestamps automatically. 

Q. What’s a realistic MTTR target for a team starting out?  

A. Under 24 hours is “high performer” by DORA standards. Start there before chasing the elite threshold of under 1 hour. 

Q. Does DORA apply to non-SaaS products?  

A. Yes, with adjustments. Deployment frequency maps to release cadence. The instrumentation approach is the same; the thresholds for “elite” may differ based on your deployment model. 

You may also like : GraphQL Federation vs REST Gateways : Which Wins?

Avatar photo

Piyush Solanki

PHP Tech Lead & Backend Architect

10+ years experience
UK market specialist
Global brands & SMEs
Full-stack expertise

Core Technologies

PHP 95%
MySQL 90%
WordPress 92%
AWS 88%
  • Backend: PHP, MySQL, CodeIgniter, Laravel
  • CMS: WordPress customization & plugin development
  • APIs: RESTful design, microservices architecture
  • Frontend: React, TypeScript, modern admin panels
  • Cloud: AWS S3, Linux deployments
  • Integrations: Stripe, SMS/OTP gateways
  • Finance: Secure payment systems & compliance
  • Hospitality: Booking & reservation systems
  • Retail: E-commerce platforms & inventory
  • Consulting: Custom business solutions
  • Food Services: Delivery & ordering systems
  • Modernizing legacy systems for scalability
  • Building secure, high-performance products
  • Mobile-first API development
  • Agile collaboration with cross-functional teams
  • Focus on operational efficiency & innovation

Piyush Solanki is a seasoned PHP Tech Lead with 10+ years of experience architecting and delivering scalable web and mobile backend solutions for global brands and fast-growing SMEs.

He specializes in PHP, MySQL, CodeIgniter, WordPress, and custom API development, helping businesses modernize legacy systems and launch secure, high-performance digital products.

He collaborates closely with mobile teams building Android & iOS apps, developing RESTful APIs, cloud integrations, and secure payment systems. With extensive experience in the UK market and across multiple sectors, Piyush Solanki is passionate about helping SMEs scale technology teams and accelerate innovation through backend excellence.

    Reach Out Us


    Your name

    Your email

    Subject

    Your message