MSP Service Level Management: Define, Measure, and Enforce SLAs
Your MSP says they provide "24/7 support." What does that actually mean? Does it mean someone answers the phone 24/7, or that issues are resolved 24/7? Does it mean response within 15 minutes, or within 15 hours? Without clear definitions, "24/7 support" is a marketing claim, not a commitment.
Service level management is the discipline of defining what you will receive, measuring whether you are receiving it, and enforcing the agreement when you are not. It transforms vague promises into measurable commitments with real consequences.
The Anatomy of an SLA
Priority Levels
SLAs are structured around priority levels that define the urgency and impact of issues:
| Priority | Description | Example |
|---|---|---|
| P1 - Critical | Business-critical system down, no workaround | Server failure affecting all users |
| P2 - High | Major system impaired, limited workaround | Email service degraded |
| P3 - Medium | System impaired, workaround available | Printer not working |
| P4 - Low | Minor issue, no immediate impact | Software update request |
Important: Priority definitions must be agreed between you and the MSP. Do not accept the MSP's default definitions — they may not match your business needs.
Response and Resolution Times
For each priority level, define two timeframes:
Response time: Time from ticket creation to first meaningful action by the MSP. This does not mean resolution — it means someone has acknowledged the issue, assessed its impact, and begun working on it.
Resolution time: Time from ticket creation to the issue being fully resolved and the user confirming satisfaction.
Typical Australian MSP benchmarks:
| Priority | Response Time | Resolution Time |
|---|---|---|
| P1 | 15 minutes | 4 hours |
| P2 | 30 minutes | 8 hours |
| P3 | 4 hours | 24 hours |
| P4 | 8 hours | 72 hours |
Note: These are benchmarks, not standards. Negotiate based on your business needs. A trading floor has different requirements than an accounting firm.
Uptime Guarantees
Uptime SLAs define the availability of critical systems:
- 99.9% uptime = 8.76 hours downtime per year = 43.8 minutes per month
- 99.95% uptime = 4.38 hours downtime per year = 21.9 minutes per month
- 99.99% uptime = 52.6 minutes downtime per year = 4.38 minutes per month
Measurement methodology matters: - What constitutes "downtime"? - Is planned maintenance excluded? - How is uptime measured (synthetic monitoring, real user monitoring)? - What time zone is used for measurement?
Service Credits
Service credits are financial penalties for SLA breaches:
Structure: - Percentage of monthly fees credited per breach - Escalating credits for repeated breaches - Cap on total credits (typically 50-100% of monthly fees)
Example structure: - P1 breach: 10% of monthly fees per incident - P2 breach: 5% of monthly fees per incident - P3 breach: 2% of monthly fees per incident - Uptime below 99.9%: 5% of monthly fees per 0.1% below target
Important: Service credits should be meaningful enough to incentivise performance. Credits of 1-2% do not create accountability. Credits of 5-10% do.
Measuring SLA Performance
Data Collection
Your MSP should collect data through:
- Ticketing system — automated timestamping of response and resolution
- Monitoring tools — uptime and availability measurement
- Customer surveys — satisfaction measurement post-ticket
- Reporting tools — automated SLA calculation and reporting
Monthly Reporting
Your MSP should provide monthly SLA reports including:
- SLA compliance summary — percentage of tickets meeting each SLA
- Breach details — specific incidents that breached SLAs
- Trend analysis — performance over time
- Root cause analysis — why SLAs were breached
- Remediation actions — what is being done to improve
What to Look For in Reports
Positive indicators: - Consistent SLA compliance (>95%) - Transparent breach reporting - Trending improvement over time - Proactive root cause analysis
Red flags: - SLA compliance reported as "overall" without per-priority breakdown - Breaches explained away without remediation - Inconsistent measurement methodology - Reporting that is difficult to understand or verify
Enforcing SLAs
The Performance Conversation
When SLAs are breached, follow this framework:
- Review the data. What SLAs were breached, how often, and what was the impact?
- Understand the cause. What does the MSP say caused the breaches?
- Evaluate the response. What has the MSP done to prevent recurrence?
- Apply consequences. Invoke service credits as contractually agreed
- Set improvement targets. Define expected performance for the next period
- Document everything. Written records of breaches, discussions, and commitments
When SLAs Are Consistently Missed
If the MSP consistently fails to meet SLAs:
Month 1-2: Discuss trends, request root cause analysis and remediation plan Month 3-4: Formal escalation, invoke service credits, require executive involvement Month 5-6: Contract review, consider whether the relationship is viable Month 7+: Begin evaluating alternative providers
Escalation Procedures
Your SLA should define escalation paths:
| Escalation Level | Trigger | Who Is Notified | Timeframe |
|---|---|---|---|
| Level 1 | SLA approaching breach | Operations manager | Before breach occurs |
| Level 2 | SLA breached | Account manager + operations director | Within 1 hour of breach |
| Level 3 | Multiple breaches | MSP executive + your leadership | Within 24 hours |
| Level 4 | Systemic failure | Contract review and legal | Within 7 days |
Building Effective SLAs
Common SLA Mistakes
Vague language. "Reasonable efforts," "best endeavours," and "commercially reasonable" have no measurable standard. Avoid them.
Misaligned priorities. If P1 is defined as "all users affected," you will have many P1 tickets that are actually P2. Define priorities based on business impact, not technical scope.
Ignoring measurement methodology. If the MSP measures uptime differently than agreed, SLA data is meaningless.
No consequences. SLAs without service credits are aspirations, not commitments.
One-way SLAs. SLAs should also cover your obligations (providing accurate information, approving changes in a timely manner, etc.).
The SLA Review Process
Review SLAs annually:
- Assess performance — is the MSP meeting current SLAs?
- Evaluate relevance — do current SLAs reflect your business needs?
- Tighten targets — as the MSP understands your environment better, SLAs should improve
- Adjust priorities — as your business evolves, priority definitions should change
- Update consequences — ensure service credits remain meaningful
Related Guides
- MSP Service Delivery Metrics — What to measure beyond SLAs
- MSP Contract Negotiation Tips — Negotiating SLA terms
- MSP Account Management Best Practices — Governance and reviews
- MSP Quality Management System — Quality frameworks for service delivery
- MSP Health Score — Overall MSP performance assessment
Was this helpful?