💡 How Ewake reduced SRE operational toil for Booksy: read the article!

All articles

Ewake x Booksy case study

Case-study

·

Dec 11, 2025

·

6

min read

Context

Booksy is a global, cloud-based marketplace that connects professionals and clients in the beauty, health, and wellness sectors. Originating in Poland, the company has expanded internationally, providing a robust platform that helps entrepreneurs manage appointments, streamline operations, and grow their businesses. With a focus on reliability, scalability, and user experience, Booksy enables service providers to focus on their craft while empowering clients to easily access and book trusted professionals. The company’s mission is to create technology that supports sustainable growth across local and global communities.




A Shared Vision for Reliability

In February 2025, Ewake’s founding team reached out to Booksy’s SRE leadership, with a shared vision: to reduce the operational burden on Site Reliability Engineering (SRE) teams and improve overall service resilience. Booksy, a technology-driven organization with a large, global engineering organization with over a dozen teams, operates a complex and high-traffic platform handling a high-traffic platform handling billions of monthly requests.

At the time, Booksy maintained a strong DevOps culture and an internal group of SRE Evangelists dedicated to democratizing reliability practices among developers. However, as the platform scaled, so did operational challenges — from increasing system noise to the growing effort required to manage incidents effectively. During the initial discussions, the team shared a high volume of concurrent alerts, reflecting the high volume of signals and the toll on SREs.

Booksy’s near-term priorities included reducing alert fatigue, improving incident response efficiency, and ultimately lowering Mean Time to Recovery (MTTR). A long-term goal was to further increase DevOps maturity across teams while maintaining their strong alignment with their observability stack.

Collaboration duration

Investigations completed

Alert handled

6 months

1,439

748


Implementation & Collaboration

Building on the initial discussions, Ewake was integrated into Booksy’s existing observability and operations ecosystem. The goal was to fit seamlessly within their established workflows — leveraging their observability platform as the primary signal source, their alerting system for escalation, and their central collaboration platform as the collaboration layer. The integration also connected to their CI/CD deployment tool and feature flag management system—all without introducing friction or additional overhead for engineering teams.



To support a smooth engineering handoff and ensure Ewake fit naturally into Booksy’s operating model, the teams collaborated in two-week iteration cycles with regular syncs to capture feedback and surface emerging pain points. This rhythm enabled iterative changes to the automation logic, alert-analysis workflows, and integration touchpoints. The collaboration remained highly cross-functional, involving SRE leadership, SRE Evangelists, platform engineers, and developers to validate both technical fit and day-to-day usability. As part of the evolving tools and integration workflow, the Ewake team first integrated directly into Booksy’s alert-response flow, then—based on insights gathered—extended the integration into the canonical SRE channel to help developers quickly access reliability context and answers to SRE-related questions.


Results & Impact

Use Case 1 : SRE Support

Metrics

Value

Satisfaction ratio

64%

Number of investigations

691

Scope

SRE Team

Adoption

Ewake Globally available for the engineering teams, to help them investigate production issues faster, and answers their day-to-day questions.

Average response time

<80 seconds


Direct developer support emerged as one of the most impactful applications.


“The second use case provided the most measurable relief for our SRE team. Previously, our SREs spent a significant portion of their day answering repetitive developer questions. Ewake, acting as a support layer, was highly effective at addressing these common software engineer queries and accessing our internal knowledge base, often citing or linking to the precise documentation needed. By taking over 700 of these interactions, Ewake dramatically reduced our operational toil, freeing our SREs to focus on strategic reliability projects instead of being constantly diverted”.

Booksy SRE Team



The second use case emerged naturally during the collaboration. In one of the early discovery sessions, Booksy’s teams identified a recurring pain point: SREs were stretched thin, constantly switching contexts between handling incidents and answering day-to-day developer questions. From debugging pipeline failures to explaining deployment issues, the operational toil was high — and it often diverted focus from more strategic reliability work.


To address this, Ewake was invited to join the public SRE communication channels, becoming a direct support layer for developers. Whenever a question or issue arose — “Why did my pipeline fail?” or “What’s causing this alert?” — Ewake would step in, analyze the context, and provide data-driven answers or guidance, freeing up valuable SRE time.


Use Case 2 : Agentic Alert Response

Metrics

Value

Satisfaction ratio

52%

Number of investigations

748

Scope

Faulty deployment alerts

Adoption

Ewake automatically sends a message to the Slack alert message received by the engineering team.

The message contains the initial triage information gathered by the AI agents, aiming to reduce the investigation toil for the developers.

Average response time

<120 seconds


“When it came to automated alert response, the value was in the filtration of noise. Ewake delivered initial, low-level triage for hundreds of alerts, instantly correlating deployment data and bringing context together. However, this was not a ready-to-implement solution; for our senior SREs, this output was often too basic for critical incidents. Still, by handling the initial context-gathering for all deployment-related alerts, Ewake proved effective at significantly reducing alert fatigue across the team”.

Booksy SRE Team


Booksy’s goal was to have Ewake automatically respond to faulty deployment alerts — cases where a specific monitor indicated that a new release had triggered a failure. When such an alert fired, Ewake would instantly launch an investigation, gathering context from the observability tool, correlating with recent deployment activity, and pinpointing the likely cause of the issue.

By turning reactive firefighting into proactive diagnosis, Ewake helped the Booksy team start cutting through the noise — transforming alert floods into focused, data-backed investigations.


Lessons Learned

During the six-month Proof of Value, Ewake was deployed across Booksy’s operational workflows, automating alert investigations, supporting developers, and expanding SRE bandwidth.

By the end of the engagement, Ewake achieved a 60% satisfaction rate, meaning six out of ten responses were viewed as useful or actionable by Booksy engineers.

This confirmed:

  • The product’s fit within complex, high-scale environments

  • Its ability to reduce operational toil

  • Its value in spreading reliability best practices

  • Its potential to serve as a real reliability multiplier across teams

Together, Booksy and Ewake demonstrated how AI-driven reliability assistance can convert operational data into immediate human value  enabling teams to work smarter, respond faster, and build more resilient systems.