Wednesday 4 March 2026

How Sawt Helped Floward Deliver 2 Million Flowers in a Single Day

On February 14th - the highest-pressure day in Floward's history - Sawt's system handled thousands of calls, 91.5% of them without any human intervention. Equivalent in output to a team of 120 agents. Customer satisfaction exceeded 93%.

"During our most demanding season, our partnership with Sawt allowed us to handle an unprecedented volume of orders with full operational stability - while maintaining the quality of experience Floward is known for."

- Mohammed Al-Arifi, COO, Floward

Floward is the leading flowers and gifts platform in the Middle East. Founded in 2017, the company set out to make sending a gift as enjoyable as receiving one. Today it operates across 3 continents.

What appears to the recipient as a well-arranged gift at their door is the product of a dense network of coordinated operations, all working toward one commitment: every order arrives on time, at the quality it promises.

On a normal day, this runs smoothly. On February 14th, the load is an order of magnitude higher.

The Challenge

The Day Everything Scales

🌹
Valentine's

🌹
Valentine’s

FEB

Floward's highest-volume day on record

+2M

FLOWERS

1,400+

STAFF

400+

CS REPS

Every year, the same problem. Demand multiplies in a single day with no tolerance for delays. Quality can't slip. Costs can't balloon. And at the center of the operational puzzle is a detail that's easy to miss: the person placing the order is almost never the person receiving it. No order can move forward until the recipient's delivery address is confirmed.

The recipient gets a notification asking for their location. If they don't share it before the cutoff window closes, the order is cancelled. That affects three people: the sender who planned the surprise, the recipient who expected it, and Floward, which absorbs the cost of an unfulfilled order.

Temporary staff. For one day a year.

The traditional solution was to hire more than 300 temporary staff just to handle Valentine's Day volume. A significant cost in recruiting, training, and coordination - repeated from scratch every year.

300+

Temporary staff. For one day a year.

300+

The Solution

Sawt: An Operational Platform, Not a Bot

Floward decided to stop solving the same problem every year. They partnered with Sawt, an AI platform for customer communication management, to run a full slice of their operations through a system that understands their business and acts on that understanding. The brief wasn't to build a bot that makes calls. It was to deploy something that operates with the same judgment a well-trained team would.

The voice agent that speaks with customers is one output of that platform. It runs inside it. It is not the platform itself.

A / AGENT OPERATING PROCESS

Building the Floward Agent: From Operational Understanding to First Call

Before the agent made a single call, the Sawt team spent time inside Floward's operations. How does an order get created? When does the recipient notification go out? What happens when someone doesn't respond? What situations come up daily that the customer service team has learned to handle?

That understanding gets translated inside the platform into a structure called Journeys. Each Journey is a complete operational scenario with five components: a Procedure that defines exactly what the agent does at each step, Policies that set hard limits on what it must never do, Data preloaded before the call begins, Tools it can use mid-conversation (like sending a WhatsApp location link or querying the order system), and Handoffs that govern when and how the agent escalates to a human.

Policy | السياسات

Never disclose the sender's identity. Three attempts maximum. Transfer to a human if the customer asks.

Procedure | خطوات المكالمة

Call sequence: greet, verify, request address, send location link, log the outcome

Data | البيانات

Order details, city, delivery window, and location-sharing status - loaded before the call begins

TOOLS | الأدوات

Send a WhatsApp location link, look up order data, write the confirmed address to the CRM

Handoffs | نقاط تحويل

If the customer requests a transfer, or declines twice to share their location, the agent hands off automatically - passing full conversation context to the human taking over

Policy | السياسات

Never disclose the sender's identity. Three attempts maximum. Transfer to a human if the customer asks.

Procedure | خطوات المكالمة

Call sequence: greet, verify, request address, send location link, log the outcome

Data | البيانات

Order details, city, delivery window, and location-sharing status - loaded before the call begins

TOOLS | الأدوات

Send a WhatsApp location link, look up order data, write the confirmed address to the CRM

Handoffs | نقاط تحويل

If the customer requests a transfer, or declines twice to share their location, the agent hands off automatically - passing full conversation context to the human taking over

Above the Journeys sits a layer called Shared Instructions: stacked system-level directives that shape the agent's behavior across all scenarios. For Floward, that meant three layers, each scoped to a different level of specificity:

Floward-specific rules: warm tone, no sender disclosure for surprise gifts

LAYER 3

Saudi Arabic dialect, local terminology, regional conversational conventions

LAYER 2

Universal rules: be concise, be clear, do not interrupt

LAYER 1

Agent Context

→

Order DB

CRM

Genesys

Average call duration. The agent starts every call knowing who it's calling and why. It doesn't ask questions it can already answer. It goes straight to the one thing it needs: the address.

Going from operational understanding to a live, running agent took weeks. The platform is built so that deep knowledge of how a business works translates directly into how the agent behaves.

B / SIMULATIONS & REINFORCEMENT LEARNING

How the Agent Learned to Pronounce the Neighborhood Correctly Before Ever Calling a Customer

Sawt's platform has a simulation engine built in. Before the agent talks to a real customer, it runs through dozens of test scenarios. The process starts with Testing Goals: what should the agent do when someone refuses to share their location? When they ask who sent the gift? When they push for a human? Each scenario is defined as a goal the agent either passes or fails.

Scenarios can be written manually or generated automatically. The platform's Generate with AI feature analyzes past conversation data and produces a list of scenarios covering common patterns and edge cases. The system then runs hundreds of synthetic conversations and scores each one against a Pass Rate per goal.

During Floward's simulation, a problem came up that would have taken hundreds of live calls to notice:

Fixed before production

In Floward's context, "Al-Olaya" refers to one of the most requested delivery districts in Riyadh. But the same word also exists in standard Arabic with a different meaning. During simulation, the system detected that the agent's pronunciation wasn't consistently distinguishing between the two.

pronunciation_dictionary.override()

The problem

The fix was made inside the platform's Pronunciation Dictionary, a per-agent lexicon where custom pronunciation rules can be defined for any word. The agent was updated before a single customer heard it mispronounce a neighborhood.

TTS_PRONUNCIATION_DEFECT

Fixed before production

pronunciation_dictionary.override()

The problem

TTS_PRONUNCIATION_DEFECT

The loop does not stop at launch.

In production, improvement is ongoing. Each call adds data. Each unexpected situation becomes a training scenario. By February 14th, the agent had already seen and trained on most of what it would encounter. Call 4,125 on Valentine's Day ran on everything the system had learned since call one.

C / OBSERVABILITY

10 Out of Thousands: How That Number Holds

Of the thousands of calls on February 14th, 10 required a human. That is not a rounding error. It is the result of an observability system that processes every call without exception, through a pipeline that starts the moment a conversation ends.

The system runs on two parallel tracks. The first measures platform-level quality. The second is built on Floward's own business logic and measures how closely the agent adheres to it. Together they produce a weighted score for every call:

PRODUCT-WIDE

KPIs

NON-DETERMINISTIC (AI-EVALUATED)

LLM Hallucination

WER

Tool Call Accuracy

Transition Accuracy

Sentiment

DETERMINISTIC (DIRECT MEASUREMENT)

Direct

e2e Latency (P95)

Direct

Transfer Success

Direct

Interruptions

Direct

Tool Latency

Direct

Overlapping Speech

For Floward, the team weighted each metric according to what their operations actually depend on. Correct pronunciation of neighborhood names accounts for 30% of the score, because a wrong district name means a gift arrives at the wrong address. Every call is evaluated as a weighted composite:

FLOWARD - WEIGHTED COMPOSITE SCORE

30%

Call Successful

30%

Name Pronunciation

30%

Node Transition Acc.

10%

Negotiation Tactic

30%

Latency P95 ≤ 2s

Threshold breach → call flagged → auto-handoff to human + added to simulation training set

Every call passes through full post-call analysis. Performance is scored, failures are logged, and any call that deviates from the expected path is transferred immediately to a human agent with the full context attached. Those calls then go back into the simulation engine as new training scenarios.

The Results

Numbers that tell the story

492

Concurrent Callsمكالمة في نفس اللحظة

60s

Avg Call Duration
أسرع بـ 6× من المكالمات التقليدية

93%

Customer Satisfaction
positive and neutral sentiment combined

91.5%

Fully Autonomousمعالجة كاملة بدون تدخل بشري

492

Concurrent Callsمكالمة في نفس اللحظة

60s

Avg Call Duration
أسرع بـ 6× من المكالمات التقليدية

93%

Customer Satisfaction
positive and neutral sentiment combined

91.5%

Fully Autonomousمعالجة كاملة بدون تدخل بشري

Beyond the Peak

One use case. Demonstrated value. Built to expand.

What began as address confirmation became a reliable operational capability - available year-round, at any volume. Peak season no longer requires temporary fixes or headcount that needs to be rebuilt each year. Floward's team could focus on what automation doesn't touch: the parts of gifting that require human judgment and care.

At Sawt, we build a platform for AI-managed customer communication. We start by understanding how your operations actually work, demonstrate measurable impact within weeks, and build from there.

Not a one-time result. A repeatable model, across seasons, channels, and scale.

Understands your tone, speaks your language

No more boring robotic replies. Sawt responds naturally, understands the customer’s tone, and speaks their language.

Understands your tone, speaks your language

No more boring robotic replies. Sawt responds naturally, understands the customer’s tone, and speaks their language.

Understands your tone, speaks your language

No more boring robotic replies. Sawt responds naturally, understands the customer’s tone, and speaks their language.

@usesawt

How Sawt Helped Floward Deliver 2 Million Flowers in a Single Day

"During our most demanding season, our partnership with Sawt allowed us to handle an unprecedented volume of orders with full operational stability - while maintaining the quality of experience Floward is known for."

The Day Everything Scales

🌹Valentine's

🌹Valentine’s

🌹Valentine’s

Floward's highest-volume day on record

Floward's highest-volume day on record

Sawt: An Operational Platform, Not a Bot

Building the Floward Agent: From Operational Understanding to First Call

Floward-specific rules: warm tone, no sender disclosure for surprise gifts

Saudi Arabic dialect, local terminology, regional conversational conventions

Universal rules: be concise, be clear, do not interrupt

Average call duration. The agent starts every call knowing who it's calling and why. It doesn't ask questions it can already answer. It goes straight to the one thing it needs: the address.

How the Agent Learned to Pronounce the Neighborhood Correctly Before Ever Calling a Customer

Fixed before production

The problem

Fixed before production

The problem

The loop does not stop at launch.

10 Out of Thousands: How That Number Holds

Numbers that tell the story

One use case. Demonstrated value. Built to expand.

Understands your tone, speaks your language

Understands your tone, speaks your language

Understands your tone, speaks your language

🌹
Valentine's

🌹
Valentine’s

🌹
Valentine’s