SaaS — Lead Routing Hardening

TL;DR

Split monolithic record-triggered Flow into intent-specific subflows (enrichment, scoring, assignment).
Added defensive fault paths + platform-event retry pattern; centralized logging for observability.
Bulk-safe updates + strict entry criteria eliminated recursion and reduced governor risk.

Context

Mid-market B2B SaaS with spiky inbound volume across regions. Existing routing relied on one large record-triggered Flow and ad-hoc decisions maintained by multiple admins.

Problem

Under peak loads the Flow faulted and partially assigned records, creating SLA misses on first response, duplicate owner handoffs, and inconsistent task creation.

Intervention

• Architecture — Decomposed routing into subflows: ① enrichment, ② score & segment, ③ owner assignment, ④ post-assign tasks/SLAs.

• Reliability — Added fault paths that raise a platform event with context; retry subflow consumes events and replays safe operations.

• Bulk safety — Consolidated DML to a single commit subflow using collections; guarded entry/exit criteria to avoid recursion.

• Observability — Error log custom object tracks Flow, element, exception, record id, and attempt count; daily dashboard and alerts.

• Change safety — Added unit test template for subflow decisions (sample inputs/expected outputs) to catch regression in sandboxes.

Outcomes

Window	90 days pre vs 90 days post go-live
Industry	SaaS
Clouds	Sales Cloud
Flow Types	Record-Triggered, Subflow

−92%

Flow error rate

+18%

Speed-to-lead (median)

−88%

Assignment faults per 1k leads

Measured from error log records / 1,000 leads

Errors counted via platform-event + error-log object; STL measured as Lead.CreatedDate → first owner activity. Data excludes weekends/holidays per client reporting convention.

Timeline

1 week design + 1 week build + 1 week bake-in with monitoring.

Stack

Sales Cloud, Platform Events, Custom Error Log object (reports/dashboards).

Artifacts

Before/after routing Flow diagram
Retry pattern sequence diagram (platform events)
Error trend chart (90d window)
Decision table (segment → owner/subqueue)

FAQ

How did you ensure bulk safety and avoid recursion?

All writes were consolidated into a commit subflow operating on collections. Entry conditions prevent re-entry on the same context, and updates are batched.

What happens when an external dependency fails (e.g., enrichment)?

A fault path raises a platform event with the record id and failure context. The retry consumer evaluates idempotency and replays only safe steps.

How are failures monitored day-to-day?

Error log records are summarized on a dashboard by flow, element, and hour. Thresholds trigger alerts to an ops channel and a weekly email digest.

What changed for admins?

Admins update a small decision subflow or a decision table instead of the monolith; unit test templates catch regressions in sandboxes before deploy.