Case Study · 01

DotShop.ai , Multi-Region Ecommerce Platform

Architected and shipped a curated fashion, beauty, and jewelry storefront from zero to production. Next.js frontend, REST backend, multi-region AWS infrastructure, and real-time Shopify sync feeding a custom order layer.

RoleSenior Full-Stack Engineer & System Architect
Focus areasProduct architecture, payments, infra, Shopify sync
Engagement8 months · architecture & build
Stack
Next.jsNodePostgresAWSShopify APIStripe
01

Context

DotShop.ai launched as a curated multi-category storefront pulling product data from a Shopify catalog while owning its own customer experience, checkout flow, and order pipeline.

The team needed a platform that read like a premium DTC product on the front end but operated like infrastructure on the back end: predictable latency across regions, idempotent payment handling, and a Shopify sync that could keep inventory honest under spiky traffic.

There was no platform in place when the engagement started. Only product designs, a Shopify storefront with the catalog, and a payment partner already selected.

02

Problem

A premium storefront with three product categories, two regions, one Shopify source-of-truth, and a checkout that can't drop orders or oversell.

The core problem wasn't visual — it was reliability under coordination. Shopify owned the catalog and inventory. The platform owned the customer, the checkout, and the order. Any drift between the two would surface as oversold SKUs, missing orders, or refunds that didn't reconcile.

On top of that: a payment integration with regional rules, two AWS regions to keep latency low, and a small team that needed the system to be operable, not just buildable.

Why it needed to be done

Three failure modes were not acceptable at launch.

Risk surface

Three failure modes were not acceptable at launch.

This wasn't a prototype. The platform was the storefront of record on launch day, with marketing spend already committed.

!

Overselling SKUs during traffic spikes

Inventory drift between Shopify and the order layer would result in confirmed orders for stock that didn't exist — manual refunds, support load, and lost trust.

$

Dropped or duplicated payments

A non-idempotent payment path would silently double-charge or lose authorizations under retry, with no audit trail to reconcile against.

~

Cross-region latency on checkout

A single-region deployment would push checkout latency past 800ms for half the customer base, measurably hurting conversion on the most important page of the funnel.

Solution

What was built and how it fits together.

01Next.js storefront
App Router with ISR for catalog pages, edge-rendered cart, and a tightly scoped client bundle. Product data is hydrated from the platform API, not directly from Shopify, so the front end sees a stable, regional contract.
02REST backend in Node
A typed REST surface for the storefront and admin: catalog read, cart, checkout, orders, customers. Every write goes through an idempotency-keyed handler with a retry-safe outbox for downstream effects.
03Postgres as source of order truth
Orders, payments, and inventory snapshots live in Postgres with strict invariants. Shopify is read-through for catalog and pushed to for fulfilment, but the order ledger never depends on Shopify being available.
04Real-time Shopify sync
A two-way sync worker: webhook ingestion for catalog and inventory updates, and a queued push for orders. Sync lag is treated as a first-class metric, surfaced in the admin, and alerted on past a 5s budget.
05Payments and fintech integration
Stripe handles authorization and capture. A thin payment service in front owns idempotency keys, status reconciliation, and the audit trail. Refunds and disputes write to the same ledger, never reconstructed from webhooks alone.
06Multi-region AWS infrastructure
Two regions behind CloudFront, ALB-fronted ECS services, a primary Postgres in one region with read replicas in the other, and S3/CloudFront for static and media. Terraform-defined, with one-command region failover.
Key technical work

The pieces of the build that mattered most.

01

Idempotent order and payment pipeline

Every write to the order ledger carries an idempotency key, with a transactional outbox to drive Shopify and Stripe side-effects. Retries, partial failures, and webhook re-delivery all converge to the same state.

Idempotency keysOutbox patternPostgres tx
02

Two-way Shopify sync worker

Catalog and inventory flow inbound via webhooks into a normalized projection; orders flow outbound through a durable queue with per-SKU ordering. Sync lag is measured per direction and surfaced in admin.

WebhooksSQSCDC projection
03

Multi-region infrastructure as code

Terraform modules for VPC, ECS services, Postgres and replicas, and CloudFront, parameterized by region. New environments stand up in a single CI pipeline; failover is a parameter flip, not a runbook.

TerraformECSRDS multi-AZ
04

Edge-rendered storefront with ISR

Product and category pages are statically rendered with on-demand revalidation triggered by sync events. Cart and checkout run on the edge against the regional API for sub-100ms first-byte from either region.

Next.js App RouterISREdge runtime
05

Payment audit and reconciliation

A dedicated payment service owns idempotency, status reconciliation against Stripe, and the audit trail that finance reads. Disputes and refunds write to the same ledger, never reconstructed from webhook history.

StripeLedgerReconciliation
06

Observability and on-call hygiene

Structured logs, RED metrics per service, sync-lag dashboards, and alerts that fire on business invariants, not just infra. Runbooks for the three top failure modes were written before launch, not after.

OpenTelemetryGrafanaRunbooks
Business impact

What came out of it.

placeholderCheckout p95< 200msEdge-rendered checkout from either region, well under the 800ms threshold the team set as the failure bar.
placeholderInventory drift~ 0%Zero overselling incidents in the first 90 days of operation, across two regions and three category surges.
placeholderShopify sync lag< 1.5sMedian catalog and inventory propagation, with alerting above 5s. Surfaced in the admin so operators can see it.
placeholderTime to launch8moFrom empty repo to production cutover, including infrastructure, integrations, admin tooling, and on-call setup.

Values marked placeholder are representative — replace with measured numbers from the live system once available.

Final result

A production storefront the team operates, not one they fight.

DotShop.ai shipped on schedule, in two regions, with a payment surface that finance can audit and a Shopify sync the support team can actually see. The platform has stayed in production with no architectural rewrites, and the runbooks written before launch have held up under the only two incidents that mattered.

Shipped on schedule in two regions
Idempotent payment and order ledger
Real-time Shopify sync with alerting
Terraform infrastructure with one-flag failover
Runbooks and on-call hygiene from day one
Next engagement

Have a similar system to build or optimize?

Whether it's a platform from scratch, a Shopify or payments migration, or untangling a multi-region setup, send a few sentences and you'll hear back directly within one business day.

Book a callbilalasharf@gmail.com