Tryl: AI Fashion Try-On

Full-stack monorepo for an AI-powered virtual fashion try-on product. A Chrome extension detects clothing items on any shopping page, a FastAPI backend manages fitting profiles and job dispatch, and an async Redis worker queue processes try-on image generation end-to-end.

Role

Full Stack Engineer (solo)

Tech Stack
ReactTypeScriptFastAPIPythonPostgreSQLRedisChrome Extension (MV3)pnpm workspaces

The Challenge

AI try-on generation takes 5–30 seconds per image — too slow for a synchronous API response. At the same time, the product needs to work across arbitrary shopping pages (Zara, H&M, ASOS) where product images live behind inconsistent DOM structures. The core engineering challenge was building an async job pipeline (create → queue → process → archive) that decouples the user's request from the slow generation step, while the Chrome extension handles product detection across heterogeneous storefronts without brittle CSS selectors.

Architecture & Deep Dive

System Architecture

Monorepo: Chrome Ext + Web → FastAPI → Redis → Worker → S3 + PostgreSQL

UserChrome Extension(MV3)React Web AppFastAPI BackendJWT AuthPostgreSQL(profiles)Redis Job QueueTry-On WorkerS3 Image Archive

0. Monorepo Structure (pnpm workspaces)

markdown
tryl/
  ├─ apps/
  │    ├─ web/          React + TypeScript — auth, fitting profiles, archive
  │    ├─ extension/    Chrome MV3 — product detection on shopping pages
  │    ├─ api/          FastAPI — REST API for profiles, jobs, archive
  │    └─ worker/       Python — async try-on job processor
  ├─ packages/
  │    ├─ shared-types/ TypeScript types shared by web + extension
  │    └─ config/       Shared ESLint + TS config
  └─ pnpm-workspace.yaml

1. Try-On Job Pipeline — create → queue → process → archive

markdown
POST /api/tryon/jobs  { profile_id, product_image_url, product_metadata }
  │
  ├─ Validate profile exists + user owns it
  ├─ Resolve product: download image → store to S3-compatible store
  ├─ INSERT job { status: "queued", profile_id, product_image_key }
  │
  └─ redis.lpush("tryon:queue", job_id)   ← enqueue
       │
       ▼  [Worker process — separate container]
  redis.brpop("tryon:queue")              ← blocking pop
       │
       ├─ UPDATE job { status: "processing" }
       ├─ Fetch fitting profile image + product image
       ├─ Call try-on AI model API (async httpx)
       │    └─ Poll / stream until result image ready
       ├─ Store result image
       ├─ UPDATE job { status: "completed", result_image_key }
       │
       └─ On any error:
            UPDATE job { status: "failed", error_message }

GET /api/tryon/jobs/{job_id}   → { status, result_image_url? }
GET /api/tryon/archive         → paginated completed jobs

2. Chrome Extension — product detection across storefronts

markdown
Content script injected on shopping page load
  │
  ├─ DOM scan: heuristic selectors for product images
  │    ├─ meta[property="og:image"]           ← most reliable
  │    ├─ [data-testid*="product"] img         ← framework apps (Next.js)
  │    ├─ .pdp-image, .product-image img       ← legacy CSS patterns
  │    └─ largest visible <img> (fallback)
  │
  ├─ Inject "Try with Tryl" button adjacent to detected image
  │
  └─ On button click:
       ├─ Send product_image_url + page_url to background service worker
       ├─ background → POST /api/tryon/jobs (with auth cookie / token)
       ├─ Receive job_id → open popup with job status polling
       │    └─ GET /api/tryon/jobs/{job_id} every 3 s
       └─ On "completed" → display result image in popup overlay

Technical Trade-offs

  • Redis BRPOP (blocking pop) over polling: Worker blocks on the queue with no CPU burn between jobs. A single brpop call replaces a sleep-poll loop and ensures FIFO ordering.
  • Status field (queued/processing/completed/failed) over separate tables: Job lifecycle is tracked in a single status column with an updated_at timestamp. Simpler to query, index, and reason about than event-sourcing for an MVP.
  • Heuristic DOM selectors + og:image fallback: Structured site-specific scrapers break on every storefront redesign. Prioritizing og:image (set by site owners for sharing) gives a stable, high-quality image with minimal fragility.
  • Chrome MV3 (Manifest V3): MV3 service workers replace persistent background pages, which Chrome is retiring. The trade-off is that service workers can be suspended between events, requiring the extension to re-establish connections per user action.
  • pnpm workspaces over separate repos: Shared TypeScript types (shared-types) flow directly to both web and extension without a publish step. The trade-off is a slightly more complex CI matrix (build order matters).

Reliability & Validation

Validation

Manual end-to-end testing across 3 shopping storefronts (Zara, ASOS, H&M). Job pipeline tested with simulated slow worker (10 s artificial delay) to verify status polling behavior.

Edge cases validated
  • Worker crash mid-job — job stays in "processing"; a watchdog timer re-queues jobs stuck for >2 min
  • Extension on SPA (React/Next.js storefront) — MutationObserver detects route changes and re-runs product detection on navigation
  • User submits duplicate job — idempotency check on (profile_id, product_image_url) returns existing job_id
  • No og:image and no matching selector — extension shows "Could not detect product image" with manual URL input fallback
  • Auth token expired while polling — 401 triggers silent refresh; poll continues after new token issued

Error Handling Strategy

  • Worker wraps the entire job in try/except; any unhandled error sets status = "failed" with the exception message stored in error_message column.
  • FastAPI dependency injection validates auth + profile ownership before any job creation — 403 returned immediately on mismatch.
  • Chrome extension background service worker catches fetch failures and surfaces them in the popup as user-readable messages (not raw status codes).

Impact & Collaboration

  • Full monorepo MVP shipped with 4 apps (web, extension, API, worker) sharing types and config via pnpm workspaces.
  • Async job pipeline decouples slow AI generation from user-facing request — job status is available immediately, result arrives when ready.
  • Chrome extension works across heterogeneous storefronts (Zara, ASOS, H&M) using a layered fallback selector strategy.
  • Redis BRPOP queue ensures FIFO job ordering with zero CPU burn between jobs — horizontally scalable by adding worker containers.