Tech Stack

The technology choices for every layer of the project, with the reasoning behind each. The stack is deliberately conservative: well-understood tools, minimal moving parts, no premature complexity.

This document covers what's used, why, and what the upgrade paths look like. For repository layout, see REPOSITORIES.md. For deployment topology, see OPERATIONS.md.

Stack philosophy

Five principles guide every technology choice.

Boring beats novel. Postgres beats the latest database. Cron beats a workflow engine. Cloudflare beats DIY edge. Boring tools have fewer surprises and longer support windows.
Open beats proprietary. Where there's a viable open-source alternative, use it. Where there isn't (e.g., Cloudflare's edge), pick the vendor whose pricing and trajectory align with infrastructure use, not consumer use.
Operating cost matters. Every line item must be defensible at year-1 budget and reasonable at year-5 budget. Solutions that work at scale but cost six figures at year 1 are wrong for this project's curve.
No vendor lock-in for the core data. Postgres, R2 (which is S3-compatible), and standard formats. The dataset can be migrated to a different infrastructure provider without rewriting application logic.
Standards over frameworks. OpenAPI, JSON Schema, JWT, RFC 7807, CC-BY 4.0. These outlast specific implementations.

Front-end stack (consumer site + contributor PWA)

Framework: Next.js 16 (App Router)

The reference site uses Next.js. The choice and the specific configuration:

App Router. Server-first rendering. Streaming where useful. Route groups for organizing locales and verticals.
React 19.2. Concurrent rendering, transitions, the use() hook for async data, partial pre-rendering.
TypeScript strict mode. Every file. No any. No type assertions without justification.
Static at build time where possible. Airline pages, airport pages, scenario pages, knowledge pages — all statically generated with revalidation (ISR) when the underlying data changes.
Server Components for content; Client Components for interaction. Default to server.

Why Next.js, not Astro / Remix / SvelteKit / Nuxt:

Maturity of the ecosystem (third-party libraries, community)
ISR (Incremental Static Regeneration) maps cleanly to the data-then-revalidate pattern
Vercel's hosting is the simplest deploy for Next.js; the project can also self-host on Cloudflare or anywhere else without changing the codebase

Why not just SSG with no server components:

ISR + on-demand revalidation lets the project update specific pages within seconds of a data change, without a full rebuild. At scale this matters.

Styling: Tailwind CSS v4 + design tokens

Utility-first, with custom design tokens for the editorial-utilitarian aesthetic.
CSS variables for theming (light/dark, contrast modes).
No CSS-in-JS runtime (server-rendered class strings are faster).
shadcn/ui as the component base, restyled to the design system.

See DESIGN-SYSTEM.md for the design language.

State management

Server Components for data fetching where possible (no client-side state needed).
React Context for theme and language preference (read once, persisted in localStorage).
TanStack Query (React Query) for client-side data that needs caching and revalidation (the bag-fit tool, the compensation tool, search).
No Redux. No Zustand. No MobX.

Forms

React Hook Form + Zod for validation.
Server actions for submission.
The contributor app uses Conform for richer form workflows (multi-step, draft saving).

Internationalization

next-intl for translation handling.
Translations stored in packages/i18n/messages/{locale}.json.
Crowdin or Weblate for the translation workflow.
12 languages at launch, expanding through community contribution.

Performance budget

Enforced via Lighthouse CI on every PR:

LCP under 1.5s on Slow 3G mobile
TBT under 100ms
CLS under 0.05
TTI under 1s on a 3-year-old Android
JS bundle under 100kB gzipped for any page

PRs that regress these thresholds are blocked.

Build and dev tooling

pnpm for package management (monorepo-friendly, fast)
Turborepo for the monorepo build orchestration (incremental builds, remote cache)
Vitest for unit tests
Playwright for end-to-end tests
Biome for linting and formatting (replaces ESLint + Prettier with a single tool, faster)
TypeScript as the type system

API and back-end stack

API server

Hono for the HTTP framework. Lightweight, fast, runs everywhere (Cloudflare Workers, Node, Bun, Deno). The choice trades the React community's familiarity for a server-first design suited to the API workload.
TypeScript end-to-end. Shared types between API and consumers via the flighthelp/schema package.
OpenAPI 3.1 generated from the route definitions. Single source of truth.
GraphQL via gql-yoga. Reuses the same resolvers.

Why not Express, Fastify, or Bun's native HTTP:

Hono is faster than Express and Fastify under typical workloads.
Bun's native HTTP locks to Bun; Hono is portable.

Database

PostgreSQL 17 on Neon (or Supabase, or self-hosted). Branching, point-in-time recovery, no infrastructure management.
Drizzle ORM for type-safe queries. Generates types from the schema; types align with flighthelp/schema.
Read replicas as load grows. Neon makes this nearly free.

Why Postgres, not PlanetScale / MongoDB / CockroachDB / DynamoDB:

The dataset is relational. Airlines have fare classes have baggage rules. Querying across these is natural in SQL.
Postgres's JSONB handles the localizable strings, hours objects, and source arrays without separate document store.
Postgres has full-text search for fallback if Meilisearch is down.
The eventual scale (low billions of API calls/month, low millions of writes/month) is well within Postgres territory.

Search

Meilisearch for full-text search across all entities.
Updated incrementally on every approved edit.
Hosted on Meilisearch Cloud initially; self-hosted if cost justifies.
Sub-50ms p99 latency at expected scale.

Why Meilisearch, not Elasticsearch / Algolia / Postgres FTS:

Algolia is excellent but priced for SaaS, not infrastructure
Elasticsearch is operationally heavy; the project is too small to dedicate ops time
Meilisearch hits the right tradeoff: fast, simple, open-source, reasonable hosted pricing

Object storage

Cloudflare R2 for photos, bulk dataset snapshots, and any other binary content.
S3-compatible API (zero-egress pricing is the key economic advantage).
Self-hosted backup to Backblaze B2 as second copy.

Edge and CDN

Cloudflare for the CDN, DDoS protection, WAF, and edge caching.
Cloudflare Workers for edge logic (rate limiting, basic auth checks, redirect logic).
The CDN dramatically reduces origin load for the API and the consumer site.

Background jobs

Cron-based jobs for scheduled work (scrapers, daily snapshots, reputation recompute).
Inngest for event-driven jobs (post-edit webhooks, badge evaluations).
Fly.io for long-running containers (scrapers, batch jobs).

Why not Temporal, Sidekiq, BullMQ, AWS Step Functions:

The scale and complexity don't justify it. Cron + Inngest covers >95% of the workload.

Email

Postmark or Amazon SES for transactional email (auth magic links, edit notifications).
Listmonk (self-hosted) for newsletter.

Authentication

Auth.js (NextAuth v5) in the contributor app.
Magic-link only. No passwords. No social logins required.
JWT sessions, short-lived.
The public consumer site has no authentication (no signup needed for read-only access).
API uses Bearer tokens, generated through the contributor app or a separate developer dashboard.

Monitoring and observability

Sentry for error tracking.
Plausible (self-hosted) for site analytics — no third-party tracking.
Grafana Cloud (or self-hosted) for metrics and dashboards.
OpenTelemetry for tracing.
Better Stack (or Cloudflare Status) for the public status page.

Hosting and deployment

Vercel for the front-end apps (or Cloudflare Pages as alternative).
Fly.io for the API server and the scraper runners.
Neon for Postgres.
Cloudflare R2 for storage.

This stack runs the entire project at year 1 for ~$1500–$2500/month, scales to ~$8000–$15000/month at year 5.

Self-hosting fallback: Every component above has a self-hosted alternative documented. If any vendor goes hostile (pricing, terms, acquisition), the project can migrate to self-hosted infrastructure within 60 days for each component.

Engine and library SDKs

Canonical implementation: TypeScript

The TypeScript implementation in flighthelp/rules-engine is the canonical one. Other languages are transpiled or hand-maintained in parallel.

TypeScript with strict null checks.
Zero dependencies on the runtime side. No lodash. No dayjs. The engine must be embeddable in any environment, including AI inference contexts with restrictive sandboxes.
Pure functions throughout. Same input always produces same output.

Cross-language SDKs

Python: Pydantic v2 models, generated from the TypeScript canonical. Hand-translated rule logic with shared fixture tests.
PHP: Maintained by-hand from the TypeScript reference. Strong type hints.
Go: Generated types from the JSON Schema; hand-written rule logic.
Rust: Generated structs; hand-written rule logic.
Java: Generated records; hand-written rule logic.
Swift: Generated Codable structs; hand-written rule logic.

Each SDK runs the shared fixture tests on every release. Cross-language agreement on every test case is enforced by CI.

Scraper infrastructure

Runtime: Playwright (for JS-rendered pages) + fetch (for static pages)

Playwright runs in containers on Fly.io.
fetch + cheerio for static-page parsing where Playwright is overkill.
Per-origin rate limiting: typically 1 request per minute.
User-Agent identifies the project with a link to ROBOTS-POLICY.md.

Storage and diffing

Each scraper produces a structured output that's diffed against the previous run.
Diffs above noise threshold (configurable per field type) generate a moderation queue item.
Raw scraper output (HTML, JSON) is archived for 90 days for forensic purposes.

Failure handling

A failing scraper retries with exponential backoff up to 3 attempts.
After 3 attempts, the scraper is marked unhealthy and a community-fix issue is auto-created.
Unhealthy scrapers don't block other scrapers; the system degrades gracefully.

Security

Application security

All input validated against schemas at the API boundary.
Output encoded for the rendering context (HTML, JSON, CSV).
CSRF tokens on all state-changing endpoints in the contributor app.
Content Security Policy headers on the consumer site.
Subresource Integrity for any third-party scripts (there are very few).
HTTP Strict Transport Security with preload.

Infrastructure security

Secrets stored in Vault or 1Password Teams (depending on team-size phase). No secrets in environment variables in git.
Database access restricted to the API service IPs. Direct admin access requires VPN + 2FA + audit logging.
API key management through the contributor app; keys are hashed at rest, only the hash is checked at auth time.
Backup encryption at rest with keys held separately from the data.
Quarterly security review by paid contractor.
Bug bounty program through HackerOne or similar (started year 2+).

Privacy security

Personal data minimization (see LEGAL.md).
No third-party analytics that could leak contributor data.
EXIF stripping on every uploaded photo.
Geolocation rounded to airport-level precision.

Standards followed

The project follows open standards rather than inventing its own:

JSON Schema 2020-12 for data definitions
OpenAPI 3.1 for API specification
RFC 7807 for HTTP problem details (error responses)
RFC 3339 / ISO 8601 for timestamps
ISO 4217 for currency codes
ISO 3166-1 for country codes
ISO 639-1 / IETF BCP 47 for language tags
IANA timezone database for time zones
JWT (RFC 7519) for authentication tokens
HTTP Signature (RFC 9421) for webhook verification (alongside HMAC)
Webhook standards following the emerging webhook-best-practices conventions
CC-BY 4.0, MIT, AGPL-3.0, CC0 for licensing

Upgrade strategy

The stack will need upgrades. The strategy:

Conservative on the database. Postgres major-version upgrades happen during scheduled maintenance windows with full rollback procedures.
Aggressive on the front-end framework. Next.js and React versions track current releases within one quarter, after community testing.
Patient on the engine. SDK version stability matters more than chasing language features.
Tested on dependencies. Every dependency upgrade runs the full test suite. Major updates that break tests don't ship.

When a major dependency goes hostile (license change, vendor sells out, abandonment), the project has 60 days from detection to migration to a vetted alternative. This commitment is in the engineering operations playbook.