System Design: URL Shortener Service
1. Business Requirements
Functional Requirements
- Users can shorten long URLs to short, unique aliases.
- Redirect to the original URL when a short URL is accessed.
- Track click analytics (e.g., timestamp, referrer, IP, device info).
- User authentication via social login (Google, Facebook, GitHub, etc.).
- Authenticated users can view/manage their created URLs and analytics.
- Support for custom aliases (if available).
- API access for programmatic URL shortening.
Non-Functional Requirements
- 99.9% availability (highly available, fault-tolerant).
- Low latency for redirection (<50ms p99 in-region).
- Scalability to handle millions of URLs and high QPS (queries per second).
- Strong security (rate limiting, input validation, HTTPS, auth).
- Data consistency for URL creation and analytics.
- GDPR compliance (user data privacy, deletion on request).
- Observability (logging, monitoring, alerting).
Out of Scope
- Deep link preview/expansion.
- Malware/phishing detection.
- Monetization (ads, paid plans).
- Browser extensions or mobile apps.
2. Estimation & Back-of-the-Envelope Calculations
- Expected QPS: 2,000 QPS (peak), 500 QPS (avg)
- URL Creation: 10M new URLs/year
- Redirection: 10x more frequent than creation (100M/year)
- Analytics Storage: 100M events/year, ~500 bytes/event → ~50GB/year
- Short URL Length: 7 chars (base62: 62^7 ≈ 3.5T possible)
- DB Size: 10M URLs × 200 bytes/row ≈ 2GB/year
- Availability: 99.9% = <9h downtime/year
3. High-Level Design
Key Design Decisions
- URL Generation: Use random base62 strings, check for collisions. Optionally, use hashids or k-sortable IDs for sharding.
- Database Choice: Use a globally distributed, highly available NoSQL DB (e.g., Google Cloud Spanner, Amazon DynamoDB, or Azure Cosmos DB). For open source: CockroachDB or FoundationDB. For analytics: append-only log to data warehouse (e.g., BigQuery, Snowflake).
- Authentication: OAuth2 with major social providers.
- Caching: Use Redis/Memcached for hot URLs.
- CDN: For static assets and edge redirection.
Component Diagram
mermaid
flowchart TD
User[User / Client] -->|Shorten URL| API[API Gateway]
API --> Auth[Social Auth Service]
API --> App[URL Shortener Service]
App --> DB[(NoSQL DB)]
App --> Cache[Redis Cache]
App --> Analytics[Analytics/Event Log]
User -->|Redirect| CDN[CDN/Edge]
CDN --> App
App --> DB
App --> AnalyticsData Flow Diagram
mermaid
sequenceDiagram
participant U as User
participant API as API Gateway
participant S as Shortener Service
participant DB as NoSQL DB
participant C as Cache
participant A as Analytics
U->>API: POST /shorten (with long URL)
API->>S: Validate & Auth
S->>DB: Store mapping
S->>A: Log event
S->>API: Return short URL
U->>CDN: GET /{short}
CDN->>C: Lookup short URL
alt Cache miss
CDN->>S: Lookup short URL
S->>DB: Fetch mapping
S->>C: Update cache
end
S->>A: Log redirect event
CDN->>U: HTTP 302 Redirect4. Detailed / Conceptual Design
URL Generation
- Use cryptographically secure random generator for 7-char base62 string.
- Check for collision in DB (very low probability).
- Optionally, allow custom aliases (validate for uniqueness).
- For sharding, use prefix or hash-based partitioning.
Database
- Primary DB: Globally distributed NoSQL (e.g., DynamoDB, Cosmos DB, Spanner).
- Partition key: short URL code
- Attributes: original URL, user ID, created_at, custom alias, etc.
- TTL for expired URLs (optional)
- Analytics: Write events to append-only log (Kafka, Kinesis, or direct to warehouse).
- Cache: Redis/Memcached for hot short codes.
Authentication
- Use OAuth2 with social providers (Google, Facebook, GitHub).
- Store user profile, session, and mapping to URLs.
- Alternative if Social Auth Unavailable: If social authentication APIs are not available (e.g., due to lack of subscription or provider restrictions), provide a fallback authentication method:
- Allow users to register and log in using email/password with secure password storage (bcrypt/scrypt/argon2).
- Implement email verification and password reset flows.
- Ensure the fallback method meets the same security and privacy standards as social auth.
API Gateway
- Rate limiting, input validation, HTTPS enforcement.
- Routes: /shorten, /{short}, /user/urls, /analytics, /login
Redirection
- Use CDN/edge for low-latency global redirects.
- Fallback to origin if not cached.
Analytics
- Log each redirect event (timestamp, IP, referrer, user agent).
- Batch process logs for reporting.
Security
- Input validation, XSS/CSRF protection, HTTPS everywhere.
- Rate limiting per IP/user.
- Secure storage of secrets (OAuth, DB creds).
Observability
- Centralized logging, metrics, distributed tracing.
- Alerting on error rates, latency, availability.
5. Bottlenecks & Refinement
Potential Bottlenecks
- DB Hotspots: Popular short codes may create partition hotspots. Mitigate with cache and sharding.
- Cache Eviction: High churn may cause cache misses. Use LRU and tune size.
- Analytics Write Load: High QPS may overload analytics pipeline. Use batching and backpressure.
- OAuth Rate Limits: Social auth providers may throttle. Implement retries and fallbacks.
- CDN Propagation: Delays in cache invalidation. Use short TTLs for dynamic content.
Refinement Strategies
- Use multi-region DB replication for failover.
- Pre-warm cache for trending URLs.
- Use async/batch processing for analytics.
- Monitor and autoscale API and DB layers.
- Regularly review security and compliance.
End of system design document.