Skip to content

File Storage Service System Design

1. Business Requirements

Functional Requirements

  • User registration and authentication (users, admins)
  • Upload, download, and delete files (documents, images, videos, etc.)
  • Organize files into folders/directories
  • File sharing (public/private links, user-to-user sharing)
  • Versioning and history for files
  • Search and filter files (by name, type, date, etc.)
  • Alerts/notifications (upload success/failure, quota limits, shared file access)
  • Mobile-ready responsive UI and API
  • Role-based access control
  • API for programmatic file operations

Non-Functional Requirements

  • 99.9% availability (max ~8.76 hours downtime/year)
  • Scalability to handle large files and high concurrency
  • Secure data storage and access control (encryption at rest and in transit)
  • Fast response times (<300ms for most requests)
  • Audit logging and monitoring
  • Backup and disaster recovery
  • GDPR/data privacy compliance
  • Mobile responsiveness

Out of Scope

  • Built-in document editing/collaboration (e.g., Google Docs)
  • In-app media playback/preview (unless specified)
  • Integration with third-party storage providers

2. Estimation & Back-of-the-Envelope Calculations

  • Users: 50,000
  • Files: 10M (average file size: 2 MB)
  • Daily transactions: ~100,000 (uploads, downloads, deletions, alerts)
  • Peak concurrent users: ~2,000
  • Data size:
    • Files: 10M × 2 MB = 20 TB
    • Metadata: 10M × 0.5 KB ≈ 5 GB
    • User data: 50,000 × 2 KB ≈ 100 MB
    • Audit logs: 100M × 0.2 KB ≈ 20 GB
    • Total DB size: ~25 GB (excluding logs, backups, file storage)
  • Availability:
    • 99.9% = 8.76 hours/year downtime max
    • Use managed DB, multi-AZ deployment, health checks, auto-scaling

3. High Level Design (Mermaid Diagrams)

Component Diagram

mermaid
flowchart LR
  User[User (Web/Mobile)]
  LB[Load Balancer]
  App[Application Server]
  DB[(Metadata DB)]
  Storage[Object Storage (Files)]
  Cache[Cache (Redis)]
  Alert[Alert/Notification Service]

  User --> LB --> App
  App --> DB
  App --> Storage
  App --> Cache
  App --> Alert

Data Flow Diagram

mermaid
sequenceDiagram
  participant U as User
  participant A as App Server
  participant D as Metadata DB
  participant S as Object Storage
  participant C as Cache
  participant L as Alert Service

  U->>A: Upload File
  A->>S: Store File
  S-->>A: Success/Fail
  A->>D: Create Metadata Record
  D-->>A: Success/Fail
  A->>L: Send Upload Alert
  A-->>U: Response

Key Design Decisions

  • Database: Relational DB (e.g., PostgreSQL) for metadata, strong consistency
  • Object Storage: For files (e.g., AWS S3, Azure Blob, MinIO)
  • Cache: Redis for fast lookups (file metadata, sessions)
  • Deployment: Cloud-based, multi-AZ, managed services for high availability
  • Alerting/Notifications: Email/SMS/push via third-party service (e.g., Twilio, Firebase)
  • API: REST/GraphQL for file operations

4. Conceptual Design

Entities

  • User: id, name, email, password_hash, role, registration_date, status
  • File: id, user_id, name, path, size, type, status, created_at, updated_at, version
  • Folder: id, user_id, name, parent_id, created_at
  • Share: id, file_id, shared_with_user_id, link, permissions, expires_at
  • Alert: id, user_id, file_id, type (upload/quota/share), message, created_at, status
  • AuditLog: id, user_id, action, entity, entity_id, timestamp

Key Flows

  • File Upload:
    1. User uploads file
    2. App stores file in object storage
    3. App creates metadata record in DB
    4. Sends upload alert to user
  • File Sharing:
    1. User shares file (generates link or assigns user)
    2. App creates share record and sends alert
  • Alerts:
    • System triggers alerts for upload success/failure, quota, sharing

Security

  • Role-based access control (RBAC)
  • Input validation, rate limiting
  • Encrypted connections (HTTPS)
  • Regular backups and audit logs

5. Bottlenecks and Refinement

Potential Bottlenecks

  • Object storage throughput:
    • Use scalable, distributed storage and CDN for downloads
  • Metadata DB contention:
    • Use read replicas, caching, and DB connection pooling
  • Alert delivery:
    • Use async queues for notifications
  • Large file uploads/downloads:
    • Use chunked uploads/downloads and resumable transfers
  • Single region failure:
    • Deploy across multiple availability zones/regions

Refinement

  • Monitor system metrics and auto-scale app servers
  • Regularly test failover and backup restores
  • Optimize queries and indexes for frequent operations
  • Consider sharding if user/file volume grows significantly

This design provides a scalable, highly available, and mobile-ready file storage service with robust alerts, security, and operational best practices.