Skip to content

Python Backend Project

Initial Setup

TODO : Once, Code is pushed to Git, I'll dowload it in an isolated system and setup and write more accurate instructions

This is a FastAPI-based Python backend project with MongoDB integration and authentication features.

Recommend VSCode Extensions

  • Python - Official Python extension with IntelliSense, linting, debugging, and more
  • Python Debugger - Debug Python code with full debugging capabilities
  • Ruff - Fast Python linter and formatter written in Rust
  • MyPy Type Checker - Static type checking for Python using MyPy

Project Structure

.
├── app/                    # Main application directory
│   ├── auth/              # Authentication related components
│   ├── compute_tasks/     # Async task offloading infrastructure
│   ├── models/            # Database models
│   ├── schemas/           # Pydantic schemas
│   ├── shared/            # Shared utilities and constants
│   ├── third_party_services/   # External services (openai, b2, s3)
│   ├── main.py            # Application entry point
│   ├── middlewares/       # Custom FastAPI middlewares
│   ├── error_handler.py   # Centralized exception handling
│   └── mongodb.py         # MongoDB connection and utilities
├── compute_deploy/        # Container offloading infrastructure
│   ├── Dockerfile         # Generalized multi-target Dockerfile
│   ├── handlers/          # Generic Lambda/Batch entry points
│   ├── configs/           # Per-container config (config.json + requirements.txt)
│   └── terraform/         # Infrastructure as Code (auto-reads configs/)
├── examples/              # API example curl scripts
├── justfile               # Just command runner configuration
├── pyproject.toml         # Python project configuration
└── uv.lock                # UV package manager lock file

Code Conventions and Guidelines

To maintain consistency and code quality, please adhere to the following conventions when contributing to the project.

General Principles

  • Code Style: We use Ruff for linting and formatting, which follows a style similar to Black. Please ensure your code is formatted before committing.
  • You can run ruff check
  • Type Hinting: All functions and methods must have type hints for arguments and return values.
  • Asynchronicity: Use async/await for all I/O-bound operations, including database calls and external API requests.
  • Configuration: Store hardcoded constants in app/shared/shared_constants.py. Use environment variables for secrets and environment-specific settings.

Naming Conventions

  • Files: Use snake_case (e.g., user_service.py).
  • Classes: Use PascalCase (e.g., UserService, UserSchema).
  • Functions/Methods: Use snake_case (e.g., get_user_by_id).
  • Variables: Use snake_case.
  • Pydantic Schema Fields: Use snake_case for all field names to align with Python conventions.
  • For DB scehma fields, camelCase is preferred to be consistent with nodeJS schemas for mongoDB.

Architecture & File Structure

The project follows a feature-based architecture with a clear separation of concerns.

  • app/main.py: The main application entry point. It should only contain FastAPI app initialization, middleware, lifespan events, and inclusion of feature routers.
  • Feature Modules (e.g., app/feature_name/): Each major feature should reside in its own directory within app/. For example, a new "products" feature would be in app/products/.
  • _router.py: The API layer. It defines API endpoints using fastapi.APIRouter. Its role is to handle HTTP requests, validate inputs using Pydantic schemas, and call the appropriate service layer method. It should not contain any business logic.
  • _service.py: The business logic layer. It's encapsulated in a service class (e.g., ProductService) with static methods. This layer coordinates application logic, calling model/database methods and helper functions.
  • _schemas.py: Contains Pydantic schemas for request and response models specific to the feature.
  • _helper.py: Optional file for utility functions that are only used by this specific feature.
  • app/models/: The data access layer. Models interact with the database. A model file (e.g., users_model.py) should contain a class with static methods for CRUD operations on a specific database collection. These methods should return Pydantic schema instances, not raw database documents.
  • app/schemas/: Contains core Pydantic schemas that are shared across multiple features (e.g., users_schema.py).
  • app/shared/: For code that is used across the entire application, such as custom error classes (shared_errors.py) or shared constants (shared_constants.py).

Creating a New Feature

To add a new feature, such as "Products":

  1. Create a feature directory: app/products/
  2. Create the router: app/products/products_router.py. Define your APIRouter here.
  3. Create the service: app/products/products_service.py. Create a ProductService class to handle business logic.
  4. Create schemas:
    • For request/response models specific to the products API, create app/products/products_schemas.py.
    • If you need a core "Product" schema that might be reused elsewhere, add it to a new file app/schemas/products_schema.py.
  5. Create the model: app/models/products_model.py. Create a ProductsModel class to handle database operations for products.
  6. Include the router: Import and include the products_router in app/main.py.

Key Components

Authentication (app/auth/)

  • Complete authentication system with helper functions, routes, and services
  • Handles user authentication and authorization
  • Includes schema definitions for request/response validation

Models (app/models/)

  • Database models for MongoDB
  • Currently includes user model for authentication

Schemas (app/schemas/)

  • Pydantic schemas for data validation
  • Defines the structure of request/response data

Shared (app/shared/)

  • Common utilities and constants used across the application
  • Custom error definitions for consistent error handling

Tone Mapping (app/tone_mapping/)

  • Ports the adaptive LUT grading pipeline from image-tools
  • Exposes /tone-mapping FastAPI routes for auto-correct, manual adjustments, preset apply, and cache management
  • Provides async wrappers around the original synchronous implementation so endpoints remain non-blocking
  • Reuses MongoDB configs via app/mongodb.py and centralizes AWS S3 access in app/third_party_services/s3_service.py
  • pipeline.py: The entire ToneMappingService class from image-tools is synchronous and depends on many modules. To keep backend endpoints async without rewriting the whole implementation, the original service lives in pipeline.py (renamed ToneMappingPipeline) and is wrapped with lightweight async adapters in tone_mapping_service.py, preventing blocking requests while preserving legacy logic.

Core Files

  • main.py: Application entry point and FastAPI setup
  • mongodb.py: MongoDB connection and database utilities

Middlewares (app/middlewares/)

  • Contains custom FastAPI middleware, such as authentication and request processing logic.

Error Handler (app/error_handler.py)

  • Centralized error and exception handling for FastAPI, including validation and custom errors.

API Examples

The examples/ directory contains curl commands for testing all API endpoints. Each subdirectory corresponds to a router with executable shell scripts demonstrating complete request examples. Run any example with ./examples/router_name/endpoint.sh or copy the curl commands directly.

Development Setup

The project uses:

  • FastAPI for the web framework
  • MongoDB for the database
  • UV for package management
  • MyPy for type checking
  • Just for command running

Configuration Files

  • pyproject.toml: Project metadata and dependencies
  • mypy.ini: Type checking configuration
  • justfile: Command runner configuration
  • .python-version: Python version specification
  • newrelic.ini: New Relic APM agent configuration

Monitoring with New Relic

New Relic APM is integrated using the Python agent and newrelic.ini.

Running Locally

Note: Work in a virtual environment for better separation of packages from your global environment. uv handles it automatically.

pip install uv or brew install uv
uv sync
uv run just dev

Installing packages

uv add <package>

Docker Deployment

Run the application in a container:

docker build -t python-backend .
docker run -p 8000:8000 -e PORT=8000 --env-file .env python-backend

Heavy Processing Task Offloading

For CPU-intensive or long-running tasks, we support offloading to AWS Lambda (fast tasks ~1-5 min) or AWS Batch (longer tasks like video rendering). This keeps the main FastAPI server responsive.

PLEASE reach out to niklesh@ or one of the devops team members to deploy your code to a container.

Architecture

flowchart TB
    subgraph client [Client]
        C[Frontend/API Consumer]
    end

    subgraph api [FastAPI Backend]
        R[Router]
        CTS[ComputeTasksService]
        TR[TaskRegistry]
        DB[(MongoDB)]
    end

    subgraph lambda_path [Lambda - Fast Tasks]
        L[tone-mapping container]
    end

    subgraph batch_path [Batch - Long Tasks]
        SQS[SQS Queue]
        EB[EventBridge Pipes]
        B[minimatics container]
    end

    C -->|"POST with use_container=true"| R
    R --> CTS
    CTS -->|"Which target?"| TR
    CTS -->|Create job| DB

    TR -->|LAMBDA| L
    L -->|Update status| DB

    TR -->|BATCH| SQS
    SQS --> EB
    EB -->|Trigger| B
    B -->|Update status| DB

    C -->|"GET /compute-tasks/{id}/status"| R
    R -->|Read| DB

When to Use Container Offloading

Use Case Target Example
Image processing (1-5 min) Lambda Tone mapping auto-correct
Video rendering (5-60 min) Batch Minimatics video export
GPU-required tasks Batch ML inference
Tasks that block event loop Lambda/Batch Heavy numpy/opencv operations

Don't offload simple CRUD operations, quick database queries, or tasks under 30 seconds.

How It Works

  1. Client sends request with use_container=true
  2. Backend stores payload in MongoDB, creates job with QUEUED status
  3. Backend dispatches to Lambda (sync invoke) or SQS (for Batch)
  4. Container fetches payload from MongoDB, processes, updates status
  5. Client either waits for result or polls /compute-tasks/{jobId}/status

Project Structure

compute_deploy/
├── Dockerfile                      # Generalized multi-target Dockerfile
├── handlers/                       # Generic handlers (no per-task code needed)
│   ├── lambda_handler.py
│   └── batch_handler.py
├── configs/
│   ├── base_requirements.txt       # Shared Python deps
│   ├── tone_mapping/
│   │   ├── config.json             # Resource config (memory, timeout, etc.)
│   │   └── requirements.txt        # Task-specific Python deps
│   └── minimatics/
│       ├── config.json
│       └── requirements.txt
└── terraform/
    ├── main.tf                     # Auto-discovers configs/ via fileset()
    ├── environments/
    │   ├── dev.tfvars
    │   └── prod.tfvars
    └── modules/

Adding a New Compute Task

  1. Register task type in app/compute_tasks/compute_tasks_schema.py:

    class ComputeTaskType(str, Enum):
        MY_NEW_TASK = "my_new_task"
    

  2. Configure handler in app/compute_tasks/task_registry.py:

    ComputeTaskType.MY_NEW_TASK: TaskConfig(
        target=ComputeTarget.LAMBDA,  # or BATCH
        resource_env_var="COMPUTE_TASKS_LAMBDA_MY_TASK",
        schema="app.my_feature.my_schemas.MyRequest",
        service="app.my_feature.my_service.MyService.do_work",
    ),
    

  3. Add mixin to request schema:

    class MyRequest(ComputeTaskMixin):
        # your fields...
    

  4. Update router with conditional dispatch:

    async def my_endpoint(request: MyRequest):
        if request.use_container:
            return await ComputeTasksService.dispatch(
                ComputeTaskType.MY_NEW_TASK,
                request.model_dump(exclude={"use_container", "wait_for_result", "wait_timeout_seconds"}),
                wait_for_result=request.wait_for_result,
                timeout=request.wait_timeout_seconds,
            )
        return await MyService.existing_method(request)
    

  5. Create container config at compute_deploy/configs/my_task/:

config.json (Lambda):

{
  "enabled": true,
  "target": "lambda",
  "memory_size": 1024,
  "timeout": 120,
  "ecr_image": "my-task"
}

config.json (Batch):

{
  "enabled": true,
  "target": "batch",
  "vcpu": 2,
  "memory": 4096,
  "timeout": 1800,
  "max_vcpus": 8,
  "ecr_image": "my-task"
}

requirements.txt:

-r ../base_requirements.txt
# task-specific deps

  1. Deploy:
    cd compute_deploy/terraform
    terraform apply -var-file=environments/dev.tfvars
    

Set "enabled": false in config.json to skip deployment while keeping the config.

Infrastructure Deployment

First-time setup:

cd compute_deploy/terraform

# Put Secrets Manager ARNs in environments/dev.tfvars (and prod.tfvars for prod)
# - mongodb_uri_secret_arn
# - b2_access_key_secret_arn
# - b2_secret_key_secret_arn

terraform init
terraform apply -var-file=environments/dev.tfvars

Secrets Manager setup (required):

Create these 3 secrets in AWS Secrets Manager before running terraform apply: - mongodb_uri_secret_arn -> secret value should be the full MongoDB URI string - b2_access_key_secret_arn -> secret value should be the B2 access key string - b2_secret_key_secret_arn -> secret value should be the B2 secret key string

Recommended naming pattern: - compute/dev/mongodb_uri_studio - compute/dev/b2_access_key - compute/dev/b2_secret_key

Then paste the generated secret ARNs into: - compute_deploy/terraform/environments/dev.tfvars (for dev) - compute_deploy/terraform/environments/prod.tfvars (for prod)

Important notes: - Do not put raw secret values in Terraform .tfvars files. - b2_bucket_name and b2_region are non-sensitive and stay in env .tfvars. - Lambda code reads MONGODB_URI_STUDIO_SECRET_ARN; Batch jobs use ECS secrets injection at runtime.

Terraform will: - Create ECR repositories - Create CodeBuild projects to build Docker images in AWS (no local Docker needed) - Trigger builds when config/requirements/Dockerfile changes - Create Lambda functions, SQS queues, Batch environments, EventBridge pipes

Force rebuild for Python code changes:

Terraform only auto-triggers builds when config.json, requirements.txt, or Dockerfile change. For Python code changes in app/, taint the CodeBuild trigger:

cd compute_deploy/terraform

# Lambda tasks (e.g., tone_mapping)
terraform taint 'module.codebuild_lambda["tone_mapping"].null_resource.trigger_build'

# Batch tasks (e.g., minimatics)
terraform taint 'module.codebuild_batch["minimatics"].null_resource.trigger_build'

# Then apply
terraform apply -var-file=environments/dev.tfvars

Note: Adding app/ hash to triggers would auto-rebuild on any code change but causes unnecessary rebuilds (e.g., auth changes rebuilding compute images). Consider task-specific hashes in main.tf if manual tainting becomes tedious.

After deployment, set these env vars in FastAPI backend:

COMPUTE_TASKS_LAMBDA_TONE_MAPPING=<lambda function name from terraform output>
COMPUTE_TASKS_SQS_MINIMATICS=<sqs queue url from terraform output>

Stale Job Cleanup

A scheduled Lambda runs every xx minutes (via CloudWatch Events) to mark stale jobs as FAILED. This catches cases where the container fails to start (e.g., missing dependencies, ECR pull errors) - situations where the code never runs to update MongoDB.

How it works: - CloudWatch Events triggers the cleanup Lambda on a schedule - Jobs stuck in QUEUED or PROCESSING beyond the timeout are marked FAILED - Error message indicates possible container failure - Uses a separate Lambda with pymongo layer (not a container image)

Configuration (in compute_deploy/terraform/modules/cleanup_lambda/variables.tf): - stale_timeout_minutes: Minutes before a job is considered stale - schedule_minutes: How often cleanup runs

Lambda Layer Requirement: The cleanup Lambda requires a pymongo Lambda layer. Create one if needed:

mkdir -p python/lib/python3.11/site-packages
pip install pymongo -t python/lib/python3.11/site-packages
zip -r layer.zip python
Upload the layer and update the layer ARN in compute_deploy/terraform/main.tf.

Testing

# Lambda task (tone mapping)
curl -X POST http://localhost:8000/tone-mapping/auto-correct \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your-api-key" \
  -d '{"image": "<base64>", "use_container": true, "wait_for_result": true}'

# Batch task (minimatics) - async
curl -X POST http://localhost:8000/minimatics/exportVideo \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your-api-key" \
  -d '{"projectId": "xxx", "use_container": true}'

# Poll for status
curl http://localhost:8000/compute-tasks/{job_id}/status \
  -H "X-API-KEY: your-api-key"