Python Backend Project¶

Initial Setup¶

TODO : Once, Code is pushed to Git, I'll dowload it in an isolated system and setup and write more accurate instructions

This is a FastAPI-based Python backend project with MongoDB integration and authentication features.

Python - Official Python extension with IntelliSense, linting, debugging, and more
Python Debugger - Debug Python code with full debugging capabilities
Ruff - Fast Python linter and formatter written in Rust
MyPy Type Checker - Static type checking for Python using MyPy

Project Structure¶

.
├── app/                    # Main application directory
│   ├── auth/              # Authentication related components
│   ├── compute_tasks/     # Async task offloading infrastructure
│   ├── models/            # Database models
│   ├── schemas/           # Pydantic schemas
│   ├── shared/            # Shared utilities and constants
│   ├── third_party_services/   # External services (openai, b2, s3)
│   ├── main.py            # Application entry point
│   ├── middlewares/       # Custom FastAPI middlewares
│   ├── error_handler.py   # Centralized exception handling
│   └── mongodb.py         # MongoDB connection and utilities
├── compute_deploy/        # Container offloading infrastructure
│   ├── Dockerfile         # Generalized multi-target Dockerfile
│   ├── handlers/          # Generic Lambda/Batch entry points
│   ├── configs/           # Per-container config (config.json + requirements.txt)
│   └── terraform/         # Infrastructure as Code (auto-reads configs/)
├── examples/              # API example curl scripts
├── justfile               # Just command runner configuration
├── pyproject.toml         # Python project configuration
└── uv.lock                # UV package manager lock file

Code Conventions and Guidelines¶

To maintain consistency and code quality, please adhere to the following conventions when contributing to the project.

General Principles¶

Code Style: We use Ruff for linting and formatting, which follows a style similar to Black. Please ensure your code is formatted before committing.
You can run ruff check
Type Hinting: All functions and methods must have type hints for arguments and return values.
Asynchronicity: Use async/await for all I/O-bound operations, including database calls and external API requests.
Configuration: Store hardcoded constants in app/shared/shared_constants.py. Use environment variables for secrets and environment-specific settings.

Naming Conventions¶

Files: Use snake_case (e.g., user_service.py).
Classes: Use PascalCase (e.g., UserService, UserSchema).
Functions/Methods: Use snake_case (e.g., get_user_by_id).
Variables: Use snake_case.
Pydantic Schema Fields: Use snake_case for all field names to align with Python conventions.
For DB scehma fields, camelCase is preferred to be consistent with nodeJS schemas for mongoDB.

Architecture & File Structure¶

The project follows a feature-based architecture with a clear separation of concerns.

app/main.py: The main application entry point. It should only contain FastAPI app initialization, middleware, lifespan events, and inclusion of feature routers.
Feature Modules (e.g., app/feature_name/): Each major feature should reside in its own directory within app/. For example, a new "products" feature would be in app/products/.
_router.py: The API layer. It defines API endpoints using fastapi.APIRouter. Its role is to handle HTTP requests, validate inputs using Pydantic schemas, and call the appropriate service layer method. It should not contain any business logic.
_service.py: The business logic layer. It's encapsulated in a service class (e.g., ProductService) with static methods. This layer coordinates application logic, calling model/database methods and helper functions.
_schemas.py: Contains Pydantic schemas for request and response models specific to the feature.
_helper.py: Optional file for utility functions that are only used by this specific feature.
app/models/: The data access layer. Models interact with the database. A model file (e.g., users_model.py) should contain a class with static methods for CRUD operations on a specific database collection. These methods should return Pydantic schema instances, not raw database documents.
app/schemas/: Contains core Pydantic schemas that are shared across multiple features (e.g., users_schema.py).
app/shared/: For code that is used across the entire application, such as custom error classes (shared_errors.py) or shared constants (shared_constants.py).

Creating a New Feature¶

To add a new feature, such as "Products":

Create a feature directory: app/products/
Create the router: app/products/products_router.py. Define your APIRouter here.
Create the service: app/products/products_service.py. Create a ProductService class to handle business logic.
Create schemas:
- For request/response models specific to the products API, create app/products/products_schemas.py.
- If you need a core "Product" schema that might be reused elsewhere, add it to a new file app/schemas/products_schema.py.
Create the model: app/models/products_model.py. Create a ProductsModel class to handle database operations for products.
Include the router: Import and include the products_router in app/main.py.

Key Components¶

Authentication (`app/auth/`)¶

Complete authentication system with helper functions, routes, and services
Handles user authentication and authorization
Includes schema definitions for request/response validation

Models (`app/models/`)¶

Database models for MongoDB
Currently includes user model for authentication

Schemas (`app/schemas/`)¶

Pydantic schemas for data validation
Defines the structure of request/response data

Shared (`app/shared/`)¶

Common utilities and constants used across the application
Custom error definitions for consistent error handling

Tone Mapping (`app/tone_mapping/`)¶

Ports the adaptive LUT grading pipeline from image-tools
Exposes /tone-mapping FastAPI routes for auto-correct, manual adjustments, preset apply, and cache management
Provides async wrappers around the original synchronous implementation so endpoints remain non-blocking
Reuses MongoDB configs via app/mongodb.py and centralizes AWS S3 access in app/third_party_services/s3_service.py
pipeline.py: The entire ToneMappingService class from image-tools is synchronous and depends on many modules. To keep backend endpoints async without rewriting the whole implementation, the original service lives in pipeline.py (renamed ToneMappingPipeline) and is wrapped with lightweight async adapters in tone_mapping_service.py, preventing blocking requests while preserving legacy logic.

Core Files¶

main.py: Application entry point and FastAPI setup
mongodb.py: MongoDB connection and database utilities

Middlewares (`app/middlewares/`)¶

Contains custom FastAPI middleware, such as authentication and request processing logic.

Error Handler (`app/error_handler.py`)¶

Centralized error and exception handling for FastAPI, including validation and custom errors.

API Examples¶

The examples/ directory contains curl commands for testing all API endpoints. Each subdirectory corresponds to a router with executable shell scripts demonstrating complete request examples. Run any example with ./examples/router_name/endpoint.sh or copy the curl commands directly.

Development Setup¶

The project uses:

FastAPI for the web framework
MongoDB for the database
UV for package management
MyPy for type checking
Just for command running

Configuration Files¶

pyproject.toml: Project metadata and dependencies
mypy.ini: Type checking configuration
justfile: Command runner configuration
.python-version: Python version specification
newrelic.ini: New Relic APM agent configuration

Monitoring with New Relic¶

New Relic APM is integrated using the Python agent and newrelic.ini.

Running Locally¶

Note: Work in a virtual environment for better separation of packages from your global environment. uv handles it automatically.

pip install uv or brew install uv
uv sync
uv run just dev

Installing packages¶

uv add <package>

Docker Deployment¶

Run the application in a container:

docker build -t python-backend .
docker run -p 8000:8000 -e PORT=8000 --env-file .env python-backend

Heavy Processing Task Offloading¶

For CPU-intensive or long-running tasks, we support offloading to AWS Lambda (fast tasks ~1-5 min) or AWS Batch (longer tasks like video rendering). This keeps the main FastAPI server responsive.

PLEASE reach out to niklesh@ or one of the devops team members to deploy your code to a container.

Architecture¶

flowchart TB
    subgraph client [Client]
        C[Frontend/API Consumer]
    end

    subgraph api [FastAPI Backend]
        R[Router]
        CTS[ComputeTasksService]
        TR[TaskRegistry]
        DB[(MongoDB)]
    end

    subgraph lambda_path [Lambda - Fast Tasks]
        L[tone-mapping container]
    end

    subgraph batch_path [Batch - Long Tasks]
        SQS[SQS Queue]
        EB[EventBridge Pipes]
        B[minimatics container]
    end

    C -->|"POST with use_container=true"| R
    R --> CTS
    CTS -->|"Which target?"| TR
    CTS -->|Create job| DB

    TR -->|LAMBDA| L
    L -->|Update status| DB

    TR -->|BATCH| SQS
    SQS --> EB
    EB -->|Trigger| B
    B -->|Update status| DB

    C -->|"GET /compute-tasks/{id}/status"| R
    R -->|Read| DB

When to Use Container Offloading¶

Use Case	Target	Example
Image processing (1-5 min)	Lambda	Tone mapping auto-correct
Video rendering (5-60 min)	Batch	Minimatics video export
GPU-required tasks	Batch	ML inference
Tasks that block event loop	Lambda/Batch	Heavy numpy/opencv operations

Don't offload simple CRUD operations, quick database queries, or tasks under 30 seconds.

How It Works¶

Client sends request with use_container=true
Backend stores payload in MongoDB, creates job with QUEUED status
Backend dispatches to Lambda (sync invoke) or SQS (for Batch)
Container fetches payload from MongoDB, processes, updates status
Client either waits for result or polls /compute-tasks/{jobId}/status

Project Structure¶

compute_deploy/
├── Dockerfile                      # Generalized multi-target Dockerfile
├── handlers/                       # Generic handlers (no per-task code needed)
│   ├── lambda_handler.py
│   └── batch_handler.py
├── configs/
│   ├── base_requirements.txt       # Shared Python deps
│   ├── tone_mapping/
│   │   ├── config.json             # Resource config (memory, timeout, etc.)
│   │   └── requirements.txt        # Task-specific Python deps
│   └── minimatics/
│       ├── config.json
│       └── requirements.txt
└── terraform/
    ├── main.tf                     # Auto-discovers configs/ via fileset()
    ├── environments/
    │   ├── dev.tfvars
    │   └── prod.tfvars
    └── modules/

Adding a New Compute Task¶

Register task type in app/compute_tasks/compute_tasks_schema.py:

class ComputeTaskType(str, Enum):
    MY_NEW_TASK = "my_new_task"

Configure handler in app/compute_tasks/task_registry.py:

ComputeTaskType.MY_NEW_TASK: TaskConfig(
    target=ComputeTarget.LAMBDA,  # or BATCH
    resource_env_var="COMPUTE_TASKS_LAMBDA_MY_TASK",
    schema="app.my_feature.my_schemas.MyRequest",
    service="app.my_feature.my_service.MyService.do_work",
),

Add mixin to request schema:

class MyRequest(ComputeTaskMixin):
    # your fields...

Update router with conditional dispatch:

async def my_endpoint(request: MyRequest):
    if request.use_container:
        return await ComputeTasksService.dispatch(
            ComputeTaskType.MY_NEW_TASK,
            request.model_dump(exclude={"use_container", "wait_for_result", "wait_timeout_seconds"}),
            wait_for_result=request.wait_for_result,
            timeout=request.wait_timeout_seconds,
        )
    return await MyService.existing_method(request)

Create container config at compute_deploy/configs/my_task/:

config.json (Lambda):

{
  "enabled": true,
  "target": "lambda",
  "memory_size": 1024,
  "timeout": 120,
  "ecr_image": "my-task"
}

config.json (Batch):

{
  "enabled": true,
  "target": "batch",
  "vcpu": 2,
  "memory": 4096,
  "timeout": 1800,
  "max_vcpus": 8,
  "ecr_image": "my-task"
}

requirements.txt:

-r ../base_requirements.txt
# task-specific deps

Deploy:

cd compute_deploy/terraform
terraform apply -var-file=environments/dev.tfvars

Set "enabled": false in config.json to skip deployment while keeping the config.

Infrastructure Deployment¶

First-time setup:

cd compute_deploy/terraform

# Put Secrets Manager ARNs in environments/dev.tfvars (and prod.tfvars for prod)
# - mongodb_uri_secret_arn
# - b2_access_key_secret_arn
# - b2_secret_key_secret_arn

terraform init
terraform apply -var-file=environments/dev.tfvars

Secrets Manager setup (required):

Create these 3 secrets in AWS Secrets Manager before running terraform apply: - mongodb_uri_secret_arn -> secret value should be the full MongoDB URI string - b2_access_key_secret_arn -> secret value should be the B2 access key string - b2_secret_key_secret_arn -> secret value should be the B2 secret key string

Recommended naming pattern: - compute/dev/mongodb_uri_studio - compute/dev/b2_access_key - compute/dev/b2_secret_key

Then paste the generated secret ARNs into: - compute_deploy/terraform/environments/dev.tfvars (for dev) - compute_deploy/terraform/environments/prod.tfvars (for prod)

Important notes: - Do not put raw secret values in Terraform .tfvars files. - b2_bucket_name and b2_region are non-sensitive and stay in env .tfvars. - Lambda code reads MONGODB_URI_STUDIO_SECRET_ARN; Batch jobs use ECS secrets injection at runtime.

Terraform will: - Create ECR repositories - Create CodeBuild projects to build Docker images in AWS (no local Docker needed) - Trigger builds when config/requirements/Dockerfile changes - Create Lambda functions, SQS queues, Batch environments, EventBridge pipes

Force rebuild for Python code changes:

Terraform only auto-triggers builds when config.json, requirements.txt, or Dockerfile change. For Python code changes in app/, taint the CodeBuild trigger:

cd compute_deploy/terraform

# Lambda tasks (e.g., tone_mapping)
terraform taint 'module.codebuild_lambda["tone_mapping"].null_resource.trigger_build'

# Batch tasks (e.g., minimatics)
terraform taint 'module.codebuild_batch["minimatics"].null_resource.trigger_build'

# Then apply
terraform apply -var-file=environments/dev.tfvars

Note: Adding app/ hash to triggers would auto-rebuild on any code change but causes unnecessary rebuilds (e.g., auth changes rebuilding compute images). Consider task-specific hashes in main.tf if manual tainting becomes tedious.

After deployment, set these env vars in FastAPI backend:

COMPUTE_TASKS_LAMBDA_TONE_MAPPING=<lambda function name from terraform output>
COMPUTE_TASKS_SQS_MINIMATICS=<sqs queue url from terraform output>

Stale Job Cleanup¶

A scheduled Lambda runs every xx minutes (via CloudWatch Events) to mark stale jobs as FAILED. This catches cases where the container fails to start (e.g., missing dependencies, ECR pull errors) - situations where the code never runs to update MongoDB.

How it works: - CloudWatch Events triggers the cleanup Lambda on a schedule - Jobs stuck in QUEUED or PROCESSING beyond the timeout are marked FAILED - Error message indicates possible container failure - Uses a separate Lambda with pymongo layer (not a container image)

Configuration (in compute_deploy/terraform/modules/cleanup_lambda/variables.tf): - stale_timeout_minutes: Minutes before a job is considered stale - schedule_minutes: How often cleanup runs

Lambda Layer Requirement: The cleanup Lambda requires a pymongo Lambda layer. Create one if needed:

mkdir -p python/lib/python3.11/site-packages
pip install pymongo -t python/lib/python3.11/site-packages
zip -r layer.zip python

Upload the layer and update the layer ARN in compute_deploy/terraform/main.tf.

Testing¶

# Lambda task (tone mapping)
curl -X POST http://localhost:8000/tone-mapping/auto-correct \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your-api-key" \
  -d '{"image": "<base64>", "use_container": true, "wait_for_result": true}'

# Batch task (minimatics) - async
curl -X POST http://localhost:8000/minimatics/exportVideo \
  -H "Content-Type: application/json" \
  -H "X-API-KEY: your-api-key" \
  -d '{"projectId": "xxx", "use_container": true}'

# Poll for status
curl http://localhost:8000/compute-tasks/{job_id}/status \
  -H "X-API-KEY: your-api-key"

Python Backend Project¶

Initial Setup¶

Recommend VSCode Extensions¶

Project Structure¶

Code Conventions and Guidelines¶

General Principles¶

Naming Conventions¶

Architecture & File Structure¶

Creating a New Feature¶

Key Components¶

Authentication (app/auth/)¶

Models (app/models/)¶

Schemas (app/schemas/)¶

Shared (app/shared/)¶

Tone Mapping (app/tone_mapping/)¶

Core Files¶

Middlewares (app/middlewares/)¶

Error Handler (app/error_handler.py)¶

API Examples¶

Development Setup¶

Configuration Files¶

Monitoring with New Relic¶

Running Locally¶

Installing packages¶

Docker Deployment¶

Heavy Processing Task Offloading¶

Architecture¶

When to Use Container Offloading¶

How It Works¶

Project Structure¶

Adding a New Compute Task¶

Infrastructure Deployment¶

Stale Job Cleanup¶

Testing¶

Authentication (`app/auth/`)¶

Models (`app/models/`)¶

Schemas (`app/schemas/`)¶

Shared (`app/shared/`)¶

Tone Mapping (`app/tone_mapping/`)¶

Middlewares (`app/middlewares/`)¶

Error Handler (`app/error_handler.py`)¶