Python Backend Project¶
Initial Setup¶
TODO : Once, Code is pushed to Git, I'll dowload it in an isolated system and setup and write more accurate instructions
This is a FastAPI-based Python backend project with MongoDB integration and authentication features.
Recommend VSCode Extensions¶
- Python - Official Python extension with IntelliSense, linting, debugging, and more
- Python Debugger - Debug Python code with full debugging capabilities
- Ruff - Fast Python linter and formatter written in Rust
- MyPy Type Checker - Static type checking for Python using MyPy
Project Structure¶
.
├── app/ # Main application directory
│ ├── auth/ # Authentication related components
│ ├── compute_tasks/ # Async task offloading infrastructure
│ ├── models/ # Database models
│ ├── schemas/ # Pydantic schemas
│ ├── shared/ # Shared utilities and constants
│ ├── third_party_services/ # External services (openai, b2, s3)
│ ├── main.py # Application entry point
│ ├── middlewares/ # Custom FastAPI middlewares
│ ├── error_handler.py # Centralized exception handling
│ └── mongodb.py # MongoDB connection and utilities
├── compute_deploy/ # Container offloading infrastructure
│ ├── Dockerfile # Generalized multi-target Dockerfile
│ ├── handlers/ # Generic Lambda/Batch entry points
│ ├── configs/ # Per-container config (config.json + requirements.txt)
│ └── terraform/ # Infrastructure as Code (auto-reads configs/)
├── examples/ # API example curl scripts
├── justfile # Just command runner configuration
├── pyproject.toml # Python project configuration
└── uv.lock # UV package manager lock file
Code Conventions and Guidelines¶
To maintain consistency and code quality, please adhere to the following conventions when contributing to the project.
General Principles¶
- Code Style: We use
Rufffor linting and formatting, which follows a style similar toBlack. Please ensure your code is formatted before committing. - You can run
ruff check - Type Hinting: All functions and methods must have type hints for arguments and return values.
- Asynchronicity: Use
async/awaitfor all I/O-bound operations, including database calls and external API requests. - Configuration: Store hardcoded constants in
app/shared/shared_constants.py. Use environment variables for secrets and environment-specific settings.
Naming Conventions¶
- Files: Use
snake_case(e.g.,user_service.py). - Classes: Use
PascalCase(e.g.,UserService,UserSchema). - Functions/Methods: Use
snake_case(e.g.,get_user_by_id). - Variables: Use
snake_case. - Pydantic Schema Fields: Use
snake_casefor all field names to align with Python conventions. - For DB scehma fields,
camelCaseis preferred to be consistent with nodeJS schemas for mongoDB.
Architecture & File Structure¶
The project follows a feature-based architecture with a clear separation of concerns.
app/main.py: The main application entry point. It should only contain FastAPI app initialization, middleware, lifespan events, and inclusion of feature routers.- Feature Modules (e.g.,
app/feature_name/): Each major feature should reside in its own directory withinapp/. For example, a new "products" feature would be inapp/products/. _router.py: The API layer. It defines API endpoints usingfastapi.APIRouter. Its role is to handle HTTP requests, validate inputs using Pydantic schemas, and call the appropriate service layer method. It should not contain any business logic._service.py: The business logic layer. It's encapsulated in a service class (e.g.,ProductService) with static methods. This layer coordinates application logic, calling model/database methods and helper functions._schemas.py: Contains Pydantic schemas for request and response models specific to the feature._helper.py: Optional file for utility functions that are only used by this specific feature.app/models/: The data access layer. Models interact with the database. A model file (e.g.,users_model.py) should contain a class with static methods for CRUD operations on a specific database collection. These methods should return Pydantic schema instances, not raw database documents.app/schemas/: Contains core Pydantic schemas that are shared across multiple features (e.g.,users_schema.py).app/shared/: For code that is used across the entire application, such as custom error classes (shared_errors.py) or shared constants (shared_constants.py).
Creating a New Feature¶
To add a new feature, such as "Products":
- Create a feature directory:
app/products/ - Create the router:
app/products/products_router.py. Define yourAPIRouterhere. - Create the service:
app/products/products_service.py. Create aProductServiceclass to handle business logic. - Create schemas:
- For request/response models specific to the products API, create
app/products/products_schemas.py. - If you need a core "Product" schema that might be reused elsewhere, add it to a new file
app/schemas/products_schema.py.
- For request/response models specific to the products API, create
- Create the model:
app/models/products_model.py. Create aProductsModelclass to handle database operations for products. - Include the router: Import and include the
products_routerinapp/main.py.
Key Components¶
Authentication (app/auth/)¶
- Complete authentication system with helper functions, routes, and services
- Handles user authentication and authorization
- Includes schema definitions for request/response validation
Models (app/models/)¶
- Database models for MongoDB
- Currently includes user model for authentication
Schemas (app/schemas/)¶
- Pydantic schemas for data validation
- Defines the structure of request/response data
Shared (app/shared/)¶
- Common utilities and constants used across the application
- Custom error definitions for consistent error handling
Tone Mapping (app/tone_mapping/)¶
- Ports the adaptive LUT grading pipeline from
image-tools - Exposes
/tone-mappingFastAPI routes for auto-correct, manual adjustments, preset apply, and cache management - Provides async wrappers around the original synchronous implementation so endpoints remain non-blocking
- Reuses MongoDB configs via
app/mongodb.pyand centralizes AWS S3 access inapp/third_party_services/s3_service.py pipeline.py: The entire ToneMappingService class from image-tools is synchronous and depends on many modules. To keep backend endpoints async without rewriting the whole implementation, the original service lives inpipeline.py(renamedToneMappingPipeline) and is wrapped with lightweight async adapters intone_mapping_service.py, preventing blocking requests while preserving legacy logic.
Core Files¶
main.py: Application entry point and FastAPI setupmongodb.py: MongoDB connection and database utilities
Middlewares (app/middlewares/)¶
- Contains custom FastAPI middleware, such as authentication and request processing logic.
Error Handler (app/error_handler.py)¶
- Centralized error and exception handling for FastAPI, including validation and custom errors.
API Examples¶
The examples/ directory contains curl commands for testing all API endpoints. Each subdirectory corresponds to a router with executable shell scripts demonstrating complete request examples. Run any example with ./examples/router_name/endpoint.sh or copy the curl commands directly.
Development Setup¶
The project uses:
- FastAPI for the web framework
- MongoDB for the database
- UV for package management
- MyPy for type checking
- Just for command running
Configuration Files¶
pyproject.toml: Project metadata and dependenciesmypy.ini: Type checking configurationjustfile: Command runner configuration.python-version: Python version specificationnewrelic.ini: New Relic APM agent configuration
Monitoring with New Relic¶
New Relic APM is integrated using the Python agent and newrelic.ini.
Running Locally¶
Note: Work in a virtual environment for better separation of packages from your global environment. uv handles it automatically.
pip install uv or brew install uv
uv sync
uv run just dev
Installing packages¶
uv add <package>
Docker Deployment¶
Run the application in a container:
docker build -t python-backend .
docker run -p 8000:8000 -e PORT=8000 --env-file .env python-backend
Heavy Processing Task Offloading¶
For CPU-intensive or long-running tasks, we support offloading to AWS Lambda (fast tasks ~1-5 min) or AWS Batch (longer tasks like video rendering). This keeps the main FastAPI server responsive.
PLEASE reach out to niklesh@ or one of the devops team members to deploy your code to a container.
Architecture¶
flowchart TB
subgraph client [Client]
C[Frontend/API Consumer]
end
subgraph api [FastAPI Backend]
R[Router]
CTS[ComputeTasksService]
TR[TaskRegistry]
DB[(MongoDB)]
end
subgraph lambda_path [Lambda - Fast Tasks]
L[tone-mapping container]
end
subgraph batch_path [Batch - Long Tasks]
SQS[SQS Queue]
EB[EventBridge Pipes]
B[minimatics container]
end
C -->|"POST with use_container=true"| R
R --> CTS
CTS -->|"Which target?"| TR
CTS -->|Create job| DB
TR -->|LAMBDA| L
L -->|Update status| DB
TR -->|BATCH| SQS
SQS --> EB
EB -->|Trigger| B
B -->|Update status| DB
C -->|"GET /compute-tasks/{id}/status"| R
R -->|Read| DB
When to Use Container Offloading¶
| Use Case | Target | Example |
|---|---|---|
| Image processing (1-5 min) | Lambda | Tone mapping auto-correct |
| Video rendering (5-60 min) | Batch | Minimatics video export |
| GPU-required tasks | Batch | ML inference |
| Tasks that block event loop | Lambda/Batch | Heavy numpy/opencv operations |
Don't offload simple CRUD operations, quick database queries, or tasks under 30 seconds.
How It Works¶
- Client sends request with
use_container=true - Backend stores payload in MongoDB, creates job with
QUEUEDstatus - Backend dispatches to Lambda (sync invoke) or SQS (for Batch)
- Container fetches payload from MongoDB, processes, updates status
- Client either waits for result or polls
/compute-tasks/{jobId}/status
Project Structure¶
compute_deploy/
├── Dockerfile # Generalized multi-target Dockerfile
├── handlers/ # Generic handlers (no per-task code needed)
│ ├── lambda_handler.py
│ └── batch_handler.py
├── configs/
│ ├── base_requirements.txt # Shared Python deps
│ ├── tone_mapping/
│ │ ├── config.json # Resource config (memory, timeout, etc.)
│ │ └── requirements.txt # Task-specific Python deps
│ └── minimatics/
│ ├── config.json
│ └── requirements.txt
└── terraform/
├── main.tf # Auto-discovers configs/ via fileset()
├── environments/
│ ├── dev.tfvars
│ └── prod.tfvars
└── modules/
Adding a New Compute Task¶
-
Register task type in
app/compute_tasks/compute_tasks_schema.py:class ComputeTaskType(str, Enum): MY_NEW_TASK = "my_new_task" -
Configure handler in
app/compute_tasks/task_registry.py:ComputeTaskType.MY_NEW_TASK: TaskConfig( target=ComputeTarget.LAMBDA, # or BATCH resource_env_var="COMPUTE_TASKS_LAMBDA_MY_TASK", schema="app.my_feature.my_schemas.MyRequest", service="app.my_feature.my_service.MyService.do_work", ), -
Add mixin to request schema:
class MyRequest(ComputeTaskMixin): # your fields... -
Update router with conditional dispatch:
async def my_endpoint(request: MyRequest): if request.use_container: return await ComputeTasksService.dispatch( ComputeTaskType.MY_NEW_TASK, request.model_dump(exclude={"use_container", "wait_for_result", "wait_timeout_seconds"}), wait_for_result=request.wait_for_result, timeout=request.wait_timeout_seconds, ) return await MyService.existing_method(request) -
Create container config at
compute_deploy/configs/my_task/:
config.json (Lambda):
{
"enabled": true,
"target": "lambda",
"memory_size": 1024,
"timeout": 120,
"ecr_image": "my-task"
}
config.json (Batch):
{
"enabled": true,
"target": "batch",
"vcpu": 2,
"memory": 4096,
"timeout": 1800,
"max_vcpus": 8,
"ecr_image": "my-task"
}
requirements.txt:
-r ../base_requirements.txt
# task-specific deps
- Deploy:
cd compute_deploy/terraform terraform apply -var-file=environments/dev.tfvars
Set "enabled": false in config.json to skip deployment while keeping the config.
Infrastructure Deployment¶
First-time setup:
cd compute_deploy/terraform
# Put Secrets Manager ARNs in environments/dev.tfvars (and prod.tfvars for prod)
# - mongodb_uri_secret_arn
# - b2_access_key_secret_arn
# - b2_secret_key_secret_arn
terraform init
terraform apply -var-file=environments/dev.tfvars
Secrets Manager setup (required):
Create these 3 secrets in AWS Secrets Manager before running terraform apply:
- mongodb_uri_secret_arn -> secret value should be the full MongoDB URI string
- b2_access_key_secret_arn -> secret value should be the B2 access key string
- b2_secret_key_secret_arn -> secret value should be the B2 secret key string
Recommended naming pattern:
- compute/dev/mongodb_uri_studio
- compute/dev/b2_access_key
- compute/dev/b2_secret_key
Then paste the generated secret ARNs into:
- compute_deploy/terraform/environments/dev.tfvars (for dev)
- compute_deploy/terraform/environments/prod.tfvars (for prod)
Important notes:
- Do not put raw secret values in Terraform .tfvars files.
- b2_bucket_name and b2_region are non-sensitive and stay in env .tfvars.
- Lambda code reads MONGODB_URI_STUDIO_SECRET_ARN; Batch jobs use ECS secrets injection at runtime.
Terraform will: - Create ECR repositories - Create CodeBuild projects to build Docker images in AWS (no local Docker needed) - Trigger builds when config/requirements/Dockerfile changes - Create Lambda functions, SQS queues, Batch environments, EventBridge pipes
Force rebuild for Python code changes:
Terraform only auto-triggers builds when config.json, requirements.txt, or Dockerfile change. For Python code changes in app/, taint the CodeBuild trigger:
cd compute_deploy/terraform
# Lambda tasks (e.g., tone_mapping)
terraform taint 'module.codebuild_lambda["tone_mapping"].null_resource.trigger_build'
# Batch tasks (e.g., minimatics)
terraform taint 'module.codebuild_batch["minimatics"].null_resource.trigger_build'
# Then apply
terraform apply -var-file=environments/dev.tfvars
Note: Adding
app/hash to triggers would auto-rebuild on any code change but causes unnecessary rebuilds (e.g., auth changes rebuilding compute images). Consider task-specific hashes inmain.tfif manual tainting becomes tedious.
After deployment, set these env vars in FastAPI backend:
COMPUTE_TASKS_LAMBDA_TONE_MAPPING=<lambda function name from terraform output>
COMPUTE_TASKS_SQS_MINIMATICS=<sqs queue url from terraform output>
Stale Job Cleanup¶
A scheduled Lambda runs every xx minutes (via CloudWatch Events) to mark stale jobs as FAILED. This catches cases where the container fails to start (e.g., missing dependencies, ECR pull errors) - situations where the code never runs to update MongoDB.
How it works:
- CloudWatch Events triggers the cleanup Lambda on a schedule
- Jobs stuck in QUEUED or PROCESSING beyond the timeout are marked FAILED
- Error message indicates possible container failure
- Uses a separate Lambda with pymongo layer (not a container image)
Configuration (in compute_deploy/terraform/modules/cleanup_lambda/variables.tf):
- stale_timeout_minutes: Minutes before a job is considered stale
- schedule_minutes: How often cleanup runs
Lambda Layer Requirement: The cleanup Lambda requires a pymongo Lambda layer. Create one if needed:
mkdir -p python/lib/python3.11/site-packages
pip install pymongo -t python/lib/python3.11/site-packages
zip -r layer.zip python
compute_deploy/terraform/main.tf.
Testing¶
# Lambda task (tone mapping)
curl -X POST http://localhost:8000/tone-mapping/auto-correct \
-H "Content-Type: application/json" \
-H "X-API-KEY: your-api-key" \
-d '{"image": "<base64>", "use_container": true, "wait_for_result": true}'
# Batch task (minimatics) - async
curl -X POST http://localhost:8000/minimatics/exportVideo \
-H "Content-Type: application/json" \
-H "X-API-KEY: your-api-key" \
-d '{"projectId": "xxx", "use_container": true}'
# Poll for status
curl http://localhost:8000/compute-tasks/{job_id}/status \
-H "X-API-KEY: your-api-key"