Skip to content

Kitsu AWS Infrastructure

Terraform configuration to deploy Kitsu (production management for animation/VFX studios) on AWS.

Architecture

                    Internet
                        │
                ┌───────▼───────┐
                │  Application  │
                │ Load Balancer │
                │  (HTTPS:443)  │
                └───────┬───────┘
                        │
                ┌───────▼───────┐
                │ EC2 (public)  │
                │  t3.medium    │
                │ ┌───────────┐ │
                │ │  Docker   │ │
                │ │ cgwire/   │ │
                │ │ cgwire    │ │
                │ └─────┬─────┘ │
                │       │       │
                │ ┌─────▼─────┐ │
                │ │EBS Volume │ │
                │ │ (50GB)    │ │
                │ └───────────┘ │
                └───────┬───────┘
                        │
                ┌───────▼───────┐
                │   S3 Bucket   │
                │  (Previews)   │
                └───────────────┘
Component AWS Service Details
Compute EC2 t3.medium Ubuntu 22.04, public subnet (SG-protected)
Container Docker cgwire/cgwire all-in-one image
Storage EBS gp3 50GB data volume (Docker data-root)
Previews S3 Zou preview files (FS_BACKEND=s3)
Load Balancer ALB HTTPS termination, HTTP→HTTPS redirect
Certificate ACM DNS validation (manual)
Backups AWS Backup Daily snapshots, 7-day retention
Access SSM Session Manager No SSH required
Monitoring New Relic Infrastructure agent + log forwarding

Prerequisites

  • AWS CLI configured with appropriate credentials
  • Terraform >= 1.5.0
  • Existing VPC with at least 2 public subnets in different AZs
  • Domain name with DNS access (for ACM validation)

Deployment

1. Create secrets in SSM Parameter Store

aws ssm put-parameter \
  --name "/kitsu/secret_key" \
  --value "$(openssl rand -hex 32)" \
  --type SecureString

aws ssm put-parameter \
  --name "/kitsu/newrelic_ingest_license_key" \
  --value "YOUR_NEWRELIC_INGEST_LICENSE_KEY" \
  --type SecureString

The New Relic ingest license key can be found in New Relic under Account settings > API keys > INGEST - LICENSE.

2. Configure variables

cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your values

Required variables:

  • vpc_id - Your VPC ID
  • public_subnet_ids - At least 2 public subnets in different AZs for ALB
  • ec2_subnet_id - Subnet for EC2 (must share an AZ with one of the ALB subnets)
  • domain_name - Domain for HTTPS (e.g., kitsu.example.com)
  • s3_bucket_prefix - Prefix for S3 preview buckets (e.g., kitsu-ACCOUNT_ID-)

3. Deploy infrastructure

terraform init
terraform apply

The first apply will create the ACM certificate but the HTTPS listener will fail until you validate it.

4. Validate ACM certificate

Get the DNS validation record:

terraform state show aws_acm_certificate.kitsu

Add a CNAME record to your DNS provider:

  • Name: _xxxxx.kitsu (the part before your domain)
  • Value: _xxxxx.acm-validations.aws.
  • Proxy: Disabled (if using Cloudflare)

Check validation status:

aws acm describe-certificate \
  --certificate-arn "$(terraform output -raw acm_certificate_arn)" \
  --query 'Certificate.Status' \
  --output text

Once status is ISSUED, re-run:

terraform apply

5. Configure DNS

Point your domain to the ALB:

terraform output alb_dns_name

Add a CNAME record: kitsu.example.comkitsu-alb-xxxxx.us-east-1.elb.amazonaws.com

6. Access Kitsu

  1. Navigate to https://kitsu.example.com
  2. Login with default credentials: admin@example.com / mysecretpassword
  3. Change the admin password immediately

Files

File Purpose
terraform/main.tf Provider, backend, data sources
terraform/variables.tf Input variable definitions
terraform/outputs.tf Output values
terraform/ec2.tf EC2 instance and EBS volumes
terraform/s3.tf S3 buckets for preview storage
terraform/alb.tf Load balancer, listeners, target group
terraform/security_groups.tf Network security rules
terraform/iam.tf IAM role for EC2 (SSM + secrets + S3)
terraform/acm.tf ACM certificate
terraform/backup.tf AWS Backup vault and plan
terraform/user_data.sh EC2 bootstrap script
terraform/terraform.tfvars Production variable values
terraform/staging.tfvars Staging variable values
scripts/deploy-config-update.sh Live config update (no instance replace)
scripts/migrate-previews-to-s3.sh Migrate previews from EBS to S3

Operations

Connect to EC2

aws ssm start-session --target "$(terraform output -raw ec2_instance_id)"

View container logs

# After connecting via SSM
sudo docker logs kitsu

# Application-level logs (also shipped to New Relic)
sudo cat /var/log/kitsu/zou/gunicorn_error.log
sudo cat /var/log/kitsu/nginx/error.log
sudo cat /var/log/kitsu/postgresql/postgresql-*.log

Restart Kitsu

# After connecting via SSM
sudo systemctl restart kitsu-compose

Deploy config changes without replacing the instance

Terraform's user_data.sh only runs on instance creation. To update configuration (Nginx, PostgreSQL, New Relic, docker-compose) on a running instance without downtime:

# Option 1: Copy script via SSM and run it
INSTANCE_ID=$(aws ec2 describe-instances \
  --filters "Name=tag:Project,Values=kitsu" "Name=instance-state-name,Values=running" \
  --query 'Reservations[].Instances[].InstanceId' --output text)

aws ssm send-command \
  --instance-ids "$INSTANCE_ID" \
  --document-name "AWS-RunShellScript" \
  --parameters "commands=[\"$(cat infra/scripts/deploy-config-update.sh)\"]" \
  --output text --query 'Command.CommandId'

# Option 2: SSM in and run manually
aws ssm start-session --target "$INSTANCE_ID"
# Then: sudo bash  (paste script contents or upload the file)

The deploy script is at scripts/deploy-config-update.sh. Update it when you change configuration, and keep it in sync with terraform/user_data.sh.

Data locations on EC2

  • /opt/kitsu/docker-compose.yml - Docker Compose configuration
  • /opt/kitsu/.env - Environment variables (SECRET_KEY, NEWRELIC_LICENSE_KEY)
  • /opt/kitsu/nginx/ - Nginx config overrides
  • /opt/kitsu/postgresql/ - PostgreSQL config overrides
  • /opt/kitsu/newrelic/ - New Relic log forwarding config
  • /opt/kitsu-data/docker/ - Docker data-root (images, named volumes)
  • /opt/kitsu-data/docker/volumes/ - PostgreSQL data (named volumes)
  • /var/log/kitsu/ - Application logs (Zou, Nginx, PostgreSQL) mounted from container

Data Persistence

  • Docker's data-root is on the EBS volume (/opt/kitsu-data/docker)
  • PostgreSQL uses Docker named volumes, stored under the data-root
  • Preview files are stored in S3 (FS_BACKEND=s3), not on the EBS volume
  • EBS data volume is separate from the EC2 instance
  • Volume persists if instance is terminated (standalone aws_ebs_volume resource)
  • Daily backups at 3 AM UTC via AWS Backup
  • 7-day retention (configurable via backup_retention_days)

Cost Estimate (us-east-1)

Resource Monthly
EC2 t3.medium ~$30
EBS 70GB gp3 ~$6
ALB ~$22
S3 ~$1
AWS Backup ~$4
Total ~$63

Staging Environment

A separate staging environment runs alongside prod using the same Terraform config with a different tfvars and state file. Staging shares the VPC, subnets, and SSM parameters with prod but has its own EC2, ALB, S3 bucket, and other resources (namespaced via project_name = "kitsu-staging").

Deploy staging

cd infra/terraform
terraform plan -var-file=staging.tfvars -state=staging.tfstate
terraform apply -var-file=staging.tfvars -state=staging.tfstate

After the first apply, validate the ACM certificate for kitsu-staging.scenarix.ai (same process as prod — add the CNAME record, wait for ISSUED, re-apply).

Key differences from prod

Setting Prod Staging
project_name kitsu kitsu-staging
domain_name kitsu.scenarix.ai kitsu-staging.scenarix.ai
instance_type t3.medium t3.small
root_volume_size_gb 20 10
data_volume_size_gb 50 10
s3_bucket_prefix kitsu-408921634255- kitsu-staging-

Destroy staging

terraform destroy -var-file=staging.tfvars -state=staging.tfstate

Notes

Why EC2 instead of Fargate?

The cgwire/cgwire image is an all-in-one container (PostgreSQL, Redis, Nginx, Zou) that requires persistent disk storage. EC2 + EBS is simpler and cheaper for this use case than Fargate + EFS.

SECRET_KEY configuration

The cgwire image uses supervisord which overrides DB_PASSWORD and DB_USERNAME for the Zou process, but SECRET_KEY is inherited from the container environment. This is used for auth token encryption.

Availability Zone requirements

Important: The EC2 subnet must be in the same AZ as one of the ALB subnets, otherwise the ALB cannot route traffic to the instance (you'll see Target.NotInUse errors).

The EBS data volume is created in the same AZ as the EC2 instance. Both are in a public subnet protected by security groups (only port 80 from ALB SG allowed inbound).

When selecting subnets:

  • public_subnet_ids - At least 2 subnets in different AZs (ALB requirement)
  • ec2_subnet_id - Must be in the same AZ as one of the public subnets

Example valid configuration:

public_subnet_ids = ["subnet-aaa (us-east-1a)", "subnet-bbb (us-east-1c)"]
ec2_subnet_id     = "subnet-ccc (us-east-1c)"  # Same AZ as subnet-bbb

Monitoring

Error logs from Zou, Nginx, and PostgreSQL are shipped to New Relic via the infrastructure agent running as a Docker sidecar.

Logs available in New Relic (query with NRQL):

-- All Kitsu errors
SELECT * FROM Log WHERE service = 'kitsu'

-- Zou API errors only
SELECT * FROM Log WHERE service = 'kitsu' AND component = 'zou-api'

-- PostgreSQL errors
SELECT * FROM Log WHERE service = 'kitsu' AND component = 'postgresql'

Host metrics (CPU, memory, disk) are also reported by the infrastructure agent.

Future improvements

  • Separate containers with RDS/ElastiCache for larger scale
  • New Relic alerting policies for error rate / CPU spikes