Scaling Your Services
This tutorial will show you how to scale your services with Defang.
There are two primary ways to scale a service. The first way is to increase the resources allocated to a service. For example, giving a service more CPUs or memory. The second way is to deploy multiple instances of a service. This is called scaling with replicas. Defang makes it easy to do both.
The Compose Specification, which is used by Defang, includes a deploy
section which allows you to configure the deployment configuration for a service. This includes your service's resource requirements and the number of replicas of a service should be deployed.
Scaling Resource Reservations
In order to scale a service's resource reservations, you will need to update the deploy
section associated with your service in your application's compose.yaml
file.
Use the resources
section to specify the resource reservation requirements. These are the minimum resources which must be available for the platform to provision your service. You may end up with more resources than you requested, but you will never be allocated less.
For example, if my app needs 2 CPUs and 512MB of memory, I would update the compose.yaml
file like this:
services:
my_service:
image: my_app:latest
deploy:
resources:
reservations:
cpus: "2"
memory: "512M"
The minimum resources which can be reserved:
Resource | Minimum |
---|---|
CPUs | 0.5 |
Memory | 512M |
Note that the memory
field must be specified as a "byte value string" using the {amount}{byte unit}
format. The supported units are b
(bytes), k
or kb
(kilobytes), m
or mb
(megabytes) and g
or gb
(gigabytes).
Scaling with Replicas
In order to scale a service's replica count, you will need to update the deploy
section associated with your service in your application's compose.yaml
file.
Use the replicas
section to specify the number of containers which should be running at any given time.
For example, if I want to run 3 instances of my app, I would update the compose.yaml
file like this:
services:
my_service:
image: my_app:latest
deploy:
replicas: 3
Autoscaling Your Services
Autoscaling allows your services to automatically adjust the number of replicas based on CPU usage — helping you scale up during traffic spikes and scale down during quieter periods.
Note: Autoscaling is only available to Pro tier or higher users.
Enabling Autoscaling
To enable autoscaling for a service, add the x-defang-autoscaling: true
extension under the service definition in your compose.yaml
file and remove the replicas field in your
deploy mapping, if present. Autoscaling is available in staging and production deployments modes only.
Example:
services:
web:
image: myorg/web:latest
ports:
- 80:80
x-defang-autoscaling: true
Once deployed, your services' CPU usage is monitored for how much load it is handling, sustained high loads will result in more replicas being started.
Requirements
- BYOC, your own cloud platform account.
- You must be on the Pro or higher plan to use autoscaling. (Defang plans)
- replicas must NOT be defined
- Only staging and production deployment modes supported. (Deployment modes)
- The service must be stateless or able to run in multiple instances. (Scaling)
Best Practices
- Design your services to be horizontally scalable. (12 Factor App)
- Use shared or external storage if your service writes data. (e.g. Postgres or Redis managed services )