Gateways¶
Gateways manage ingress traffic for running services, handle auto-scaling and rate limits, enable HTTPS, and allow you to configure a custom domain. They also support custom routers, such as the SGLang Model Gateway.
Apply a configuration¶
First, define a gateway configuration as a YAML file in your project folder.
The filename must end with .dstack.yml (e.g. .dstack.yml or gateway.dstack.yml are both acceptable).
type: gateway
# A name of the gateway
name: example-gateway
# Gateways are bound to a specific backend and region
backend: aws
region: eu-west-1
# This domain will be used to access the endpoint
domain: example.com
To create or update the gateway, simply call the dstack apply command:
$ dstack apply -f gateway.dstack.yml
The example-gateway doesn't exist. Create it? [y/n]: y
Provisioning...
---> 100%
BACKEND REGION NAME HOSTNAME DOMAIN DEFAULT STATUS
aws eu-west-1 example-gateway example.com ✓ submitted
Configuration options¶
Domain¶
A gateway requires a domain to be specified in the configuration before creation. The domain is used to generate service endpoints (e.g. <run name>.<gateway domain>).
Once the gateway is created and assigned a hostname, configure your DNS by adding a wildcard record for *.<gateway domain> (e.g. *.example.com). The record should point to the gateway's hostname and should be of type A if the hostname is an IP address (most cases), or of type CNAME if the hostname is another domain (some private gateways and Kubernetes).
Backend¶
You can create gateways with the aws, azure, gcp, or kubernetes backends, but that does not limit where services run. A gateway can use one backend while services run on any other backend supported by dstack, including backends where gateways themselves cannot be created.
Kubernetes
Gateways in kubernetes backend require an external load balancer. Managed Kubernetes solutions usually include a load balancer.
For self-hosted Kubernetes, you must provide a load balancer by yourself.
Router¶
By default, the gateway uses its own load balancer to route traffic between replicas. However, you can delegate this responsibility to a specific router by setting the router property. Currently, the only supported external router is sglang.
SGLang¶
The sglang router delegates routing logic to the SGLang Model Gateway.
To enable it, set type field under router to sglang:
type: gateway
name: sglang-gateway
backend: aws
region: eu-west-1
domain: example.com
router:
type: sglang
policy: cache_aware
If you configure the sglang router, services can run either standard SGLang workers or Prefill-Decode workers (aka PD disaggregation).
Note, if you want to run services with PD disaggregation, the gateway must currently run in the same cluster as the service.
Policy
The policy property allows you to configure the routing policy:
cache_aware— Default policy; combines cache locality with load balancing, falling back to shortest queue.power_of_two— Samples two workers and picks the lighter one.random— Uniform random selection.round_robin— Cycles through workers in order.
Certificate¶
By default, when you run a service with a gateway, dstack provisions an SSL certificate via Let's Encrypt for the configured domain. This automatically enables HTTPS for the service endpoint.
If you disable public IP (e.g. to make the gateway private) or if you simply don't need HTTPS, you can set certificate to null.
Note, by default services set
httpstotruewhich requires a certificate. You can sethttpstoautoto detect if the gateway supports HTTPS or not automatically.
Certificate types
dstack supports the following certificate types:
lets-encrypt(default) — Automatic certificates via Let's Encrypt. Requires a public IP.acm— Certificates managed by AWS Certificate Manager. AWS-only. TLS is terminated at the load balancer, not at the gateway.null— No certificate. Services will use HTTP.
Public IP¶
If you don't need a public IP for the gateway, you can set public_ip to false (the default is true), making the gateway private.
Private gateways are currently supported in aws and gcp backends.
type: gateway
name: private-gateway
backend: aws
region: eu-west-1
domain: example.com
public_ip: false
certificate: null
Instance type¶
By default, dstack provisions a small, low-cost instance for the gateway. If you expect to run high-traffic services, you can configure a larger instance type using the instance_type property.
type: gateway
name: example-gateway
backend: aws
region: eu-west-1
instance_type: t3.large
domain: example.com
Reference
For all gateway configuration options, refer to the reference.
Manage gateways¶
List gateways¶
The dstack gateway list command lists existing gateways and their status.
Delete a gateway¶
To delete a gateway, pass the gateway configuration to dstack delete:
$ dstack delete -f examples/inference/gateway.dstack.yml
Alternatively, you can delete a gateway by passing the gateway name to dstack gateway delete.
What's next?
- See services on how to run services