Skip to content

Gateways

Gateways manage ingress traffic for running services, handle auto-scaling and rate limits, enable HTTPS, and allow you to configure a custom domain. They also support custom routers, such as the SGLang Model Gateway.

Apply a configuration

First, define a gateway configuration as a YAML file in your project folder. The filename must end with .dstack.yml (e.g. .dstack.yml or gateway.dstack.yml are both acceptable).

type: gateway
# A name of the gateway
name: example-gateway

# Gateways are bound to a specific backend and region
backend: aws
region: eu-west-1

# This domain will be used to access the endpoint
domain: example.com

To create or update the gateway, simply call the dstack apply command:

$ dstack apply -f gateway.dstack.yml
The example-gateway doesn't exist. Create it? [y/n]: y

Provisioning...
---> 100%

 BACKEND  REGION     NAME             HOSTNAME  DOMAIN       DEFAULT  STATUS
 aws      eu-west-1  example-gateway            example.com          submitted

Configuration options

Domain

A gateway requires a domain to be specified in the configuration before creation. The domain is used to generate service endpoints (e.g. <run name>.<gateway domain>).

Once the gateway is created and assigned a hostname, configure your DNS by adding a wildcard record for *.<gateway domain> (e.g. *.example.com). The record should point to the gateway's hostname and should be of type A if the hostname is an IP address (most cases), or of type CNAME if the hostname is another domain (some private gateways and Kubernetes).

Backend

You can create gateways with the aws, azure, gcp, or kubernetes backends, but that does not limit where services run. A gateway can use one backend while services run on any other backend supported by dstack, including backends where gateways themselves cannot be created.

Kubernetes

Gateways in kubernetes backend require an external load balancer. Managed Kubernetes solutions usually include a load balancer. For self-hosted Kubernetes, you must provide a load balancer by yourself.

Router

By default, the gateway uses its own load balancer to route traffic between replicas. However, you can delegate this responsibility to a specific router by setting the router property. Currently, the only supported external router is sglang.

SGLang

The sglang router delegates routing logic to the SGLang Model Gateway.

To enable it, set type field under router to sglang:

type: gateway
name: sglang-gateway

backend: aws
region: eu-west-1

domain: example.com

router:
  type: sglang
  policy: cache_aware

If you configure the sglang router, services can run either standard SGLang workers or Prefill-Decode workers (aka PD disaggregation).

Note, if you want to run services with PD disaggregation, the gateway must currently run in the same cluster as the service.

Policy

The policy property allows you to configure the routing policy:

  • cache_aware — Default policy; combines cache locality with load balancing, falling back to shortest queue.
  • power_of_two — Samples two workers and picks the lighter one.
  • random — Uniform random selection.
  • round_robin — Cycles through workers in order.

Certificate

By default, when you run a service with a gateway, dstack provisions an SSL certificate via Let's Encrypt for the configured domain. This automatically enables HTTPS for the service endpoint.

If you disable public IP (e.g. to make the gateway private) or if you simply don't need HTTPS, you can set certificate to null.

Note, by default services set https to true which requires a certificate. You can set https to auto to detect if the gateway supports HTTPS or not automatically.

Certificate types

dstack supports the following certificate types:

  • lets-encrypt (default) — Automatic certificates via Let's Encrypt. Requires a public IP.
  • acm — Certificates managed by AWS Certificate Manager. AWS-only. TLS is terminated at the load balancer, not at the gateway.
  • null — No certificate. Services will use HTTP.

Public IP

If you don't need a public IP for the gateway, you can set public_ip to false (the default is true), making the gateway private.

Private gateways are currently supported in aws and gcp backends.

type: gateway
name: private-gateway

backend: aws
region: eu-west-1
domain: example.com

public_ip: false
certificate: null

Instance type

By default, dstack provisions a small, low-cost instance for the gateway. If you expect to run high-traffic services, you can configure a larger instance type using the instance_type property.

type: gateway
name: example-gateway

backend: aws
region: eu-west-1

instance_type: t3.large

domain: example.com

Reference

For all gateway configuration options, refer to the reference.

Manage gateways

List gateways

The dstack gateway list command lists existing gateways and their status.

Delete a gateway

To delete a gateway, pass the gateway configuration to dstack delete:

$ dstack delete -f examples/inference/gateway.dstack.yml

Alternatively, you can delete a gateway by passing the gateway name to dstack gateway delete.

What's next?

  1. See services on how to run services