Skip to main content
Update an existing endpoint.

Signature

VastAI.update_endpoint(
    id: int,
    min_load: Optional[float] = None,
    min_cold_load: Optional[float] = None,
    endpoint_state: Optional[str] = None,
    target_util: Optional[float] = None,
    cold_mult: Optional[float] = None,
    cold_workers: Optional[int] = None,
    max_workers: Optional[int] = None,
    endpoint_name: Optional[str] = None,
    max_queue_time: Optional[float] = None,
    target_queue_time: Optional[float] = None,
    inactivity_timeout: Optional[int] = None
) -> dict

Parameters

id
int
required
id of endpoint group to update
min_load
Optional[float]
minimum floor load in perf units/s (token/s for LLms)
min_cold_load
Optional[float]
minimum floor load in perf units/s (token/s for LLms), but allow handling with cold workers
endpoint_state
Optional[str]
active, suspended, or stopped
target_util
Optional[float]
target capacity utilization (fraction, max 1.0, default 0.9)
cold_mult
Optional[float]
cold/stopped instance capacity target as multiple of hot capacity target (default 2.5)
cold_workers
Optional[int]
min number of workers to keep ‘cold’ when you have no load (default 5)
max_workers
Optional[int]
max number of workers your endpoint group can have (default 20)
endpoint_name
Optional[str]
deployment endpoint name (allows multiple workergroups to share same deployment endpoint)
max_queue_time
Optional[float]
maximum seconds requests may be queued on each worker (default 30.0)
target_queue_time
Optional[float]
target seconds for the queue to be cleared (default 10.0)
inactivity_timeout
Optional[int]
seconds of no traffic before the endpoint can scale to zero active workers

Returns

dict

Example

from vastai import VastAI

client = VastAI(api_key="YOUR_API_KEY")
result = client.update_endpoint(id=12345)
print(result)