Terms and concepts
With W&B Launch, you enqueue jobs onto queues to create runs. Jobs are python scripts instrumented with W&B. Queues hold a list of jobs to execute on a target resource. Agents pull jobs from queues and execute the jobs on target resources. W&B tracks launch jobs similarly to how W&B tracks runs.
A launch job is a specific type of W&B Artifact that represents a task to complete. For example, common launch jobs include training a model or triggering a model evaluation. Job definitions include:
- Python code and other file assets, including at least one runnable entrypoint.
- Information about the input (config parameter) and output (metrics logged).
- Information about the environment. (for example,
There are three main kinds of job definitions:
|How to run this job type
|Artifact-based (or code-based) jobs
|Code and other assets are saved as a W&B artifact.
|To run artifact-based jobs, Launch agent must be configured with a builder.
|Code and other assets are cloned from a certain commit, branch, or tag in a git repository.
|To run git-based jobs, Launch agent must be configured with a builder and git repository credentials.
|Code and other assets are baked into a Docker image.
|To run image-based jobs, Launch agent might need to be configured with image repository credentials.
While Launch jobs can perform activities not related to model training--for example, deploy a model to a Triton inference server--all jobs must call
wandb.init to complete successfully. This creates a run for tracking purposes in a W&B workspace.
Launch queues are ordered lists of jobs to execute on a specific target resource. Launch queues are first-in, first-out. (FIFO). There is no practical limit to the number of queues you can have, but a good guideline is one queue per target resource. Jobs can be enqueued with the W&B App UI, W&B CLI or Python SDK. Then, one or more Launch agents can be configured to pull items from the queue and execute them on the queue's target resource.
The compute environment that a Launch queue is configured to execute jobs on is called the target resource.
W&B Launch supports the following target resources:
Each target resource accepts a different set of configuration parameters called resource configurations. Resource configurations take on default values defined by each Launch queue, but can be overridden independently by each job. See the documentation for each target resource for more details.
Launch agents are lightweight, persistent programs that periodically check Launch queues for jobs to execute. When a launch agent receives a job, it first builds or pulls the image from the job definition then runs it on the target resource.
One agent may poll multiple queues, however the agent must be configured properly to support all of the backing target resources for each queue it is polling.
Launch agent environment
The agent environment is the environment where a launch agent is running, polling for jobs.
The agent's runtime environment is independent of a queue's target resource. In other words, agents can be deployed anywhere as long as they are configured sufficiently to access the required target resources.