Experiment Spec#

class beaker.ExperimentSpec(**data)[source]#

Bases: BaseModel

Experiments are the main unit of execution in Beaker.

An ExperimentSpec defines an Experiment.

Examples:

>>> spec = ExperimentSpec(
...     budget="ai2/allennlp",
...     tasks=[
...         TaskSpec(
...             name="hello",
...             image=ImageSource(docker="hello-world"),
...             context=TaskContext(cluster="ai2/cpu-only"),
...             result=ResultSpec(
...                 path="/unused"  # required even if the task produces no output.
...             ),
...         ),
...     ],
... )

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

budget: str#: The name of the budget account for your team. See https://beaker-docs.apps.allenai.org/concept/budgets.html for more details.

tasks: List[TaskSpec]#: Specifications for each process to run.

version: SpecVersion#: Must be ‘v2’ for now.

description: Optional[str]#: Long-form explanation for an experiment.

classmethod from_file(path)[source]#

Load an ExperimentSpec from a YAML file.

Return type:: ExperimentSpec

classmethod new(budget, task_name='main', description=None, cluster=None, beaker_image=None, docker_image=None, result_path='/unused', priority=None, **kwargs)[source]#

A convenience method for creating a new ExperimentSpec with a single task.

Parameters:

task_name (str, default: 'main') – The name of the task.
description (Optional[str], default: None) – A description of the experiment.
cluster (Union[str, List[str], None], default: None) –
The cluster or clusters where the experiment can run.

Tip

Omitting the cluster will allow your experiment to run on any on-premise cluster, but you can only do this with preemptible jobs.
beaker_image (Optional[str], default: None) –
The beaker image name in the image source.

Important

Mutually exclusive with docker_image.
docker_image (Optional[str], default: None) –
The docker image name in the image source.

Important

Mutually exclusive with beaker_image.
priority (Union[str, Priority, None], default: None) – The priority of the context.
kwargs – Additional kwargs are passed as-is to TaskSpec.

Examples:

Return type:

ExperimentSpec

Create a preemptible experiment that can run an any on-premise cluster:

>>> spec = ExperimentSpec.new(
...     "ai2/allennlp",
...     docker_image="hello-world",
...     priority=Priority.preemptible,
... )

to_file(path)[source]#

Write the experiment spec to a YAML file.

Return type:: None

with_task(task)[source]#

Return a new ExperimentSpec with an additional task.

Parameters:: task (TaskSpec) – The task to add.
Examples:

>>> spec = ExperimentSpec(budget="ai2/allennlp").with_task(
...     TaskSpec.new(
...         "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.ExperimentSpec`
...         docker_image="hello-world",
...     )
... )

with_description(description)[source]#

Return a new ExperimentSpec with a different description.

Parameters:: description (str) – The new description.
Examples:

>>> ExperimentSpec(budget="ai2/allennlp", description="Hello, World!").with_description(
:rtype: :py:class:`~beaker.data_model.experiment_spec.ExperimentSpec`
...     "Hello, Mars!"
... ).description
'Hello, Mars!'

validate()[source]#

class beaker.TaskSpec(**data)[source]#

Bases: BaseModel

A TaskSpec defines a Task within an ExperimentSpec.

Tasks are Beaker’s fundamental unit of work.

A Beaker experiment may contain multiple tasks. A task may also depend on the results of another task in its experiment, creating an execution graph.

image: ImageSource#: A base image to run, usually built with Docker.

result: ResultSpec#: Where the task will place output files.

context: TaskContext#: Context describes how and where this task should run.

constraints: Optional[Constraints]#: Each task can have many constraints. And each constraint can have many values. Constraints are rules that change where a task is executed, by influencing the scheduler’s placement of the workload.

Important

Because constraints depend on external configuration, a given constraints may be invalid or unavailable if a task is re-run at a future date.

name: Optional[str]#: Name is used for display and to refer to the task throughout the spec. It must be unique among all tasks within its experiment.

command: Optional[List[Union[str, int, float]]]#

Command is the full shell command to run as a sequence of separate arguments.

If omitted, the image’s default command is used, for example Docker’s ENTRYPOINT directive. If set, default commands such as Docker’s ENTRYPOINT and CMD directives are ignored.

Example: ["python", "-u", "main.py"]

arguments: Optional[List[Union[str, int, float]]]#

Arguments are appended to the command and replace default arguments such as Docker’s CMD directive.

If command is omitted, arguments are appended to the default command, Docker’s ENTRYPOINT directive.

Example: If command is ["python", "-u", "main.py"], specifying arguments ["--quiet", "some-arg"] will run the command python -u main.py --quiet some-arg.

env_vars: Optional[List[EnvVar]]#: Sequence of environment variables passed to the container.

datasets: Optional[List[DataMount]]#: External data sources mounted into the task as files.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

resources: Optional[TaskResources]#: External hardware requirements, such as memory or GPU devices.

host_networking: bool#: Enables the task to use the host’s network.

replicas: Optional[int]#: The number of replica tasks to create based on this template.

leader_selection: bool#: Enables leader selection for the replicas and passes the leader’s hostname to the replicas.

propagate_failure: Optional[bool]#: Determines if whole experiment should fail if this task failures.

classmethod new(name, cluster=None, beaker_image=None, docker_image=None, result_path='/unused', priority=None, **kwargs)[source]#

A convenience method for quickly creating a new TaskSpec.

Parameters:

name (str) – The name of the task.
cluster (Union[str, List[str], None], default: None) –
The cluster or clusters where the experiment can run.

Tip

Omitting the cluster will allow your experiment to run on any on-premise cluster, but you can only do this with preemptible jobs.
beaker_image (Optional[str], default: None) –
The beaker image name in the image source.

Important

Mutually exclusive with docker_image.
docker_image (Optional[str], default: None) –
The docker image name in the image source.

Important

Mutually exclusive with beaker_image.
priority (Union[str, Priority, None], default: None) – The priority of the context.
kwargs – Additional kwargs are passed as-is to TaskSpec.

Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     cluster="ai2/cpu-cluster",
...     docker_image="hello-world",
... )

with_image(**kwargs)[source]#

Return a new TaskSpec with the given image.

Parameters:: kwargs – Key-word arguments that are passed directly to ImageSource.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_image(beaker="hello-world")
>>> assert task_spec.image.beaker == "hello-world"

with_result(**kwargs)[source]#

Return a new TaskSpec with the given result.

Parameters:: kwargs – Key-word arguments that are passed directly to ResultSpec.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_result(path="/output")
>>> assert task_spec.result.path == "/output"

with_context(**kwargs)[source]#

Return a new TaskSpec with the given context.

Parameters:: kwargs – Key-word arguments that are passed directly to TaskContext.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_context(cluster="ai2/general-cirrascale")
>>> assert task_spec.context.cluster == "ai2/general-cirrascale"

with_name(name)[source]#

Return a new TaskSpec with the given name.

Parameters:: name (str) – The new name.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_name("Hi there!")
>>> assert task_spec.name == "Hi there!"

with_command(command)[source]#

Return a new TaskSpec with the given command.

Parameters:: command (List[str]) – The new command.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_command(["echo"])
>>> assert task_spec.command == ["echo"]

with_arguments(arguments)[source]#

Return a new TaskSpec with the given arguments.

Parameters:: arguments (List[str]) – The new arguments.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_arguments(["Hello", "World!"])
>>> assert task_spec.arguments == ["Hello", "World!"]

with_resources(**kwargs)[source]#

Return a new TaskSpec with the given resources.

Parameters:: kwargs – Key-word arguments are passed directly to TaskResources.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_resources(gpu_count=2)
>>> assert task_spec.resources.gpu_count == 2

with_dataset(mount_path, **kwargs)[source]#

Return a new TaskSpec with an additional input dataset.

Parameters:

mount_path (str) – The mount_path of the DataMount.
kwargs – Additional kwargs are passed as-is to DataMount.new().

Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_dataset("/data/foo", beaker="foo")
>>> assert task_spec.datasets

with_env_var(name, value=None, secret=None)[source]#

Return a new TaskSpec with an additional input env_var.

Parameters:

name (str) – The name of the EnvVar.
value (Optional[str], default: None) – The value of the EnvVar.
secret (Optional[str], default: None) – The secret of the EnvVar.

Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
...     docker_image="hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     env_vars=[EnvVar(name="bar", value="secret!")],
... ).with_env_var("baz", value="top, top secret")
>>> assert len(task_spec.env_vars) == 2

with_constraint(**kwargs)[source]#

Return a new TaskSpec with the given constraints.

Parameters:: kwargs (List[str]) – Constraint name, constraint values.
Examples:

>>> task_spec = TaskSpec.new(
...     "hello-world",
:rtype: :py:class:`~beaker.data_model.experiment_spec.TaskSpec`
...     docker_image="hello-world",
... ).with_constraint(cluster=['ai2/cpu-cluster'])
>>> assert task_spec.constraints['cluster'] == ['ai2/cpu-cluster']

class beaker.ImageSource(**data)[source]#

Bases: BaseModel

ImageSource describes where Beaker can find a task’s image. Beaker will automatically pull, or download, this image immediately before running the task.

Attention

One of either ‘beaker’ or ‘docker’ must be set, but not both.

beaker: Optional[str]#: The full name or ID of a Beaker image.

docker: Optional[str]#: The tag of a Docker image hosted on the Docker Hub or a private registry.

Note

If the tag is from a private registry, the cluster on which the task will run must be pre-configured to enable access.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

class beaker.EnvVar(**data)[source]#

Bases: BaseModel

An EnvVar defines an environment variable within a task’s container.

Tip

If neither ‘source’ nor ‘secret’ are set, the value of the environment variable with default to “”.

name: str#: Name of the environment variable following Unix rules. Environment variable names are case sensitive and must be unique.

value: Optional[str]#: Literal value which can include spaces and special characters.

secret: Optional[str]#: Source the enviroment variable from a secret in the experiment’s workspace.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}#: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

class beaker.DataMount(**data)[source]#

Bases: BaseModel

Describes how to mount a dataset into a task. All datasets are mounted read-only.