propeller logo

Task Scheduling & Workflows

Configure task dependencies, DAG-based workflows, jobs, and scheduling in Propeller

Overview

Propeller schedules and executes tasks across distributed edge devices. When tasks depend on each other, Propeller ensures they run in the correct order.

What is a DAG?

A DAG (Directed Acyclic Graph) defines which tasks must complete before others can start:

Basic DAG

  • Tasks are the individual steps in your workflow
  • Dependencies specify which tasks must finish first
  • Acyclic means the workflow only moves forward—no circular dependencies

Tasks without dependencies on each other run in parallel.

BenefitDescription
Correct orderTasks wait for their dependencies to complete
Parallel executionIndependent tasks run simultaneously
Error isolationFailures affect only downstream tasks
Clear structureThe graph shows the entire workflow

Architecture

Propeller DAG Architecture

Components

ComponentDescription
ManagerCentral coordinator that receives workflow definitions and orchestrates execution
JobsGroups of related tasks with a shared execution mode (parallel or sequential)
WorkflowsDAG-based task definitions with explicit dependencies between tasks
Standalone TasksIndividual tasks executed independently without belonging to a job or workflow
SchedulingDetermines task ordering: priority, cron schedules, or round-robin distribution
TasksThe atomic unit of work—a WebAssembly module executed on a proplet
PropletsEdge devices that receive and execute tasks assigned by the Manager

Jobs

Jobs group related tasks with configurable execution modes (parallel, sequential). Jobs execute tasks with a common strategy, simplifying management of multi-task workloads.

Creating a Job

Send a POST request to the /jobs endpoint with a name, execution mode, and list of tasks:

curl -X POST "http://localhost:7070/jobs" \
-H "Content-Type: application/json" \
-d '{
  "name": "data-pipeline",
  "execution_mode": "sequential",
  "tasks": [
    {
      "name": "extract",
      "image_url": "docker.io/myorg/extract:v1"
    },
    {
      "name": "transform",
      "image_url": "docker.io/myorg/transform:v1"
    },
    {
      "name": "load",
      "image_url": "docker.io/myorg/load:v1"
    }
  ]
}'

Your output should look like this:

{
  "job_id": "job-abc123",
  "tasks": [
    {"id": "task-001", "name": "extract", "state": 0},
    {"id": "task-002", "name": "transform", "state": 0},
    {"id": "task-003", "name": "load", "state": 0}
  ]
}
  • job_id: Unique identifier assigned to the job. Use this to start, stop, or query the job.
  • tasks: Each task receives its own id and starts in state: 0 (pending).

Execution Modes

ModeBehavior
parallelAll tasks start simultaneously on available proplets
sequentialTasks run one at a time in order; first failure stops the job (fail-fast)
configurableUses DAG-based topological sort: starts tasks with no dependencies first, then starts dependent tasks as their prerequisites complete

Starting a Job

Trigger execution of a job by sending a POST request to the start endpoint:

curl -X POST "http://localhost:7070/jobs/job-abc123/start"

Your output should look like this:

{
  "job_id": "job-abc123",
  "message": "job started"
}
  • job_id: The ID of the job that was started.
  • message: Confirms the job was successfully queued for execution.

Stopping a Job

Halt a running job and cancel any pending tasks:

curl -X POST "http://localhost:7070/jobs/job-abc123/stop"

Your output should look like this:

{
  "job_id": "job-abc123",
  "message": "job stopped"
}

Listing Jobs

Retrieve all jobs with pagination support:

curl "http://localhost:7070/jobs?offset=0&limit=100"

Your output should look like this:

{
  "offset": 0,
  "limit": 100,
  "total": 2,
  "jobs": [
    {
      "job_id": "38c745f1-b5c8-4e72-8c0a-f269ec0637f5",
      "name": "sequential-pipeline",
      "state": 3,
      "tasks": [...],
      "created_at": "2026-03-01T17:13:24.540925091Z"
    },
    {
      "job_id": "981a9c91-92e2-4518-a0b6-46b8b590d9bb",
      "name": "addition-pipeline",
      "state": 3,
      "tasks": [...],
      "created_at": "2026-03-01T17:05:45.493390418Z"
    }
  ]
}
  • offset / limit: Pagination parameters from the request.
  • total: Total number of jobs in the system.
  • jobs: Array of job summaries with nested task arrays.

Immutable Jobs

Jobs cannot be updated or deleted once created. To modify a job, create a new one with the desired configuration.

This immutable design provides several benefits:

BenefitDescription
AuditabilityEvery job version is preserved for tracing exactly what ran and when
ReproducibilityRe-running a job ID always references the same definition
Concurrency safetyNo race conditions between execution and configuration changes
DebuggingFailed jobs retain their original configuration for post-mortem analysis

Workflows

Workflows provide DAG-based task orchestration with dependencies and conditional execution. Unlike jobs, workflows allow fine-grained control over task dependencies using depends_on arrays and conditional execution with run_if fields.

DAG Concepts

Workflows execute tasks based on dependency relationships and conditional logic. The following diagram shows a typical workflow with success and failure branches:

DAG Run-If Success/Failure Branching

Execution flow:

  1. Payment Task executes first
  2. Based on outcome, one of two paths runs:
    • On success: Send Confirmation task runs (run_if: success)
    • On failure: Send Failure Alert task runs (run_if: failure)

Only one downstream task executes—never both. This pattern is ideal for notifications, cleanup, or any action that depends on whether the upstream task succeeded or failed.

Task Dependencies

The following diagram shows how depends_on chains tasks together:

Task Dependencies

Execution flow:

  • Task A has no dependencies and runs first
  • Task B depends on Task A (depends_on: ["task-a"])
  • Task C depends on both Task A and Task B (depends_on: ["task-a", "task-b"])
  • Task C waits for all its dependencies to complete before starting

This pattern is common when a task needs results from multiple upstream tasks before it can proceed.

Fan-Out and Fan-In

DAG workflows support two common parallel execution patterns for distributing and consolidating work.

Fan-Out

A single parent task triggers multiple child tasks. All children start simultaneously when the parent completes. This pattern maximizes parallelism for independent work.

Fan-Out and Fan-In

Example (image processing pipeline):

  • Ingest task runs first—loads the source image
  • Ingest triggers three parallel tasks: Resize, Watermark, Compress
  • All three run simultaneously since they have no dependencies on each other
Fan-In

Multiple tasks converge into a single downstream task. The downstream task waits for all upstream tasks to complete. This pattern is used for aggregation, merging results, or synchronization points.

Example (image processing pipeline):

  • Merge & Upload depends on Resize, Watermark, and Compress
  • It waits for all three to complete before starting
  • Combines results and uploads the final processed image

Creating a Workflow

When tasks have dependencies, specify explicit id values so depends_on can reference them:

curl -X POST "http://localhost:7070/workflows" \
-H "Content-Type: application/json" \
-d '{
  "name": "etl-workflow",
  "tasks": [
    {
      "id": "fetch-a",
      "name": "fetch-source-a",
      "image_url": "docker.io/myorg/fetch:v1",
      "env": {"SOURCE": "database-a"}
    },
    {
      "id": "fetch-b",
      "name": "fetch-source-b",
      "image_url": "docker.io/myorg/fetch:v1",
      "env": {"SOURCE": "database-b"}
    },
    {
      "id": "merge",
      "name": "merge-data",
      "image_url": "docker.io/myorg/merge:v1",
      "depends_on": ["fetch-a", "fetch-b"],
      "run_if": "success"
    },
    {
      "id": "report",
      "name": "generate-report",
      "image_url": "docker.io/myorg/report:v1",
      "depends_on": ["merge"],
      "run_if": "success"
    },
    {
      "id": "alert",
      "name": "send-alert",
      "image_url": "docker.io/myorg/alert:v1",
      "depends_on": ["merge"],
      "run_if": "failure"
    }
  ]
}'

Request fields:

FieldDescription
idExplicit task identifier used by depends_on references. Required for tasks that others depend on.
depends_onArray of task IDs that must complete before this task starts. fetch-a and fetch-b have no dependencies (root tasks), while merge depends on both.
run_ifConditional execution—merge and report run on success; alert runs only if merge fails.
image_urlWASM module location. Multiple tasks can share the same image with different env values.

Your output should look like this:

{
  "tasks": [
    {"id": "fetch-a", "name": "fetch-source-a", "workflow_id": "...", "state": 0},
    {"id": "fetch-b", "name": "fetch-source-b", "workflow_id": "...", "state": 0},
    {"id": "merge", "name": "merge-data", "workflow_id": "...", "depends_on": ["fetch-a", "fetch-b"], "run_if": "success", "state": 0},
    {"id": "report", "name": "generate-report", "workflow_id": "...", "depends_on": ["merge"], "run_if": "success", "state": 0},
    {"id": "alert", "name": "send-alert", "workflow_id": "...", "depends_on": ["merge"], "run_if": "failure", "state": 0}
  ]
}

Response fields:

FieldDescription
workflow_idAll tasks share the same auto-generated workflow ID, linking them as a single workflow. Use this to identify all tasks belonging to this workflow.
depends_onOmitted from the response for root tasks (field is absent when empty).
run_ifOmitted from the response when not explicitly set; defaults to success behavior at runtime.

Task and Dependency Fields

FieldTypeDescription
idstringTask identifier. If omitted, auto-generated as a UUID. Specify explicitly when other tasks need to reference it in depends_on.
depends_onarray of stringTask IDs that must complete before this task starts
run_ifstringCondition for execution: success (default) or failure
workflow_idstring (auto)ID of the workflow this task belongs to

The depends_on field references task IDs that must complete before this task runs. The manager validates:

  • All referenced task IDs exist in the workflow
  • No circular dependencies exist (DAG validation)

WASM Module Deployment

Tasks execute independently on proplets—possibly on different nodes—so each task needs access to its WASM module. There are two ways to provide it:

  1. Upload directly: Use PUT /tasks/{id}/upload with a multipart form to upload a .wasm file. The module is stored with the task record.
  2. Reference via URL: Set image_url when creating the task to reference a shared WASM file from a URL (e.g., OCI registry, HTTP server). The proplet fetches the module when execution begins.

Data Passing Between Tasks

When tasks depend on each other, downstream tasks often need access to results from upstream tasks. For example, a data transformation task needs the raw data extracted by its parent task, or an aggregation task needs outputs from multiple parallel processing tasks.

Propeller passes task results through the workflow coordinator. Each task receives the outputs from the tasks listed in its depends_on array—and only those outputs. This scoped data passing keeps workflows efficient and predictable.

Data Passing Between Tasks

Diagram explanation:

  • Task A produces results and completes
  • Task B receives Task A's results, processes them, and produces its own results
  • Task C receives results from both Task A and Task B
  • Each task receives only the outputs from tasks listed in its depends_on array

run_if Values

Conditional execution lets you build workflows that respond to task outcomes. Common use cases include:

  • Success handlers: Send notifications, trigger deployments, or start downstream processing only when upstream tasks succeed
  • Failure handlers: Send alerts, run cleanup routines, or log diagnostics when tasks fail

Conditional Branching

Diagram explanation:

  • Parent Task completes with either success or failure
  • Child Task (run_if: success) executes only when all parents succeeded
  • Child Task (run_if: failure) executes only when at least one parent failed
  • When conditions aren't met, tasks are marked Skipped (terminal state)
  • Skipped tasks don't trigger their downstream dependencies
ValueBehavior
success (default)Task runs only if all dependencies completed successfully
failureTask runs only if any dependency failed

Starting a Workflow

Start the root task (a task with no depends_on). The manager automatically schedules dependent tasks as their prerequisites complete:

curl -X POST "http://localhost:7070/tasks/{root_task_id}/start"

The workflow coordinator will:

  1. Identify tasks with no dependencies (roots) — these are started manually
  2. Monitor root task completion
  3. Start dependent tasks when their prerequisites are satisfied
  4. Evaluate run_if conditions before starting each task

For a complete working example with real WASM modules and actual API responses, see DAG Workflows and Jobs.

Workflow vs Job Comparison

AspectWorkflowJob
EndpointPOST /workflowsPOST /jobs
Execution orderDefined by depends_on arraysControlled by execution_mode
DependenciesExplicit task-to-taskNone—mode determines order
Use caseComplex DAGs with fan-out/fan-inSimple parallel or sequential batches
Start commandPOST /tasks/{task_id}/start (root task)POST /jobs/{job_id}/start

DAG Validation

The workflow coordinator validates DAG structure before execution:

Circular Dependency Detection

A circular dependency occurs when tasks form a loop—Task A depends on Task B, which depends on Task C, which depends back on Task A. This creates an impossible execution order where no task can start because each is waiting for another.

Circular dependencies are not allowed because:

  • Deadlock: Tasks wait forever since their prerequisites never complete
  • Invalid DAG: A graph with cycles is not acyclic, violating the DAG definition
  • Unpredictable behavior: The workflow has no valid starting point

The coordinator detects and rejects circular dependencies:

{
  "tasks": [
    {"name": "task-a", "depends_on": ["task-c"]},
    {"name": "task-b", "depends_on": ["task-a"]},
    {"name": "task-c", "depends_on": ["task-b"]}
  ]
}

Error response:

{
  "error": "DAG validation failed: circular dependency detected: cycle detected involving tasks task-a and task-b"
}

Dependency Existence Validation

All referenced task IDs must exist:

{
  "tasks": [
    {"name": "process", "depends_on": ["nonexistent-task"]}
  ]
}

Error response:

{
  "error": "dependency validation failed: dependency task not found: task process depends on nonexistent-task which does not exist"
}

Topological Sorting

The coordinator performs topological sorting to determine execution order:

  1. Tasks with no dependencies execute first
  2. Tasks are ordered so all dependencies run before the dependent
  3. Tasks at the same level can run in parallel

Task Scheduling

Priority-Based Scheduling

Priority scheduling controls task dispatch order based on priority values. When multiple tasks are pending, the Manager dispatches higher-priority tasks first.

How Priority Works

Each task has a priority field (0–100). When the Manager has multiple pending tasks, it dispatches the highest-priority task first. This ensures urgent work gets processed before background tasks, even if background tasks were created earlier.

Priority Within Level

Setting Task Priority

curl -X POST "http://localhost:7070/tasks" \
-H "Content-Type: application/json" \
-d '{
  "name": "urgent-task",
  "image_url": "docker.io/myorg/process:v1",
  "priority": 90
}'

Priority Levels

PriorityRangeDescription
Low0–30Background tasks, batch processing
Normal31–70Default priority (50)
High71–100Urgent tasks, real-time processing

Priority vs Proplet Selection

Priority determines which task is dispatched next. Round Robin determines which proplet receives the task. A high-priority task will be dispatched before lower-priority tasks, but proplet selection still cycles evenly.

Round Robin Proplet Selection

The Manager cycles through alive proplets when assigning tasks:

Task 1 → Proplet A
Task 2 → Proplet B
Task 3 → Proplet C
Task 4 → Proplet A
Task 5 → Proplet B
...

Only proplets with recent heartbeats (within the liveliness threshold) are eligible.

Pinning to a Specific Proplet

Override scheduling by specifying a proplet:

curl -X POST "http://localhost:7070/tasks" \
-H "Content-Type: application/json" \
-d '{
  "name": "pinned-task",
  "image_url": "docker.io/myorg/process:v1",
  "proplet_id": "proplet-specific-001"
}'

Cron Scheduling

Cron scheduling enables time-based task execution with cron expressions. Tasks can run on fixed schedules—hourly, daily, weekly—or at specific times using standard cron syntax.

Creating a Scheduled Task

curl -X POST "http://localhost:7070/tasks" \
-H "Content-Type: application/json" \
-d '{
  "name": "hourly-cleanup",
  "image_url": "docker.io/myorg/cleanup:v1",
  "schedule": "0 * * * *",
  "timezone": "UTC",
  "is_recurring": true
}'

Cron Expression Format

┌───────────── minute (0-59)
│ ┌───────────── hour (0-23)
│ │ ┌───────────── day of month (1-31)
│ │ │ ┌───────────── month (1-12)
│ │ │ │ ┌───────────── day of week (0-6, Sun=0)
│ │ │ │ │
* * * * *

Common Cron Patterns

PatternDescription
0 * * * *Every hour at minute 0
0 0 * * *Daily at midnight
0 0 * * 0Weekly on Sunday at midnight
0 0 1 * *Monthly on the 1st at midnight
*/15 * * * *Every 15 minutes
0 9-17 * * 1-5Every hour 9am-5pm, Mon-Fri

Timezone Support

Specify timezone for cron evaluation:

{
  "schedule": "0 9 * * *",
  "timezone": "America/New_York"
}

The Manager evaluates the cron expression in the specified timezone and calculates next_run accordingly.

Viewing Next Run

curl http://localhost:7070/tasks/task-abc123 | jq '{next_run, is_recurring}'

Response:

{
  "next_run": "2026-02-27T11:00:00Z",
  "is_recurring": true
}

Stopping a Recurring Task

Stop a recurring task to halt future scheduled executions:

curl -X POST "http://localhost:7070/tasks/task-abc123/stop"

Your output should look like this:

{"stopped": true}

The response indicates the task was successfully stopped. The task's state changes to 6 (Interrupted).

Alternatively, update the task to disable recurrence while keeping the task definition:

curl -X PUT "http://localhost:7070/tasks/task-abc123" \
-H "Content-Type: application/json" \
-d '{"is_recurring": false}'

Your output should look like this:

{
  "id": "task-abc123",
  "name": "hourly-cleanup",
  "state": 0,
  "is_recurring": false,
  "schedule": "0 * * * *",
  "updated_at": "2026-03-01T18:25:26.423162005Z"
}

The response returns the full updated task. Key changes:

  • is_recurring: Now false—the task won't run on schedule
  • updated_at: Reflects when the change was applied
  • schedule: Retained for reference, but won't trigger execution

Task States

Tasks move through these states during execution:

ValueNameDescription
0PendingTask created, not yet started
1ScheduledTask assigned to a proplet
2RunningTask is executing on a proplet
3CompletedTask finished successfully
4FailedTask encountered an error
5SkippedTask skipped (e.g., run_if condition not met)
6InterruptedTask was stopped externally

Task State Transitions

Interrupted Tasks

A task enters the Interrupted state (6) when it's stopped before natural completion. This can happen through:

  • Manual stop: Calling POST /tasks/{id}/stop on a running or pending task
  • Job stop: Calling POST /jobs/{id}/stop interrupts all tasks in the job
  • Workflow cancellation: Stopping a root task may interrupt dependent tasks
  • System shutdown: Proplet disconnection during execution

Interrupted tasks:

  • Don't trigger downstream dependencies in workflows
  • Can be restarted with POST /tasks/{id}/start
  • Retain their configuration for debugging or retry

State Transitions in Workflows

The diagram above shows the complete state machine. Key transitions:

  • Normal flow: Pending → Scheduled → Running → Completed
  • Error path: Running → Failed
  • Conditional skip: Pending → Skipped (when run_if condition not met)
  • Manual stop: Any active state → Interrupted
  • Recovery: Interrupted → Pending (via restart)

Daemon Tasks

For long-running services that should run continuously:

curl -X POST "http://localhost:7070/tasks" \
-H "Content-Type: application/json" \
-d '{
  "name": "api-server",
  "image_url": "docker.io/myorg/api:v1",
  "daemon": true
}'

Your output should look like this:

{
  "id": "8c5f2631-1ac9-4fc1-968d-a7cbdb14d660",
  "name": "api-server",
  "kind": "standard",
  "state": 0,
  "image_url": "docker.io/myorg/api:v1",
  "daemon": true,
  "encrypted": false,
  "start_time": "0001-01-01T00:00:00Z",
  "finish_time": "0001-01-01T00:00:00Z",
  "created_at": "2026-03-01T18:39:34.055594605Z",
  "updated_at": "0001-01-01T00:00:00Z",
  "priority": 50
}

Daemon tasks:

  • Run until explicitly stopped
  • Automatically restart on the same proplet if they crash
  • Don't count toward workflow completion

For complete working examples including ETL pipelines and troubleshooting guides, see DAG Workflows and Jobs.

On this page