Writing Your Own Scenarios¶

This tutorial walks through designing a custom ncsim scenario from scratch. By the end, you will have a complete YAML file ready to run.

Step 1: Define Your Network¶

Start by choosing a topology: how many nodes you need and how they connect.

Key decisions:

Node count and compute capacities -- heterogeneous capacities create interesting scheduling trade-offs.
Link topology -- fully connected, line, ring, mesh, star, or tree.
Link properties -- bandwidth (MB/s) and latency (seconds).
Positions -- required if you plan to use interference models. Coordinates are in meters.

Tip

Links in ncsim are directional. If you need bidirectional communication between two nodes, define two links (one in each direction).

Example: 4-node mesh

network:
  nodes:
    - {id: n0, compute_capacity: 100, position: {x: 0,  y: 0}}
    - {id: n1, compute_capacity: 150, position: {x: 20, y: 0}}
    - {id: n2, compute_capacity: 100, position: {x: 0,  y: 20}}
    - {id: n3, compute_capacity: 200, position: {x: 20, y: 20}}
  links:
    # Horizontal links (bidirectional)
    - {id: l01, from: n0, to: n1, bandwidth: 100, latency: 0.001}
    - {id: l10, from: n1, to: n0, bandwidth: 100, latency: 0.001}
    - {id: l23, from: n2, to: n3, bandwidth: 100, latency: 0.001}
    - {id: l32, from: n3, to: n2, bandwidth: 100, latency: 0.001}
    # Vertical links (bidirectional)
    - {id: l02, from: n0, to: n2, bandwidth: 50,  latency: 0.002}
    - {id: l20, from: n2, to: n0, bandwidth: 50,  latency: 0.002}
    - {id: l13, from: n1, to: n3, bandwidth: 50,  latency: 0.002}
    - {id: l31, from: n3, to: n1, bandwidth: 50,  latency: 0.002}

This creates a 4-node mesh where horizontal links are faster (100 MB/s) than vertical links (50 MB/s), giving the scheduler something to optimize around.

Step 2: Design Your DAG¶

Define the computation tasks and the data dependencies between them.

Key decisions:

Task compute costs -- in compute units. Runtime = compute_cost / node.compute_capacity.
Edge data sizes -- in MB. Transfer time = data_size / effective_bandwidth + latency.
DAG shape -- chain, fork-join, diamond, parallel, or a custom structure.
Pinning -- optionally force tasks onto specific nodes with pinned_to.

Tip

If both the source and destination tasks of an edge run on the same node, no network transfer occurs -- the data dependency is resolved locally with zero transfer time.

Example: fork-join DAG

dags:
  - id: dag1
    inject_at: 0.0
    tasks:
      - {id: start,  compute_cost: 50}
      - {id: work_a, compute_cost: 500}
      - {id: work_b, compute_cost: 300}
      - {id: work_c, compute_cost: 400}
      - {id: finish, compute_cost: 50}
    edges:
      - {from: start,  to: work_a, data_size: 10}
      - {from: start,  to: work_b, data_size: 10}
      - {from: start,  to: work_c, data_size: 10}
      - {from: work_a, to: finish, data_size: 20}
      - {from: work_b, to: finish, data_size: 20}
      - {from: work_c, to: finish, data_size: 20}

Step 3: Configure Settings¶

Choose the scheduler, routing algorithm, interference model, and random seed.

Scheduler options:

Scheduler	Best for
`heft`	General use. Optimizes for earliest finish time on heterogeneous nodes.
`cpop`	Critical-path-aware. Good when one path through the DAG dominates.
`round_robin`	Baseline comparison. Simple round-robin assignment.

Routing options:

Routing	Best for
`direct`	Fully connected topologies (every node has a direct link to every other).
`widest_path`	Multi-hop topologies. Maximizes bottleneck bandwidth along the path.
`shortest_path`	Multi-hop topologies where latency matters more than throughput.

Interference options:

Interference	Best for
`none`	Wired networks or baseline comparison.
`proximity`	Simple wireless model. Nearby links share bandwidth as 1/k.
`csma_clique`	802.11 static model. Uses conflict graph + clique-based fair share.
`csma_bianchi`	802.11 dynamic model. SINR-aware rate + Bianchi MAC efficiency.

Tip

Set seed to a fixed value for reproducible results. Different seeds may produce different scheduler tie-breaking or shadow fading.

Example configuration:

config:
  scheduler: heft
  routing: widest_path
  interference: none
  seed: 42

Step 4: Put It Together¶

Combining the network, DAG, and config sections from the previous steps into a complete scenario file:

scenario:
  name: "Custom 4-Node Mesh with Fork-Join"

  network:
    nodes:
      - {id: n0, compute_capacity: 100, position: {x: 0,  y: 0}}
      - {id: n1, compute_capacity: 150, position: {x: 20, y: 0}}
      - {id: n2, compute_capacity: 100, position: {x: 0,  y: 20}}
      - {id: n3, compute_capacity: 200, position: {x: 20, y: 20}}
    links:
      # Horizontal (fast)
      - {id: l01, from: n0, to: n1, bandwidth: 100, latency: 0.001}
      - {id: l10, from: n1, to: n0, bandwidth: 100, latency: 0.001}
      - {id: l23, from: n2, to: n3, bandwidth: 100, latency: 0.001}
      - {id: l32, from: n3, to: n2, bandwidth: 100, latency: 0.001}
      # Vertical (slower)
      - {id: l02, from: n0, to: n2, bandwidth: 50,  latency: 0.002}
      - {id: l20, from: n2, to: n0, bandwidth: 50,  latency: 0.002}
      - {id: l13, from: n1, to: n3, bandwidth: 50,  latency: 0.002}
      - {id: l31, from: n3, to: n1, bandwidth: 50,  latency: 0.002}

  dags:
    - id: dag1
      inject_at: 0.0
      tasks:
        - {id: start,  compute_cost: 50}
        - {id: work_a, compute_cost: 500}
        - {id: work_b, compute_cost: 300}
        - {id: work_c, compute_cost: 400}
        - {id: finish, compute_cost: 50}
      edges:
        - {from: start,  to: work_a, data_size: 10}
        - {from: start,  to: work_b, data_size: 10}
        - {from: start,  to: work_c, data_size: 10}
        - {from: work_a, to: finish, data_size: 20}
        - {from: work_b, to: finish, data_size: 20}
        - {from: work_c, to: finish, data_size: 20}

  config:
    scheduler: heft
    routing: widest_path
    interference: none
    seed: 42

Step 5: Run and Verify¶

Run the scenario¶

ncsim --scenario my_scenario.yaml --output results/my_run

Check the output¶

The output directory will contain:

trace.jsonl -- event-by-event trace of the simulation
metrics.json -- summary metrics including makespan, task counts, and transfer counts
scenario.yaml -- copy of the input scenario for reproducibility

Compare with different settings¶

Use CLI overrides to compare schedulers and routing algorithms without modifying the YAML:

# Compare schedulers
ncsim --scenario my_scenario.yaml --output results/heft   --scheduler heft
ncsim --scenario my_scenario.yaml --output results/cpop   --scheduler cpop
ncsim --scenario my_scenario.yaml --output results/rr     --scheduler round_robin

# Compare routing
ncsim --scenario my_scenario.yaml --output results/direct  --routing direct
ncsim --scenario my_scenario.yaml --output results/widest  --routing widest_path

# Compare interference models
ncsim --scenario my_scenario.yaml --output results/no_intf    --interference none
ncsim --scenario my_scenario.yaml --output results/proximity  --interference proximity

Tips¶

Start simple, add complexity gradually

Begin with 2-3 nodes and a simple chain DAG. Verify the output matches your hand calculations, then scale up the network and DAG complexity.

Use deterministic seeds

Always set seed in your YAML or via --seed on the CLI. This ensures identical results across runs, making debugging and comparison straightforward.

Isolate interference effects

Run the same scenario with interference: none and interference: proximity (or a WiFi model) to measure exactly how much interference impacts makespan.

Use pinned_to for controlled experiments

When investigating network behavior (routing, interference, bandwidth contention), pin tasks to specific nodes to remove scheduler variability from the equation. Once the network layer behaves as expected, remove the pins and let the scheduler optimize.

Bidirectional links require two entries

A link from n0 to n1 does not automatically create a link from n1 to n0. If your DAG requires data flow in both directions (or the scheduler needs to consider reverse paths), define both explicitly.

Common Patterns¶

Topology Patterns¶

Chain (line)¶

A simple linear topology where each node connects to its neighbor:

nodes:
  - {id: n0, compute_capacity: 100, position: {x: 0,  y: 0}}
  - {id: n1, compute_capacity: 100, position: {x: 10, y: 0}}
  - {id: n2, compute_capacity: 100, position: {x: 20, y: 0}}
links:
  - {id: l01, from: n0, to: n1, bandwidth: 100, latency: 0.001}
  - {id: l12, from: n1, to: n2, bandwidth: 100, latency: 0.001}

Star¶

A central hub connected to all other nodes:

nodes:
  - {id: hub,  compute_capacity: 200, position: {x: 10, y: 10}}
  - {id: edge0, compute_capacity: 50, position: {x: 0,  y: 10}}
  - {id: edge1, compute_capacity: 50, position: {x: 20, y: 10}}
  - {id: edge2, compute_capacity: 50, position: {x: 10, y: 0}}
  - {id: edge3, compute_capacity: 50, position: {x: 10, y: 20}}
links:
  - {id: l_h0, from: hub, to: edge0, bandwidth: 100, latency: 0.001}
  - {id: l_0h, from: edge0, to: hub, bandwidth: 100, latency: 0.001}
  - {id: l_h1, from: hub, to: edge1, bandwidth: 100, latency: 0.001}
  - {id: l_1h, from: edge1, to: hub, bandwidth: 100, latency: 0.001}
  - {id: l_h2, from: hub, to: edge2, bandwidth: 100, latency: 0.001}
  - {id: l_2h, from: edge2, to: hub, bandwidth: 100, latency: 0.001}
  - {id: l_h3, from: hub, to: edge3, bandwidth: 100, latency: 0.001}
  - {id: l_3h, from: edge3, to: hub, bandwidth: 100, latency: 0.001}

Diamond¶

Two paths between source and destination, useful for routing comparisons:

nodes:
  - {id: src,    compute_capacity: 100, position: {x: 0,  y: 5}}
  - {id: top,    compute_capacity: 100, position: {x: 5,  y: 0}}
  - {id: bottom, compute_capacity: 100, position: {x: 5,  y: 10}}
  - {id: dst,    compute_capacity: 100, position: {x: 10, y: 5}}
links:
  - {id: l_st, from: src, to: top,    bandwidth: 50,  latency: 0.001}
  - {id: l_td, from: top, to: dst,    bandwidth: 50,  latency: 0.001}
  - {id: l_sb, from: src, to: bottom, bandwidth: 200, latency: 0.01}
  - {id: l_bd, from: bottom, to: dst, bandwidth: 200, latency: 0.01}

Ring¶

Every node connects to the next, forming a loop:

nodes:
  - {id: n0, compute_capacity: 100, position: {x: 10, y: 0}}
  - {id: n1, compute_capacity: 100, position: {x: 20, y: 10}}
  - {id: n2, compute_capacity: 100, position: {x: 10, y: 20}}
  - {id: n3, compute_capacity: 100, position: {x: 0,  y: 10}}
links:
  - {id: l01, from: n0, to: n1, bandwidth: 100, latency: 0.001}
  - {id: l12, from: n1, to: n2, bandwidth: 100, latency: 0.001}
  - {id: l23, from: n2, to: n3, bandwidth: 100, latency: 0.001}
  - {id: l30, from: n3, to: n0, bandwidth: 100, latency: 0.001}

DAG Patterns¶

Chain¶

Sequential tasks, each depending on the previous:

tasks:
  - {id: A, compute_cost: 100}
  - {id: B, compute_cost: 200}
  - {id: C, compute_cost: 100}
edges:
  - {from: A, to: B, data_size: 10}
  - {from: B, to: C, data_size: 10}

Fork-join (fan-out / fan-in)¶

One task distributes work to many, then collects results:

tasks:
  - {id: scatter, compute_cost: 50}
  - {id: w0, compute_cost: 500}
  - {id: w1, compute_cost: 500}
  - {id: w2, compute_cost: 500}
  - {id: gather, compute_cost: 50}
edges:
  - {from: scatter, to: w0, data_size: 10}
  - {from: scatter, to: w1, data_size: 10}
  - {from: scatter, to: w2, data_size: 10}
  - {from: w0, to: gather, data_size: 20}
  - {from: w1, to: gather, data_size: 20}
  - {from: w2, to: gather, data_size: 20}

Diamond¶

Two parallel paths converge at a single sink:

tasks:
  - {id: root, compute_cost: 50}
  - {id: left, compute_cost: 300}
  - {id: right, compute_cost: 200}
  - {id: sink, compute_cost: 50}
edges:
  - {from: root, to: left,  data_size: 10}
  - {from: root, to: right, data_size: 10}
  - {from: left, to: sink,  data_size: 15}
  - {from: right, to: sink, data_size: 15}

Independent parallel tasks¶

Multiple entry-point tasks with no dependencies (useful for throughput testing):

tasks:
  - {id: T0, compute_cost: 1000}
  - {id: T1, compute_cost: 1000}
  - {id: T2, compute_cost: 1000}
  - {id: T3, compute_cost: 1000}
edges: []

Multi-stage pipeline¶

A deeper DAG with sequential stages:

tasks:
  - {id: ingest,    compute_cost: 100}
  - {id: parse,     compute_cost: 200}
  - {id: transform, compute_cost: 500}
  - {id: validate,  compute_cost: 100}
  - {id: store,     compute_cost: 50}
edges:
  - {from: ingest,    to: parse,     data_size: 50}
  - {from: parse,     to: transform, data_size: 30}
  - {from: transform, to: validate,  data_size: 20}
  - {from: validate,  to: store,     data_size: 10}