Skip to content

Tutorial 2: Build a Custom Scenario

This tutorial walks you through creating a complete ncsim scenario from scratch -- a 4-node mesh network with a fork-join DAG -- and running it with different schedulers, routing algorithms, and interference models.


What You Will Learn

  • Design a network topology with heterogeneous compute capacities
  • Create a task dependency graph (DAG) with fork-join parallelism
  • Write a complete scenario YAML file
  • Run and analyze your custom scenario
  • Experiment with different schedulers, routing, and interference settings

Prerequisites

  • ncsim installed (Tutorial 1)
  • Familiarity with YAML syntax

The Scenario

We will build a 4-node mesh network arranged in a square. Each node has a different compute capacity, and the links between them have varying bandwidths. A fork-join DAG of 6 tasks will be scheduled across the network:

         T_src
       / | | \
     W0  W1 W2  W3
       \ | | /
        T_sink

One source task fans out to four parallel workers, which all fan back into a single sink task.


Step 1: Define the Network

Our 4 nodes form a square with 20-meter sides. Each node has a different compute capacity to make scheduling decisions interesting:

nodes:
  - id: n0
    compute_capacity: 200
    position: {x: 0, y: 0}
  - id: n1
    compute_capacity: 100
    position: {x: 20, y: 0}
  - id: n2
    compute_capacity: 150
    position: {x: 0, y: 20}
  - id: n3
    compute_capacity: 80
    position: {x: 20, y: 20}

Compute Capacity

compute_capacity is in compute units per second (cu/s). A task with compute_cost: 500 running on a node with compute_capacity: 200 takes 500 / 200 = 2.5 seconds to complete.

The heterogeneous capacities create a tradeoff: n0 is the fastest node (200 cu/s), but the scheduler must balance load across all nodes when there are more tasks than n0 can handle sequentially.


A full mesh of 4 nodes requires 6 bidirectional connections, which means 12 directional links. We give each direction its own link with varying bandwidths to create asymmetry:

links:
  # n0 <-> n1 (horizontal top)
  - {id: l01, from: n0, to: n1, bandwidth: 500, latency: 0.001}
  - {id: l10, from: n1, to: n0, bandwidth: 500, latency: 0.001}
  # n0 <-> n2 (vertical left)
  - {id: l02, from: n0, to: n2, bandwidth: 400, latency: 0.001}
  - {id: l20, from: n2, to: n0, bandwidth: 400, latency: 0.001}
  # n0 <-> n3 (diagonal)
  - {id: l03, from: n0, to: n3, bandwidth: 300, latency: 0.002}
  - {id: l30, from: n3, to: n0, bandwidth: 300, latency: 0.002}
  # n1 <-> n2 (diagonal)
  - {id: l12, from: n1, to: n2, bandwidth: 300, latency: 0.002}
  - {id: l21, from: n2, to: n1, bandwidth: 300, latency: 0.002}
  # n1 <-> n3 (vertical right)
  - {id: l13, from: n1, to: n3, bandwidth: 400, latency: 0.001}
  - {id: l31, from: n3, to: n1, bandwidth: 400, latency: 0.001}
  # n2 <-> n3 (horizontal bottom)
  - {id: l23, from: n2, to: n3, bandwidth: 500, latency: 0.001}
  - {id: l32, from: n3, to: n2, bandwidth: 500, latency: 0.001}

Bandwidth Units

Bandwidth is in MB/s (megabytes per second). Latency is in seconds. Transfer time = data_size / bandwidth + latency.

The topology looks like this:

    n0 ---500--- n1
    |  \       / |
   400  300 300  400
    |  /       \ |
    n2 ---500--- n3

Side links (500 MB/s) are faster than diagonal links (300 MB/s), reflecting the longer physical distance of the diagonals.


Step 3: Design the DAG

The fork-join pattern is common in data-parallel workloads: a source task produces data, four workers process it in parallel, and a sink task aggregates the results.

tasks:
  - {id: T_src, compute_cost: 100}
  - {id: W0, compute_cost: 500}
  - {id: W1, compute_cost: 600}
  - {id: W2, compute_cost: 400}
  - {id: W3, compute_cost: 700}
  - {id: T_sink, compute_cost: 100}

The workers have different compute costs (400--700 cu), simulating uneven workloads. This makes scheduling non-trivial -- a good scheduler should assign heavier tasks to faster nodes.

edges:
  # Fan-out: T_src sends 20 MB to each worker
  - {from: T_src, to: W0, data_size: 20}
  - {from: T_src, to: W1, data_size: 20}
  - {from: T_src, to: W2, data_size: 20}
  - {from: T_src, to: W3, data_size: 20}
  # Fan-in: each worker sends 10 MB to T_sink
  - {from: W0, to: T_sink, data_size: 10}
  - {from: W1, to: T_sink, data_size: 10}
  - {from: W2, to: T_sink, data_size: 10}
  - {from: W3, to: T_sink, data_size: 10}

Data Size

data_size is in MB (megabytes). When two tasks are on different nodes, this amount of data must be transferred over the network. When both tasks are on the same node, the transfer is local (instant, no network cost).


Step 4: Write the Complete YAML

Create the file scenarios/my_custom.yaml with the complete scenario:

# Custom fork-join scenario: 4-node mesh with heterogeneous compute
# T_src -> {W0, W1, W2, W3} -> T_sink

scenario:
  name: "Custom Fork-Join Mesh"

  network:
    nodes:
      - id: n0
        compute_capacity: 200
        position: {x: 0, y: 0}
      - id: n1
        compute_capacity: 100
        position: {x: 20, y: 0}
      - id: n2
        compute_capacity: 150
        position: {x: 0, y: 20}
      - id: n3
        compute_capacity: 80
        position: {x: 20, y: 20}
    links:
      # n0 <-> n1 (horizontal top)
      - {id: l01, from: n0, to: n1, bandwidth: 500, latency: 0.001}
      - {id: l10, from: n1, to: n0, bandwidth: 500, latency: 0.001}
      # n0 <-> n2 (vertical left)
      - {id: l02, from: n0, to: n2, bandwidth: 400, latency: 0.001}
      - {id: l20, from: n2, to: n0, bandwidth: 400, latency: 0.001}
      # n0 <-> n3 (diagonal)
      - {id: l03, from: n0, to: n3, bandwidth: 300, latency: 0.002}
      - {id: l30, from: n3, to: n0, bandwidth: 300, latency: 0.002}
      # n1 <-> n2 (diagonal)
      - {id: l12, from: n1, to: n2, bandwidth: 300, latency: 0.002}
      - {id: l21, from: n2, to: n1, bandwidth: 300, latency: 0.002}
      # n1 <-> n3 (vertical right)
      - {id: l13, from: n1, to: n3, bandwidth: 400, latency: 0.001}
      - {id: l31, from: n3, to: n1, bandwidth: 400, latency: 0.001}
      # n2 <-> n3 (horizontal bottom)
      - {id: l23, from: n2, to: n3, bandwidth: 500, latency: 0.001}
      - {id: l32, from: n3, to: n2, bandwidth: 500, latency: 0.001}

  dags:
    - id: dag_1
      inject_at: 0.0
      tasks:
        - {id: T_src, compute_cost: 100}
        - {id: W0, compute_cost: 500}
        - {id: W1, compute_cost: 600}
        - {id: W2, compute_cost: 400}
        - {id: W3, compute_cost: 700}
        - {id: T_sink, compute_cost: 100}
      edges:
        - {from: T_src, to: W0, data_size: 20}
        - {from: T_src, to: W1, data_size: 20}
        - {from: T_src, to: W2, data_size: 20}
        - {from: T_src, to: W3, data_size: 20}
        - {from: W0, to: T_sink, data_size: 10}
        - {from: W1, to: T_sink, data_size: 10}
        - {from: W2, to: T_sink, data_size: 10}
        - {from: W3, to: T_sink, data_size: 10}

  config:
    scheduler: heft
    seed: 42

Save this file, then verify it parses correctly:

ncsim --scenario scenarios/my_custom.yaml --output results/tutorial2/heft

Step 5: Run with Different Schedulers

Run the scenario with each of the three available schedulers:

ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/heft \
      --scheduler heft

ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/cpop \
      --scheduler cpop

ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/rr \
      --scheduler round_robin

Compare the makespans:

Scheduler Strategy Expected Behavior
heft Assigns each task to the node that gives the earliest finish time Tends to place heavy tasks on fast nodes; balances compute vs. transfer cost
cpop Identifies the critical path and assigns critical tasks to the fastest node Prioritizes the critical path; may leave non-critical tasks on slower nodes
round_robin Assigns tasks to nodes in rotation (n0, n1, n2, n3, n0, ...) Ignores compute capacity and transfer cost; useful as a baseline

Examining the placement

Run with --verbose (or -v) to see which tasks are assigned to which nodes:

ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/heft_verbose \
      --scheduler heft -v

Look for the SAGA HEFT assignments: line in the log output.

Use the Gantt chart to visualize how each scheduler distributes work:

python analyze_trace.py results/tutorial2/heft/trace.jsonl --gantt
python analyze_trace.py results/tutorial2/rr/trace.jsonl --gantt

Step 6: Try Different Routing

By default, ncsim uses direct routing -- data can only travel over a single explicit link between two nodes. With a full mesh, every node pair has a direct link, so all transfers are single-hop.

Try widest-path routing, which finds multi-hop paths that maximize bottleneck bandwidth:

ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/widest \
      --routing widest_path

ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/direct \
      --routing direct

When does routing matter?

In a full mesh, widest-path routing has the same result as direct routing because every node pair already has a direct link. Routing makes a bigger difference in linear or tree topologies where some node pairs lack direct connections. See the parallel_spread.yaml scenario for an example.

To see routing effects, try the parallel spread scenario with both routing modes:

ncsim --scenario scenarios/parallel_spread.yaml \
      --output results/tutorial2/spread_direct \
      --routing direct

ncsim --scenario scenarios/parallel_spread.yaml \
      --output results/tutorial2/spread_widest \
      --routing widest_path

In the linear topology, widest-path routing enables HEFT to spread tasks across all 5 nodes (reaching n0 and n4 via multi-hop), while direct routing limits placement to 3 adjacent nodes.


Step 7: Add Interference

Interference models simulate the effect of shared wireless spectrum. When nearby links are active simultaneously, they reduce each other's effective bandwidth.

Proximity Interference

The simplest model: links whose midpoints are within a given radius share bandwidth equally.

# Default radius (15m)
ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/prox_15 \
      --interference proximity

# Smaller radius (10m) -- less interference
ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/prox_10 \
      --interference proximity --interference-radius 10

# Larger radius (30m) -- more interference
ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/prox_30 \
      --interference proximity --interference-radius 30

# No interference
ncsim --scenario scenarios/my_custom.yaml \
      --output results/tutorial2/no_interf \
      --interference none

How proximity interference works

With radius R, if k active links have midpoints within R meters of each other, each gets its bandwidth reduced by a factor of 1/k. This is a simple model -- for a physically accurate WiFi model, see Tutorial 3.

Compare the makespans at different radii:

Interference Radius Effect
none -- All links get full bandwidth at all times
proximity 10m Only very close links interfere (side links, not diagonals)
proximity 15m Default; most adjacent links interfere
proximity 30m All links in the mesh interfere with each other

Larger radius means more contention, which increases transfer times and may change the optimal scheduler placement.


Summary

In this tutorial you built a complete ncsim scenario from scratch:

  1. Network: 4 nodes in a square with heterogeneous compute capacities (80--200 cu/s)
  2. Links: Full mesh with 12 directional links and varying bandwidths (300--500 MB/s)
  3. DAG: Fork-join pattern with 6 tasks and uneven worker compute costs (400--700 cu)
  4. Experimented with three schedulers (HEFT, CPOP, round-robin), two routing modes (direct, widest-path), and multiple interference settings

Key Takeaways

Concept Lesson
Heterogeneous nodes Create non-trivial scheduling decisions
Fork-join DAGs Expose parallelism that smart schedulers exploit
Full mesh topology Gives direct routing access to all node pairs
Interference radius Controls how aggressively links share bandwidth

Complete Configuration Reference

For the full YAML schema and all available options, see the YAML Reference.

What's Next

Tutorial Topic
Tutorial 3: WiFi Experiment Use the physically-grounded CSMA/CA Bianchi WiFi model
Tutorial 4: Compare Schedulers Systematic scheduler comparison across multiple scenarios