Tutorial 2: Build a Custom Scenario¶
This tutorial walks you through creating a complete ncsim scenario from scratch -- a 4-node mesh network with a fork-join DAG -- and running it with different schedulers, routing algorithms, and interference models.
What You Will Learn¶
- Design a network topology with heterogeneous compute capacities
- Create a task dependency graph (DAG) with fork-join parallelism
- Write a complete scenario YAML file
- Run and analyze your custom scenario
- Experiment with different schedulers, routing, and interference settings
Prerequisites¶
- ncsim installed (Tutorial 1)
- Familiarity with YAML syntax
The Scenario¶
We will build a 4-node mesh network arranged in a square. Each node has a different compute capacity, and the links between them have varying bandwidths. A fork-join DAG of 6 tasks will be scheduled across the network:
One source task fans out to four parallel workers, which all fan back into a single sink task.
Step 1: Define the Network¶
Our 4 nodes form a square with 20-meter sides. Each node has a different compute capacity to make scheduling decisions interesting:
nodes:
- id: n0
compute_capacity: 200
position: {x: 0, y: 0}
- id: n1
compute_capacity: 100
position: {x: 20, y: 0}
- id: n2
compute_capacity: 150
position: {x: 0, y: 20}
- id: n3
compute_capacity: 80
position: {x: 20, y: 20}
Compute Capacity
compute_capacity is in compute units per second (cu/s). A task with
compute_cost: 500 running on a node with compute_capacity: 200 takes
500 / 200 = 2.5 seconds to complete.
The heterogeneous capacities create a tradeoff: n0 is the fastest node (200 cu/s), but the scheduler must balance load across all nodes when there are more tasks than n0 can handle sequentially.
Step 2: Define the Links¶
A full mesh of 4 nodes requires 6 bidirectional connections, which means 12 directional links. We give each direction its own link with varying bandwidths to create asymmetry:
links:
# n0 <-> n1 (horizontal top)
- {id: l01, from: n0, to: n1, bandwidth: 500, latency: 0.001}
- {id: l10, from: n1, to: n0, bandwidth: 500, latency: 0.001}
# n0 <-> n2 (vertical left)
- {id: l02, from: n0, to: n2, bandwidth: 400, latency: 0.001}
- {id: l20, from: n2, to: n0, bandwidth: 400, latency: 0.001}
# n0 <-> n3 (diagonal)
- {id: l03, from: n0, to: n3, bandwidth: 300, latency: 0.002}
- {id: l30, from: n3, to: n0, bandwidth: 300, latency: 0.002}
# n1 <-> n2 (diagonal)
- {id: l12, from: n1, to: n2, bandwidth: 300, latency: 0.002}
- {id: l21, from: n2, to: n1, bandwidth: 300, latency: 0.002}
# n1 <-> n3 (vertical right)
- {id: l13, from: n1, to: n3, bandwidth: 400, latency: 0.001}
- {id: l31, from: n3, to: n1, bandwidth: 400, latency: 0.001}
# n2 <-> n3 (horizontal bottom)
- {id: l23, from: n2, to: n3, bandwidth: 500, latency: 0.001}
- {id: l32, from: n3, to: n2, bandwidth: 500, latency: 0.001}
Bandwidth Units
Bandwidth is in MB/s (megabytes per second). Latency is in seconds.
Transfer time = data_size / bandwidth + latency.
The topology looks like this:
Side links (500 MB/s) are faster than diagonal links (300 MB/s), reflecting the longer physical distance of the diagonals.
Step 3: Design the DAG¶
The fork-join pattern is common in data-parallel workloads: a source task produces data, four workers process it in parallel, and a sink task aggregates the results.
tasks:
- {id: T_src, compute_cost: 100}
- {id: W0, compute_cost: 500}
- {id: W1, compute_cost: 600}
- {id: W2, compute_cost: 400}
- {id: W3, compute_cost: 700}
- {id: T_sink, compute_cost: 100}
The workers have different compute costs (400--700 cu), simulating uneven workloads. This makes scheduling non-trivial -- a good scheduler should assign heavier tasks to faster nodes.
edges:
# Fan-out: T_src sends 20 MB to each worker
- {from: T_src, to: W0, data_size: 20}
- {from: T_src, to: W1, data_size: 20}
- {from: T_src, to: W2, data_size: 20}
- {from: T_src, to: W3, data_size: 20}
# Fan-in: each worker sends 10 MB to T_sink
- {from: W0, to: T_sink, data_size: 10}
- {from: W1, to: T_sink, data_size: 10}
- {from: W2, to: T_sink, data_size: 10}
- {from: W3, to: T_sink, data_size: 10}
Data Size
data_size is in MB (megabytes). When two tasks are on different nodes,
this amount of data must be transferred over the network. When both tasks
are on the same node, the transfer is local (instant, no network cost).
Step 4: Write the Complete YAML¶
Create the file scenarios/my_custom.yaml with the complete scenario:
# Custom fork-join scenario: 4-node mesh with heterogeneous compute
# T_src -> {W0, W1, W2, W3} -> T_sink
scenario:
name: "Custom Fork-Join Mesh"
network:
nodes:
- id: n0
compute_capacity: 200
position: {x: 0, y: 0}
- id: n1
compute_capacity: 100
position: {x: 20, y: 0}
- id: n2
compute_capacity: 150
position: {x: 0, y: 20}
- id: n3
compute_capacity: 80
position: {x: 20, y: 20}
links:
# n0 <-> n1 (horizontal top)
- {id: l01, from: n0, to: n1, bandwidth: 500, latency: 0.001}
- {id: l10, from: n1, to: n0, bandwidth: 500, latency: 0.001}
# n0 <-> n2 (vertical left)
- {id: l02, from: n0, to: n2, bandwidth: 400, latency: 0.001}
- {id: l20, from: n2, to: n0, bandwidth: 400, latency: 0.001}
# n0 <-> n3 (diagonal)
- {id: l03, from: n0, to: n3, bandwidth: 300, latency: 0.002}
- {id: l30, from: n3, to: n0, bandwidth: 300, latency: 0.002}
# n1 <-> n2 (diagonal)
- {id: l12, from: n1, to: n2, bandwidth: 300, latency: 0.002}
- {id: l21, from: n2, to: n1, bandwidth: 300, latency: 0.002}
# n1 <-> n3 (vertical right)
- {id: l13, from: n1, to: n3, bandwidth: 400, latency: 0.001}
- {id: l31, from: n3, to: n1, bandwidth: 400, latency: 0.001}
# n2 <-> n3 (horizontal bottom)
- {id: l23, from: n2, to: n3, bandwidth: 500, latency: 0.001}
- {id: l32, from: n3, to: n2, bandwidth: 500, latency: 0.001}
dags:
- id: dag_1
inject_at: 0.0
tasks:
- {id: T_src, compute_cost: 100}
- {id: W0, compute_cost: 500}
- {id: W1, compute_cost: 600}
- {id: W2, compute_cost: 400}
- {id: W3, compute_cost: 700}
- {id: T_sink, compute_cost: 100}
edges:
- {from: T_src, to: W0, data_size: 20}
- {from: T_src, to: W1, data_size: 20}
- {from: T_src, to: W2, data_size: 20}
- {from: T_src, to: W3, data_size: 20}
- {from: W0, to: T_sink, data_size: 10}
- {from: W1, to: T_sink, data_size: 10}
- {from: W2, to: T_sink, data_size: 10}
- {from: W3, to: T_sink, data_size: 10}
config:
scheduler: heft
seed: 42
Save this file, then verify it parses correctly:
Step 5: Run with Different Schedulers¶
Run the scenario with each of the three available schedulers:
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/heft \
--scheduler heft
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/cpop \
--scheduler cpop
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/rr \
--scheduler round_robin
Compare the makespans:
| Scheduler | Strategy | Expected Behavior |
|---|---|---|
| heft | Assigns each task to the node that gives the earliest finish time | Tends to place heavy tasks on fast nodes; balances compute vs. transfer cost |
| cpop | Identifies the critical path and assigns critical tasks to the fastest node | Prioritizes the critical path; may leave non-critical tasks on slower nodes |
| round_robin | Assigns tasks to nodes in rotation (n0, n1, n2, n3, n0, ...) | Ignores compute capacity and transfer cost; useful as a baseline |
Examining the placement
Run with --verbose (or -v) to see which tasks are assigned to which nodes:
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/heft_verbose \
--scheduler heft -v
Look for the SAGA HEFT assignments: line in the log output.
Use the Gantt chart to visualize how each scheduler distributes work:
python analyze_trace.py results/tutorial2/heft/trace.jsonl --gantt
python analyze_trace.py results/tutorial2/rr/trace.jsonl --gantt
Step 6: Try Different Routing¶
By default, ncsim uses direct routing -- data can only travel over a single explicit link between two nodes. With a full mesh, every node pair has a direct link, so all transfers are single-hop.
Try widest-path routing, which finds multi-hop paths that maximize bottleneck bandwidth:
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/widest \
--routing widest_path
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/direct \
--routing direct
When does routing matter?
In a full mesh, widest-path routing has the same result as direct routing
because every node pair already has a direct link. Routing makes a bigger
difference in linear or tree topologies where some node pairs lack
direct connections. See the parallel_spread.yaml scenario for an example.
To see routing effects, try the parallel spread scenario with both routing modes:
ncsim --scenario scenarios/parallel_spread.yaml \
--output results/tutorial2/spread_direct \
--routing direct
ncsim --scenario scenarios/parallel_spread.yaml \
--output results/tutorial2/spread_widest \
--routing widest_path
In the linear topology, widest-path routing enables HEFT to spread tasks across all 5 nodes (reaching n0 and n4 via multi-hop), while direct routing limits placement to 3 adjacent nodes.
Step 7: Add Interference¶
Interference models simulate the effect of shared wireless spectrum. When nearby links are active simultaneously, they reduce each other's effective bandwidth.
Proximity Interference¶
The simplest model: links whose midpoints are within a given radius share bandwidth equally.
# Default radius (15m)
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/prox_15 \
--interference proximity
# Smaller radius (10m) -- less interference
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/prox_10 \
--interference proximity --interference-radius 10
# Larger radius (30m) -- more interference
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/prox_30 \
--interference proximity --interference-radius 30
# No interference
ncsim --scenario scenarios/my_custom.yaml \
--output results/tutorial2/no_interf \
--interference none
How proximity interference works
With radius R, if k active links have midpoints within R meters of each other, each gets its bandwidth reduced by a factor of 1/k. This is a simple model -- for a physically accurate WiFi model, see Tutorial 3.
Compare the makespans at different radii:
| Interference | Radius | Effect |
|---|---|---|
| none | -- | All links get full bandwidth at all times |
| proximity | 10m | Only very close links interfere (side links, not diagonals) |
| proximity | 15m | Default; most adjacent links interfere |
| proximity | 30m | All links in the mesh interfere with each other |
Larger radius means more contention, which increases transfer times and may change the optimal scheduler placement.
Summary¶
In this tutorial you built a complete ncsim scenario from scratch:
- Network: 4 nodes in a square with heterogeneous compute capacities (80--200 cu/s)
- Links: Full mesh with 12 directional links and varying bandwidths (300--500 MB/s)
- DAG: Fork-join pattern with 6 tasks and uneven worker compute costs (400--700 cu)
- Experimented with three schedulers (HEFT, CPOP, round-robin), two routing modes (direct, widest-path), and multiple interference settings
Key Takeaways¶
| Concept | Lesson |
|---|---|
| Heterogeneous nodes | Create non-trivial scheduling decisions |
| Fork-join DAGs | Expose parallelism that smart schedulers exploit |
| Full mesh topology | Gives direct routing access to all node pairs |
| Interference radius | Controls how aggressively links share bandwidth |
Complete Configuration Reference¶
For the full YAML schema and all available options, see the YAML Reference.
What's Next¶
| Tutorial | Topic |
|---|---|
| Tutorial 3: WiFi Experiment | Use the physically-grounded CSMA/CA Bianchi WiFi model |
| Tutorial 4: Compare Schedulers | Systematic scheduler comparison across multiple scenarios |