Output Files¶
Every ncsim run produces three files in the --output directory. Together,
these files form a self-contained record that can fully reproduce and analyze
any simulation run.
output/
scenario.yaml # Copy of input scenario with all defaults filled in
trace.jsonl # One JSON object per line for every discrete event
metrics.json # Summary statistics for the run
scenario.yaml¶
A verbatim copy of the input scenario YAML file, placed in the output directory for convenience. This ensures that every output folder is self-contained -- you can re-run the exact same simulation from the output directory alone.
# Re-run from a previous output folder
ncsim --scenario output/my_run/scenario.yaml --output output/my_run_v2
Self-contained output folders
Copying the scenario into the output directory means you never lose track of which configuration produced a given set of results, even if you later modify the original scenario file.
trace.jsonl¶
The trace file records every discrete event that occurred during the simulation, one JSON object per line (JSON Lines format). Events are written in chronological order with monotonically increasing sequence numbers.
Example Trace¶
{"seq":0,"sim_time":0.0,"type":"sim_start","trace_version":"1.0","seed":42,"scenario":"demo_simple.yaml"}
{"seq":1,"sim_time":0.0,"type":"dag_inject","dag_id":"dag_1","task_ids":["T0","T1"]}
{"seq":2,"sim_time":0.0,"type":"task_scheduled","dag_id":"dag_1","task_id":"T0","node_id":"n0"}
{"seq":3,"sim_time":0.0,"type":"task_start","dag_id":"dag_1","task_id":"T0","node_id":"n0"}
{"seq":4,"sim_time":0.0,"type":"task_scheduled","dag_id":"dag_1","task_id":"T1","node_id":"n0"}
{"seq":5,"sim_time":1.0,"type":"task_complete","dag_id":"dag_1","task_id":"T0","node_id":"n0","duration":1.0}
{"seq":6,"sim_time":1.0,"type":"transfer_start","dag_id":"dag_1","from_task":"T0","to_task":"T1","link_id":"l01","data_size":50}
{"seq":7,"sim_time":1.501,"type":"transfer_complete","dag_id":"dag_1","from_task":"T0","to_task":"T1","link_id":"l01","duration":0.501}
{"seq":8,"sim_time":1.501,"type":"task_start","dag_id":"dag_1","task_id":"T1","node_id":"n0"}
{"seq":9,"sim_time":3.501,"type":"task_complete","dag_id":"dag_1","task_id":"T1","node_id":"n0","duration":2.0}
{"seq":10,"sim_time":3.501,"type":"sim_end","status":"completed","makespan":3.501,"total_events":10}
Event Type Reference¶
| Event Type | Key Fields | Description |
|---|---|---|
sim_start |
trace_version, seed, scenario, scenario_hash |
Simulation begins. Always the first event (seq: 0). |
dag_inject |
dag_id, task_ids |
A DAG is injected into the simulation at the specified sim_time. |
task_scheduled |
dag_id, task_id, node_id |
The scheduler assigns a task to a compute node. |
task_start |
dag_id, task_id, node_id |
A task begins executing on its assigned node. |
task_complete |
dag_id, task_id, node_id, duration |
A task finishes execution. duration is wall-clock compute time. |
transfer_start |
dag_id, from_task, to_task, link_id, data_size |
A data transfer begins between two tasks. data_size is in MB. May include route for multi-hop paths. |
transfer_complete |
dag_id, from_task, to_task, link_id, duration |
A data transfer finishes. duration is total transfer time in seconds. May include route for multi-hop paths. |
sim_end |
status, makespan, total_events |
Simulation complete. Always the last event. |
Common Fields¶
Every event includes these fields:
| Field | Type | Description |
|---|---|---|
seq |
int | Monotonically increasing sequence number, starting at 0 |
sim_time |
float | Simulation time in seconds when the event occurred |
type |
string | Event type identifier (see table above) |
Time precision
All sim_time and duration values are rounded to microsecond precision
(6 decimal places) to avoid floating-point drift across long simulations.
metrics.json¶
A JSON file containing summary statistics for the simulation run.
Example¶
{
"scenario": "demo_simple.yaml",
"seed": 42,
"makespan": 3.501,
"total_tasks": 2,
"total_transfers": 1,
"total_events": 10,
"status": "completed",
"node_utilization": {
"n0": 0.857,
"n1": 0.0
},
"link_utilization": {
"l01": 0.143
}
}
Field Reference¶
| Field | Type | Description |
|---|---|---|
scenario |
string | Name of the input scenario YAML file |
seed |
int | Random seed used for this run |
makespan |
float | Total simulation time from first event to last task completion (seconds) |
total_tasks |
int | Number of tasks across all DAGs |
total_transfers |
int | Number of data-dependency edges across all DAGs |
total_events |
int | Total number of discrete events in the trace |
status |
string | "completed" on success, "error" on failure |
node_utilization |
object | Per-node utilization ratio (0.0--1.0). Computed as total busy time divided by makespan. |
link_utilization |
object | Per-link utilization ratio (0.0--1.0). Computed as total transfer time divided by makespan. |
error_message |
string | Present only when status is "error". Describes the failure. |
When the WiFi interference model (csma_clique or csma_bianchi) is active,
additional fields are included:
| Field | Type | Description |
|---|---|---|
rf_config |
object | Full RF configuration used (tx power, frequency, path loss exponent, etc.) |
carrier_sensing_range_m |
float | Computed carrier sensing range in meters |
link_phy_rates_MBps |
object | Per-link PHY data rate in MB/s before contention adjustment |
max_clique_sizes |
object | Per-link maximum clique size from the conflict graph |
Working with Output Files¶
Loading trace.jsonl in Python¶
import json
def load_trace(path):
"""Load all events from a trace file."""
events = []
with open(path) as f:
for line in f:
events.append(json.loads(line))
return events
events = load_trace("output/my_run/trace.jsonl")
print(f"Total events: {len(events)}")
print(f"Makespan: {events[-1]['makespan']}")
Filtering events by type¶
events = load_trace("output/my_run/trace.jsonl")
# Get all task completion events
completions = [e for e in events if e["type"] == "task_complete"]
for c in completions:
print(f" {c['task_id']} on {c['node_id']}: {c['duration']:.3f}s")
Loading metrics.json in Python¶
import json
with open("output/my_run/metrics.json") as f:
metrics = json.load(f)
print(f"Makespan: {metrics['makespan']:.3f}s")
print(f"Status: {metrics['status']}")
# Print node utilization
for node, util in metrics["node_utilization"].items():
print(f" {node}: {util:.1%}")
Comparing two runs¶
import json
def load_metrics(path):
with open(path) as f:
return json.load(f)
m1 = load_metrics("output/heft_run/metrics.json")
m2 = load_metrics("output/cpop_run/metrics.json")
speedup = m1["makespan"] / m2["makespan"]
print(f"HEFT makespan: {m1['makespan']:.3f}s")
print(f"CPOP makespan: {m2['makespan']:.3f}s")
print(f"CPOP speedup: {speedup:.2f}x")
Computing transfer overhead from trace¶
events = load_trace("output/my_run/trace.jsonl")
total_compute = sum(
e["duration"] for e in events if e["type"] == "task_complete"
)
total_transfer = sum(
e["duration"] for e in events if e["type"] == "transfer_complete"
)
overhead = total_transfer / (total_compute + total_transfer) * 100
print(f"Compute time: {total_compute:.3f}s")
print(f"Transfer time: {total_transfer:.3f}s")
print(f"Transfer overhead: {overhead:.1f}%")
Large trace files
For scenarios with many DAGs or tasks, trace files can grow large. Consider streaming the JSONL file line-by-line rather than loading the entire file into memory: