This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
- When asked to write a plan, ONLY write the plan file. Do not attempt to implement, ask to implement, or exit plan mode. Stop after writing the file.
This is a dual-visualization network traffic analysis system built with D3.js v7 for analyzing TCP packet data and attack patterns. It provides two complementary views:
- Network TimeArcs (
attack-network.html→attack-network.js) - Arc-based visualization of attack events over time with force-directed IP positioning. Default mode: Force layout network view (2D force-directed network graph); Timearcs Time Line View (arc-based timeline) available via radio toggle - TCP Connection Analysis (
tcp-analysis.html→tcp-analysis.js) - Detailed packet-level visualization with three named UI components:- Packet View — main visualization area: stacked circles, arcs, time axis (
#chart-column) - Overview Bar chart — stacked flow bars at bottom, brush-navigable time range (
#overview-container) - Control Panel — floating draggable panel: IP selection, legends, view controls (
#control-panel)
- Packet View — main visualization area: stacked circles, arcs, time axis (
This is a static HTML/JavaScript application. Serve the directory with any HTTP server:
# Python 3
python -m http.server 8000
# Node.js (npx)
npx serve .
# Then open:
# http://localhost:8000/attack-network.html (TimeArcs view)
# http://localhost:8000/tcp-analysis.html (TCP Analysis view)The index.html redirects to attack-network.html by default.
┌─────────────────────────────────────────────────────────┐
│ Main Visualizations │
│ attack-network.js (~2000 LOC) - Arc network view │
│ tcp-analysis.js (~4600 LOC) - Packet analysis view │
└──────────────────────────┬──────────────────────────────┘
│
┌──────────────────────────┴──────────────────────────────┐
│ Supporting Modules │
│ control-panel.js - Control Panel UI (drag/collapse) │
│ sidebar.js - IP/flow selection UI │
│ legends.js - Legend rendering │
│ overview_chart.js - Overview Bar chart + brush nav │
│ folder_integration.js (~1300 LOC) - Folder data coord │
│ folder_loader.js - Chunked folder data loading │
│ viewer_loader.js - Viewer initialization utilities │
└──────────────────────────┬──────────────────────────────┘
│
┌──────────────────────────┴──────────────────────────────┐
│ /src Modular System (ES6 modules) │
│ │
│ rendering/ bars.js, circles.js, arcPath.js, rows.js │
│ arcInteractions.js, highlightUtils.js │
│ tooltip.js, svgSetup.js, initialRender.js │
│ scales/ scaleFactory.js, distortion.js (fisheye) │
│ bifocal.js (focus+context layout math) │
│ layout/ forceSimulation.js, force_network.js │
│ timearcs_layout.js (~1600 LOC, class) │
│ interaction/ zoom.js, dragReorder.js, resize.js │
│ data/ binning.js (visible packets, bar width calc) │
│ csvParser.js, flowReconstruction.js │
│ resolution-manager.js, data-source.js │
│ component-loader.js, csv-resolution-manager.js│
│ aggregation.js, flow-loader.js │
│ flow-list-loader.js (lazy CSV loading) │
│ adaptive-overview-loader.js │
│ packet-filter.js, flow-data-handler.js │
│ tcp/ flags.js (TCP flag classification) │
│ groundTruth/ groundTruth.js (attack event loading) │
│ mappings/ decoders.js, loaders.js │
│ workers/ packetWorkerManager.js │
│ plugins/ d3-fisheye.js │
│ search/ pattern-language.js (DSL tokenizer/parser/ │
│ compiler/matcher), pattern-presets.js, │
│ pattern-search-engine.js, │
│ flow-abstractor.js, search-results.js, │
│ pattern-ast-to-blocks.js, blocks-to-dsl.js │
│ ui/ legend.js, bifocal-handles.js │
│ loading-indicator.js │
│ pattern-search-panel.js, │
│ pattern-builder-popup.js/.css │
│ utils/ formatters.js, helpers.js │
│ config/ constants.js │
└──────────────────────────────────────────────────────────┘
- CSV Input →
csvParser.jsstream parsing OR folder-based chunked loading - Packet Objects → flow reconstruction, force layout positioning
- Ground Truth →
groundTruth.jsloads attack event annotations from CSV - Pre-binned Data → Multi-resolution CSV files loaded by
csv-resolution-manager.js(hours/minutes/seconds/100ms/10ms/1ms/raw) - Resolution Management →
resolution-manager.jshandles zoom-level data with LRU caching - Rendering → stacked bars by flag type, arcs between IPs (
initialRender.jsprepares data,bars.js/circles.jsrender)
Flow Data for Overview Bar chart (tcp-analysis.js:1703-1757):
- When IPs are selected,
updateIPFilter()is called (async function) - Uses adaptive multi-resolution loader (
flow_bins_index.json) for efficient Overview Bar chart rendering - Falls back to
flow_bins.jsonor chunk loading if multi-resolution not available - For v3 format (
chunked_flows_by_ip_pair), filters chunks by IP pair first for efficiency - Passes filtered/aggregated data to
overview_chart.jsfor categorization and binning
packet_worker.js handles packet filtering off the main thread:
- Receives packets via
initmessage - Filters by connection keys or IPs
- Returns
Uint8Arrayvisibility mask - Managed by
src/workers/packetWorkerManager.js
config.js- Centralized settings (GLOBAL_BIN_COUNT, batch sizes)src/config/constants.js- Colors, sizes, TCP states
full_ip_map.json- IP address → descriptive nameattack_group_mapping.json- Attack type → categoryattack_group_color_mapping.json- Category → colorevent_type_mapping.json- Event → colorflag_colors.json,flow_colors.json- Visual styling
TimeArcs CSV: timestamp, length, src_ip, dst_ip, protocol, count
TCP Analysis CSV: timestamp, src_ip, dst_ip, src_port, dst_port, flags, length, ...
Folder-based data (v3 format - chunked_flows_by_ip_pair):
packets_data/attack_flows_day1to5/
├── manifest.json # Dataset metadata (version 3.0, format, totals, time range)
├── flows/
│ ├── pairs_meta.json # IP pair index with per-pair chunk metadata
│ └── by_pair/ # Flows organized by IP pair (efficient filtering)
│ ├── 172-28-4-7__19-202-221-71/
│ │ ├── chunk_00000.json
│ │ ├── chunk_00001.json
│ │ └── ...
│ └── ... # (574 IP pairs, 1318 total chunks)
├── indices/
│ ├── bins.json # Time bins with total packet counts
│ ├── flow_bins.json # Pre-aggregated flows by IP pair (single resolution)
│ ├── flow_bins_index.json # Multi-resolution index for adaptive loading
│ ├── flow_bins_1s.json # 1-second resolution bins (for zoomed views)
│ ├── flow_bins_1min.json # 1-minute resolution bins
│ ├── flow_bins_10min.json # 10-minute resolution bins
│ ├── flow_bins_hour.json # Hourly resolution bins
│ └── flow_list/ # Flow summaries for flow list popup (lazy-loaded CSVs)
│ ├── index.json # IP pair index with file references
│ └── *.csv # Per-IP-pair CSV files (574 files, ~525MB total)
└── ips/
├── ip_stats.json # Per-IP packet/byte counts
├── flag_stats.json # Global TCP flag distribution
└── unique_ips.json # List of all IPs in dataset
Legacy v2 format (chunked_flows) also supported:
packets_data/attack_flows_day1to5_v2/
├── manifest.json # version 2.2, format: chunked_flows
├── flows/
│ ├── chunks_meta.json # Flat chunk index
│ ├── chunk_00000.json # ~300 flows per chunk
│ └── ...
└── ...
The code auto-detects format from manifest.json and loads appropriately.
attack-network.js(~2000 LOC) - Arc network view orchestrator (data loading, mode switching, UI wiring)tcp-analysis.js(~4600 LOC) - Detailed packet analysis with stacked bars
Both compose modules from /src and maintain extensive internal state (IP positions, selections, zoom state).
src/layout/timearcs_layout.js (~1600 LOC) encapsulates the timearcs arc visualization, extracted from attack-network.js. Mirrors the ForceNetworkLayout pattern: one class with constructor options, separate setData() and render() calls, and pull-based context retrieval via callbacks.
Key responsibilities:
- Force simulation setup (component separation, hub attraction, y-constraints)
- Arc rendering with gradient coloring by attack type
- IP label layout and hover highlighting
- Bifocal (focus+context) lens distortion
- Drag-to-brush selection system
- Animated transitions between layout modes
Loading Bar (tcp-analysis.html, tcp-analysis.css):
- A progress bar is shown in
tcp-analysis.htmlwhile data is loading on page open - Disappears once initial render completes
The overview_chart.js module (~1100 LOC) provides:
- Stacked bar overview of invalid flows by reason (the Overview Bar chart)
- Brush-based time range selection synced with Packet View zoom
- Legend integration: clicking a legend item hides/shows bars of that category and recomputes bar heights based on the remaining visible data's max
Legend filter behavior (overview_chart.js):
- Module-level
overviewHiddenReasons(Set) andoverviewHiddenCloseTypes(Set) track legend-toggled visibility; these persist across chart recreations (e.g. IP filter changes) - Clicking a legend item toggles the appropriate set then calls
recomputeOverviewBars() recomputeOverviewBars()dispatches to_recomputeFlows()or_recomputeAdaptive()based on which rendering path is active- Two-pass recompute: (1) compute new
sharedMaxfrom visible-only categories, (2) restack y-positions per bin from scratch — required because stacking means hiding one bar shifts its neighbors _applyPositions()applies results:display:nonefor hidden segments, animated 200ms y/height transitions for visible ones- Both the overview-local sets AND the main-app filter sets (
hiddenInvalidReasonsRef/hiddenCloseTypesRef) are considered;updateOverviewInvalidVisibility()now callsrecomputeOverviewBars()instead of CSS-only show/hide
Current Implementation (Multi-resolution adaptive loading):
tcp-analysis.jsinitializesAdaptiveOverviewLoaderfromflow_bins_index.json- Loader selects appropriate resolution based on visible time range (hour → 10min → 1min)
- Filters pre-aggregated flow bins by selected IP pairs
- Creates synthetic flows from bin data for Overview Bar chart
- Fallback chain: adaptive loader →
flow_bins.json→ chunk loading
Multi-resolution index (flow_bins_index.json):
{
"resolutions": {
"1s": { "file": "flow_bins_1s.json", "bin_width_us": 1000000, "use_when_range_minutes_lte": 10 },
"1min": { "file": "flow_bins_1min.json", "bin_width_us": 60000000, "use_when_range_minutes_lte": 120 },
"10min": { "file": "flow_bins_10min.json", "bin_width_us": 600000000, "use_when_range_minutes_lte": 7200 },
"hour": { "file": "flow_bins_hour.json", "bin_width_us": 3600000000, "use_when_range_minutes_gt": 7200 }
}
}flow_bins.json Structure (per resolution):
[
{
"bin": 0,
"start": 1257254652674641,
"end": 1257258647167936,
"flows_by_ip_pair": {
"172.28.1.134<->152.162.178.254": {
"graceful": 1,
"abortive": 5,
"invalid": {
"rst_during_handshake": 290,
"invalid_ack": 2,
"incomplete_no_synack": 1
},
"ongoing": 10
}
}
}
]Benefits:
- Adaptive resolution: Coarse bins for overview, fine bins when zoomed
- Instant loading: Small files vs. thousands of chunk files
- Efficient filtering: Pre-aggregated by IP pair
- Reduced memory: No need to load full flow objects for overview
For deployments where chunk files are too large (e.g., GitHub Pages), generate per-IP-pair CSV files. Two scripts are available:
# Summary only (no packet data) - smaller files
python packets_data/generate_flow_list.py --input-dir packets_data/attack_flows_day1to5
# With embedded packet data (fp column) - enables "View Packets" button
python packets_data/generate_flow_data.py --input-dir packets_data/attack_flows_day1to5Output Structure:
indices/flow_list/
├── index.json # IP pair index (87KB)
├── 172-28-4-7__192-168-1-1.csv # Flows for this IP pair
├── 172-28-4-7__10-0-0-1.csv # Another IP pair
└── ... # 574 files total (~525MB)
index.json Structure:
{
"version": "1.1",
"format": "flow_list_csv",
"columns": ["d", "st", "et", "p", "sp", "dp", "ct", "fp"],
"total_flows": 5482939,
"total_pairs": 574,
"unique_ips": 294,
"time_range": { "start": 1257254652674641, "end": 1257654102004202 },
"pairs": [
{ "pair": "172.28.4.7<->192.168.1.1", "file": "172-28-4-7__192-168-1-1.csv", "count": 1523 }
]
}CSV Format (with embedded packet data):
start_time,src_port,dst_port,close_type,packets
1257254652674641,54321,80,invalid_ack,"0:2:1,159:18:0,578:16:1"start_time: Start time (absolute microseconds)close_type: Close type (graceful/abortive/ongoing) or invalid reason (invalid_ack, rst_during_handshake, etc.)
The packets column contains embedded packet data: delta_ts:flags:dir,...
delta_ts: Microseconds relative to flow start timeflags: TCP flags (numeric value)dir: Direction (1 = ip1→ip2, 0 = ip2→ip1) based on alphabetical IP order from filename- Initiator = first packet's dir (dir=1 means ip1 initiated, dir=0 means ip2 initiated)
- Packet count = number of comma-separated entries
- Duration = last packet's delta_ts (end_time = start_time + last delta_ts)
Lazy Loading Behavior:
- On page load: Only
index.jsonis fetched (~87KB) - On IP selection: No CSV files loaded yet; UI shows "Flow List Available"
- On Overview Bar chart click: Only relevant IP pair CSVs are fetched for the clicked time range
- Loaded CSVs are cached in memory for subsequent requests
Key Files:
src/data/flow-list-loader.js- FlowListLoader class for parsing/caching CSVs withfpcolumn supportsrc/data/flow-loader.js- Decision tree that defers loading when FlowListLoader available
When flow_list CSVs are present:
- Flow list popup works without loading chunk files
- If
fpcolumn present: "View Packets" visualizes embedded packet data (no chunk files needed) - If
fpcolumn absent: "View Packets" and "Export CSV" buttons are disabled - Overview Bar chart still uses adaptive flow_bins for visualization
- CSV format is ~45% smaller than JSON; all files under GitHub's 100MB limit
The csv-resolution-manager.js handles zoom-level dependent packet data loading with 7 resolution levels:
| Resolution | Bin Size | Auto Threshold |
|---|---|---|
| hours | 1 hour | > 2 days |
| minutes | 1 minute | > 1 hour |
| seconds | 1 second | > 1 minute |
| 100ms | 100ms | > 10 seconds |
| 10ms | 10ms | > 1 second |
| 1ms | 1ms | > 100ms |
| raw | individual packets | ≤ 100ms |
Coarse resolutions (hours, minutes, seconds) use single-file data.csv files loaded at initialization. Fine resolutions (100ms, 10ms, 1ms, raw) use chunked files loaded on-demand with LRU caching.
Resolution Selection (getResolutionForVisibleRange() in tcp-analysis.js):
- Auto mode: Threshold-based — picks the coarsest level whose threshold the visible range exceeds
- Manual override: Dropdown selects an explicit resolution level. The selected level is used directly. When the user zooms in past the next finer level's threshold, it auto-refines one step at a time (e.g., minutes → seconds → 100ms → ...). Never goes coarser than the selected level.
- The dropdown labels use "+" suffix (e.g., "Minutes+") to indicate zoom-to-finer behavior
- A current resolution indicator badge next to the dropdown shows the active resolution (blue = auto, orange = manual override)
Generating multi-resolution flow bins:
# Generate all resolutions from existing v3 data
python packets_data/generate_flow_bins_v3.py --input-dir packets_data/attack_flows_day1to5src/groundTruth/groundTruth.js loads attack event annotations from GroundTruth_UTC_naive.csv:
- Parses event types, source/destination IPs, port ranges, time windows
- Converts timestamps to microseconds for alignment with packet data
- Filters events by selected IPs for contextual display
The visualization uses a sophisticated layout system to prevent overlapping when multiple IP pairs share the same source IP row:
Per-IP Dynamic Row Heights (src/layout/ipPositioning.js):
computeIPPairCounts()counts unique destination IPs per source IP- Base row height:
max(ROW_GAP, pairCount * (SUB_ROW_HEIGHT + SUB_ROW_GAP))(i.e.pairCount * 32px) - When Separate Flags is on, heights are expanded post-binning by
computeFlagSeparationHeights()/computeSubRowLayout()intcp-analysis.jsto fit actual circle stacking ipRowHeightsMap stored in state for rendering access- Cumulative positioning: each IP's y = previous IP's y + previous IP's row height
IP Pair Vertical Offsets (src/rendering/bars.js, src/rendering/circles.js):
- Pairs within a row are ordered by time of first packet (earliest first)
- Each pair gets a sub-row offset:
pairIndex * (subRowHeight + SUB_ROW_GAP) - First pair (index 0) aligns with the IP label baseline
- Subsequent pairs grow downward within the row's allocated height
makeIpPairKey(srcIp, dstIp)creates canonical keys (alphabetically sorted)
Sub-Row Target IP Labels (src/rendering/circles.js):
- When an IP row is expanded and has multiple sub-rows, each sub-row displays a small italic label showing the target (partner) IP address
- Labels are rendered as
.sub-row-ip-labeltext elements inside the circle layer, positioned just left of the leftmost circle in each sub-row (x = firstCircle.cx - radius - 4px) - Y position uses the stable sub-row center (from
ipPairOrderByRow), unaffected by flag separation - Labels are only shown for multi-pair rows; single-pair IPs and collapsed rows are skipped
- Target IP is extracted from the canonical pair key by comparing against the row's
src_ip - Labels are cleared and re-created on every
renderCircles()call (zoom, pan, filter changes) - Styled: 9px monospace, italic,
#888fill,pointer-events: none
Sub-Row Ghost Arcs (src/rendering/circles.js, src/rendering/svgSetup.js):
- Persistent ghost arcs show IP pair connections at the sub-row level
- Rendered as low-opacity arcs connecting each IP pair's sub-row position
- Toggled via a control in the Control Panel
svgSetup.jshandles SVG layer setup and hover area sizing; hover hit areas are limited to the IP label width to prevent overlap with chart content
Row Hover Highlighting:
- Hovering an IP row uses grey shades (not blue/yellow)
- Highlights all bins in the row that belong to the hovered IP pair, not just the first matching bin
IP Label Hover Styling (consistent across all views):
- Hovered IP: bold, black (
#000) - Connected IPs: font-weight 500, black (
#000) - Non-connected IPs: faded to opacity 0.25
- Applied in
timearcs_layout.js,rows.js(TCP Analysis), andtcp-analysis.css
Circle Hover Callbacks (circles.js):
onCircleHighlight(srcIp, dstIps)— called on circle mouseover; highlights source/destination IP rows and labelsonCircleClearHighlight()— called on circle mouseout; clears all highlights- TCP Analysis wires these via
renderCirclesWithOptions()to apply.highlighted,.connected,.fadedCSS classes
Arc Path Connections (src/rendering/arcPath.js):
arcPathGenerator()accepts optionalsrcYanddstYfor offset positions- Hover handlers calculate both source and destination offsets using
calculateYPosWithOffset() - Arcs connect circle-to-circle at their actual offset positions, not baselines
Row Collapse Behavior:
- All IP rows with multiple pairs start collapsed by default (
defaultCollapseAppliedflag) state.layout.collapsedIPsSet tracks which IPs have their sub-rows merged- Click individual IP labels to expand/collapse; per-IP toggle buttons are SVG circles at a fixed left-aligned column (
toggleX = -168insvgSetup.js) - "Expand All"/"Collapse All" is a sticky pill-shaped HTML button (
#expand-all-btn) at the top of#chart-containerwithposition: sticky; top: 8px. Visually distinct from per-IP circles (pill shape with text label, dynamic width: 96px/106px). Created increateOrUpdateExpandAllBtn()intcp-analysis.js - Collapsed rows merge all pair bins at same (time, yPos, flagType) into single circles
Key Data Structures:
// ipPairOrderByRow: Map<yPos, { order: Map<ipPairKey, index>, count: number }>
// ipRowHeights: Map<ip, heightInPixels>
// ipPairCounts: Map<ip, numberOfUniquePairs>
// collapsedIPs: Set<ip> - IPs whose sub-rows are collapsed
// subRowHeights: Map<"ip|pairKey", number> - per-sub-row effective height (null when separateFlags off)
// subRowOffsets: Map<"ip|pairKey", number> - per-sub-row cumulative Y offset from baseY (null when separateFlags off)
// state.search.newlyAddedIPs: Set<ip> - IPs added by "Select IPs" action (gold-highlighted until cleared)IMPORTANT — ipPairOrderByRow must be updated in-place:
renderIPRowLabels() in svgSetup.js captures ipPairOrderByRow in mouseover closures. If you replace the Map with a new object (state.layout.ipPairOrderByRow = newMap), those closures become stale and lookups fail (causing full-row highlight instead of sub-row highlight). Always update in-place:
const newOrder = computeIPPairOrderByRow(packets, ipPositions);
state.layout.ipPairOrderByRow.clear();
for (const [k, v] of newOrder) state.layout.ipPairOrderByRow.set(k, v);This pattern is used at 4 sites: resolution change, drag-reorder, collapse/expand, and flag separation adjustment.
Separate Flags (#separateFlags checkbox, state.ui.separateFlags):
- Prevents overlapping circles of different flag types at the same time bin
- Groups co-located circles by
(binCenter, yPosWithOffset), sorts by TCP lifecycle phase order (FLAG_PHASE_ORDER: SYN → SYN+ACK → ACK → PSH → FIN → RST → OTHER), then packs them sequentially (each circle touching its neighbors) so the total vertical span = sum of all diameters - Adaptive per-sub-row heights (post-binning pipeline):
computeFlagSeparationHeights(binnedPackets, rScale)intcp-analysis.js— groups binned packets by(src_ip, ipPairKey, timeKey), computes sum-of-diameters per group, keeps the max per sub-row → returnsMap<"ip|pairKey", maxHeight>computeSubRowLayout(perSubRowHeight, ipPairOrderByRow, ...)— converts per-sub-row heights into cumulative Y offsets with variable stride:center[i] = center[i-1] + h[i-1]/2 + SUB_ROW_GAP + h[i]/2. Also computes updated IP row heights (sum of all sub-row heights + gaps). Returns{ subRowOffsets, subRowHeights, ipRowHeightUpdates }- Results stored in
state.layout.subRowHeightsandstate.layout.subRowOffsets; reset tonullwhen Separate Flags is off. All stride calculations (rendering, hover detection, box selection) use offset lookups with fallback to uniformpairIndex * (SUB_ROW_HEIGHT + SUB_ROW_GAP)
- For collapsed rows, the span is clamped to available row height and falls back to even spacing
- Implemented in
src/rendering/circles.js:174-224(circle packing),tcp-analysis.js:409-507(height computation) - Named "Separate Flags" in the UI (not "Stacked Circles" — avoid that term)
- Default: off (
separateFlags: falseintcp-analysis.js:252)
Three types of arc links connect circles in the Packet View:
-
Hover S-curves (temporary) — drawn on circle mouseover in
circles.js. Replaces the old arc-to-destination with an S-curve that shows flow direction. The dummy node endpoint is synthesized by projecting forward from the hovered circle by one bin width (d.bin_end - d.bin_startconverted to pixels, min 20px). The S-curve bends down to the destination IP's sub-row y-position then ends at the dummy node x. Includes a midpoint polygon arrowhead. Removed on mouseout.- Event handlers (
handleMouseover,handleMousemove,handleMouseout) are defined as named closures insiderenderCirclesand bound in both theenterandupdatejoin paths, so all circles (newly entered and persisted across zoom/pan) always use the current render'sxScaleandcalculateYPosWithOffset. - Collapsed circles (
ipPairKey === '__collapsed__') have ambiguousdst_ipand skip S-curve drawing; tooltip and IP highlight still fire normally.
- Event handlers (
-
Sub-row arcs (
#showSubRowArcstoggle,state.ui.showSubRowArcs) — persistent low-opacity arcs connecting IP pair sub-rows. Drawn bydrawSubRowArcs()intcp-analysis.js. Toggled via Control Panel checkbox. -
TCP Flow arcs (
#showTcpFlowstoggle,state.ui.showTcpFlows) — persistent arcs for selected TCP flows, drawn bydrawSelectedFlowArcs()intcp-analysis.js:1654-1758. Grouped by time bucket, src/dst IP pair, and flag type. Phase filters (establishment/data transfer/closing) control which flows are shown.
Note: Auto-enabling links at raw zoom level (as in the original requirements) is not yet implemented.
- TimeArcs (
src/layout/timearcs_layout.js): Complex multi-force simulation with component separation, hub attraction, y-constraints - Force Network (
src/layout/force_network.js): 2D force layout used as the default view mode in TimeArcs. Aggregates arc data by IP pair + attack type, renders with D3 force simulation. Supportsprecalculate()for pre-computing positions (used during animated transitions) andstaticStartrendering. On data load, the timearcs render completes first, then auto-transitions to force layout viatransitionToForceLayout() - BarDiagram: Uses vertical IP order from TimeArcs directly (no separate force layout)
Network Mode Toggle (attack-network.html):
- Radio buttons switch between "Timearcs Time Line View" (arc timeline) and "Force layout network view" (2D network graph)
- Default: Force layout network view (
layoutMode = 'force_layout',labelMode = 'force_layout') - Force layout uses
attack_groupfor coloring; Timearcs usesattack(finer-grained)
Drag-to-brush selection allows users to select arcs/nodes for analysis and export to tcp-analysis:
- Persistent selections (
persistentSelections[]): Stored at module level as data objects with{id, timeBounds, ips, arcs, timeRange}. Survive resize/filter re-renders. multiSelectionsGroup: SVG<g>holding selection visuals. Must be reset tonullinrender()cleanup (aftersvg.selectAll('*').remove()) sosetupDragToBrush()creates a fresh DOM group. Forgetting this causes new selections to append to a detached element.computeSelectionBounds(): Recomputes selection rectangle pixel bounds from stored IP names using current scales/node positions (not stale pixel values). Shared bycreatePersistentSelectionVisualandupdatePersistentSelectionVisuals.redrawAllPersistentSelections()/redrawPersistentSelectionsFn: Clears and re-creates all selection DOM elements. Called after positions finalize (timearcs animation end, force layout setup, component layout change) and from the force layout resize handler.- Resize behavior: Timearcs mode calls
render()which preservespersistentSelectionsand redraws after animation. Force layout mode bypassesrender()and callsredrawPersistentSelectionsFndirectly.
Box selection allows users to select packets on circle rows for raw CSV export:
- Enable:
#enableBoxSelectioncheckbox in Control Panel (state.ui.enableBoxSelection) - Interaction: When enabled, click-drag horizontally across any IP row (like text selection — no modifier key needed). Box height auto-snaps to the detected row. Normal pan/drag is disabled while box selection mode is on; wheel zoom still works.
- Collapsed mode: Box covers entire IP row; paired boxes drawn on all partner IP rows via
allPairs - Expanded mode: Box targets a specific sub-row; paired box drawn on partner IP's matching sub-row
- Multiple selections supported; all use dark grey (
BOX_SELECTION_COLOR = '#555') - Persistent selections (
boxSelections[]): Stored as data coordinates (time range + IP names), recomputed to pixels on zoom/resize viaredrawAllBoxSelections() - Export:
exportBoxSelectionCSV()(async) loads raw packets viafetchChunksForRange(start, end, 'raw')— fetches actual individual packets with microsecond timestamps, ports, and flags regardless of current zoom resolution. Falls back tostate.data.fullif raw resolution unavailable.
SVG layering (inside mainGroup, appended by setupBoxSelectionDrag()):
- Overlay rect (
.box-select-overlay) —pointer-events: allwhen enabled, captures drag events - Selections group (
.box-selections-group) —pointer-events: noneon<g>so rects pass through to overlay;foreignObjectbuttons override withpointer-events: allfor clickability
Zoom integration:
src/interaction/zoom.js: Zoom filter blocks drag-pan whenisBoxSelectionActive()returns truesrc/interaction/timearcsZoomHandler.js: CallsredrawAllBoxSelections()after zoom render- Drag-reorder handler also calls
redrawAllBoxSelections()
Key functions (all in tcp-analysis.js):
setupBoxSelectionDrag()— creates overlay + selections group, wires drag handlersdetectIPRowFromY(y)— finds IP row/sub-row for a Y coordinatecomputeSubRowBounds(ip, pairKey)— returns{boxY, boxH}for source or destinationfinalizeBoxSelection(start, end, rowInfo)— converts pixel coords to selection data objectcreateBoxSelectionVisual(selection)— draws source rect, partner rects, label, Export/Remove buttonsredrawAllBoxSelections()— clears and recreates all visuals from stored dataexportBoxSelectionCSV(selection)— async; loads raw packets, filters by IP pair, downloads CSV
The Pattern Search feature allows searching for TCP packet patterns across flows using a custom DSL.
Architecture:
src/search/pattern-language.js— Tokenizer, parser, compiler, and matcher for the DSLsrc/search/pattern-presets.js— Built-in preset patterns (e.g., "Full Graceful Close", "SYN Flood")src/search/pattern-search-engine.js— Orchestrates search across flow data fromFlowListLoadersrc/search/flow-abstractor.js— Converts embedded packet arrays to abstract event sequences for matchingsrc/search/search-results.js— Stores match results with per-IP-pair match countssrc/ui/pattern-search-panel.js— Search panel UI in the Control Panelsrc/ui/pattern-builder-popup.js— Visual pattern builder popup (block-based)
DSL Grammar (simplified):
pattern := sequence ('|' sequence)* # disjunction (OR)
sequence := element ('->' element)* # strict adjacency
element := '!'? atom quantifier? # negation + quantifier
atom := event_name constraint? | '(' pattern ')' | '.' | '*' | '$' | '^'
quantifier := '+' | '?' | '{' min (',' max?)? '}'
constraint := '[' key op value (',' key op value)* ']'
Key semantics:
->means strict adjacency — no events allowed between elements unless explicitly matched.and*are both wildcard atoms (match any single event)ACK{0,}= zero or more ACKs (greedy with backtracking)!SYN_ACK= negative lookahead (does NOT consume an event)$= end-of-sequence anchor (matches only at end; does NOT consume an event)^= start-of-sequence anchor (matches only at position 0; does NOT consume an event)- Both
^and$are zero-width assertions — they bypass the quantifier loop incompileElement()since repeating a position check is meaningless - Constraints:
[dir=out],[dt>1s],[dir=in,dt<50ms]
DSL Implementation Details (src/search/pattern-language.js):
- Tokenizer: Token types include
IDENT,ARROW,PIPE,LBRACE,RBRACE,DOT,STAR,BANG,DOLLAR,CARET,LBRACKET,RBRACKET,LPAREN,RPAREN,PLUS,QUESTION,COMMA,NUMBER,EOF ^(StartAnchor): Compiled to(seq, start) => ({ matched: start === 0, endIndex: start }). Zero-width — only succeeds at position 0.matchPattern()scanner still iterates all start positions, but^ensures only position 0 can match.$(EndAnchor): Compiled to(seq, start) => ({ matched: start === seq.length, endIndex: start }). Only succeeds at end of sequence.compileElement()bypass: BothStartAnchorandEndAnchorskip the quantifier loop —if (atom.type === 'EndAnchor' || atom.type === 'StartAnchor') return atomMatcher;
Abstraction Levels:
- Level 1 (Packet): Each packet →
{ flagType, dir, deltaTime }. Matches against individual packets. - Level 2 (Phase): Groups consecutive packets by TCP phase →
{ phase, packetCount, duration }. - Level 3 (Outcome): Single event per flow →
{ outcome }(e.g.,COMPLETE_GRACEFUL).
Flag Classification Pipeline (Level 1):
flow-list-loader.jsparsespacketscolumn →{ flags: number (bitmask), _fromInitiator: boolean, timestamp }flow-abstractor.js:abstractToLevel1()callsclassifyFlags(flags)→ display name (e.g.,'RST+ACK')FLAG_TO_DSLmap converts to DSL token name (e.g.,'RST+ACK'→'RST_ACK')- Key bitmask → DSL mappings:
0x02→SYN,0x12→SYN_ACK,0x10→ACK,0x18→PSH_ACK,0x04→RST,0x14→RST_ACK,0x11→FIN_ACK,0x19→ACK_FIN_PSH
Match behavior (matchPattern() in pattern-language.js:584-618):
- Scans all starting positions in the abstracted sequence
- Reports first match only per flow (
breakat line 603) - Match region
{start, end}points to positions in the abstracted sequence
"Select IPs" behavior (selectMatchedIPsInSidebar() in tcp-analysis.js):
- Additive only: checks matched IPs that aren't already checked; never unchecks existing IPs
- Tracks newly added IPs in
state.search.newlyAddedIPs(Set) - Newly added IP labels get
.newly-addedCSS class (gold#f1c40f, bold) — applied viaapplySearchHighlightClasses() - Newly added multi-pair IPs are auto-collapsed (added to
state.layout.collapsedIPs) - Golden highlights persist until:
clearPatternSearch(), new search started, or manual IP checkbox change
Important — .node-label elements live in svg, not mainGroup:
IP labels (.node-label) are appended by svgSetup.js as children of .node groups directly under the <svg> element. The mainGroup is a separate clipped <g> for chart marks. When selecting .node-label elements, use d3.select(svg.node()).selectAll('.node-label') or traverse up via mainGroup.node().closest('svg').
Pattern Builder Popup (src/ui/pattern-builder-popup.js, pattern-builder-popup.css):
- Visual block-based pattern builder with drag-and-drop colored pills
- Flag palette — colored buttons for each TCP flag type (SYN, SYN+ACK, ACK, PSH+ACK, ACK+FIN+PSH, FIN+ACK, RST, RST+ACK)
- Symbols row (
.pb-symbols) — separate row below the flag palette with:^button — adds start anchor block$button — adds end anchor block·(wildcard) — adds a.wildcard block (moved here from flag palette)( | )button — opens the group creator dialog
- Group creator (
_openGroupCreator()) — modal dialog for building(A | B)alternation groups:- 2-4 alternative rows, each with a mini flag palette and removable pills
- "Add Alternative" button (if < 4 rows), "Create" and "Cancel" buttons
- Produces a
GROUPblock withalternatives: PatternBlock[][]
- Block-to-DSL conversion (
src/search/blocks-to-dsl.js): Serializes PatternBlock[] to DSL string. Anchors (^,$) emit raw characters. Groups emit(alt1 | alt2)syntax. - AST-to-block conversion (
src/search/pattern-ast-to-blocks.js): Converts parsed AST back to visual blocks for preset loading. HandlesStartAnchor,EndAnchor,Group,Wildcard, negation, quantifiers, and constraints.
Current Level 1 Presets (src/search/pattern-presets.js):
| Preset | Pattern | Notes |
|---|---|---|
| Full Graceful Close | SYN -> SYN_ACK -> ACK -> .{1,} -> (FIN_ACK | ACK_FIN_PSH) -> ACK{1,} -> $ |
Complete TCP lifecycle. 99+% match rate for graceful flows. |
| Full Abortive Connection | SYN -> SYN_ACK -> ACK -> .{1,} -> (RST | RST_ACK) -> $ |
Handshake + data + RST. Matches both RST (0x04) and RST_ACK (0x14). |
| SYN Retransmit (no SYN+ACK) | SYN -> SYN |
Consecutive SYNs without server response. |
| RST During Handshake | SYN -> SYN_ACK -> RST |
RST after SYN_ACK but before ACK. |
| SYN Rejected (RST+ACK) | SYN -> RST_ACK |
Server immediately rejects with RST+ACK. |
| SYN+ACK Retransmit | SYN_ACK -> SYN_ACK |
Repeated SYN_ACK (server retry). |
| SYN Flood (5+ SYNs) | SYN{5,} |
5 or more consecutive SYN packets. |
| RST Flood (3+ consecutive) | RST{3,} |
3 or more consecutive RST packets. |
Full Graceful Close pattern details:
The .{1,} wildcard bridges the variable-length data phase; greedy backtracking finds the last FIN-bearing packet before the final ACKs. The (FIN_ACK | ACK_FIN_PSH) disjunction covers both standard FIN (95.1%) and piggybacked FIN-on-data (4.9%). ACK{1,} handles one or more trailing acknowledgments. The $ end anchor ensures the graceful close is the last thing in the flow — without it, abortive flows containing a mid-flow FIN_ACK→ACK before a trailing RST would false-positive.
Full Abortive Connection pattern details:
Uses (RST | RST_ACK) because classifyFlags() in flags.js maps 0x04 → RST and 0x14 → RST_ACK as distinct DSL tokens, while the Python flow detector (tcp_data_loader_streaming_by_ip_pair.py) treats both as closeType = 'abortive'. Without the disjunction, flows ending with RST+ACK would be missed.
Level 1 pattern search results will NOT match overview chart legend counts exactly. This is a systemic architectural difference, not a bug.
Root cause — two different classification systems:
-
Overview chart uses pre-aggregated flow bins (
flow_bins_*.json) generated bygenerate_flow_bins_v3.py, which readscloseTypeandinvalidReasonfrom the Python flow detector's state-based classification. The Python detector (tcp_data_loader_streaming_by_ip_pair.py:322-330) classifies by TCP state machine:- Any RST after handshake →
closeType = 'abortive'(regardless of exact flag combo) - Any RST during establishing →
invalidReason = 'rst_during_handshake'(regardless of packet order)
- Any RST after handshake →
-
Pattern search uses Level 1 DSL matching against the actual packet sequence from flow_list CSVs via
flow-abstractor.js. It requires exact packet-level adjacency —SYN -> SYN_ACK -> RSTmeans precisely those three flags in that order with nothing between them.
Specific mismatches verified by diagnostic scripts (scripts/diagnose_*.js):
| Overview Category | Overview Count (all IPs) | Pattern Match | Gap | Main Causes |
|---|---|---|---|---|
abortive |
29,788 | 29,618 | 170 | Reversed handshake (SYN_ACK before SYN): 145; RST not last: 5; missing SYN: 6 |
rst_during_handshake |
451,671 | 405,309 | 46,362 | SYN -> RST_ACK (45,007) counted as rst_during_hs by Python but matched by separate "SYN Rejected" preset |
Key flag classification detail (src/tcp/flags.js:classifyFlags()):
RST(flags0x04) andRST+ACK(flags0x14) are distinct DSL tokens (RSTvsRST_ACK)- The Python detector does not distinguish them — both trigger the same state transition
- Patterns must use
(RST | RST_ACK)to match both variants
Common packet sequence anomalies that cause mismatches:
- Reversed handshake —
SYN_ACK -> SYN -> ACKinstead ofSYN -> SYN_ACK -> ACK(capture timing/reordering). ~85% of abortive mismatches. - SYN retransmits —
SYN -> SYN_ACK -> SYN -> SYN_ACK -> ACK(duplicate handshake packets break strict adjacency) - RST not terminal —
... -> RST -> FIN_ACK(trailing packet after RST breaks$anchor) - Broader Python categories —
rst_during_handshakeincludes bothSYN -> SYN_ACK -> RSTandSYN -> RST_ACK; these are separate presets in Level 1
Implication for future work:
- Level 3 (Outcome) presets would match overview counts 1:1 since they use the same
closeType/invalidReasonfields viaflowToOutcome()inflags.js - Level 1 presets are for precise packet-sequence analysis, not for reproducing overview totals
- To match overview exactly at Level 1, composite patterns would be needed: e.g.,
(SYN -> SYN_ACK -> RST) | (SYN -> RST_ACK) | (SYN -> RST) | (SYN_ACK -> SYN -> RST)
src/rendering/highlightUtils.js provides shared hover highlight functions used by both timearcs (arcInteractions.js) and force layout (force_network.js):
highlightHoveredLink()/unhighlightLinks()— dim all links, highlight hoveredgetLinkHighlightInfo()— compute active IPs and attack color from link datum (handles both timearcs arc shape and force link shape)highlightEndpointLabels()/unhighlightEndpointLabels()— bold, enlarge, color active IP labels; dim othersipFromDatum()(internal) — normalizes datum to IP string (timearcs labels bind raw strings, force layout nodes bind{id, degree}objects)showArcArrowhead(container, pathElement, datum, color)— renders a filled polygon arrowhead on a hovered timearcs arc, positioned at the arc midpointshowLineArrowhead(container, datum, color, targetRadius, strokeWidth)— renders a directional arrowhead on a hovered force-layout link; returns base position so the line can be trimmed to not overlap the arrowremoveArrowheads(container)— cleans up arrowhead overlays on mouseout
Directional arrows appear on mouseover only (not permanently, to avoid clutter). Both timearcs arcs (arcInteractions.js:74-75) and force layout links (force_network.js:438-454) call these. In the TCP Analysis Packet View, circles.js draws a midpoint polygon arrowhead on the hover S-curve (no SVG <marker> — arrowhead is computed from the Bezier tangent angle at the curve midpoint).
The Control Panel (control-panel.js) is a position: fixed aside with drag-to-move and click-to-collapse behavior:
- Drag handle: Title bar at top — click to collapse/expand, drag to reposition
- Zoom controls bar: Positioned above the Control Panel via
position: absolute; bottom: 100%. Contains resolution dropdown, current resolution indicator badge, and zoom +/- buttons. Stays visible when panel is collapsed. Moves with the panel on drag. - Controls body: Scrollable area with IP selection, TCP flags, legends, flow visualization options
- Panel uses
overflow: visibleso the zoom bar (absolutely positioned above) is not clipped
The fisheye lens effect (src/plugins/d3-fisheye.js, wrapped by src/scales/distortion.js) provides overview+detail zooming. Controlled by the "Lensing" toggle and zoom slider in the UI.
- Binning: Reduces millions of packets to thousands of bins
- Web Worker: Packet filtering runs off main thread
- Layer caching: Full-domain layer pre-rendered
- Batch processing: Flow reconstruction and list rendering use configurable batch sizes
- LRU Cache:
resolution-manager.jscaches loaded detail chunks with automatic eviction - Multi-resolution loading: Zoom-level dependent data loading (overview → detail)
- IP-pair organization (v3): Chunks organized by IP pair enable efficient filtering—only load chunks for selected IP pairs instead of scanning all chunks
- Adaptive overview resolution: Coarse bins for full view, fine bins when zoomed (Overview Bar chart)
- Lazy flow list loading: CSV files only loaded when user clicks Overview Bar chart bars
Main files import heavily from /src:
- Rendering:
bars.js,circles.js,arcPath.js,rows.js,tooltip.js,arcInteractions.js,highlightUtils.js,svgSetup.js - Data:
binning.js(visible packets, bar width),flowReconstruction.js,csvParser.js,aggregation.js,resolution-manager.js,csv-resolution-manager.js,data-source.js,component-loader.js,initialRender.js - Layout:
forceSimulation.js,force_network.js,timearcs_layout.js - Interaction:
zoom.js,arcInteractions.js,dragReorder.js,resize.js - Scales:
scaleFactory.js,distortion.js,bifocal.js - Ground Truth:
groundTruth.js - Utils:
formatters.js(byte/timestamp formatting),helpers.js - UI:
legend.js,bifocal-handles.js,loading-indicator.js,pattern-search-panel.js,pattern-builder-popup.js - Search:
pattern-language.js(DSL),pattern-search-engine.js,pattern-presets.js,flow-abstractor.js,search-results.js - Config:
constants.js(colors, sizes, debug flags)
The timearcs_source/ directory contains the original TimeArcs implementation for political blog analysis (unrelated to the network traffic visualization).