--- layout: default title: "Viz and Debug" parent: "Utility Function" nav_order: 2 --- # Visualization and Debugging Similar to LLM wrappers, we **don't** provide built-in visualization and debugging. Here, we recommend some *minimal* (and incomplete) implementations These examples can serve as a starting point for your own tooling. ## 1. Visualization with Mermaid This code recursively traverses the nested graph, assigns unique IDs to each node, and treats Flow nodes as subgraphs to generate Mermaid syntax for a hierarchical visualization. {% raw %} ```python def build_mermaid(start): ids, visited, lines = {}, set(), ["graph LR"] ctr = 1 def get_id(n): nonlocal ctr return ids[n] if n in ids else (ids.setdefault(n, f"N{ctr}"), (ctr := ctr + 1))[0] def link(a, b): lines.append(f" {a} --> {b}") def walk(node, parent=None): if node in visited: return parent and link(parent, get_id(node)) visited.add(node) if isinstance(node, Flow): node.start_node and parent and link(parent, get_id(node.start_node)) lines.append(f"\n subgraph sub_flow_{get_id(node)}[{type(node).__name__}]") node.start_node and walk(node.start_node) for nxt in node.successors.values(): node.start_node and walk(nxt, get_id(node.start_node)) or (parent and link(parent, get_id(nxt))) or walk(nxt) lines.append(" end\n") else: lines.append(f" {(nid := get_id(node))}['{type(node).__name__}']") parent and link(parent, nid) [walk(nxt, nid) for nxt in node.successors.values()] walk(start) return "\n".join(lines) ``` {% endraw %} For example, suppose we have a complex Flow for data science: ```python class DataPrepBatchNode(BatchNode): def prep(self,shared): return [] class ValidateDataNode(Node): pass class FeatureExtractionNode(Node): pass class TrainModelNode(Node): pass class EvaluateModelNode(Node): pass class ModelFlow(Flow): pass class DataScienceFlow(Flow):pass feature_node = FeatureExtractionNode() train_node = TrainModelNode() evaluate_node = EvaluateModelNode() feature_node >> train_node >> evaluate_node model_flow = ModelFlow(start=feature_node) data_prep_node = DataPrepBatchNode() validate_node = ValidateDataNode() data_prep_node >> validate_node >> model_flow data_science_flow = DataScienceFlow(start=data_prep_node) result = build_mermaid(start=data_science_flow) ``` The code generates a Mermaid diagram: ```mermaid graph LR subgraph sub_flow_N1[DataScienceFlow] N2['DataPrepBatchNode'] N3['ValidateDataNode'] N2 --> N3 N3 --> N4 subgraph sub_flow_N5[ModelFlow] N4['FeatureExtractionNode'] N6['TrainModelNode'] N4 --> N6 N7['EvaluateModelNode'] N6 --> N7 end end ``` ## 2. Interactive D3.js Visualization For more complex flows, a static diagram may not be sufficient. We provide a D3.js-based interactive visualization that allows for dragging nodes, showing group boundaries for flows, and connecting flows at their boundaries. ### Converting Flow to JSON First, we convert the PocketFlow graph to JSON format suitable for D3.js: ```python def flow_to_json(start): """Convert a flow to JSON format suitable for D3.js visualization. This function walks through the flow graph and builds a structure with: - nodes: All non-Flow nodes with their group memberships - links: Connections between nodes within the same group - group_links: Connections between different groups (for inter-flow connections) - flows: Flow information for group labeling """ nodes = [] links = [] group_links = [] # For connections between groups (Flow to Flow) ids = {} node_types = {} flow_nodes = {} # Keep track of flow nodes ctr = 1 def get_id(n): nonlocal ctr if n not in ids: ids[n] = ctr node_types[ctr] = type(n).__name__ if isinstance(n, Flow): flow_nodes[ctr] = n # Store flow reference ctr += 1 return ids[n] def walk(node, parent=None, group=None, parent_group=None, action=None): # Traverse the flow graph recursively # ...implementation details... # Start the traversal walk(start) # Post-processing: Generate group links based on node connections between different groups node_groups = {n["id"]: n["group"] for n in nodes} filtered_links = [] for link in links: source_id = link["source"] target_id = link["target"] source_group = node_groups.get(source_id, 0) target_group = node_groups.get(target_id, 0) # If source and target are in different groups and both groups are valid if source_group != target_group and source_group > 0 and target_group > 0: # Add to group links if not already there # Skip adding this link to filtered_links - we don't want direct node connections across groups else: # Keep links within the same group filtered_links.append(link) return { "nodes": nodes, "links": filtered_links, # Use filtered links instead of all links "group_links": group_links, "flows": {str(k): v.__class__.__name__ for k, v in flow_nodes.items()}, } ``` ### Creating the Visualization Then, we generate an HTML file with D3.js visualization: ```python def create_d3_visualization(json_data, output_dir="./viz", filename="flow_viz"): """Create a D3.js visualization from JSON data. This generates an HTML file with an interactive visualization where: - Nodes are represented as circles - Flows are shown as dashed rectangles (groups) - Inter-group connections are shown as dashed lines connecting at group boundaries - Node and group labels are displayed - Nodes can be dragged to reorganize the layout """ # Create output directory os.makedirs(output_dir, exist_ok=True) # Save JSON data to file json_path = os.path.join(output_dir, f"{filename}.json") with open(json_path, "w") as f: json.dump(json_data, f, indent=2) # Generate HTML with D3.js visualization # ...HTML template with D3.js code... # Write HTML to file html_path = os.path.join(output_dir, f"{filename}.html") with open(html_path, "w") as f: f.write(html_content) print(f"Visualization created at {html_path}") return html_path ``` ### Convenience Function A convenience function to visualize flows: ```python def visualize_flow(flow, flow_name): """Helper function to visualize a flow with both mermaid and D3.js""" print(f"\n--- {flow_name} Mermaid Diagram ---") print(build_mermaid(start=flow)) print(f"\n--- {flow_name} D3.js Visualization ---") json_data = flow_to_json(flow) create_d3_visualization( json_data, filename=f"{flow_name.lower().replace(' ', '_')}" ) ``` ### Usage Example ```python from visualize import visualize_flow # Create a complex flow with nested subflows # ...flow definition... # Generate visualization visualize_flow(data_science_flow, "Data Science Flow") ``` This generates: 1. A mermaid diagram in the console 2. A JSON file with the flow structure 3. An HTML file with the interactive D3.js visualization The D3.js visualization offers several advantages: - **Interactivity**: Nodes can be dragged to reorganize the layout - **Group visualization**: Flows are shown as groups with their own boundaries - **Inter-group connections**: Links between groups connect at boundaries for cleaner visualization - **Action labels**: Edge labels show transition actions ## 3. Call Stack Debugging It would be useful to print the Node call stacks for debugging. This can be achieved by inspecting the runtime call stack: ```python import inspect def get_node_call_stack(): stack = inspect.stack() node_names = [] seen_ids = set() for frame_info in stack[1:]: local_vars = frame_info.frame.f_locals if 'self' in local_vars: caller_self = local_vars['self'] if isinstance(caller_self, BaseNode) and id(caller_self) not in seen_ids: seen_ids.add(id(caller_self)) node_names.append(type(caller_self).__name__) return node_names ``` For example, suppose we have a complex Flow for data science: ```python class DataPrepBatchNode(BatchNode): def prep(self, shared): return [] class ValidateDataNode(Node): pass class FeatureExtractionNode(Node): pass class TrainModelNode(Node): pass class EvaluateModelNode(Node): def prep(self, shared): stack = get_node_call_stack() print("Call stack:", stack) class ModelFlow(Flow): pass class DataScienceFlow(Flow):pass feature_node = FeatureExtractionNode() train_node = TrainModelNode() evaluate_node = EvaluateModelNode() feature_node >> train_node >> evaluate_node model_flow = ModelFlow(start=feature_node) data_prep_node = DataPrepBatchNode() validate_node = ValidateDataNode() data_prep_node >> validate_node >> model_flow data_science_flow = DataScienceFlow(start=data_prep_node) data_science_flow.run({}) ``` The output would be: `Call stack: ['EvaluateModelNode', 'ModelFlow', 'DataScienceFlow']`