Loom and Graphs in Clojure

Post on 11-May-2015

4.452 views 4 download

description

Graph algorithms are cool and fascinating. We'll look at a graph algorithms and visualization library, Loom, which is written in Clojure. We will discuss the graph API, look at implementation of the algorithms and learn about the integration of Loom with Titanium, which allows us to run the algorithms on and visualize data in graph databases.

Transcript of Loom and Graphs in Clojure

Loom and Graphs in Clojure

github.com/aysylu/loom

Aysylu Greenberg @aysylu22; http://aysy.lu

LispNYC, August 13th 2013

Overview

•  Loom's Graph API

•  Graph Algorithms in Loom

•  Titanium Loom

•  Single Static Assignment (SSA) Loom

Overview

•  Loom's Graph API

•  Graph Algorithms in Loom

•  Titanium Loom

•  SSA Loom

Loom's Graph API

•  Graph, Digraph, Weighted Graph

Loom's Graph API

•  Graph, Digraph, Weighted Graph •  FlyGraph

Loom's Graph API

•  Graph, Digraph, Weighted Graph •  FlyGraph o  read-only, ad-hoc

Loom's Graph API

•  Graph, Digraph, Weighted Graph •  FlyGraph o  read-only, ad-hoc o  edges from nodes + successors

Loom's Graph API

•  Graph, Digraph, Weighted Graph •  FlyGraph o  read-only, ad-hoc o  edges from nodes + successors o  nodes and edges from successors + start

Loom's Graph API

•  Uses Clojure protocols (clojure.org/protocols)

Loom's Graph API

•  Uses Clojure protocols (clojure.org/protocols) o  specification only, no implementation

Loom's Graph API

•  Uses Clojure protocols (clojure.org/protocols) o  specification only, no implementation o  single type can implement multiple

protocols

Loom's Graph API

•  Uses Clojure protocols (clojure.org/protocols) o  specification only, no implementation o  single type can implement multiple

protocols o  interfaces: design-time choice of the type

author, protocols: can be added to a type at runtime

Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges")

Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges”) (remove-nodes* [g nodes] "Remove nodes from graph g. See

remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See

remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g")

Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges” (remove-nodes* [g nodes] "Remove nodes from graph g. See

remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See

remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g” (nodes [g] "Return a collection of the nodes in graph g”) (edges [g] "Edges in g. May return each edge twice in an undirected

graph")

Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges” (remove-nodes* [g nodes] "Remove nodes from graph g. See

remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See

remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g” (nodes [g] "Return a collection of the nodes in graph g”) (edges [g] "Edges in g. May return each edge twice in an undirected

graph”) (has-node? [g node] "Return true when node is in g”) (has-edge? [g n1 n2] "Return true when edge [n1 n2] is in g")

Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges” (remove-nodes* [g nodes] "Remove nodes from graph g. See

remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See

remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g” (nodes [g] "Return a collection of the nodes in graph g”) (edges [g] "Edges in g. May return each edge twice in an undirected

graph”) (has-node? [g node] "Return true when node is in g”) (has-edge? [g n1 n2] "Return true when edge [n1 n2] is in g”) (successors [g] [g node] "Return direct successors of node, or

(partial successors g)”) (out-degree [g node] "Return the number of direct successors of

node"))

Loom's Graph API (defprotocol Digraph (predecessors [g] [g node] "Return direct

predecessors of node, or (partial predecessors g)”) (in-degree [g node] "Return the number direct

predecessors to node")

Loom's Graph API (defprotocol Digraph (predecessors [g] [g node] "Return direct

predecessors of node, or (partial predecessors g)”) (in-degree [g node] "Return the number direct

predecessors to node”) (transpose [g] "Return a graph with all edges

reversed"))

Loom's Graph API (defprotocol WeightedGraph (weight [g] [g n1 n2] "Return weight of edge [n1 n2]

or (partial weight g)"))

Overview

•  Loom's Graph API

•  Graph Algorithms in Loom

•  Titanium Loom

•  SSA Loom

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional)

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort •  Single Source Shortest Path (Dijkstra, Bellman-Ford)

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort •  Single Source Shortest Path (Dijkstra, Bellman-Ford) •  Strongly Connected Components (Kosaraju)

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort •  Single Source Shortest Path (Dijkstra, Bellman-Ford) •  Strongly Connected Components (Kosaraju) •  Density (edges/nodes)

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort •  Single Source Shortest Path (Dijkstra, Bellman-Ford) •  Strongly Connected Components (Kosaraju) •  Density (edges/nodes) •  Loner nodes

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort •  Single Source Shortest Path (Dijkstra, Bellman-Ford) •  Strongly Connected Components (Kosaraju) •  Density (edges/nodes) •  Loner nodes •  2 coloring

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort •  Single Source Shortest Path (Dijkstra, Bellman-Ford) •  Strongly Connected Components (Kosaraju) •  Density (edges/nodes) •  Loner nodes •  2 coloring •  Max-Flow (Edmonds-Karp)

Graph Algorithms in Loom •  DFS/BFS (+ bidirectional) •  Topological Sort •  Single Source Shortest Path (Dijkstra, Bellman-Ford) •  Strongly Connected Components (Kosaraju) •  Density (edges/nodes) •  Loner nodes •  2 coloring •  Max-Flow (Edmonds-Karp) •  alg-generic requires only successors + start (where

appropriate)

Graph Algorithms: Bellman-Ford

A B C

D E

3 4

5

2

-8

Graph Algorithms: Bellman-Ford

CLRS Introduction to Algorithms

Graph Algorithms: Bellman-Ford

CLRS Introduction to Algorithms

Graph Algorithms: Bellman-Ford (defn- init-estimates "Initializes path cost estimates and paths from source to all vertices, for

Bellman-Ford algorithm” [graph start] (let [nodes (disj (nodes graph) start)

path-costs {start 0} paths {start nil} infinities (repeat Double/POSITIVE_INFINITY) nils (repeat nil) init-costs (interleave nodes infinities) init-paths (interleave nodes nils)]

[(apply assoc path-costs init-costs) (apply assoc paths init-paths)]))

Graph Algorithms: Bellman-Ford

Graph Algorithms: Bellman-Ford

Graph Algorithms: Bellman-Ford (defn- can-relax-edge? "Test for whether we can improve the shortest path to v found so far by

going through u.” [[u v :as edge] weight costs] (let [vd (get costs v)

ud (get costs u) sum (+ ud weight)]

(> vd sum)))

Graph Algorithms: Bellman-Ford (defn- relax-edge "If there's a shorter path from s to v via u, update our map of

estimated path costs and map of paths from source to vertex v” [[u v :as edge] weight [costs paths :as estimates]] (let [ud (get costs u)

sum (+ ud weight)] (if (can-relax-edge? edge weight costs)

[(assoc costs v sum) (assoc paths v u)] estimates)))

Graph Algorithms: Bellman-Ford

Graph Algorithms: Bellman-Ford (defn- relax-edges "Performs edge relaxation on all edges in weighted directed graph” [g start estimates] (->> (edges g)

(reduce (fn [estimates [u v :as edge]] (relax-edge edge (wt g u v) estimates)) estimates)))

Graph Algorithms: Bellman-Ford (defn bellman-ford

"Given a weighted, directed graph G = (V, E) with source start, the Bellman-Ford algorithm produces map of single source shortest paths and their costs if no negative-weight cycle that is reachable from the source exits, and false otherwise, indicating that no solution exists." [g start] (let [initial-estimates (init-estimates g start) ;relax-edges is calculated for all edges V-1 times [costs paths] (reduce (fn [estimates _] (relax-edges g start estimates)) initial-estimates (-> g nodes count dec range)) edges (edges g)] (if (some (fn [[u v :as edge]] (can-relax-edge? edge (wt g u v) costs)) edges) false [costs (->> (keys paths) ;remove vertices that are unreachable from source (remove #(= Double/POSITIVE_INFINITY (get costs %))) (reduce (fn [final-paths v] (assoc final-paths v ; follows the parent pointers ; to construct path from source to node v (loop [node v path ()] (if node (recur (get paths node) (cons node path)) path)))) {}))])))

Graph Algorithms: Bellman-Ford (defn bellman-ford "Given a weighted, directed graph G = (V, E) with source start,

the Bellman-Ford algorithm produces map of single source shortest paths and their costs if no negative-weight cycle that is reachable from the source exits, and false otherwise, indicating that no solution exists."

Graph Algorithms: Bellman-Ford [g start] (let [initial-estimates (init-estimates g start)

;relax-edges is calculated for all edges V-1 times [costs paths] (reduce (fn [estimates _] (relax-edges g start estimates)) initial-estimates (->> g (nodes) (count) (dec) (range))) edges (edges g)]

Graph Algorithms: Bellman-Ford [g start] (let [initial-estimates (init-estimates g start)

;relax-edges is calculated for all edges V-1 times [costs paths] (reduce (fn [estimates _] (relax-edges g start estimates)) initial-estimates (->> g (nodes) (count) (dec) (range))) edges (edges g)]

Graph Algorithms: Bellman-Ford (if (some (fn [[u v :as edge]] (can-relax-edge? edge (wt g u v) costs))

edges) false

Graph Algorithms: Bellman-Ford [costs

(->> (keys paths) ;remove vertices that are unreachable from source (remove

#(= Double/POSITIVE_INFINITY (get costs %)))

Graph Algorithms: Bellman-Ford [costs

(->> (keys paths) ;remove vertices that are unreachable from source (remove

#(= Double/POSITIVE_INFINITY (get costs %))) (reduce (fn [final-paths v] (assoc final-paths v ; follows the parent pointers

; to construct path from source to node v (loop [node v path ()] (if node (recur (get paths node) (cons node path)) path)))) {}))])))

Overview

•  Loom's Graph API

•  Graph Algorithms

•  Titanium Loom

•  SSA Loom

Titanium Loom

•  Titanium by Clojurewerkz (titanium.clojurewerkz.org)

Titanium Loom

•  Titanium by Clojurewerkz (titanium.clojurewerkz.org)

•  Clojure graph library built on top of Aurelius Titan (thinkaurelius.github.com/titan)

Titanium Loom

•  Titanium by Clojurewerkz (titanium.clojurewerkz.org)

•  Clojure graph library built on top of Aurelius Titan (thinkaurelius.github.com/titan)

•  Various storage backends: Cassandra, HBase, BerkeleyDB Java Edition

Titanium Loom

•  Titanium by Clojurewerkz (titanium.clojurewerkz.org)

•  Clojure graph library built on top of Aurelius Titan (thinkaurelius.github.com/titan)

•  Various storage backends: Cassandra, HBase, BerkeleyDB Java Edition

•  No graph visualization

Titanium Loom (let [in-mem-graph (tg/open {"storage.backend" "inmemory"})]

(tg/transact!

(let [

a (nodes/create! {:name "Node A"})

b (nodes/create! {:name "Node B"})

c (nodes/create! {:name "Node C"})

Titanium Loom (let [in-mem-graph (tg/open {"storage.backend" "inmemory"})]

(tg/transact!

(let [

a (nodes/create! {:name "Node A"})

b (nodes/create! {:name "Node B"})

c (nodes/create! {:name "Node C"})

e1 (edges/connect! a "edge A->B" b)

e2 (edges/connect! b "edge B->C" c)

e3 (edges/connect! c "edge C->A” a)

graph (titanium->loom in-mem-graph)])

Titanium Loom (view graph)

Titanium Loom (defn titanium->loom "Converts titanium graph into Loom representation” ([titanium-graph & {:keys [node-fn edge-fn weight-fn]

:or {node-fn (nodes/get-all-vertices) edge-fn (map (juxt edges/tail-vertex edges/head-vertex) (edges/get-all-edges)) weight-fn (constantly 1)}}]

(let [nodes-set (set node-fn) edges-set (set edge-fn)]

Titanium Loom (reify Graph (nodes [_] nodes-set) (edges [_] edges-set)

(has-node? [g node] (contains? (nodes g) node)) (has-edge? [g n1 n2] (contains? (edges g) [n1 n2])) (successors [g] (partial successors g)) (successors [g node] (filter (nodes g)

(seq (nodes/connected-out-vertices node)))) (out-degree [g node] (count (successors g node)))

Titanium Loom (reify Graph (nodes [_] nodes-set) (edges [_] edges-set)

(has-node? [g node] (contains? (nodes g) node)) (has-edge? [g n1 n2] (contains? (edges g) [n1 n2])) (successors [g] (partial successors g)) (successors [g node] (filter (nodes g)

(seq (nodes/connected-out-vertices node)))) (out-degree [g node] (count (successors g node))) Digraph (predecessors [g] (partial predecessors g)) (predecessors [g node] (filter (nodes g) (seq (nodes/connected-in-vertices node)))) (in-degree [g node] (count (predecessors g node))) WeightedGraph (weight [g] (partial weight g)) (weight [g n1 n2] (weight-fn n1 n2))))))

Overview

•  Loom's Graph API

•  Graph Algorithms

•  Titanium Loom

•  SSA Loom

SSA Loom

•  Single Static Assignment (SSA) form produced by core.async

SSA Loom

•  Single Static Assignment (SSA) form produced by core.async

•  Generated by parse-to-state-machine function

SSA Loom (parse-to-state-machine '[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))])

SSA Loom (parse-to-state-machine '[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))])

[inst_4938 {:current-block 76, :start-block 73, :block-catches {76 nil, 75 nil, 74 nil, 73 nil}, :blocks {76 [{:value :clojure.core.async.impl.ioc-macros/value, :id

inst_4937} {:value inst_4937, :id inst_4938}], 75 [{:refs [clojure.core/+ x 2], :id inst_4935} {:value inst_4935, :block 76, :id inst_4936}], 74 [{:refs [clojure.core/+ x 1], :id inst_4933} {:value inst_4933, :block 76, :id inst_4934}], 73 [{:refs [clojure.core/+ x 1 2 y], :id inst_4930} {:refs [clojure.core/> inst_4930 0], :id inst_4931} {:test inst_4931, :then-block 74, :else-block 75, :id

inst_4932}]}}]

SSA Loom (def ssa (->> (parse-to-state-machine

'[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))]) second :blocks))

SSA Loom {76 [{:value :clojure.core.async.impl.ioc-macros/

value, :id inst_4937} {:value inst_4937, :id inst_4938}], 75 [{:refs [clojure.core/+ x 2], :id inst_4935} {:value inst_4935, :block 76, :id inst_4936}], 74 [{:refs [clojure.core/+ x 1], :id inst_4933} {:value inst_4933, :block 76, :id inst_4934}], 73 [{:refs [clojure.core/+ x 1 2 y], :id inst_4930} {:refs [clojure.core/> inst_4930 0], :id

inst_4931} {:test inst_4931, :then-block 74, :else-block

75, :id inst_4932}]}}]

(def ssa (->> (parse-to-state-machine

'[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))]) second :blocks))

SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn))

SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn))

SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn)) (defn ssa->loom "Converts the SSA form generated by core.async into Loom

representation.” ([ssa node-fn edge-fn] (let [nodes (delay (node-fn ssa))

edges (delay (edge-fn ssa))]

SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn)) {:graph (reify Graph

(nodes [g] @nodes) (edges [g] @edges) (has-node? [g node] (contains? @nodes node)) (has-edge? [g n1 n2] (contains? @edges [n1 n2])) (successors [g] (partial successors g)) (successors [g node]

(->> @edges (filter (fn [[n1 n2]] (= n1 node))) (map second))) (out-degree [g node] (count (successors g node)))

SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn)) Digraph

(predecessors [g] (partial predecessors g)) (predecessors [g node]

(->> @edges (filter (fn [[n1 n2]] (= n2 node))) (map first))) (in-degree [g node] (count (predecessors g node))))

:data ssa})))

SSA Loom: Dataflow Analysis

•  For each basic block, solve system of equations until reaching fixed point:

•  Use worklist approach

SSA Loom: Dataflow Analysis (defn dataflow-analysis "Performs dataflow analysis. Nodes have value nil initially.” [& {:keys [start graph join transfer]}] (let [start (cond

(set? start) start (coll? start) (set start) :else #{start})]

SSA Loom: Dataflow Analysis (loop [out-values {}

[node & worklist] (into clojure.lang.PersistentQueue/EMPTY start) (let [in-value (join (mapv out-values (predecessors graph node)))

out (transfer node in-value) update? (not= out (get out-values node)) out-values (if update? (assoc out-values node out) out-values) worklist (if update? (into worklist (successors graph node)) worklist)]

SSA Loom: Dataflow Analysis (loop [out-values {}

[node & worklist] (into clojure.lang.PersistentQueue/EMPTY start (let [in-value (join (mapv out-values (predecessors graph node)))

out (transfer node in-value) update? (not= out (get out-values node)) out-values (if update? (assoc out-values node out) out-values) worklist (if update? (into worklist (successors graph node)) worklist)]

(if (seq worklist) (recur out-values worklist) out-values)))))

SSA Loom: Global Availability (defn global-cse [ssa] (let [{graph :graph node-data :data} (ssa->loom (:blocks ssa) ssa-nodes-fn ssa-edges-fn) start (:start-block ssa)] (letfn [(pure? [instr] (contains? instr :refs)) (global-cse-join [values] (if (seq values) (apply set/intersection values) #{})) (global-cse-transfer [node in-value] (into in-value (map :refs (filter pure? (node-data node)))))]

SSA Loom: Global Availability (defn global-cse [ssa] (let [{graph :graph node-data :data} (ssa->loom (:blocks ssa) ssa-nodes-fn ssa-edges-fn) start (:start-block ssa)] (letfn [(pure? [instr] (contains? instr :refs)) (global-cse-join [values] (if (seq values) (apply set/intersection values) #{})) (global-cse-transfer [node in-value] (into in-value (map :refs (filter pure? (node-data node)))))] (dataflow-analysis :start start :graph graph :join global-cse-join :transfer global-cse-transfer))))

SSA Loom: Dataflow Analysis

• Reaching definitions

SSA Loom: Dataflow Analysis

• Reaching definitions • Liveness analysis (dead code elimination)

SSA Loom: Dataflow Analysis

• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions

SSA Loom: Dataflow Analysis

• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation

SSA Loom: Dataflow Analysis

• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation • Other Applications:

SSA Loom: Dataflow Analysis

• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation • Other Applications:

o Erdős number

SSA Loom: Dataflow Analysis

• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation • Other Applications:

o Erdős number o Spread of information in systems (e.g. taint)

My Experience

•  Intuitive way to implement algorithms functionally

• Some mental overhead of transforming data structures

Open Questions

• How general should a graph API be?

Open Questions

• How general should a graph API be? • How feature-rich should a graph API be?