Introduction to Computational Analysis

Pay Notebook Creator: Roy Hyunjin Han0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0

NetworkX is for network relationship analysis

NetworkX is a package for creating, analyzing, visualizing networks or graphs. A graph is a general mathematical concept that can be used to represent nodes and edges for moving from one node to another node. The strength of NetworkX is in its simplicity and its algorithms.

In [ ]:
import networkx as nx

Make a graph.

In [ ]:
# Make a graph where edges do not have direction.
graph = nx.Graph()
In [ ]:
# Make a graph where edges have direction.
directed_graph = nx.DiGraph()
In [ ]:
# Make a graph where two nodes can have multiple connecting edges without direction.
multi_graph = nx.MultiGraph()
In [ ]:
# Make a graph where two nodes can have multiple connecting edges with direction.
# For example, this kind of graph can represent cities that are connected by more than one highway.
multi_directed_graph = nx.MultiDiGraph()
In [ ]:
# Generate a random graph.
degree = 2
node_count = 10
random_graph = nx.random_regular_graph(degree, node_count)

Add nodes, see nodes, draw nodes.

In [ ]:
graph = nx.Graph()
# Add single nodes
# Add multiple nodes
graph.add_nodes_from(['carrot', 'zucchini'])
# List nodes
In [ ]:
# Print the first three letters of each node using the nodes_iter() iterator,
# which is more efficient than nodes() because it does not create an itermediate list
[str(node)[:3] for node in graph.nodes_iter()]
In [ ]:
% matplotlib inline

Add edges, see edges, draw edges.

In [ ]:
graph = nx.Graph()
# Add single edges
graph.add_edge('City Hall', 'Union Square')
graph.add_edge('Union Square', 'Grand Central')
graph.add_edge('Grand Central', 'Times Square')
# Add multiple edges
    ('Penn Station', 'Times Square'),
    ('Times Square', '72nd Street and Broadway'),
    ('72nd Street and Broadway', '96th Street and Broadway'),
# List nodes
In [ ]:
# List edges
In [ ]:
# List edges using the edges_iter() iterator, which is more efficient than edges()
# because it does not create an itermediate list
[(node1, node2) for node1, node2 in graph.edges_iter()]
In [ ]:
%matplotlib inline

Add attributes.

In [ ]:
# Add graph attributes
graph = nx.Graph(name='phone records')
graph.graph['start_index'] = 500
graph.graph['end_index'] = 10000
In [ ]:
# Add node attributes
graph.add_node('18008662453', name='TMobile customer service')
graph.add_node('18883336651', name='AT&T customer service')
graph.add_node('18009220204', name='Verizon customer service')
graph.add_node('18668667509', name='Sprint customer service')
graph.node['18008662453']['name'] = 'T-Mobile customer service'
In [ ]:
# Look at nodes and their attributes
In [ ]:
# Add attributes to an edge
graph.add_edge('18883336651', '18009220204', weight=0.5, confidence=0.9)
graph.add_edge('18009220204', '18668667509', weight=0.3, confidence=0.4)
graph.edge['18883336651']['18009220204']['weight'] = 1.0
In [ ]:
# Look at edges and their attributes
In [ ]:
%matplotlib inline

Analyze a graph with an algorithm.

In [ ]:
graph = nx.Graph()
graph.add_edge('City Hall', 'Union Square', duration=10)
graph.add_edge('Union Square', 'Grand Central', duration=10)
graph.add_edge('Grand Central', 'Times Square', duration=5)
graph.add_edge('Penn Station', 'Times Square', duration=5)
graph.add_edge('Times Square', '72nd Street and Broadway', duration=10)
graph.add_edge('72nd Street and Broadway', '96th Street and Broadway', duration=10)
graph.add_edge('Grand Central', '86th Street and Lexington', duration=10)
# What is the fastest way to get from City Hall to 86th Street and Lexington?
print(nx.dijkstra_path(graph, 'City Hall', '86th Street and Lexington', weight='duration'))
# How long will it take?
print('%s minutes' % nx.dijkstra_path_length(graph, 'City Hall', '86th Street and Lexington', weight='duration'))

Save and load graphs.

In [ ]:
path = '/tmp/graph.pkl'
# Save
nx.write_gpickle(graph, path)
# Load
graph = nx.read_gpickle(path)