Graphs are great, but sometimes you want to do analysis on some data. Excel and similar tools can be very powerful - many of my "customers" (in the XP sense of the term) in recent years have been Excel super-power-mega users.
I've often seen developers put too much effort into trying to format output for their customer. It can be quicker for the developers and more useful for the customer to provide a csv file and let the customer decide how they want to see the data (of course, suggesting this to the customer and doing this if they agree, not just unilaterally deciding that's what your going to do). This can also apply to data used only by developers.
Here's a little python program which takes the input to Graphviz used in my previous article, and produces a csv file containing the number of "incoming" and "outgoing" dependencies as shown by the graph in that article.
The code was saved as "graph2csv.py" and executed by running "python graph2csv.py < graph.txt > graph.csv", and here's the raw output
and here it is in Excel: ![]()
import sys
incoming = {}
outgoing = {}
nodes = set()
lines = sys.stdin.readlines()[2:-2]
for line in lines:
source, destination = line.split(" -> ")
destination = destination[:-2]
incoming.setdefault(destination,[]).append(source)
outgoing.setdefault(source,[]).append(destination)
nodes.add(source)
nodes.add(destination)
def num(dictionary, node):
if not dictionary.has_key(node):
return "0"
return str(len(dictionary[node]))
print "incoming,node name,outgoing"
for node in nodes:
print num(incoming,node)+","+node+","+num(outgoing,node)
Posted by ivan at February 9, 2006 10:35 PM
Python has a module for reading and writing CSV in its standard library. It's most useful for writing CSV files that can contain arbitrary strings in the table cells because it escapes the commas for you. It can also produce and consume the different dialects of CSV escaping understood by various tools.
Posted by: Nat at February 10, 2006 6:43 PMCool! Thanks Nat. I'll have a look, and might possibly post an updated version of the code, but there again, I might just have another cup of tea.
Posted by: ivan at February 10, 2006 9:18 PM