之前看过networkx,igraph都说支持GML格式文件输入形式,直到今天想再Nepidemix load进文件进行分析,发现Nepidemix 对load输入文件要求是GML和gpickle,其余的不支持。所以查阅了下GML究竟是什么。
GML(Graph Modelling Language):
There are many different programs that work with graphs but almost all of them use their own file format. As a consequence, exchanging graphs between different programs is almost impossible. Simple tasks like exchange of data, externally reproducible results or a common benchmark suite are much harder than neccessary.
Therefore, we have developed a new file format for the Graphlet system: GML. GML supports attaching arbitrary information to graphs, nodes and edges, and is therefore able to emulate almost every other format.
原文链接 可以看出GML 是网络数据的一个统一标准,相当于度量衡的作用。GML支持关于图,点,边的任意数据的附加,因此可以仿真任意格式的数据。
networkx 提供了函数 可以读GML格式的数据 或者将你load进去的普通网络转换成GML格式的文件。链接请click
read_gml(path,[,encoding, relabel]) 读GML
write_gml(G,path)写网络G进GML文件
parse_gml(lines[,relael]) 从字符串中解析GML图
generate_gml(G) Generate a single entry of the graph G in GML format
举个栗子:
我的原始数据是这样,存在一个GML.txt里
a 1
a 2
b 1
c 2
d 3
b 3
这样构成了一个二分网络,我想提取出第二列网络,即投影。然后将第二列数据的投影网络,用GML数据保
代码:
G = nx.Graph()
f = open(‘GML.txt’)
i = 0
for line in f:
i = i+1
print i
cell = line.split()
G.add_edge(cell[0],cell[1])
f.close()
NSet = nx.bipartite.sets(G)
User = nx.project(G,NSet[1])
nx.write_gml(User,’Project_Gml’)
结果:投影的GML文件是这样:
graph [
node [
id 0
label “1”
]
node [
id 1
label “3”
]
node [
id 2
label “2”
]
edge [
source 0
target 1
]
edge [
source 0
target 2
]
]