Neo4j性能测试

  • Post author:
  • Post category:其他




Neo4j性能测试


声明:本测试基于cypher语句,也就是无论是查询还是修改都是基于cypher语句的,而且本文的查询和修改均为查询出一条边和修改一条边。并且数据的量级也是基于边的关系的。而节点是从这些边中抽出出来之后加上随机属性的



测试环境准备

3台节点,每台cpu和内存都差不多



CPU

[root@f1 import]# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                40
On-line CPU(s) list:   0-39
Thread(s) per core:    2
Core(s) per socket:    10
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
Stepping:              4
CPU MHz:               2499.921
CPU max MHz:           3000.0000
CPU min MHz:           800.0000
BogoMIPS:              4400.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
L3 cache:              14080K
NUMA node0 CPU(s):     0-9,20-29
NUMA node1 CPU(s):     10-19,30-39
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req



内存

[root@f1 import]# free -m
              total        used        free      shared  buff/cache   available
Mem:         257419        5205        8221         287      243993      250926
Swap:        131071           0      131071



磁盘

[root@f3 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb1        12T  609M   12T   1% /app



测试数据

通过随机函数随机生成的通话记录数据



测试方法

使用neo4j-OGM框架,每次查出相应的关系数据然后,根据关系数据现在的值对其进行更改,然后保存,查询为一条一条地查询,但是保存为每一个事务批次保存100条数据。

每次查询与保存的之前和之后都获取时间戳,然后进行相减,对产生的时间段进行累加。如代码所示保存的样例。

Long updateStart = System.currentTimeMillis();
session.save(updateEntities);
updateCost += System.currentTimeMillis() - updateStart;



100万关系数据测试

导入数据数量

IMPORT DONE in 7s 464ms.
Imported:
  1967732 nodes
  1000000 relationships
  7870928 properties
Peak memory usage: 1.02 GB

没索引情况下

neo4j> match (a:phone{phone:"132xxxxxx8"})-[b]-(c) return a,b,c;

| (:phone {province: 24, phone: "132xxxxxx8", city: 26, net_time: "2017-03-12"}) | [:call {duration: 62, shortest_duration: 31, times: 2, fast_time: "2019-04-04 11:24:05", last_time: "2019-04-04 11:24:05", fast_duration: 31, longest_time: "2019-04-04 11:24:05", longest_duration: 31, last_duration: 31, shortest_time: "2019-04-04 11:24:05"}] | (:phone {province: 1, phone: "1306xxxxxx6", city: 46, net_time: "2017-03-19"}) |

1 row available after 40 ms, consumed after another 560 ms




无索引无预热

search 1600 times cost 850943 ms and update cost 1418 ms
计算得查询速度531.839375‬ms/条 存储速度0.88625ms/条



无索引有预热

search 2500times cost 1310413ms and update cost 1497ms 
计算得存储速度524.1652 ms/条 存储速度0.5988ms/条



创建索引

neo4j> create index on :phone(phone);
0 rows available after 17 ms, consumed after another 0 ms
Added 1 indexes
neo4j> call db.indexes;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| description              | indexName | tokenNames | properties | state    | type                  | progress | provider                              | id | failureMessage |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "INDEX ON :phone(phone)" | "index_1" | ["phone"]  | ["phone"]  | "ONLINE" | "node_label_property" | 100.0    | {version: "1.0", key: "native-btree"} | 1  | ""             |

1 row available after 19 ms, consumed after another 6 ms



查询索引是否生效

neo4j> explain match p=(:phone{phone:"13xxxxxxxx8"})-[*..3]-() return p;
# 省略部分无用信息
| +NodeIndexSeek        |              1 | anon[9]                        | anon[9].phone ASC | :phone(phone)                                                                                         

0 rows available after 104 ms, consumed after another 0 ms



有索引无预热

search 3400 times cost 4578ms and update cost 3157ms

计算得查询速度1.3464705ms/条 保存速度为0.928529ms/条

查询三级好友速度

match p=(:phone{phone:"13xxxxxxxx8"})-[*..3]-() return p;

| (:phone {province: 24, phone: "13xxxxxxxx8", city: 26, net_time: "2017-03-12"})-[:call {duration: 124, shortest_duration: 31, times: 4, fast_time: "2019-04-04 11:24:05", fast_duration: 31, last_time: "2019-04-04 11:24:05", longest_time: "2019-04-04 11:24:05", longest_duration: 31, last_duration: 31, shortest_time: "2019-04-04 11:24:05"}]->(:phone {province: 1, phone: "13xxxxxxxx6", city: 46, net_time: "2017-03-19"}) |

1 row available after 55 ms, consumed after another 7 ms
#不过造的数据没有三级好友,但是速度也是很快的



1千万数据

IMPORT DONE in 38s 437ms. 
Imported:
  18260878 nodes
  10000000 relationships
  18260878 properties
Peak memory usage: 1.21 GB



无索引无预热

search 600 times cost 2512645ms and update cost 811ms 
计算得查询速度4,187.741666ms/条 更新速度1.3516ms/条



无索引有预热

search 300 times cost 1674214ms and update cost 194ms
计算得查询速度5,580.71333ms/条 ,更新速度0.64666ms/条



有索引无预热

search 5800 times cost 5430ms and update cost 5361ms
计算得查询速度0.9362ms/条 更新速度为0.92431ms/条



有索引有预热

search 5800 times cost 4575ms and update cost 5553ms
计算得查询速度0.788793ms/条 更新速度为0.95741ms/条



数据大小

[root@f1 import]# du -sh ../data
3.7G	../data



一亿条数据

查看导入的日志

IMPORT DONE in 2m 52s 428ms. 
Imported:
  136004833 nodes
  100000000 relationships
  544019332 properties
Peak memory usage: 2.52 GB



数据大小

[root@f1 import]# du -sh ../data
31G	../data



无索引无预热

此时速度已经变得很慢,因此就在控制台进行测试速度,不再进行无索引批量测试

neo4j> match p=(:phone{phone:"13xxxxxxxx8"})-[]-() return p;

| (:phone {province: 21, phone: "13xxxxxxxx8", city: 18, net_time: "2017-02-26"})-[:call {duration: 31, shortest_duration: 31, times: 1, fast_time: "2019-04-04 11:24:05", fast_duration: 31, last_time: "2019-04-04 11:24:05", longest_time: "2019-04-04 11:24:05", longest_duration: 31, last_duration: 31, shortest_time: "2019-04-04 11:24:05"}]->(:phone {province: 3, phone: "13xxxxxxxx6", city: 59, net_time: "2017-02-21"}) |

1 row available after 253 ms, consumed after another 77926 ms


neo4j> match p=(:phone{phone:"13xxxxxxxx9"})-[]-() return p;

| (:phone {province: 8, phone: "13xxxxxxxx9", city: 31, net_time: "2017-02-10"})-[:call {duration: 35, shortest_duration: 6, times: 2, fast_time: "2019-04-13 14:43:17", fast_duration: 6, last_time: "2019-04-17 06:29:57", longest_time: "2019-04-17 06:29:57", longest_duration: 29, last_duration: 29, shortest_time: "2019-04-13 14:43:17"}]->(:phone {province: 22, phone: "16xxxxxxxx8", city: 59, net_time: "2017-03-04"}) |

1 row available after 17 ms, consumed after another 69744 ms



无索引有预热

neo4j> match p=(:phone{phone:"13xxxxxxxx8"})-[]-() return p;

| (:phone {province: 21, phone: "13xxxxxxxx8", city: 18, net_time: "2017-02-26"})-[:call {duration: 31, shortest_duration: 31, times: 1, fast_time: "2019-04-04 11:24:05", fast_duration: 31, last_time: "2019-04-04 11:24:05", longest_time: "2019-04-04 11:24:05", longest_duration: 31, last_duration: 31, shortest_time: "2019-04-04 11:24:05"}]->(:phone {province: 3, phone: "13xxxxxxxx6", city: 59, net_time: "2017-02-21"}) |

1 row available after 2 ms, consumed after another 66127 ms

neo4j> match p=(:phone{phone:"13xxxxxxxx9"})-[]-() return p;

| (:phone {province: 8, phone: "13xxxxxxxx9", city: 31, net_time: "2017-02-10"})-[:call {duration: 35, shortest_duration: 6, times: 2, fast_time: "2019-04-13 14:43:17", fast_duration: 6, last_time: "2019-04-17 06:29:57", longest_time: "2019-04-17 06:29:57", longest_duration: 29, last_duration: 29, shortest_time: "2019-04-13 14:43:17"}]->(:phone {province: 22, phone: "16xxxxxxxx8", city: 59, net_time: "2017-03-04"}) |

1 row available after 2 ms, consumed after another 70924 ms





有索引无预热

search 16600 times cost 12452ms and update cost 31918ms
计算得查询平均速度为0.75012ms/条 更新速度为1.9227710ms/条



有索引有预热

search 15700 times cost 9833ms and update cost 29212ms
计算得查询平均速度为0.6263ms/条 更新查询速度为1.86063ms/条



建立索引后数据大小

[root@f1 import]# du -sh ../data/
37G	../data/

可以看到这时候数据量增加了

6GB

左右



十亿条数据

数据导入情况

IMPORT DONE in 19m 57s 543ms. 
Imported:
  586648101 nodes
  1000000000 relationships
  2346592404 properties
Peak memory usage: 7.56 GB



数据量大小

[root@f1 ~]# du -sh /app/neo4j/data
238G	/app/neo4j/data



无索引

此时查询了7分钟有余,依旧没有结果,因此不再测试无索引情况。

耗时过长,以后就不再做这种查询无索引的操作了



有索引无预热

search 1900 times cost 634324ms and update cost 2011ms
计算得查询速度为333.854736ms/条 更新速度为1.1052631ms/条



有索引有预热

search 1900 times cost 2798ms and update cost 1228ms
计算得查询速度为1.4726315ms/条 更新速度为0.64631578ms/条



数据大小

[root@f2 ~]# du -sh /app/neo4j/data
266G	/app/neo4j/data

可以看出索引大小为

28GB

左右



使用配置

dbms.memory.heap.initial_size=150g
dbms.memory.heap.max_size=150g
dbms.memory.pagecache.size=50g



尝试1、不修改配置文件,按照默认动态配置

有索引无预热

search 900 times cost 107085ms and update cost 950ms 
查询速度118.983ms/条 更新速度1.0555ms/条

有索引无预热

search 900 times cost 1909ms and update cost 390ms
查询速度2.1211111ms/条  更新速度0.433333ms/条



尝试2、100G堆内存,100G页面缓存

有索引,无预热

search 1900 times cost 168320ms and update cost 902ms 
查询速度88.589473ms/条 更新速度为0.47473ms/条

有索引有预热

search 1900 times cost 2643ms and update cost 1085ms 
查询速度1.391052631ms/s 更新速度为0.57105263ms/条

再次换10万数据测试这批数据的查询和更改速度。

search 100000 times cost 3495354ms and update cost 2339074ms
计算得查询速度为34.95354‬ms/s 更新速度为23.39074‬ms/s

在这里插入图片描述

蓝色为查询时长和数据量对比曲线,橙黄色为保存数据量和时长对比曲线。横坐标为数据量,单位为百

这是有预热时数据量与查询、保存曲线图

在这里插入图片描述



结过对比图

数据量\读取速度 无索引无预热 无索引有预热 有索引无预热 有索引有预热
1百万 531.839375‬ms/条 524.1652 ms/条 1.3464705ms/条 1.24ms/条
1千万 4,187.741666ms/条 5,580.71333ms/条 0.9362ms/条 0.788793ms/条
1亿 77926 ms/条 70924 ms/条 0.75012ms/条 0.6263ms/条
10亿 超过7分钟/条 超过7分钟/条 333.854736ms/条 1.4726315ms/条
数据量\写入速度
1百万 0.88625ms/条 0.5988ms/条 0.928529ms/条 0.926ms/条
1千万 1.3516ms/条 0.64666ms/条 0.92431ms/条 0.95741ms/条
1亿 1.9227710ms/条 1.86063ms/条
10亿 1.1052631ms/条 0.64631ms/条



版权声明:本文为sinat_35045195原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。