HBASE 入门
写在前面
大数据竞赛结束已经有2年了,当时没有考察hbase的知识之后也没了解,导致现在工作遇到相关问题啥也不懂,趁周末赶紧补补课。
一、安装与启动
我用的版本分别为:
hadoop-2.7.3
zookeeper-3.4.10
hbase-1.2.4
虽然之前比赛也有现成的脚本,但是是多节点的,这次单纯学习hbase,单节点就够了,于是上网找了点大佬的博客。
配置可以看这篇:
不过上篇文章的启动命令有点问题,我是看下面这篇文章的启动命令成功启动的:
启动好后,使用
你的hbase目录/bin/./start-hbase.sh
后运行
你的hbase目录/bin/./hbase shell
进入shell界面就算安装启动成功:
二、基本命令
环境信息和权限分配
我们输入以下命令可以分别查看hbase的版本、集群状态以及当前用户:
hbase(main):001:0> version
1.2.4, r67592f3d062743907f8c5ae00dbbe1ae4f69e5af, Tue Oct 25 18:10:20 CDT 2016
hbase(main):002:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load
hbase(main):003:0> whoami
root (auth:SIMPLE)
groups: root
hbase(main):003:0>
使用如下命令可以赋予权限,系统也会给予一些提示:
]
hbase(main):005:0> grant 'root', 'RWXCA'
ERROR: DISABLED: Security features are not available
Here is some help for this command:
Grant users specific rights.
Syntax : grant <user>, <permissions> [, <@namespace> [, <table> [, <column family> [, <column qualifier>]]]
permissions is either zero or more letters from the set "RWXCA".
READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A')
Note: Groups and users are granted access in the same way, but groups are prefixed with an '@'
character. In the same way, tables and namespaces are specified, but namespaces are
prefixed with an '@' character.
For example:
hbase> grant 'bobsmith', 'RWXCA'
hbase> grant '@admins', 'RWXCA'
hbase> grant 'bobsmith', 'RWXCA', '@ns1'
hbase> grant 'bobsmith', 'RW', 't1', 'f1', 'col1'
hbase> grant 'bobsmith', 'RW', 'ns1:t1', 'f1', 'col1'
hbase(main):006:0>
(ADMIN代表管理权)
表的创建和删除以及查看
- 创建:官方的建议很不错:
Here is some help for this command:
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:
Create a table with namespace=ns1 and table qualifier=t1
hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}
Create a table with namespace=default and table qualifier=t1
hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
hbase> # The above in shorthand would be the following:
hbase> create 't1', 'f1', 'f2', 'f3'
hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
Table configuration options can be put at the end.
Examples:
hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
hbase> # Optionally pre-split the table into NUMREGIONS, using
hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
hbase> create 't1', {NAME => 'f1', DFS_REPLICATION => 1}
You can also keep around a reference to the created table:
hbase> t1 = create 't1', 'f1'
Which gives you a reference to the table named 't1', on which you can then
call methods.
hbase(main):002:0>
我们可以用如下命令创建名叫“lol”的表,并且列簇名分别为“name”和“technique”:
hbase(main):007:0> create 'lol',{NAME=>'name'},{NAME=>'technique'}
0 row(s) in 1.4040 seconds
=> Hbase::Table - lol
hbase(main):008:0>
- 删除:
删除表需要先disable然后drop
hbase(main):003:0> disable 'lol'
0 row(s) in 3.2680 seconds
hbase(main):004:0> drop 'lol'
0 row(s) in 1.2960 seconds
hbase(main):005:0>
- 修改表名:
创建完表后,需要先创建临时快照,然后clone临时快照并重新命名新的表,然后删除临时快照,最后查看新复制的表的属性:
hbase(main):005:0> create 'lol',{NAME=>'name'},{NAME=>'technique'}
0 row(s) in 1.2550 seconds
=> Hbase::Table - lol
hbase(main):006:0> snapshot 'lol','lol_tmp'
0 row(s) in 0.3550 seconds
hbase(main):007:0> clone_snapshot 'lol_tmp','leagueoflengend'
0 row(s) in 0.5940 seconds
hbase(main):008:0> delete
delete delete_all_snapshot delete_snapshot
deleteall
hbase(main):008:0> delete_snapshot 'lol_tmp'
0 row(s) in 0.0640 seconds
hbase(main):010:0> desc 'leagueoflengend'
Table leagueoflengend is ENABLED
leagueoflengend
COLUMN FAMILIES DESCRIPTION
{NAME => 'name', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KE
EP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', CO
MPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65
536', REPLICATION_SCOPE => '0'}
{NAME => 'technique', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false
', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER
', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE =
> '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.0700 seconds
hbase(main):011:0>
- 查看所有表
list
hbase(main):012:0> list
TABLE
leagueoflengend
lol
2 row(s) in 0.0170 seconds
=> ["leagueoflengend", "lol"]
hbase(main):013:0>
行的增删改查
- 增
我们对表’lol’插入“机械公敌-兰博”的数据,可以看到行键为“3”的行存储了兰博的信息(名字为Rambo,称号为Mechanical Enemy,Q技能为grilled at high temperature):
hbase(main):064:0> put 'lol','3','name:name','Rambo'
0 row(s) in 0.0120 seconds
hbase(main):066:0> put 'lol','3','name:title','Mechanical Enemy'
0 row(s) in 0.0260 seconds
hbase(main):085:0> put 'lol','3','tech:q','grilled at high temperature'
0 row(s) in 0.0410 seconds
hbase(main):067:0> scan 'lol'
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
3 row(s) in 0.0510 seconds
hbase(main):068:0>
- 删
deleteall:删除整行所有数据:
hbase(main):198:0> deleteall 'lol','2'
0 row(s) in 0.0060 seconds
hbase(main):199:0>
delete:删除对应行对应列对应时间戳的数据:
hbase(main):199:0> scan 'lol'
ROW COLUMN+CELL
1 column=name:cn-name, timestamp=1652065791331, value=\xE6\xB0\xB8\xE6\x81\xA9
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
1 column=tech:w, timestamp=1652065118775, value=Spirit Cleave
2 column=name:cn-name, timestamp=1652065856945, value=\xE4\xBA\x9A\xE7\xB4\xA2
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
3 column=name:cn-name, timestamp=1652065881601, value=\xE5\x85\xB0\xE5\x8D\x9A
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
3 row(s) in 0.0410 seconds
hbase(main):200:0> delete 'lol','1','name:tech',1652059717967
0 row(s) in 0.0090 seconds
hbase(main):201:0>
- 改
可以直接再put就可以更改了:
put 'lol','3','name:name','Rambo_changed'
列簇的增删改
- 增
增加列簇:
hbase(main):079:0> alter 'lol',NAME=>'tech'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 1.9890 seconds
hbase(main):080:0> desc 'lol'
Table lol is ENABLED
lol
COLUMN FAMILIES DESCRIPTION
{NAME => 'country', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE',
MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'name', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MI
N_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'tech', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MI
N_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
3 row(s) in 0.0170 seconds
hbase(main):081:0>
- 删
删除列簇:
hbase(main):088:0> alter 'lol',NAME=>'country',METHOD=>'delete'
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.2430 seconds
hbase(main):089:0> desc 'lol'
Table lol is ENABLED
lol
COLUMN FAMILIES DESCRIPTION
{NAME => 'name', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MI
N_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'tech', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MI
N_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.0170 seconds
hbase(main):090:0>
- 改
先增加列簇,再删除列簇即可
查询
hbase的匹配有两种,一种是模糊匹配,一种是精确匹配,类似于elasticsearch里的match和term:
substring:模糊匹配,如(我之前已在lol表的name列簇中添加了tech列):
hbase(main):148:0> scan 'lol',FILTER=>"QualifierFilter (=,'substring:te')"
ROW COLUMN+CELL
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 row(s) in 0.0220 seconds
hbase(main):149:0> scan 'lol',FILTER=>"QualifierFilter (=,'substring:tech')"
ROW COLUMN+CELL
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 row(s) in 0.0070 seconds
hbase(main):150:0>
binary:精确匹配,如:
hbase(main):146:0> scan 'lol',FILTER=>"QualifierFilter (=,'binary:te')"
ROW COLUMN+CELL
0 row(s) in 0.0280 seconds
hbase(main):147:0> scan 'lol',FILTER=>"QualifierFilter (=,'binary:tech')"
ROW COLUMN+CELL
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 row(s) in 0.0160 seconds
hbase(main):148:0>
- get
get用来查询某行的数据。官方help:
Here is some help for this command:
Get row or cell contents; pass table name, row, and optionally
a dictionary of column(s), timestamp, timerange and versions. Examples:
hbase> get 'ns1:t1', 'r1'
hbase> get 't1', 'r1'
hbase> get 't1', 'r1', {TIMERANGE => [ts1, ts2]}
hbase> get 't1', 'r1', {COLUMN => 'c1'}
hbase> get 't1', 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> get 't1', 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> get 't1', 'r1', 'c1'
hbase> get 't1', 'r1', 'c1', 'c2'
hbase> get 't1', 'r1', ['c1', 'c2']
hbase> get 't1', 'r1', {COLUMN => 'c1', ATTRIBUTES => {'mykey'=>'myvalue'}}
hbase> get 't1', 'r1', {COLUMN => 'c1', AUTHORIZATIONS => ['PRIVATE','SECRET']}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> get 't1', 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
Besides the default 'toStringBinary' format, 'get' also supports custom formatting by
column. A user can define a FORMATTER by adding it to the column name in the get
specification. The FORMATTER can be stipulated:
1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
hbase> get 't1', 'r1' {COLUMN => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifier). You cannot specify
a FORMATTER for all columns of a column family.
The same commands also can be run on a reference to a table (obtained via get_table or
create_table). Suppose you had a reference t to table 't1', the corresponding commands
would be:
hbase> t.get 'r1'
hbase> t.get 'r1', {TIMERANGE => [ts1, ts2]}
hbase> t.get 'r1', {COLUMN => 'c1'}
hbase> t.get 'r1', {COLUMN => ['c1', 'c2', 'c3']}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}
hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS => 4}
hbase> t.get 'r1', {COLUMN => 'c1', TIMESTAMP => ts1, VERSIONS => 4}
hbase> t.get 'r1', {FILTER => "ValueFilter(=, 'binary:abc')"}
hbase> t.get 'r1', 'c1'
hbase> t.get 'r1', 'c1', 'c2'
hbase> t.get 'r1', ['c1', 'c2']
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE'}
hbase> t.get 'r1', {CONSISTENCY => 'TIMELINE', REGION_REPLICA_ID => 1}
以下分别为查询rowkey为“1”,列簇为“tech”的数据和同时查询列簇为“name”或“tech”的数据:
hbase(main):092:0> get 'lol','1',{COLUMN=>'tech'}
COLUMN CELL
tech:q timestamp=1652061296629, value=Staggered jade cut
1 row(s) in 0.0080 seconds
hbase(main):093:0> get 'lol','1',{COLUMN=>['name','tech']}
COLUMN CELL
name:fname timestamp=1652003547121, value=Yone
name:tech timestamp=1652059717967, value=Staggered jade cut
name:title timestamp=1652003832088, value=Demon Sword Soul
tech:q timestamp=1652061296629, value=Staggered jade cut
4 row(s) in 0.0080 seconds
hbase(main):094:0>
如果要查询某个值,比如查询名称为“Yone”的数据:
hbase(main):094:0> get 'lol','1',{FILTER=>"ValueFilter (=,'binary:Yone')"}
COLUMN CELL
name:fname timestamp=1652003547121, value=Yone
1 row(s) in 0.0090 seconds
hbase(main):095:0>
- scan
官方的help总是很全面,基本上看这个就够了
Here is some help for this command:
Scan a table; pass table name and optionally a dictionary of scanner
specifications. Scanner specifications may include one or more of:
TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP,
MAXLENGTH or COLUMNS, CACHE or RAW, VERSIONS, ALL_METRICS or METRICS
If no columns are specified, all columns will be scanned.
To scan all members of a column family, leave the qualifier empty as in
'col_family'.
The filter can be specified in two ways:
1. Using a filterString - more information on this is available in the
Filter Language document attached to the HBASE-4176 JIRA
2. Using the entire package name of the filter.
If you wish to see metrics regarding the execution of the scan, the
ALL_METRICS boolean should be set to true. Alternatively, if you would
prefer to see only a subset of the metrics, the METRICS array can be
defined to include the names of only the metrics you care about.
Some examples:
hbase> scan 'hbase:meta'
hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'}
hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'}
hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804, 1303668904]}
hbase> scan 't1', {REVERSED => true}
hbase> scan 't1', {ALL_METRICS => true}
hbase> scan 't1', {METRICS => ['RPC_RETRIES', 'ROWS_FILTERED']}
hbase> scan 't1', {ROWPREFIXFILTER => 'row2', FILTER => "
(QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))"}
hbase> scan 't1', {FILTER =>
org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}
hbase> scan 't1', {CONSISTENCY => 'TIMELINE'}
For setting the Operation Attributes
hbase> scan 't1', { COLUMNS => ['c1', 'c2'], ATTRIBUTES => {'mykey' => 'myvalue'}}
hbase> scan 't1', { COLUMNS => ['c1', 'c2'], AUTHORIZATIONS => ['PRIVATE','SECRET']}
For experts, there is an additional option -- CACHE_BLOCKS -- which
switches block caching for the scanner on (true) or off (false). By
default it is enabled. Examples:
hbase> scan 't1', {COLUMNS => ['c1', 'c2'], CACHE_BLOCKS => false}
Also for experts, there is an advanced option -- RAW -- which instructs the
scanner to return all cells (including delete markers and uncollected deleted
cells). This option cannot be combined with requesting specific COLUMNS.
Disabled by default. Example:
hbase> scan 't1', {RAW => true, VERSIONS => 10}
Besides the default 'toStringBinary' format, 'scan' supports custom formatting
by column. A user can define a FORMATTER by adding it to the column name in
the scan specification. The FORMATTER can be stipulated:
1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString)
2. or as a custom class followed by method name: e.g. 'c(MyFormatterClass).format'.
Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }
Note that you can specify a FORMATTER by column only (cf:qualifier). You cannot
specify a FORMATTER for all columns of a column family.
Scan can also be used directly from a table, by first getting a reference to a
table, like such:
hbase> t = get_table 't'
hbase> t.scan
Note in the above situation, you can still provide all the filtering, columns,
options, etc as described above.
比如我们要查看表”lol”的前两行数据:
若不加限制条件则直接查看表所有数据
hbase(main):068:0> scan 'lol',{LIMIT=>2}
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
2 row(s) in 0.0330 seconds
hbase(main):069:0>
如果要对行号进行限定:
注意,区间为[STARTROW, ENDROW)
hbase(main):069:0> scan 'lol',{STARTROW=>'2',ENDROW=>'3'}
ROW COLUMN+CELL
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
1 row(s) in 0.0340 seconds
hbase(main):070:0> scan 'lol',{STARTROW=>'2',ENDROW=>'4'}
ROW COLUMN+CELL
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
2 row(s) in 0.0090 seconds
hbase(main):071:0>
如果要对时间进行限定:
hbase(main):074:0> scan 'lol',{FILTER=>"(TimestampsFilter (1652003547121,1652059643106))"}
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 row(s) in 0.0340 seconds
hbase(main):075:0>
高级查询
- 查看hbase支持的filter:
hbase(main):047:0> show_filters
DependentColumnFilter
KeyOnlyFilter
ColumnCountGetFilter
SingleColumnValueFilter
PrefixFilter
SingleColumnValueExcludeFilter
FirstKeyOnlyFilter
ColumnRangeFilter
TimestampsFilter
FamilyFilter
QualifierFilter
ColumnPrefixFilter
RowFilter
MultipleColumnPrefixFilter
InclusiveStopFilter
PageFilter
ValueFilter
ColumnPaginationFilter
hbase(main):048:0>
- 行键过滤器
RowFilter:对行键进行过滤。如以下命令获取rowkey开头为“1”的数据
hbase(main):051:0> scan 'lol',FILTER=>"RowFilter(=,'binaryprefix:1')"
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 row(s) in 0.0240 seconds
hbase(main):052:0>
PrefixFilter:行键前缀过滤。上面的命令可以这样写:
hbase(main):056:0> scan 'lol',FILTER=>"PrefixFilter('1')"
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 row(s) in 0.0260 seconds
hbase(main):057:0>
FirstKeyOnlyFilter:显示每个逻辑行的第一个数据,可以用来快速查看表的基本数据,也可以提高统计计数的效率
hbase(main):098:0> scan 'lol',{FILTER=>"FirstKeyOnlyFilter()"}
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
2 column=name:name, timestamp=1652059786181, value=Yasuo
3 column=name:name, timestamp=1652059766646, value=Rambo
3 row(s) in 0.0130 seconds
hbase(main):099:0>
同时我们可以直接用count来查询行数:
hbase(main):194:0> scan 'lol'
ROW COLUMN+CELL
1 column=name:cn-name, timestamp=1652065791331, value=\xE6\xB0\xB8\xE6\x81\xA9
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
1 column=tech:w, timestamp=1652065118775, value=Spirit Cleave
2 column=name:cn-name, timestamp=1652065856945, value=\xE4\xBA\x9A\xE7\xB4\xA2
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
3 column=name:cn-name, timestamp=1652065881601, value=\xE5\x85\xB0\xE5\x8D\x9A
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
4 column=name:name, timestamp=1652085807213, value=Foyego
4 row(s) in 0.0120 seconds
hbase(main):195:0> count 'lol'
4 row(s) in 0.0070 seconds
=> 4
hbase(main):196:0>
- 列簇和列过滤器
FamilyFilter:查询列簇名。如查找列簇名包含“te”的数据:
hbase(main):101:0> scan 'lol',FILTER=>"FamilyFilter (=,'substring:te')"
ROW COLUMN+CELL
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
2 row(s) in 0.0270 seconds
hbase(main):102:0>
QualifierFilter:查询列名。如查找包含“tech”的列的数据:
hbase(main):104:0> scan 'lol',FILTER=>"QualifierFilter (=,'substring:tech')"
ROW COLUMN+CELL
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 row(s) in 0.0080 seconds
hbase(main):105:0>
ColumnPrefixFilter:查询列前缀为xx。如查找列以“f”开头的数据:
hbase(main):106:0> scan 'lol',FILTER=>"ColumnPrefixFilter('f')"
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
1 row(s) in 0.0060 seconds
hbase(main):107:0>
MultipleColumnPrefixFilter:查询多个列前缀。如:
hbase(main):107:0> scan 'lol',FILTER=>"MultipleColumnPrefixFilter('na','f')"
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
2 column=name:name, timestamp=1652059786181, value=Yasuo
3 column=name:name, timestamp=1652059766646, value=Rambo
3 row(s) in 0.0190 seconds
hbase(main):108:0>
ColumnRangeFilter:设定范围来对列进行过滤,其中true和false来设置起始点和结束点,范围与STARTROW和ENDROW一样是左闭右开:
hbase(main):110:0> scan 'lol'
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
1 column=tech:w, timestamp=1652065118775, value=Spirit Cleave
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
3 row(s) in 0.0120 seconds
hbase(main):111:0> scan 'lol',FILTER=>"ColumnRangeFilter ('na',true,'te',false)"
ROW COLUMN+CELL
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
2 column=name:name, timestamp=1652059786181, value=Yasuo
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
3 row(s) in 0.0530 seconds
hbase(main):112:0> scan 'lol',FILTER=>"ColumnRangeFilter ('na',true,'wa',false)"
ROW COLUMN+CELL
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
1 column=tech:w, timestamp=1652065118775, value=Spirit Cleave
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
3 row(s) in 0.0330 seconds
hbase(main):113:0>
- 值过滤
ValueFilter:查询值。在此之前我对lol表插入了各个英雄的中文名,可以看到hbase默认会将中文转化成以16进制存储并展示:
hbase(main):117:0> scan 'lol'
ROW COLUMN+CELL
1 column=name:cn-name, timestamp=1652065791331, value=\xE6\xB0\xB8\xE6\x81\xA9
1 column=name:fname, timestamp=1652003547121, value=Yone
1 column=name:tech, timestamp=1652059717967, value=Staggered jade cut
1 column=name:title, timestamp=1652003832088, value=Demon Sword Soul
1 column=tech:q, timestamp=1652061296629, value=Staggered jade cut
1 column=tech:w, timestamp=1652065118775, value=Spirit Cleave
2 column=name:cn-name, timestamp=1652065856945, value=\xE4\xBA\x9A\xE7\xB4\xA2
2 column=name:name, timestamp=1652059786181, value=Yasuo
2 column=name:tech, timestamp=1652059643106, value=Chopping Steel Flash
2 column=name:title, timestamp=1652059364721, value=Wind Swordsman
3 column=name:cn-name, timestamp=1652065881601, value=\xE5\x85\xB0\xE5\x8D\x9A
3 column=name:name, timestamp=1652059766646, value=Rambo
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
3 column=tech:q, timestamp=1652061445488, value=grilled at high temperature
3 row(s) in 0.0230 seconds
hbase(main):118:0> scan 'lol',FILTER=>"ValueFilter (=,'substring:永恩')"
ROW COLUMN+CELL
1 column=name:cn-name, timestamp=1652065791331, value=\xE6\xB0\xB8\xE6\x81\xA9
1 row(s) in 0.0290 seconds
hbase(main):145:0> scan 'lol',FILTER=>"ValueFilter (=,'substring:ne')"
ROW COLUMN+CELL
1 column=name:fname, timestamp=1652003547121, value=Yone
3 column=name:title, timestamp=1652059808472, value=Mechanical Enemy
2 row(s) in 0.0300 seconds
hbase(main):119:0>
Tips:hbase显示中文:
hbase(main):144:0> scan 'lol',{COLUMNS => 'name:cn-name:toString'}
ROW COLUMN+CELL
1 column=name:cn-name, timestamp=1652065791331, value=永恩
2 column=name:cn-name, timestamp=1652065856945, value=亚索
3 column=name:cn-name, timestamp=1652065881601, value=兰博
3 row(s) in 0.0210 seconds
hbase(main):145:0>
结语
本文介绍的只是非常基础的语法,hbase还有很多用法没有展示,比如导入数据等,期待进一步学习。
感谢