混合函数
-
调用 Java 函数:java_method
语法: java_method(class, method[, arg1[, arg2…]])
返回值: varies
说明: 调用 Java 中的方法处理数据。hive> select java_method("java.net.URLEncoder", "encode", 'http://www.baidu.com',"UTF-8") from iteblog; OK http%3A%2F%2Fwww.baidu.com -- 该 查询中调用 java.net.URLEncoder 中 的 encode 方 法,给该方法传的参数为'http://www.baidu.com',"UTF-8"
-
调用 Java 函数:reflect
语法: reflect(class, method[, arg1[, arg2…]])
返回值: varies
说明: 调用 Java 中的方法处理数据。hive> select reflect("java.net.URLDecoder", "decode", 'http%3A%2F%2Fwww.baidu.com',"UTF-8") from iteblog; OK http://www.baidu.com
-
字符串的 hash 值:hash
语法: hash(a1[, a2…])
返回值: int
说明: 返回字符串的 hash 值。hive> select hash('www.baidu.com') from iteblog; OK 270263191
XPath 解析 XML 函数
XPath 参考资料:
http://baike.baidu.com/view/307399.htm
XPath 函数 Hive 官方介绍文档:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+XPathUDF
-
xpath
语法: xpath(string xmlstr,string xpath_expression)
返回值: array
说明: 从 xml 字符串中返回匹配到表达式的结果数组。//获取 xml 字符串中 a/b/节点的值 hive> select xpath('<a><b>b1</b><b>b2</b><c>c1</c></a>','a/b/text()') from iteblog; OK ["b1","b2"] //获取 xml 字符串中所有名为 id 的属性值 hive> select xpath('<a><b id="foo">b1</b><b id="bar">b2</b></a>','//@id') from iteblog; OK ["foo","bar"]
-
xpath_string
语法: xpath_string(string xmlstr,string xpath_expression)
返回值: string
说明: 默认情况下,从 xml 字符串中返回第一个匹配到表达式的节点的值。hive> SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', '//b') FROM iteblog; OK b1 //指定返回匹配到哪一个节点 hive> SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', '//b[2]') FROM iteblog; OK b2
-
xpath_boolean
语法: xpath_boolean (string xmlstr,string xpath_expression)
返回值: boolean
说明: 返回 xml 字符串中是否匹配 xml 表达式。hive> SELECT xpath_boolean ('<a><b>b</b></a>', 'a/b') FROM iteblog; OK true hive> SELECT xpath_boolean ('<a><b>10</b></a>', 'a/b < 10') FROM iteblog; OK false
-
xpath_short, xpath_int, xpath_long
语法: xpath_short (string xmlstr,string xpath_expression)
xpath_int (string xmlstr,string xpath_expression)
xpath_long (string xmlstr,string xpath_expression)
返回值: int
说明: 返回 xml 字符串中经过 xml 表达式计算后的值,如果不匹配,则返回 0。hive> SELECT xpath_int ('<a>this is not a number</a>', 'a') FROM iteblog; OK 0 hive> SELECT xpath_int ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/*)') FROM iteblog; OK 15 hive> select xpath_long('<a><b>10.5</b><c>11.2</c></a>','sum(a/*)') from iteblog; OK 21
-
xpath_float, xpath_double, xpath_number
语法: xpath_float (string xmlstr,string xpath_expression)
xpath_double (string xmlstr,string xpath_expression)
xpath_number (string xmlstr,string xpath_expression)
返回值: number
说明: 返回 xml 字符串中经过 xml 表达式计算后的值,如果不匹配,则返回 0。hive> select xpath_double('<a><b>10.5</b><c>11.2</c></a>','sum(a/*)') from iteblog; OK 21.7