hive分层级取数（将多行记录合并为一条）：concat_ws ，collect_set

Post author:xfxia
Post published:2023年8月26日
Post category:其他

分层级取数：投诉工单>通信质量>无法正常使用数据业务>无法上网/掉线>固网>宽带故障

select a.row_id,concat_ws(‘>’, collect_set(b.code_name)) code_name

from (select row_id, id_list

from open_038_dim.dim_ivr_path LATERAL VIEW explode(split(path, ‘,’)) aa as id_list) a

left join open_038_dim.dim_ivr_path b on a.id_list=b.row_id

group by a.row_id

这里的collect_set的作用是对promotion_id去重

知识点：

一、concat()函数可以连接一个或者多个字符串

CONCAT(str1,str2,…) 返回结果为连接参数产生的字符串。如有任何一个参数为NULL ，则返回值为 NULL。

select concat(’11’,’22’,’33’); 112233

二、CONCAT_WS(separator,str1,str2,…)

是CONCAT()的特殊形式。第一个参数是其它参数的分隔符。分隔符的位置放在要连接的两个字符串之间。分隔符可以是一个字符串，也可以是其它参数。

select concat_ws(‘,’,’11’,’22’,’33’); 　11,22,33

三、group_concat()分组拼接函数

group_concat([DISTINCT] 要连接的字段 [Order BY ASC/DESC 排序字段] [Separator ‘分隔符’])

对下面的一组数据使用 group_concat()

| id |name

|1 | 10|

|1 | 20|

|1 | 20|

|2 | 20|

|3 | 200 |

|3 | 500 |

1、select id,group_concat(name) from aa group by id;

|1 | 10,20,20|

|2 | 20 |

|3 | 200,500|

2、select id,group_concat(name separator ‘;’) from aa group by id;

|1 | 10;20;20 |

|2 | 20|

|3 | 200;500 |

3、select id,group_concat(name order by name desc) from aa group by id;

|1 | 20,20,10 |

|2 | 20|

|3 | 500,200|

4、select id,group_concat(distinct name) from aa group by id;

|1 | 10,20|

|2 | 20 |

|3 | 200,500 |

原文链接：https://blog.csdn.net/LH0912666/article/details/81024275

你可能也喜欢