一、题目
Product 表:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| product_id | int |
| product_name | varchar |
+---------------+---------+
product_id 是这张表的主键。
product_name 是产品的名称。
Sales 表:
+---------------------+---------+
| Column Name | Type |
+---------------------+---------+
| product_id | int |
| period_start | date |
| period_end | date |
| average_daily_sales | int |
+---------------------+---------+
product_id 是这张表的主键。
period_start 和 period_end 是该产品销售期的起始日期和结束日期,且这两个日期包含在销售期内。
average_daily_sales 列存储销售期内该产品的日平均销售额。
编写一段 SQL 查询每个产品每年的总销售额,并包含 product_id, product_name 以及 report_year 等信息。
销售年份的日期介于 2018 年到 2020 年之间。你返回的结果需要按 product_id 和 report_year
排序
。
查询结果格式如下例所示。
示例 1:
输入:
Product table:
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 1 | LC Phone |
| 2 | LC T-Shirt |
| 3 | LC Keychain |
+------------+--------------+
Sales table:
+------------+--------------+-------------+---------------------+
| product_id | period_start | period_end | average_daily_sales |
+------------+--------------+-------------+---------------------+
| 1 | 2019-01-25 | 2019-02-28 | 100 |
| 2 | 2018-12-01 | 2020-01-01 | 10 |
| 3 | 2019-12-01 | 2020-01-31 | 1 |
+------------+--------------+-------------+---------------------+
输出:
+------------+--------------+-------------+--------------+
| product_id | product_name | report_year | total_amount |
+------------+--------------+-------------+--------------+
| 1 | LC Phone | 2019 | 3500 |
| 2 | LC T-Shirt | 2018 | 310 |
| 2 | LC T-Shirt | 2019 | 3650 |
| 2 | LC T-Shirt | 2020 | 10 |
| 3 | LC Keychain | 2019 | 31 |
| 3 | LC Keychain | 2020 | 31 |
+------------+--------------+-------------+--------------+
解释:
LC Phone 在 2019-01-25 至 2019-02-28 期间销售,该产品销售时间总计35天。销售总额 35*100 = 3500。
LC T-shirt 在 2018-12-01 至 2020-01-01 期间销售,该产品在2018年、2019年、2020年的销售时间分别是31天、365天、1天,2018年、2019年、2020年的销售总额分别是31*10=310、365*10=3650、1*10=10。
LC Keychain 在 2019-12-01 至 2020-01-31 期间销售,该产品在2019年、2020年的销售时间分别是:31天、31天,2019年、2020年的销售总额分别是31*1=31、31*1=31。
二、解决
1、分类讨论汇总
思路:
笨办法,具体略。
代码-版本1:
select
t.product_id
, p.product_name,report_year
, sum(average_daily_sales*(total_days+1)) as total_amount
from product p
join
(
select
product_id
, average_daily_sales
, '2018' as report_year
, case
when period_start<='2018-01-01' and period_end>'2018-12-31'
then datediff('2018-12-31','2018-01-01')
when period_start<='2018-01-01' and period_end>='2018-01-01'and period_end<='2018-12-31'
then datediff(period_end,'2018-01-01')
when period_start>'2018-01-01' and period_start<='2018-12-31'and period_end>'2018-12-31'
then datediff('2018-12-31',period_start)
when period_start>'2018-01-01' and period_start<='2018-12-31' and period_end<'2018-12-31'
then datediff(period_end,period_start)
end as total_days
from sales
union all
select
product_id
, average_daily_sales
, '2019' as report_year
, case
when period_start<='2019-01-01' and period_end>'2019-12-31'
then datediff('2019-12-31','2019-01-01')
when period_start<='2019-01-01' and period_end>='2019-01-01'and period_end<='2019-12-31'
then datediff(period_end,'2019-01-01')
when period_start>'2019-01-01' and period_start<='2019-12-31'and period_end>'2019-12-31'
then datediff('2019-12-31',period_start)
when period_start>'2019-01-01' and period_start<='2019-12-31' and period_end<'2019-12-31'
then datediff(period_end,period_start) end as total_days
from sales
union all
select
product_id
, average_daily_sales
, '2020' as report_year
, case
when period_start<='2020-01-01' and period_end>'2020-12-31'
then datediff('2020-12-31','2020-01-01')
when period_start<='2020-01-01' and period_end>='2020-01-01'and period_end<='2020-12-31'
then datediff(period_end,'2020-01-01')
when period_start>'2020-01-01' and period_start<='2020-12-31'and period_end>'2020-12-31'
then datediff('2020-12-31',period_start)
when period_start>'2020-01-01' and period_start<='2020-12-31' and period_end<'2020-12-31'
then datediff(period_end,period_start)
end as total_days
from sales
) t
on t.product_id = p.product_id
where total_days is not null
group by 1, 3
order by 1, 3
代码-版本2:
select
t.product_id
, product_name
, report_year
, sum(total_amount) as total_amount
from
(
select
product_id
, "2020" as report_year
,
( datediff
(
if(period_end < "2021-01-01", period_end, date("2020-12-31"))
, if(period_start > "2020-01-01", period_start, date("2020-01-01"))
) + 1
)*average_daily_sales as total_amount
from Sales
having total_amount > 0
union all
select
product_id
, "2019" as report_year
,
( datediff
(
if(period_end < "2020-01-01", period_end, date("2019-12-31"))
, if(period_start > "2019-01-01", period_start, date("2019-01-01"))
) + 1
)*average_daily_sales as total_amount
from Sales
having total_amount > 0
union all
select
product_id
, "2018" as report_year
,
( datediff
(
if(period_end < "2019-01-01", period_end, date("2018-12-31"))
, if(period_start > "2018-01-01", period_start, date("2018-01-01"))
) + 1
)*average_daily_sales as total_amount
from Sales
having total_amount > 0
) t
left join product p on p.product_id = t.product_id
group by product_id, report_year
order by product_id, report_year
2、通用版:分离年份再汇总
思路-版本1:
select
n
from y
# Result:
{"headers": ["n"], "values": [
[2018],
[2019],
[2020],
[2021]
]}
代码-版本1:
with recursive y(n) as (
select
min(year(period_start))
from sales
union all
select n+1
from y
where n <= (select max(year(period_end)) from sales)
)
select
distinct cast(p.product_id as char) product_id
, p.product_name as product_name
, cast(n as char) as report_year
,
if (n = year(period_start) and n = year(period_end)
, (datediff(period_end, period_start) + 1)
,
if (n = year(period_start)
, (datediff(concat(n, '-12-31'), period_start) + 1)
,
if (n = year(period_end)
, (datediff(period_end, concat(n,'-01-01')) + 1)
, 365
)
)
) * average_daily_sales as total_amount
from product p
inner join y
left join sales s on p.product_id = s.product_id
where year(period_start) <= n and year(period_end) >= n
order by product_id, report_year
思路-版本2:
select
s.product_id
, year(s.period_start)
, year(s.period_end)
, y.year report_year
, s.average_daily_sales
* ( if(year(s.period_end) > y.year, y.days_of_year, dayofyear(s.period_end))
- if(year(s.period_start) < y.year, 1, dayofyear(s.period_start)) + 1
) total_amount
from Sales s
inner join (
select '2018' year, 365 days_of_year
union all select '2019' year, 365 days_of_year
union all select '2020' year, 366 days_of_year
) y on year(s.period_start) <= y.year and year(s.period_end) >= y.year
order by s.product_id, y.year;
# Result:
{"headers": [
"product_id",
"year(s.period_start)",
"year(s.period_end)",
"report_year", "total_amount"],
"values": [
["1", 2019, 2019, "2019", 3500],
["2", 2018, 2020, "2018", 310],
["2", 2018, 2020, "2019", 3650],
["2", 2018, 2020, "2020", 10],
["3", 2019, 2020, "2019", 31],
["3", 2019, 2020, "2020", 31]
]}
if(year(s.period_end) > y.year, y.days_of_year, dayofyear(s.period_end))
- if(year(s.period_start) < y.year, 1, dayofyear(s.period_start)) + 1
(1)if(year(s.period_end) > y.year, y.days_of_year, dayofyear(s.period_end)) 解释
若 year(s.period_end) > y.year 成立 --> y.year不是结束年,则 y.days_of_year
否则(y.year是结束年) dayofyear(s.period_end)
(2)if(year(s.period_start) < y.year, 1, dayofyear(s.period_start))解释
若 year(s.period_start) < y.year 成立 --> y.year不是起始年,则 1
否则(y.year是起始年) dayofyear(s.period_start)
上面语句可能有四种情况:
(1)y.year是结束年,也是起始年
res = dayofyear(s.period_end) - dayofyear(s.period_start) + 1
(2)y.year是结束年,不是起始年
res = dayofyear(s.period_end) - 1 + 1
(3)y.year不是结束年,是起始年
res = y.days_of_year - dayofyear(s.period_start) + 1
(4)y.year不是结束年,不是起始年
res = dayofyear(s.period_end) - 1 + 1
代码-版本2:
select
s.product_id
, p.product_name
, y.year report_year
, s.average_daily_sales
* ( if(year(s.period_end) > y.year, y.days_of_year, dayofyear(s.period_end))
- if(year(s.period_start) < y.year, 1, dayofyear(s.period_start)) + 1
) total_amount
from Sales s
inner join (
select '2018' year, 365 days_of_year
union all select '2019' year, 365 days_of_year
union all select '2020' year, 366 days_of_year
) y on year(s.period_start) <= y.year and year(s.period_end) >= y.year
inner join Product p on p.product_id = s.product_id
order by s.product_id, y.year;
2、按天展开,分组累加
思路:
- 递归生成将最大天数差的自然数;
- 将这些时间段安天展开但是只去天的年份;
- 然后按照product_id,report_year分组求字和。
select
s.product_id
, DATE_ADD(s.period_start, INTERVAL t.n DAY) as Date
from Sales s
join t on datediff(s.period_end, s.period_start)>=t.n
order by 1, 2
# Result:
{"headers":
["product_id", "Date"], "values": [
["1", "2019-01-25"], ["1", "2019-01-26"], ["1", "2019-01-27"], ["1", "2019-01-28"],
["1", "2019-01-29"], ["1", "2019-01-30"], ["1", "2019-01-31"], ["1", "2019-02-01"],
["1", "2019-02-02"], ["1", "2019-02-03"], ["1", "2019-02-04"], ["1", "2019-02-05"],
["1", "2019-02-06"], ["1", "2019-02-07"], ["1", "2019-02-08"], ["1", "2019-02-09"],
["1", "2019-02-10"], ["1", "2019-02-11"], ["1", "2019-02-12"], ["1", "2019-02-13"],
["1", "2019-02-14"], ["1", "2019-02-15"], ["1", "2019-02-16"], ["1", "2019-02-17"],
["1", "2019-02-18"], ["1", "2019-02-19"], ["1", "2019-02-20"], ["1", "2019-02-21"],
["1", "2019-02-22"], ["1", "2019-02-23"], ["1", "2019-02-24"], ["1", "2019-02-25"],
["1", "2019-02-26"], ["1", "2019-02-27"], ["1", "2019-02-28"], ["2", "2018-12-01"],
["2", "2018-12-02"], ["2", "2018-12-03"], ["2", "2018-12-04"], ["2", "2018-12-05"],
["2", "2018-12-06"], ["2", "2018-12-07"], ["2", "2018-12-08"], ["2", "2018-12-09"],
["2", "2018-12-10"], ["2", "2...
代码:
with recursive t(n) as (
select 0
union all
select n+1 from t where n<(select max(datediff(period_end, period_start)) from Sales)
)
select
s.product_id,
p.product_name,
DATE_FORMAT(DATE_ADD(s.period_start, INTERVAL t.n DAY), '%Y') as report_year,
sum(s.average_daily_sales) as total_amount
from Sales s
join t on datediff(s.period_end, s.period_start)>=t.n
join Product p on s.product_id=p.product_id
group by 1, 2, 3
order by 1, 3
三、参考
1、
9行代码,非union方法
2、
with recursive,按天展开,按年group
3、
1384. 按年度列出销售总额
4、
过期少女 ⚡、jzTD
版权声明:本文为HeavenDan原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。