SQL基础之聚合和排序


本文主要是阅读《SQL 基础教程》这本书总结而来,有需求的话可以阅读原著。

SQL基础之查询语句


1. 对表进行聚合查询

1.1 聚合函数

  • 所谓的聚合函数,就是将多行汇总为一行
  • count ==> 计算表中记录行数
  • sum ==> 任意列中数据的合计值
  • avg ==> 任意列中数据的平均值
  • max ==> 任意列中数据的最大值
  • min ==> 任意列中数据的最小值

1.2 count 函数

  • 对于count函数来说,参数列表不同计算的结果也会发生变化
  • count(*)会得到包含null的数据行数
  • count(<列名>)会得到null之外的数据行数
# 新的表格数据
mysql> select * from Shohin;
+-----------+------------+---------------+--------------+--------------+------------+
| shohin_id | shohin_mei | shohin_bunrui | hanbai_tanka | shiire_tanka | torokubi   |
+-----------+------------+---------------+--------------+--------------+------------+
| 0001      | T恤衫    | 衣服        |         1000 |          500 | 2009-09-20 |
| 0002      | 打孔器  | 办公用品  |          500 |          320 | 2009-09-11 |
| 0003      | 运动T恤 | 衣服        |         4000 |         2800 | NULL       |
| 0004      | 菜刀     | 厨房用具  |         3000 |         2800 | 2009-09-20 |
| 0005      | 高压锅  | 厨房用具  |         6800 |         5000 | 2009-09-15 |
| 0006      | 叉子     | 厨房用具  |          500 |         NULL | 2009-09-20 |
| 0007      | 切菜板  | 厨房用具  |          880 |          790 | 2008-09-20 |
| 0008      | 圆珠笔  | 办公用品  |          100 |         NULL | 2009-11-11 |
+-----------+------------+---------------+--------------+--------------+------------+
8 rows in set (0.00 sec)
# 计算全部数据的行数
mysql> select count(*) from Shohin;
+----------+
| count(*) |
+----------+
|        8 |
+----------+
1 row in set (0.01 sec)

# 计算null之外的数据行数
mysql> select count(shiire_tanka) from Shohin;
+---------------------+
| count(shiire_tanka) |
+---------------------+
|                   6 |
+---------------------+
1 row in set (0.00 sec)

1.3 sum 函数

  • 聚合函数会将null排除在外,只有count(*)函数列外
# 计算销售单价的合计值
mysql> select sum(hanbai_tanka) from Shohin;
+-------------------+
| sum(hanbai_tanka) |
+-------------------+
|             16780 |
+-------------------+
1 row in set (0.00 sec)

# 计算销售单价和进货单价的合计值
mysql> select sum(hanbai_tanka), sum(shiire_tanka) from Shohin;
+-------------------+-------------------+
| sum(hanbai_tanka) | sum(shiire_tanka) |
+-------------------+-------------------+
|             16780 |             12210 |
+-------------------+-------------------+
1 row in set (0.00 sec)

1.4 avg 函数

# 计算销售单价的平均值
mysql> select avg(hanbai_tanka) from Shohin;
+-------------------+
| avg(hanbai_tanka) |
+-------------------+
|         2097.5000 |
+-------------------+
1 row in set (0.00 sec)

# 计算销售单价和进货单价的平均值
mysql> select avg(hanbai_tanka), avg(shiire_tanka) from Shohin;
+-------------------+-------------------+
| avg(hanbai_tanka) | avg(shiire_tanka) |
+-------------------+-------------------+
|         2097.5000 |         2035.0000 |
+-------------------+-------------------+
1 row in set (0.00 sec)

1.5 max 函数和 min 函数

  • max函数和min函数几乎适用于所有数据类型的列
  • sum函数和avg函数只适应与数值类型的列
# 数值值类型
mysql> select max(hanbai_tanka), min(shiire_tanka) from Shohin;
+-------------------+-------------------+
| max(hanbai_tanka) | min(shiire_tanka) |
+-------------------+-------------------+
|              6800 |               320 |
+-------------------+-------------------+
1 row in set (0.00 sec)

# 日期类型
mysql> select max(torokubi), min(torokubi) from Shohin;
+---------------+---------------+
| max(torokubi) | min(torokubi) |
+---------------+---------------+
| 2009-11-11    | 2008-09-20    |
+---------------+---------------+
1 row in set (0.00 sec)

1.6 聚合函数去重

  • 这里同样需要使用到,distinct关键字
# shohin_bunrui列的总行数
mysql> select shohin_bunrui from Shohin;
+---------------+
| shohin_bunrui |
+---------------+
| 衣服        |
| 办公用品  |
| 衣服        |
| 厨房用具  |
| 厨房用具  |
| 厨房用具  |
| 厨房用具  |
| 办公用品  |
+---------------+
8 rows in set (0.00 sec)

# shohin_bunrui列的去重行数
mysql> select distinct shohin_bunrui from Shohin;
+---------------+
| shohin_bunrui |
+---------------+
| 衣服        |
| 办公用品  |
| 厨房用具  |
+---------------+
3 rows in set (0.00 sec)

# shohin_bunrui列的总行数的统计
# 几乎没有什么明显的效果
mysql> select distinct count(shohin_bunrui) from Shohin;
+----------------------+
| count(shohin_bunrui) |
+----------------------+
|                    8 |
+----------------------+
1 row in set (0.00 sec)

# shohin_bunrui列的去重行数
mysql> select count(distinct shohin_bunrui) from Shohin;
+-------------------------------+
| count(distinct shohin_bunrui) |
+-------------------------------+
|                             3 |
+-------------------------------+
1 row in set (0.00 sec)

2. 对表进行分组

2.1 group by 子句

  • 注意事项

    • 不使用group by子句,是将表中的所以数据作为一组来对待处理
    • 使用group by子句,会将表中的数据分为多个组进行处理
    • 暂定的子句执行顺序
      • select ==> from ==> where ==> group by
  • 语法格式

mysql> select <列名1>, <列名2>, ...
mysql> from <表名>
mysql> group by <列名1>, <列名2>, ...;
  • 示例演示
# 按照商品类型进行数据行数统计
mysql> select shohin_bunrui, count(*) from Shohin group by shohin_bunrui;
+---------------+----------+
| shohin_bunrui | count(*) |
+---------------+----------+
| 衣服        |        2 |
| 办公用品  |        2 |
| 厨房用具  |        4 |
+---------------+----------+
3 rows in set (0.00 sec)

# 错误使用输出
mysql> select shohin_bunrui, count(*) from Shohin;
+---------------+----------+
| shohin_bunrui | count(*) |
+---------------+----------+
| 衣服        |        8 |
+---------------+----------+
1 row in set (0.00 sec)

2.2 聚合键中包含 null 的情况

# 按照进货单价进行数据统计
mysql> select shiire_tanka, count(*) from Shohin group by shiire_tanka;
+--------------+----------+
| shiire_tanka | count(*) |
+--------------+----------+
|         NULL |        2 |
|          320 |        1 |
|          500 |        1 |
|          790 |        1 |
|         2800 |        2 |
|         5000 |        1 |
+--------------+----------+
6 rows in set (0.00 sec)

2.3 where 子句 group by 执行结果

  • 注意事项

    • group bywhere并用的时候select语句的执行顺序
      • from==>where ==> group by ==> select
  • 语法格式

mysql> select <列名1>, <列名2>, ...
mysql> from <表名>
mysql> where
mysql> group by <列名1>, <列名2>, ...;
  • 示例演示
# 好好体会吧
mysql> select shiire_tanka, count(*)
    -> from Shohin
    -> where shohin_bunrui = '衣服'
    -> group by shiire_tanka;
+--------------+----------+
| shiire_tanka | count(*) |
+--------------+----------+
|          500 |        1 |
|         2800 |        1 |
+--------------+----------+
2 rows in set (0.00 sec)

2.4 聚合函数和 group by 子句常见问题

错误 1——在select子句中书写了多余的列

  • 使用group by子句时,select子句中不能出现聚合键之外的列名
mysql> select * from Shohin;
+-----------+------------+---------------+--------------+--------------+------------+
| shohin_id | shohin_mei | shohin_bunrui | hanbai_tanka | shiire_tanka | torokubi   |
+-----------+------------+---------------+--------------+--------------+------------+
| 0001      | T恤衫    | 衣服        |         1000 |          500 | 2009-09-20 |
| 0002      | 打孔器  | 办公用品  |          500 |          320 | 2009-09-11 |
| 0003      | 运动T恤 | 衣服        |         4000 |         2800 | NULL       |
| 0004      | 菜刀     | 厨房用具  |         3000 |         2800 | 2009-09-20 |
| 0005      | 高压锅  | 厨房用具  |         6800 |         5000 | 2009-09-15 |
| 0006      | 叉子     | 厨房用具  |          500 |         NULL | 2009-09-20 |
| 0007      | 切菜板  | 厨房用具  |          880 |          790 | 2008-09-20 |
| 0008      | 圆珠笔  | 办公用品  |          100 |         NULL | 2009-11-11 |
+-----------+------------+---------------+--------------+--------------+------------+
8 rows in set (0.00 sec)

# 我们会发现,执行语句和我们想要得到的结果不同,这是因为聚合建和商品名并不一定是一对一的
# 使用shiire_tanka作为分割时,count(*)只会输出一行,但是满足shiire_tanka的shohin_mei却有很多,错误就出现了
mysql> select shohin_mei, shiire_tanka, count(*)
    -> from Shohin
    -> group by shiire_tanka;
+------------+--------------+----------+
| shohin_mei | shiire_tanka | count(*) |
+------------+--------------+----------+
| 叉子     |         NULL |        2 |
| 打孔器  |          320 |        1 |
| T恤衫    |          500 |        1 |
| 切菜板  |          790 |        1 |
| 运动T恤 |         2800 |        2 |
| 高压锅  |         5000 |        1 |
+------------+--------------+----------+
6 rows in set (0.00 sec)

错误 2——在group by子句中写了列的别名

  • group by子句中不能使用 select 子句中定义的别名
# group by子句中使用列的别名会引发错误
# 在PostgreSQL和mysql中可以使正常使用
mysql> select shohin_bunrui as sb, count(*)
    -> from Shohin
    -> group by sb;
+--------------+----------+
| sb           | count(*) |
+--------------+----------+
| 衣服       |        2 |
| 办公用品 |        2 |
| 厨房用具 |        4 |
+--------------+----------+
3 rows in set (0.00 sec)

mysql> select shohin_bunrui as sb, count(*)
    -> from Shohin
    -> group by shohin_bunrui;
+--------------+----------+
| sb           | count(*) |
+--------------+----------+
| 衣服       |        2 |
| 办公用品 |        2 |
| 厨房用具 |        4 |
+--------------+----------+
3 rows in set (0.00 sec)

错误 3——group by子句的输出结果能够排序吗?

  • 答案就是,不可用,为随机的无序输出

错误 4——在where子句中使用聚合函数

  • 只有select子句和having子句以及order by子句中能够使用聚合函数
mysql> select shohin_bunrui, count(*)
    -> from Shohin
    -> group by shohin_bunrui;
+---------------+----------+
| shohin_bunrui | count(*) |
+---------------+----------+
| 衣服        |        2 |
| 办公用品  |        2 |
| 厨房用具  |        4 |
+---------------+----------+
3 rows in set (0.00 sec)

# 筛选出count(*)为2的进行输出,在where中使用聚合函数会报错
# 使用count等函数对表中的数据进行聚合操作的时候,为其指定条件的不是where子句,而是用having子句
mysql> select shohin_bunrui, count(*)
    -> from Shohin
    -> where count(*) = 2
    -> group by shohin_bunrui;
ERROR 1111 (HY000): Invalid use of group function

3. 对聚合结果指定条件

重点要点

  • 使用count等函数对表中的数据进行聚合操作的时候,为其指定条件的不是where子句,而是用having子句
  • 聚合函数可以在select子句、having子句和order by子句中使用
  • having子句要写在group by子句之后
  • where子句用来指定数据行的条件,having子句用来指定分组的条件

语法格式

mysql> select <列名1>, <列名2>, ...
mysql> from <表名>
mysql> group by <列名1>, <列名2>, ...
mysql> having <分组结果对应条件>

执行顺序

  • select ==> from ==> where ==> group by ==> having

3.1 haiving 字句

# 之前筛选count(*)=2的正确方式
mysql> select shohin_bunrui, count(*)
    -> from Shohin
    -> group by shohin_bunrui
    -> having count(*) = 2;
+---------------+----------+
| shohin_bunrui | count(*) |
+---------------+----------+
| 衣服        |        2 |
| 办公用品  |        2 |
+---------------+----------+
2 rows in set (0.00 sec)
# 不适用having子句
mysql> select shohin_bunrui, avg(hanbai_tanka)
    -> from Shohin
    -> group by shohin_bunrui;
+---------------+-------------------+
| shohin_bunrui | avg(hanbai_tanka) |
+---------------+-------------------+
| 衣服        |         2500.0000 |
| 办公用品  |          300.0000 |
| 厨房用具  |         2795.0000 |
+---------------+-------------------+
3 rows in set (0.00 sec)

# 使用having子句
mysql> select shohin_bunrui, avg(hanbai_tanka)
    -> from Shohin
    -> group by shohin_bunrui
    -> having avg(hanbai_tanka) >= 2500;
+---------------+-------------------+
| shohin_bunrui | avg(hanbai_tanka) |
+---------------+-------------------+
| 衣服        |         2500.0000 |
| 厨房用具  |         2795.0000 |
+---------------+-------------------+
2 rows in set (0.00 sec)

3.2 having 子句的构造要素

  • having子句中只能够使用这三种要素
    • 常数
    • 聚合函数
    • group by子句中指定的列名(即为聚合键)
# 这条语句中avg为聚合函数,2500为常数
mysql> select shohin_bunrui, avg(hanbai_tanka)
    -> from Shohin
    -> group by shohin_bunrui
    -> having avg(hanbai_tanka) >= 2500;

3.3 更适合写在 where 子句中的条件

  • 相对于 having 子句,更适合写在 where 子句中的条件
  • 通常情况下,为了得到相同的结果,讲条件写在where子句中要比写在having子句中的处理速度更快,返回结果所需时间更短
# 以下这两个查询语句中分别使用having子句和where子句,但是表达的结果是相同的
# having shohin_bunrui = '衣服'就是之前所说的having子句可以使用`group by`子句中指定的列名

mysql> select shohin_bunrui, count(*)
    -> from Shohin
    -> group by shohin_bunrui
    -> having shohin_bunrui = '衣服';
+---------------+----------+
| shohin_bunrui | count(*) |
+---------------+----------+
| 衣服        |        2 |
+---------------+----------+
1 row in set (0.00 sec)

mysql> select shohin_bunrui, count(*)
    -> from Shohin
    -> where shohin_bunrui = '衣服'
    -> group by shohin_bunrui
    -> ;
+---------------+----------+
| shohin_bunrui | count(*) |
+---------------+----------+
| 衣服        |        2 |
+---------------+----------+
1 row in set (0.00 sec)

4. 对查询结果进行排序

  • order by子句对查询结果进行排序
  • order by子句中的列名后面使用关键字asc可以进行升序排序,使用desc关键字进行降序排序
  • order by子句中可以指定多个排序键
  • 排序键包含null时,会在开头或者结尾进行汇总
  • order by子句中可以使用select子句中定义的列的别名
  • order by子句中可以使用select子句中未出现的列或者聚合函数
  • order by子句中不能使用列的编号

4.1 order by 子句

  • 使用格式
mysql> select <列名1>, <列名2>, ...
     > from <表名>
     > order by <排序基准列1>, <排序基准列2>, ...;

执行顺序

  • from ==> where ==> group by ==> having ==> select ==> order by

实例演示

mysql> select shohin_id, shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by hanbai_tanka;
+-----------+------------+--------------+--------------+
| shohin_id | shohin_mei | hanbai_tanka | shiire_tanka |
+-----------+------------+--------------+--------------+
| 0008      | 圆珠笔  |          100 |         NULL |
| 0002      | 打孔器  |          500 |          320 |
| 0006      | 叉子     |          500 |         NULL |
| 0007      | 切菜板  |          880 |          790 |
| 0001      | T恤衫    |         1000 |          500 |
| 0004      | 菜刀     |         3000 |         2800 |
| 0003      | 运动T恤 |         4000 |         2800 |
| 0005      | 高压锅  |         6800 |         5000 |
+-----------+------------+--------------+--------------+
8 rows in set (0.00 sec)

4.2 指定升序和降序

  • 未指定order by子句的排序方式,默认使用升序进行排序
# 默认升序排序
mysql> select shohin_id, shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by hanbai_tanka asc;
+-----------+------------+--------------+--------------+
| shohin_id | shohin_mei | hanbai_tanka | shiire_tanka |
+-----------+------------+--------------+--------------+
| 0008      | 圆珠笔  |          100 |         NULL |
| 0002      | 打孔器  |          500 |          320 |
| 0006      | 叉子     |          500 |         NULL |
| 0007      | 切菜板  |          880 |          790 |
| 0001      | T恤衫    |         1000 |          500 |
| 0004      | 菜刀     |         3000 |         2800 |
| 0003      | 运动T恤 |         4000 |         2800 |
| 0005      | 高压锅  |         6800 |         5000 |
+-----------+------------+--------------+--------------+
8 rows in set (0.00 sec)

# 降序排序
mysql> select shohin_id, shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by hanbai_tanka desc;
+-----------+------------+--------------+--------------+
| shohin_id | shohin_mei | hanbai_tanka | shiire_tanka |
+-----------+------------+--------------+--------------+
| 0005      | 高压锅  |         6800 |         5000 |
| 0003      | 运动T恤 |         4000 |         2800 |
| 0004      | 菜刀     |         3000 |         2800 |
| 0001      | T恤衫    |         1000 |          500 |
| 0007      | 切菜板  |          880 |          790 |
| 0002      | 打孔器  |          500 |          320 |
| 0006      | 叉子     |          500 |         NULL |
| 0008      | 圆珠笔  |          100 |         NULL |
+-----------+------------+--------------+--------------+
8 rows in set (0.00 sec)

4.3 指定多个排序键

  • 根据指定的排序键顺序排序
# 先按照shohin_tanka排序,在同为500的时候,再按照shohin_id进行排序
mysql> select shohin_id, shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by hanbai_tanka, shohin_id;
+-----------+------------+--------------+--------------+
| shohin_id | shohin_mei | hanbai_tanka | shiire_tanka |
+-----------+------------+--------------+--------------+
| 0008      | 圆珠笔  |          100 |         NULL |
| 0002      | 打孔器  |          500 |          320 |
| 0006      | 叉子     |          500 |         NULL |
| 0007      | 切菜板  |          880 |          790 |
| 0001      | T恤衫    |         1000 |          500 |
| 0004      | 菜刀     |         3000 |         2800 |
| 0003      | 运动T恤 |         4000 |         2800 |
| 0005      | 高压锅  |         6800 |         5000 |
+-----------+------------+--------------+--------------+
8 rows in set (0.00 sec)

4.4 null 的顺序

  • 排序键包含null时,会在开头或者结尾进行汇总
# null在开头进行排序
mysql> select shohin_id, shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by shiire_tanka;
+-----------+------------+--------------+--------------+
| shohin_id | shohin_mei | hanbai_tanka | shiire_tanka |
+-----------+------------+--------------+--------------+
| 0006      | 叉子     |          500 |         NULL |
| 0008      | 圆珠笔  |          100 |         NULL |
| 0002      | 打孔器  |          500 |          320 |
| 0001      | T恤衫    |         1000 |          500 |
| 0007      | 切菜板  |          880 |          790 |
| 0003      | 运动T恤 |         4000 |         2800 |
| 0004      | 菜刀     |         3000 |         2800 |
| 0005      | 高压锅  |         6800 |         5000 |
+-----------+------------+--------------+--------------+
8 rows in set (0.00 sec)

4.5 在排序键中使用别名

mysql> select shohin_id as id, shohin_mei, hanbai_tanka as ht, shiire_tanka
    -> from Shohin
    -> order by ht, id;
+------+------------+------+--------------+
| id   | shohin_mei | ht   | shiire_tanka |
+------+------------+------+--------------+
| 0008 | 圆珠笔  |  100 |         NULL |
| 0002 | 打孔器  |  500 |          320 |
| 0006 | 叉子     |  500 |         NULL |
| 0007 | 切菜板  |  880 |          790 |
| 0001 | T恤衫    | 1000 |          500 |
| 0004 | 菜刀     | 3000 |         2800 |
| 0003 | 运动T恤 | 4000 |         2800 |
| 0005 | 高压锅  | 6800 |         5000 |
+------+------------+------+--------------+
8 rows in set (0.00 sec)

4.6 order by 子句中可以使用的列

mysql> select shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by shohin_id;
+------------+--------------+--------------+
| shohin_mei | hanbai_tanka | shiire_tanka |
+------------+--------------+--------------+
| T恤衫    |         1000 |          500 |
| 打孔器  |          500 |          320 |
| 运动T恤 |         4000 |         2800 |
| 菜刀     |         3000 |         2800 |
| 高压锅  |         6800 |         5000 |
| 叉子     |          500 |         NULL |
| 切菜板  |          880 |          790 |
| 圆珠笔  |          100 |         NULL |
+------------+--------------+--------------+
8 rows in set (0.00 sec)

4.7 不要使用列编号

  • 列编号是指 select 子句中的列按照从左到右的顺序进行排列时对应的编号
  • 代码阅读起来比较难
  • 该排序功能将会被删除
# 使用列名
mysql> select shohin_id, shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by hanbai_tanka desc, shohin_id;
+-----------+------------+--------------+--------------+
| shohin_id | shohin_mei | hanbai_tanka | shiire_tanka |
+-----------+------------+--------------+--------------+
| 0005      | 高压锅  |         6800 |         5000 |
| 0003      | 运动T恤 |         4000 |         2800 |
| 0004      | 菜刀     |         3000 |         2800 |
| 0001      | T恤衫    |         1000 |          500 |
| 0007      | 切菜板  |          880 |          790 |
| 0002      | 打孔器  |          500 |          320 |
| 0006      | 叉子     |          500 |         NULL |
| 0008      | 圆珠笔  |          100 |         NULL |
+-----------+------------+--------------+--------------+
8 rows in set (0.00 sec)

# 使用列表号
mysql> select shohin_id, shohin_mei, hanbai_tanka, shiire_tanka
    -> from Shohin
    -> order by 3 desc, 1;
+-----------+------------+--------------+--------------+
| shohin_id | shohin_mei | hanbai_tanka | shiire_tanka |
+-----------+------------+--------------+--------------+
| 0005      | 高压锅  |         6800 |         5000 |
| 0003      | 运动T恤 |         4000 |         2800 |
| 0004      | 菜刀     |         3000 |         2800 |
| 0001      | T恤衫    |         1000 |          500 |
| 0007      | 切菜板  |          880 |          790 |
| 0002      | 打孔器  |          500 |          320 |
| 0006      | 叉子     |          500 |         NULL |
| 0008      | 圆珠笔  |          100 |         NULL |
+-----------+------------+--------------+--------------+
8 rows in set (0.00 sec)

文章作者: Escape
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Escape !
  目录