insert into on duplicate key update – 返回主键问题

问题 – InsertOrUpdate 会修改内存主键

在项目中，需要批量插入销售记录数据进入 record 表。这张表有自己的唯一键 sale_no 作为唯一标识号。我们需要同步 Mysql 表与 ES 索引，设计成同时双写 mysql 和 es。代码如下

双写逻辑

SQL

insert into record
(column1, column2, ...)
values
(value1, value2, ...)
...
on duplicate key update
column1 = values(values1)
...

private void saveRecords(List<RecordPO> RecordPOList) {
	// 1. 这里使用 insert into on duplicate key update SQL 语句批量插入
	markableRecordPOUdfMapper.batchInsertReplace(markableRecordPOList);
	// 2. 批量更新 ES，转成 ES 类型 PO 再 Bulk
	RecordSearch.bulkIndex(RecordIndexConverter.toIndexObjects(markableRecordPOByUniqueSerialNo));
	// ......
    }

可是出现一个奇怪的现象，每次更新数据表后，es 都会比 mysql 多几个数据。

测试验证

最后排查结果是：batchInsertReplace 方法返回的主键是有问题的，不是真实数据库存储的主键，而是 Mybatis 自己生成的。下面是实验和问题详解。

测试验证

自己造出实验数据，一遍insert一边update，我们预期的是 insert 的PO 赋值出插入数据库的自增主键，update 的 PO 赋值更新的主键。

结果出人意料，插入数据的PO获得主键 115066，UPDATE 的 PO 获得主键居然是 115067（数据库里真实的主键应该是113242，所以只能更新那条记录，却根本没获取到要更新记录的主键）。

所以这个 PO 回填的主键是假的，只是 Mybatis 自增主键 + 1

解决方案

强制走从库反查数据库再读取一遍主键

为什么要反查，因为批量插入的健只有在mysql中才能获取到
为什么强制走从库？因为刚插入新数据，会有主从延迟

markableRecordPOUdfMapper.batchInsertReplace(markableRecordPOList);
// 用唯一索引强制走从库读取
List<MarkableRecordPO> markableRecordPOByUniqueSerialNo = getRecordPOBySaleSerialNoFromMaster(uniqueSerialNoList);
RecordSearch.bulkIndex(RecordIndexConverter.toIndexObjects(markableRecordPOByUniqueSerialNo));

为什么会出现这个问题，解答要追溯至 Mybatis 自增主键的原理了。

Mybatis 获取主键的原理

useGenerateKey

在这里插入图片描述

MyBatis 中 excutor包中的 keyGenerator 负责生产自增主键与回填自增主键。MyBatis 提供了三种实现类，Jdbc3KeyGenerate / SelectKeyGenerator / NoKeyGenerator，在实际使用中，只能生效一种。配置 useGenerateKeys即可开启

<setting_name="useGenerateKeys" value="true">

Jdbc3KeyGenerate 类是给自带主键自增功能的数据库准备的，比如 MySql；SelectKeyGenerator 是给不具备主键自增功能的数据库准备的，比如 Oracle；NoKeyGenerator就是不开启自增主键功能。这篇wiki主要讲Jdbc3KeyGenerate。

Jdbc3KeyGenerate

Jdbc3KeyGenerate，工作原理：Java PO 插入数据库后，将数据库自增产生的 id 读取出来，然后回写给 PO 对象本身。是利用MySql插入数据后反查主键得到自增健，回填入PO。

Jdbc3KeyGenerate

@Override
public void processAfter(Executor executor, MappedStatement ms, Statement stmt, Object parameter) {
  processBatch(ms, stmt, getParameters(parameter));
}

public void processBatch(MappedStatement ms, Statement stmt, Collection<Object> parameters) {
  ResultSet rs = null;
  try {
    // 通过调用 mysql.jdbc 获取自增主键结果 rs
    rs = stmt.getGeneratedKeys();
    // 后续都是分配 rs 回填给 PO 的操作
    if (keyProperties != null && rsmd.getColumnCount() >= keyProperties.length) {
      for (Object parameter : parameters) {
        // ......
        populateKeys(rs, metaParam, keyProperties, typeHandlers);
      }
    }
  } catch (Exception e) {
    throw new ExecutorException("Error getting generated key or setting result to parameter object. Cause: " + e, e);
  } finally {
    // ..........
  }
}

追朔到 jdbc 的 getGeneratedKeys 方法里

在 mysql.jdbc 的 StatementImpl 类中

public java.sql.ResultSet getGeneratedKeys() throws SQLException {
  synchronized (checkClosed().getConnectionMutex()) {
    // 。。。。。。
    // 这里就是我们 insert on duplicate key 获取自增主键的逻辑了
    if (this.batchedGeneratedKeys == null) {
      if (this.lastQueryIsOnDupKeyUpdate) {
        return this.generatedKeysResults = getGeneratedKeysInternal(1);
      }
      return this.generatedKeysResults = getGeneratedKeysInternal();
    }
    // 正常获取自增主键的逻辑，会去反查
    Field[] fields = new Field[1];
    fields[0] = new Field("", "GENERATED_KEY", Types.BIGINT, 20);
    fields[0].setConnection(this.connection);
    this.generatedKeysResults = com.mysql.jdbc.ResultSetImpl.getInstance(this.currentCatalog, fields, new RowDataStatic(this.batchedGeneratedKeys),
                                                                         this.connection, this, false);
    return this.generatedKeysResults;
  }
}

秘密就隐藏在 getGeneratedKeysInternal 方法里。

代码块

protected ResultSetInternalMethods getGeneratedKeysInternal() throws SQLException {
    long numKeys = getLargeUpdateCount();
    return getGeneratedKeysInternal(numKeys);
}

protected ResultSetInternalMethods getGeneratedKeysInternal(long numKeys) throws SQLException {
    synchronized (checkClosed().getConnectionMutex()) {
        // ..........
        // 获取最后一个插入的主键
        long beginAt = getLastInsertID();
        // ..........
        if (this.results != null) {
            // ......
            if ((beginAt != 0 /* BIGINT UNSIGNED can wrap the protocol representation */) && (numKeys > 0)) {
                for (int i = 0; i < numKeys; i++) {
                    // .......
                    rowSet.add(new ByteArrayRow(row, getExceptionInterceptor()));
                    // 居然是 自己 ++ 
                    beginAt += this.connection.getAutoIncrementIncrement();
                }
            }
        }
        com.mysql.jdbc.ResultSetImpl gkRs = com.mysql.jdbc.ResultSetImpl.getInstance(this.currentCatalog, fields, new RowDataStatic(rowSet),
                this.connection, this, false);
        return gkRs;
    }
}

结论

如果是 insertOrUpdate (insert on duplicate key)，生成自增主键逻辑是：先查出最后一次插入的主键，记录为 beginAt，之后计算插入行数，逐行 +1 回填自增主键。所以 insertOrUpdate 开启自增主键机制时，会存在Mybatis改变内存里已有主键信息风险，而且这种改变很可能是错误的。

最终结论

INSERT … ON DUPLICATE KEY UPDATE这种方式在更新数据时，会将auto_increment的值加1，这样持续更新并插入时就会导致数据库自增主键id不连续，并且增长很快。

MyBatis中

INSERT … ON DUPLICATE KEY UPDATE 在同时新插入 INSERT 和 UPDATE 更新时，回填在JVM内存中改变 PO 的主键 id 值是获取最后一个插入键再自增长的。这是 MyBatis 与 jdbc 的问题。

这种主键回填方式并不会影响 MySQL 数据库，只影响内存中的 PO。只要注意使用时，不要用PO的回填主键去做业务逻辑操作，再反查一次即可。

原文链接：https://blog.csdn.net/weixin_43864838/article/details/111665276