详述Flink SQL Connector写入clickhouse的问题与方法

  • Post author:
  • Post category:其他


笔者使用Flink SQL(jdbc connector)将实时数据写入Clickhouse时,查询

Flink官方文档

发现flink-connector-jdbc仅支持MySQL,PostgreSQL和Derby。无奈只能上手查阅资料,并扩展源码功能解决。

注:1.11.0版本之后flink-connector-jdbc

DataStream

支持了ClickHouse Sink



1. 解决办法



1.1 扩展flink-connector-jdbc支持clickhouse



github上下载flink项目源码:

https://github.com/apache/flink.git


下载的flink源码整个项目比较难一次编译成功,这里提供两个快速解决办法,以供参考:

  • 将阿里云仓库镜像放入本地maven settings.xml的中(放在mirrors标签里第一个)
<mirror>
   <id>nexus-aliyun</id>
    <mirrorOf>*,!jeecg,!jeecg-snapshots,!mapr-releases</mirrorOf>
    <name>Nexus aliyun</name>
   <url>http://maven.aliyun.com/nexus/content/groups/public</url>
</mirror>
 
<mirror>
    <id>mapr-public</id>
    <mirrorOf>mapr-releases</mirrorOf>
    <name>mapr-releases</name>
   <url>https://maven.aliyun.com/repository/mapr-public</url>
</mirror>
  • 将flink-connector-jdbc子项目中的flink版本改为已发布的稳定版本号(例:1.12.0),就可以仅打包这个子项目。



仿照flink-connector-jdbc支持mysql的情况添加Clickhouse相关代码

,共需添加三处:

ClickHouseDialect



ClickHouseRowConverter

,修改

JdbcDialects

添加

ClickHouseDialect()

。话不多说,上代码。

  • org.apache.flink.connector.jdbc.dialect.ClickHouseDialect
package org.apache.flink.connector.jdbc.dialect;

import org.apache.flink.connector.jdbc.internal.converter.ClickHouseRowConverter;
import org.apache.flink.connector.jdbc.internal.converter.JdbcRowConverter;
import org.apache.flink.table.types.logical.LogicalTypeRoot;
import org.apache.flink.table.types.logical.RowType;

import java.util.Arrays;
import java.util.List;
import java.util.Optional;

/** JDBC dialect for ClickHouse. */
public class ClickHouseDialect extends AbstractDialect {

   private static final long serialVersionUID = 1L;

   private static final String SQL_DEFAULT_PLACEHOLDER = " :";
   private static final int MAX_TIMESTAMP_PRECISION = 6;
   private static final int MIN_TIMESTAMP_PRECISION = 1;
   private static final int MAX_DECIMAL_PRECISION = 65;
   private static final int MIN_DECIMAL_PRECISION = 1;

   @Override
   public boolean canHandle(String url) {
       return url.startsWith("jdbc:clickhouse:");
   }

   @Override
   public JdbcRowConverter getRowConverter(
           RowType rowType) {
       return new ClickHouseRowConverter(rowType);
   }

   @Override
   public Optional<String> defaultDriverName() {
       return Optional.of("ru.yandex.clickhouse.ClickHouseDriver");
   }

   @Override
   public String quoteIdentifier(String identifier) {
       return identifier;
   }

   @Override
   public int maxDecimalPrecision() {
       return MAX_DECIMAL_PRECISION;
   }

   @Override
   public int minDecimalPrecision() {
       return MIN_DECIMAL_PRECISION;
   }

   @Override
   public int maxTimestampPrecision() {
       return MAX_TIMESTAMP_PRECISION;
   }

   @Override
   public int minTimestampPrecision() {
       return MIN_TIMESTAMP_PRECISION;
   }

   @Override
   public List<LogicalTypeRoot> unsupportedTypes() {
       return Arrays.asList(
               LogicalTypeRoot.BINARY,
               LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZONE,
               LogicalTypeRoot.TIMESTAMP_WITH_TIME_ZONE,
               LogicalTypeRoot.INTERVAL_YEAR_MONTH,
               LogicalTypeRoot.INTERVAL_DAY_TIME,
               LogicalTypeRoot.ARRAY,
               LogicalTypeRoot.MULTISET,
               LogicalTypeRoot.MAP,
               LogicalTypeRoot.ROW,
               LogicalTypeRoot.DISTINCT_TYPE,
               LogicalTypeRoot.STRUCTURED_TYPE,
               LogicalTypeRoot.NULL,
               LogicalTypeRoot.RAW,
               LogicalTypeRoot.SYMBOL,
               LogicalTypeRoot.UNRESOLVED
       );
   }

   @Override
   public String dialectName() {
       return "ClickHouse";
   }

}

  • org.apache.flink.connector.jdbc.internal.converter.ClickHouseRowConverter
package org.apache.flink.connector.jdbc.internal.converter;

import org.apache.flink.table.types.logical.RowType;

/**
 * Runtime converter that responsible to convert between JDBC object and Flink internal object for
 * ClickHouse.
 */
public class ClickHouseRowConverter extends AbstractJdbcRowConverter {

    private static final long serialVersionUID = 1L;

    @Override
    public String converterName() {
        return "ClickHouse";
    }

    public ClickHouseRowConverter(RowType rowType) {
        super(rowType);
    }
}

org.apache.flink.connector.jdbc.dialect.JdbcDialects

public final class JdbcDialects {

    private static final List<JdbcDialect> DIALECTS = Arrays.asList(
            new DerbyDialect(),
            new MySQLDialect(),
            new PostgresDialect(),
            new ClickHouseDialect(), //这里是上面的ClickHouseDialect的类对象
            new OracleDialect()
    );
    ......
    ......
}

其它支持JDBC的数据源也可使用此方法继续扩展,比如笔者这里的OracleDialect()。



1.2 使用flink-connector-clickhouse

该方法阿里云文档 “

使用flink-connector-clickhouse写入ClickHouse

” 中有详细描述,可自行查阅。

注:该connector从Flink 1.12版本开始支持,Flink1.11使用会报错。且该插件maven依赖下载和安装会出问题,可直接下载jar包导入项目Libraries中。或者在github上有相关代码,这里就自行寻找不贴链接了。



2. Flink SQL读写Clickhouse示例



2.1 flink-connector-jdbc使用

CREATE TABLE test_rightTable (
  id INT,
  name STRING,
  gender STRING,
  age INT,
  address STRING,
  phone INT
) WITH (
  'connector' = 'jdbc',
  'url' = 'jdbc:clickhouse://*****:8123/default',
  'username' = 'default',
  'password' = '******',
  'table-name' = 'test_rightTable'
);



2.2 flink-connector-clickhouse使用

CREATE TABLE test_sink_clickhouse_table 
   logtime STRING,
   col1 STRING,
   col2 STRING
 ) WITH (
   'connector' = 'clickhouse',
   'url' = 'clickhouse://****:8124', //clickhouse的host:port
   'table-name' = 'test_sink_clickhouse_table'
)

完整SQL程序可参考

Flink 1.11 SQL 快速上手,内含Demo及详细分析和使用过程,亲测可行!



3. 其它问题及注意事项

① ClickHouse数据抽取问题

flink-connector-clickhouse只支持加载,不支持抽取,扩展的flink-connector-jdbc是都支持的。

② 库名指定问题

配置url的时候使用

clickhouse://****:8124/${database-name}

并不会直接报错,但这里的库名指定是不生效的,如果库名不是default,且报错该表在default下不存在,需要另外使用

database-name

参数指定。



版权声明:本文为weixin_44056920原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。