oracle 数据导入odps,SQL_ODPS-MaxCompute-odpscmd-tunnel-数据上传下载

  • Post author:
  • Post category:其他


1. rd行分割 -fd列分割 -charset编码

–建临时表

odps@ work_test_1>create table if not exists t_rd_fd(id int,name string);

–源数据data_t_rd_fd.txt

–666$$张三||555$$李四||

— -rd行分割 -fd列分割 -charset编码

odps@ work_test_1>tunnel upload D:\#ODPS_D2\odps_data\data_t_rd_fd.txt work_test_1.t_rd_fd -rd “||” -fd “$$” -charset gbk;

Upload session: 20200220231800a0dbdb0b12e98660

Start upload:D:\#ODPS_D2\odps_data\data_t_rd_fd.txt

Using || to split records

Upload in strict schema mode: true

Total bytes:26 Split input to 1 blocks

0000-00-00 00:00:00 scan block: ‘1’

0000-00-00 00:00:00 scan block complete, block id: 1

0000-00-00 00:00:00 upload block: ‘1’

0000-00-00 00:00:00 upload block complete, block id: 1

OK

–检查一下

odps@ work_test_1>select * from work_test_1.t_rd_fd;

+————+————+

| id | name |

+————+————+

| 666 | 瀵姳绗? |

| 555 | 閺夊骸娲? |

+————+————+

2.脏数据-dbr true -s only ;tunnel show bad 0000;

–源数据data_t_rd_fd.txt

–666,张三

–555,李四

–1.34,王五

–AA

— -dbr true -s only;审查一下有哪些脏数据 仅读

odps@ work_test_1>tunnel upload D:\#ODPS_D2\odps_data\data_t_rd_fd.txt work_test_1.t_rd_fd -dbr true -s only;

Upload session: 000000000009151647df0b12dd8e9e –注意这个session

Start upload:D:\#ODPS_D2\odps_data\data_t_rd_fd.txt

Using \r\n to split records

Upload in strict schema mode: true

Total bytes:39 Split input to 1 blocks

0000-00-00 00:00:00 scan block: ‘1’

0000-00-00 00:00:00 ERROR: format error – :1, BIGINT:’1.34′ For input string: “1.34”content: 1.34,鐜嬩簲

offset: 29

0000-00-00 00:00:00 ERROR: column mismatch, expected 2 columns, 1 columns found, please check data or delimiter

content: AA

offset: 34

0000-00-00 00:00:00 scan block complete, block id: 1 [bad 2]

OK

odps@ work_test_1>

–tunnel show bad 具体session;查看一下具体是哪些脏数据

–通过这种方法可以快速找到错误信息对源文件修正

odps@ work_test_1>tunnel show bad 0000000000329151647df0b12dd8e9e;

1.34,鐜嬩簲

AA

odps@ work_test_1>

–然后可以去源文件修正或删除,再上传