OceanBase的table api为应用提供了ObHBase的访问接口,因此,OceanBase的table api的reader与HBase Reader的结构和配置方法类似。 obhbasereader插件支持sql和hbase api两种读取方式,两种方式存在如下区别:
sql方式仅支持获取获得最新或者最旧版本的数据,而hbase api支持获得多版本数据。
{
"job": {
"setting": {
"speed": {
"channel": 3,
"byte": 104857600
},
"errorLimit": {
"record": 10
}
},
"content": [
{
"reader": {
"name": "obhbasereader",
"parameter": {
"username": "username",
"password": "password",
"encoding": "utf8",
"column": [
{
"name": "f1:column1_1",
"type": "string"
},
{
"name": "f1:column2_2",
"type": "string"
},
{
"name": "f1:column1_1",
"type": "string"
},
{
"name": "f1:column2_2",
"type": "string"
}
],
"range": [
{
"startRowkey": "aaa",
"endRowkey": "ccc",
"isBinaryRowkey": false
},
{
"startRowkey": "eee",
"endRowkey": "zzz",
"isBinaryRowkey": false
}
],
"mode": "normal",
"readByPartition": "true",
"scanCacheSize": "",
"readerHint": "",
"readBatchSize": "1000",
"connection": [
{
"table": [
"htable1",
"htable2"
],
"jdbcUrl": [
"||_dsc_ob10_dsc_||集群:租户||_dsc_ob10_dsc_||jdbc:mysql://ip:port/dbName1"
],
"username": "username",
"password": "password"
},
{
"table": [
"htable1",
"htable2"
],
"jdbcUrl": [
"jdbc:mysql://ip:port/database"
]
}
]
}
},
"writer": {
"name": "txtfilewriter",
"parameter": {
"path": "/Users/xujing/datax/txtfile",
"charset": "UTF-8",
"fieldDelimiter": ",",
"fileName": "hbase",
"nullFormat": "null",
"writeMode": "truncate"
}
}
}
]
}
}
jdbcUrl
描述:连接ob使用的jdbc url,支持如下两种格式:
必选:是
默认值:无
table
readByPartition
partitionName
readBatchSize
fetchSize
scanCacheSize
readerHint
column
支持列裁剪,即列可以挑选部分列进行导出。
支持列换序,即列可以不按照表schema信息进行导出,同时支持通配符*,在使用之前需仔细核对列信息。
必选:sql方式读取时必选
range
username
mode
version
一些注意点: 注:如果配置了partitionName,则无需再配置readByPartition,即便配置了也会忽略readByPartition选项,而是仅会读取指定分区的数据。 注:如果配置了readByPartition,任务将仅按照分区切分任务,而不会再按照K值进行切分。如果是非分区表,则整张表会被当作一个任务而不会再切分。