GDBReader插件实现读取GDB实例数据的功能,通过Gremlin Client
连接远程GDB实例,按配置提供的label
生成查询DSL,遍历点或边数据,包括属性数据,并将数据写入到Record中给到Writer使用。
GDBReader使用Gremlin Client
连接GDB实例,按label
分不同Task取点或边数据。
单个Task中按label
遍历点或边的id,再切分范围分多次请求查询点或边和属性数据,最后将点或边数据根据配置转换成指定格式记录发送给下游写插件。
GDBReader按label
切分多个Task并发,同一个label
的数据批量异步获取来加快读取速度。如果配置读取的label
列表为空,任务启动前会从GDB查询所有label
再切分Task。
GDB中点和边不同,读取需要区分点和边点配置。
{
"job": {
"setting": {
"speed": {
"channel": 1
}
"errorLimit": {
"record": 1
}
},
"content": [
{
"reader": {
"name": "gdbreader",
"parameter": {
"host": "10.218.145.24",
"port": 8182,
"username": "***",
"password": "***",
"fetchBatchSize": 100,
"rangeSplitSize": 1000,
"labelType": "VERTEX",
"labels": ["label1", "label2"],
"column": [
{
"name": "id",
"type": "string",
"columnType": "primaryKey"
},
{
"name": "label",
"type": "string",
"columnType": "primaryLabel"
},
{
"name": "age",
"type": "int",
"columnType": "vertexProperty"
}
]
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"print": true
}
}
}
]
}
}
{
"job": {
"setting": {
"speed": {
"channel": 1
},
"errorLimit": {
"record": 1
}
},
"content": [
{
"reader": {
"name": "gdbreader",
"parameter": {
"host": "10.218.145.24",
"port": 8182,
"username": "***",
"password": "***",
"fetchBatchSize": 100,
"rangeSplitSize": 1000,
"labelType": "EDGE",
"labels": ["label1", "label2"],
"column": [
{
"name": "id",
"type": "string",
"columnType": "primaryKey"
},
{
"name": "label",
"type": "string",
"columnType": "primaryLabel"
},
{
"name": "srcId",
"type": "string",
"columnType": "srcPrimaryKey"
},
{
"name": "srcLabel",
"type": "string",
"columnType": "srcPrimaryLabel"
},
{
"name": "dstId",
"type": "string",
"columnType": "srcPrimaryKey"
},
{
"name": "dstLabel",
"type": "string",
"columnType": "srcPrimaryLabel"
},
{
"name": "name",
"type": "string",
"columnType": "edgeProperty"
},
{
"name": "weight",
"type": "double",
"columnType": "edgeProperty"
}
]
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"print": true
}
}
}
]
}
}
host
port
username
password
fetchBatchSize
rangeSplitSize
labels
labelType
column
column -> name
column -> type
column -> columnType
vertexJsonProperty格式示例,新增c
字段区分SET属性,但是SET属性只包含单个属性值时会标记成普通属性
{"properties":[
{"k":"name","t","string","v":"Jack","c":"set"},
{"k":"name","t","string","v":"Luck","c":"set"},
{"k":"age","t","int","v":"20","c":"single"}
]}
edgeJsonProperty格式示例,边不支持多值属性 ``` {"properties":[ {"k":"created_at","t","long","v":"153498653"}, {"k":"weight","t","double","v":"3.14"} ]}
(TODO)
无
无