Join on foreign key in Kafka stream
Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :
- users : usedId, name
- modules of an application : moduleId, name
- grants of users for modules of application : grantId, userId, moduleId, scope
Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :
- create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.
- create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.
- create a stream from the stream of Grants and joining on the two KTable.
It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).
So how to proceed ?
apache-kafka apache-kafka-streams
add a comment |
Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :
- users : usedId, name
- modules of an application : moduleId, name
- grants of users for modules of application : grantId, userId, moduleId, scope
Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :
- create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.
- create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.
- create a stream from the stream of Grants and joining on the two KTable.
It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).
So how to proceed ?
apache-kafka apache-kafka-streams
Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55
No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01
So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06
In Grants stream I want to use theuserId
field to join to the Users table, and themoduleId
field to join to the Modules table. The key of Grants stream isgrantId
.
– gentiane
Nov 12 '18 at 13:11
please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45
add a comment |
Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :
- users : usedId, name
- modules of an application : moduleId, name
- grants of users for modules of application : grantId, userId, moduleId, scope
Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :
- create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.
- create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.
- create a stream from the stream of Grants and joining on the two KTable.
It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).
So how to proceed ?
apache-kafka apache-kafka-streams
Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :
- users : usedId, name
- modules of an application : moduleId, name
- grants of users for modules of application : grantId, userId, moduleId, scope
Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :
- create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.
- create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.
- create a stream from the stream of Grants and joining on the two KTable.
It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).
So how to proceed ?
apache-kafka apache-kafka-streams
apache-kafka apache-kafka-streams
asked Nov 12 '18 at 11:04
gentianegentiane
4,89421625
4,89421625
Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55
No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01
So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06
In Grants stream I want to use theuserId
field to join to the Users table, and themoduleId
field to join to the Modules table. The key of Grants stream isgrantId
.
– gentiane
Nov 12 '18 at 13:11
please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45
add a comment |
Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55
No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01
So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06
In Grants stream I want to use theuserId
field to join to the Users table, and themoduleId
field to join to the Modules table. The key of Grants stream isgrantId
.
– gentiane
Nov 12 '18 at 13:11
please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45
Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55
Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55
No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01
No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01
So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06
So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06
In Grants stream I want to use the
userId
field to join to the Users table, and the moduleId
field to join to the Modules table. The key of Grants stream is grantId
.– gentiane
Nov 12 '18 at 13:11
In Grants stream I want to use the
userId
field to join to the Users table, and the moduleId
field to join to the Modules table. The key of Grants stream is grantId
.– gentiane
Nov 12 '18 at 13:11
please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45
please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45
add a comment |
1 Answer
1
active
oldest
votes
Well, there is no direct support for Foreign key join at the moment in Kafka Streams.
There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.
For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.
First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.
KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);
Now select the moduleID as a key in the Grants stream.
KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);
It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.
KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);
Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.
KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);
Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join
KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);
Above Stream will have user ID as a key and contain all grants and module details for that user.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260817%2fjoin-on-foreign-key-in-kafka-stream%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Well, there is no direct support for Foreign key join at the moment in Kafka Streams.
There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.
For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.
First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.
KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);
Now select the moduleID as a key in the Grants stream.
KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);
It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.
KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);
Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.
KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);
Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join
KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);
Above Stream will have user ID as a key and contain all grants and module details for that user.
add a comment |
Well, there is no direct support for Foreign key join at the moment in Kafka Streams.
There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.
For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.
First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.
KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);
Now select the moduleID as a key in the Grants stream.
KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);
It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.
KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);
Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.
KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);
Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join
KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);
Above Stream will have user ID as a key and contain all grants and module details for that user.
add a comment |
Well, there is no direct support for Foreign key join at the moment in Kafka Streams.
There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.
For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.
First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.
KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);
Now select the moduleID as a key in the Grants stream.
KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);
It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.
KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);
Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.
KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);
Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join
KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);
Above Stream will have user ID as a key and contain all grants and module details for that user.
Well, there is no direct support for Foreign key join at the moment in Kafka Streams.
There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.
For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.
First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.
KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);
Now select the moduleID as a key in the Grants stream.
KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);
It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.
KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);
Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.
KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);
Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join
KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);
Above Stream will have user ID as a key and contain all grants and module details for that user.
answered Nov 12 '18 at 13:26
Nishu TayalNishu Tayal
11.7k73481
11.7k73481
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260817%2fjoin-on-foreign-key-in-kafka-stream%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55
No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01
So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06
In Grants stream I want to use the
userId
field to join to the Users table, and themoduleId
field to join to the Modules table. The key of Grants stream isgrantId
.– gentiane
Nov 12 '18 at 13:11
please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45