Join on foreign key in Kafka stream












1














Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :




  • users : usedId, name

  • modules of an application : moduleId, name

  • grants of users for modules of application : grantId, userId, moduleId, scope


Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :




  1. create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.

  2. create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.

  3. create a stream from the stream of Grants and joining on the two KTable.
    It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).


So how to proceed ?










share|improve this question






















  • Is product table same as module table?
    – Nishu Tayal
    Nov 12 '18 at 11:55










  • No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
    – gentiane
    Nov 12 '18 at 13:01












  • So which key do you want to use in Grants table to refer the product ID?
    – Nishu Tayal
    Nov 12 '18 at 13:06










  • In Grants stream I want to use the userId field to join to the Users table, and the moduleId field to join to the Modules table. The key of Grants stream is grantId.
    – gentiane
    Nov 12 '18 at 13:11












  • please check the answer, and see if it helps you
    – Nishu Tayal
    Nov 12 '18 at 18:45
















1














Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :




  • users : usedId, name

  • modules of an application : moduleId, name

  • grants of users for modules of application : grantId, userId, moduleId, scope


Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :




  1. create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.

  2. create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.

  3. create a stream from the stream of Grants and joining on the two KTable.
    It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).


So how to proceed ?










share|improve this question






















  • Is product table same as module table?
    – Nishu Tayal
    Nov 12 '18 at 11:55










  • No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
    – gentiane
    Nov 12 '18 at 13:01












  • So which key do you want to use in Grants table to refer the product ID?
    – Nishu Tayal
    Nov 12 '18 at 13:06










  • In Grants stream I want to use the userId field to join to the Users table, and the moduleId field to join to the Modules table. The key of Grants stream is grantId.
    – gentiane
    Nov 12 '18 at 13:11












  • please check the answer, and see if it helps you
    – Nishu Tayal
    Nov 12 '18 at 18:45














1












1








1


2





Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :




  • users : usedId, name

  • modules of an application : moduleId, name

  • grants of users for modules of application : grantId, userId, moduleId, scope


Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :




  1. create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.

  2. create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.

  3. create a stream from the stream of Grants and joining on the two KTable.
    It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).


So how to proceed ?










share|improve this question













Lets say that I have three Kafka topics filled with events representing business events occuring in different aggregates (event sourcing application). These events allow to build aggregates with following attributes :




  • users : usedId, name

  • modules of an application : moduleId, name

  • grants of users for modules of application : grantId, userId, moduleId, scope


Now I want to create a stream of all grants with name of users and products (instead of id).
I thought to do so :




  1. create a KTable for users by grouping events by userId. The KTable has userId as key. It is ok.

  2. create a KTable for products by grouping events by productId. The KTable has productId as key. It is ok.

  3. create a stream from the stream of Grants and joining on the two KTable.
    It is no ok. The problem is that joins seem only possible on primary keys. But the key of the stream is an technical identifier of the Grant and keys of users and products tables are not (they are agnostic of Grant).


So how to proceed ?







apache-kafka apache-kafka-streams






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 12 '18 at 11:04









gentianegentiane

4,89421625




4,89421625












  • Is product table same as module table?
    – Nishu Tayal
    Nov 12 '18 at 11:55










  • No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
    – gentiane
    Nov 12 '18 at 13:01












  • So which key do you want to use in Grants table to refer the product ID?
    – Nishu Tayal
    Nov 12 '18 at 13:06










  • In Grants stream I want to use the userId field to join to the Users table, and the moduleId field to join to the Modules table. The key of Grants stream is grantId.
    – gentiane
    Nov 12 '18 at 13:11












  • please check the answer, and see if it helps you
    – Nishu Tayal
    Nov 12 '18 at 18:45


















  • Is product table same as module table?
    – Nishu Tayal
    Nov 12 '18 at 11:55










  • No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
    – gentiane
    Nov 12 '18 at 13:01












  • So which key do you want to use in Grants table to refer the product ID?
    – Nishu Tayal
    Nov 12 '18 at 13:06










  • In Grants stream I want to use the userId field to join to the Users table, and the moduleId field to join to the Modules table. The key of Grants stream is grantId.
    – gentiane
    Nov 12 '18 at 13:11












  • please check the answer, and see if it helps you
    – Nishu Tayal
    Nov 12 '18 at 18:45
















Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55




Is product table same as module table?
– Nishu Tayal
Nov 12 '18 at 11:55












No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01






No they are two different tables. But they are used in the same way in this case (just as reference table to get information of users and of modules).
– gentiane
Nov 12 '18 at 13:01














So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06




So which key do you want to use in Grants table to refer the product ID?
– Nishu Tayal
Nov 12 '18 at 13:06












In Grants stream I want to use the userId field to join to the Users table, and the moduleId field to join to the Modules table. The key of Grants stream is grantId.
– gentiane
Nov 12 '18 at 13:11






In Grants stream I want to use the userId field to join to the Users table, and the moduleId field to join to the Modules table. The key of Grants stream is grantId.
– gentiane
Nov 12 '18 at 13:11














please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45




please check the answer, and see if it helps you
– Nishu Tayal
Nov 12 '18 at 18:45












1 Answer
1






active

oldest

votes


















1














Well, there is no direct support for Foreign key join at the moment in Kafka Streams.

There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.



For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.



First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.



KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);


Now select the moduleID as a key in the Grants stream.



KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);


It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.



KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);


Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.



KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);


Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join



 KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);


Above Stream will have user ID as a key and contain all grants and module details for that user.






share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260817%2fjoin-on-foreign-key-in-kafka-stream%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    Well, there is no direct support for Foreign key join at the moment in Kafka Streams.

    There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.



    For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.



    First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.



    KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
    KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);


    Now select the moduleID as a key in the Grants stream.



    KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);


    It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.



    KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);


    Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.



    KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);


    Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join



     KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);


    Above Stream will have user ID as a key and contain all grants and module details for that user.






    share|improve this answer


























      1














      Well, there is no direct support for Foreign key join at the moment in Kafka Streams.

      There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.



      For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.



      First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.



      KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
      KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);


      Now select the moduleID as a key in the Grants stream.



      KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);


      It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.



      KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);


      Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.



      KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);


      Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join



       KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);


      Above Stream will have user ID as a key and contain all grants and module details for that user.






      share|improve this answer
























        1












        1








        1






        Well, there is no direct support for Foreign key join at the moment in Kafka Streams.

        There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.



        For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.



        First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.



        KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
        KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);


        Now select the moduleID as a key in the Grants stream.



        KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);


        It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.



        KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);


        Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.



        KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);


        Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join



         KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);


        Above Stream will have user ID as a key and contain all grants and module details for that user.






        share|improve this answer












        Well, there is no direct support for Foreign key join at the moment in Kafka Streams.

        There is an open KIP : https://issues.apache.org/jira/browse/KAFKA-3705 for the same.



        For now, there can be a workaround to solve this problem. You can use KStream-KTable Join.



        First Aggregate the User Stream and Module Stream into respective KTable with aggregated collection of Events.



        KTable<String,Object> UserTable = userStream.groupBy(<UserId>).aggregate(<... build collection/latest event>) ;
        KTable<String,Object> ModuleTable = moduleStream.groupBy(<ModuleId>).aggregate(<... build collection/latest event>);


        Now select the moduleID as a key in the Grants stream.



        KStream<String,Object> grantRekeyedStream = grantStream.selectKey(<moduleId>);


        It will change the key to moduleId. Now you can perform Stream-Table Join with ModuleTable. It will join all the matching records from right side for the key in the left side. Result stream will have Grant and Module data into one stream with ModuleId as key.



        KStream<String,Object> grantModuleStream = grantRekeyedStream.join(moduleTable);


        Next step is to join with userTable. Hence you need to rekey the grantModuleTable again with userId.



        KStream<String,Object> grantModuleRekeyedStream = grantModuleTable.selectKey(<Select UserId>);


        Now grantModuleRekeyedStream can be joined with userTable with KStream-KTable Join



         KStream<String,Object> grantModuleUserStream = grantModuleRekeyedStream .join(userTable);


        Above Stream will have user ID as a key and contain all grants and module details for that user.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 12 '18 at 13:26









        Nishu TayalNishu Tayal

        11.7k73481




        11.7k73481






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260817%2fjoin-on-foreign-key-in-kafka-stream%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Full-time equivalent

            Bicuculline

            さくらももこ