Create a list of values for each key in pandas?












1















I have a CSV file the looks something like this which I have loaded into a dataframe,



keycode, warehouse_id
1, WH1
1, WH2
1, WH1


I want to map that to an output that looks like



keycode, warehouse_ids
1, [WH1, WH2]


I am not sure where to start with this in pandas? I tried using a pivot table but the aggregate but can't choose the right agregrate function.



Thanks in advance.










share|improve this question



























    1















    I have a CSV file the looks something like this which I have loaded into a dataframe,



    keycode, warehouse_id
    1, WH1
    1, WH2
    1, WH1


    I want to map that to an output that looks like



    keycode, warehouse_ids
    1, [WH1, WH2]


    I am not sure where to start with this in pandas? I tried using a pivot table but the aggregate but can't choose the right agregrate function.



    Thanks in advance.










    share|improve this question

























      1












      1








      1








      I have a CSV file the looks something like this which I have loaded into a dataframe,



      keycode, warehouse_id
      1, WH1
      1, WH2
      1, WH1


      I want to map that to an output that looks like



      keycode, warehouse_ids
      1, [WH1, WH2]


      I am not sure where to start with this in pandas? I tried using a pivot table but the aggregate but can't choose the right agregrate function.



      Thanks in advance.










      share|improve this question














      I have a CSV file the looks something like this which I have loaded into a dataframe,



      keycode, warehouse_id
      1, WH1
      1, WH2
      1, WH1


      I want to map that to an output that looks like



      keycode, warehouse_ids
      1, [WH1, WH2]


      I am not sure where to start with this in pandas? I tried using a pivot table but the aggregate but can't choose the right agregrate function.



      Thanks in advance.







      python pandas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 13 '18 at 4:55









      tmcnicoltmcnicol

      44738




      44738
























          3 Answers
          3






          active

          oldest

          votes


















          2














          Use groupby+unique:



          df1 = df.groupby('keycode,')['warehouse_id'].unique().reset_index()

          print(df1)
          keycode, warehouse_id
          0 1, [WH1, WH2]




          Explanation :



          Generally while using groupby with a single operation column like below produces Series with index as groupby key, for that we use reset_index to change index to column:



          print(df.groupby('keycode,')['warehouse_id'].unique())
          keycode,
          1, [WH1, WH2]
          Name: warehouse_id, dtype: object

          print(type(df.groupby('keycode,')['warehouse_id'].unique()))
          <class 'pandas.core.series.Series'>

          print(df.groupby('keycode,')['warehouse_id'].unique().reset_index())
          keycode, warehouse_id
          0 1, [WH1, WH2]





          share|improve this answer





















          • 1





            Nice answer (-:

            – piRSquared
            Nov 13 '18 at 5:12











          • Forgive my ignorance what is the reset_index doing for me here?

            – tmcnicol
            Nov 13 '18 at 5:15











          • @tmcnicol Check the explanation.

            – Sandeep Kadapa
            Nov 13 '18 at 5:22











          • @sandeep-kadapa Thanks so much, I would upvote again if I could.

            – tmcnicol
            Nov 13 '18 at 5:24











          • @tmcnicol Glad to help.

            – Sandeep Kadapa
            Nov 13 '18 at 5:24



















          1














          pandas 'groupby' operator is used for doing these type of things.



          you can just do:



          df.groupby('keycode')['warehouse_id'].apply(list)


          assuming 'df' is your dataframe name.






          share|improve this answer































            1














            list(set(iterable))



            df.groupby('keycode').warehouse_id.apply(lambda x: [*{*x}]).reset_index()

            keycode warehouse_id
            0 1 [WH2, WH1]




            drop_duplicates



            df.drop_duplicates().groupby('keycode').warehouse_id.apply(list).reset_index()

            keycode warehouse_id
            0 1 [WH1, WH2]





            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53274052%2fcreate-a-list-of-values-for-each-key-in-pandas%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              2














              Use groupby+unique:



              df1 = df.groupby('keycode,')['warehouse_id'].unique().reset_index()

              print(df1)
              keycode, warehouse_id
              0 1, [WH1, WH2]




              Explanation :



              Generally while using groupby with a single operation column like below produces Series with index as groupby key, for that we use reset_index to change index to column:



              print(df.groupby('keycode,')['warehouse_id'].unique())
              keycode,
              1, [WH1, WH2]
              Name: warehouse_id, dtype: object

              print(type(df.groupby('keycode,')['warehouse_id'].unique()))
              <class 'pandas.core.series.Series'>

              print(df.groupby('keycode,')['warehouse_id'].unique().reset_index())
              keycode, warehouse_id
              0 1, [WH1, WH2]





              share|improve this answer





















              • 1





                Nice answer (-:

                – piRSquared
                Nov 13 '18 at 5:12











              • Forgive my ignorance what is the reset_index doing for me here?

                – tmcnicol
                Nov 13 '18 at 5:15











              • @tmcnicol Check the explanation.

                – Sandeep Kadapa
                Nov 13 '18 at 5:22











              • @sandeep-kadapa Thanks so much, I would upvote again if I could.

                – tmcnicol
                Nov 13 '18 at 5:24











              • @tmcnicol Glad to help.

                – Sandeep Kadapa
                Nov 13 '18 at 5:24
















              2














              Use groupby+unique:



              df1 = df.groupby('keycode,')['warehouse_id'].unique().reset_index()

              print(df1)
              keycode, warehouse_id
              0 1, [WH1, WH2]




              Explanation :



              Generally while using groupby with a single operation column like below produces Series with index as groupby key, for that we use reset_index to change index to column:



              print(df.groupby('keycode,')['warehouse_id'].unique())
              keycode,
              1, [WH1, WH2]
              Name: warehouse_id, dtype: object

              print(type(df.groupby('keycode,')['warehouse_id'].unique()))
              <class 'pandas.core.series.Series'>

              print(df.groupby('keycode,')['warehouse_id'].unique().reset_index())
              keycode, warehouse_id
              0 1, [WH1, WH2]





              share|improve this answer





















              • 1





                Nice answer (-:

                – piRSquared
                Nov 13 '18 at 5:12











              • Forgive my ignorance what is the reset_index doing for me here?

                – tmcnicol
                Nov 13 '18 at 5:15











              • @tmcnicol Check the explanation.

                – Sandeep Kadapa
                Nov 13 '18 at 5:22











              • @sandeep-kadapa Thanks so much, I would upvote again if I could.

                – tmcnicol
                Nov 13 '18 at 5:24











              • @tmcnicol Glad to help.

                – Sandeep Kadapa
                Nov 13 '18 at 5:24














              2












              2








              2







              Use groupby+unique:



              df1 = df.groupby('keycode,')['warehouse_id'].unique().reset_index()

              print(df1)
              keycode, warehouse_id
              0 1, [WH1, WH2]




              Explanation :



              Generally while using groupby with a single operation column like below produces Series with index as groupby key, for that we use reset_index to change index to column:



              print(df.groupby('keycode,')['warehouse_id'].unique())
              keycode,
              1, [WH1, WH2]
              Name: warehouse_id, dtype: object

              print(type(df.groupby('keycode,')['warehouse_id'].unique()))
              <class 'pandas.core.series.Series'>

              print(df.groupby('keycode,')['warehouse_id'].unique().reset_index())
              keycode, warehouse_id
              0 1, [WH1, WH2]





              share|improve this answer















              Use groupby+unique:



              df1 = df.groupby('keycode,')['warehouse_id'].unique().reset_index()

              print(df1)
              keycode, warehouse_id
              0 1, [WH1, WH2]




              Explanation :



              Generally while using groupby with a single operation column like below produces Series with index as groupby key, for that we use reset_index to change index to column:



              print(df.groupby('keycode,')['warehouse_id'].unique())
              keycode,
              1, [WH1, WH2]
              Name: warehouse_id, dtype: object

              print(type(df.groupby('keycode,')['warehouse_id'].unique()))
              <class 'pandas.core.series.Series'>

              print(df.groupby('keycode,')['warehouse_id'].unique().reset_index())
              keycode, warehouse_id
              0 1, [WH1, WH2]






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Nov 13 '18 at 5:20

























              answered Nov 13 '18 at 4:58









              Sandeep KadapaSandeep Kadapa

              6,633429




              6,633429








              • 1





                Nice answer (-:

                – piRSquared
                Nov 13 '18 at 5:12











              • Forgive my ignorance what is the reset_index doing for me here?

                – tmcnicol
                Nov 13 '18 at 5:15











              • @tmcnicol Check the explanation.

                – Sandeep Kadapa
                Nov 13 '18 at 5:22











              • @sandeep-kadapa Thanks so much, I would upvote again if I could.

                – tmcnicol
                Nov 13 '18 at 5:24











              • @tmcnicol Glad to help.

                – Sandeep Kadapa
                Nov 13 '18 at 5:24














              • 1





                Nice answer (-:

                – piRSquared
                Nov 13 '18 at 5:12











              • Forgive my ignorance what is the reset_index doing for me here?

                – tmcnicol
                Nov 13 '18 at 5:15











              • @tmcnicol Check the explanation.

                – Sandeep Kadapa
                Nov 13 '18 at 5:22











              • @sandeep-kadapa Thanks so much, I would upvote again if I could.

                – tmcnicol
                Nov 13 '18 at 5:24











              • @tmcnicol Glad to help.

                – Sandeep Kadapa
                Nov 13 '18 at 5:24








              1




              1





              Nice answer (-:

              – piRSquared
              Nov 13 '18 at 5:12





              Nice answer (-:

              – piRSquared
              Nov 13 '18 at 5:12













              Forgive my ignorance what is the reset_index doing for me here?

              – tmcnicol
              Nov 13 '18 at 5:15





              Forgive my ignorance what is the reset_index doing for me here?

              – tmcnicol
              Nov 13 '18 at 5:15













              @tmcnicol Check the explanation.

              – Sandeep Kadapa
              Nov 13 '18 at 5:22





              @tmcnicol Check the explanation.

              – Sandeep Kadapa
              Nov 13 '18 at 5:22













              @sandeep-kadapa Thanks so much, I would upvote again if I could.

              – tmcnicol
              Nov 13 '18 at 5:24





              @sandeep-kadapa Thanks so much, I would upvote again if I could.

              – tmcnicol
              Nov 13 '18 at 5:24













              @tmcnicol Glad to help.

              – Sandeep Kadapa
              Nov 13 '18 at 5:24





              @tmcnicol Glad to help.

              – Sandeep Kadapa
              Nov 13 '18 at 5:24













              1














              pandas 'groupby' operator is used for doing these type of things.



              you can just do:



              df.groupby('keycode')['warehouse_id'].apply(list)


              assuming 'df' is your dataframe name.






              share|improve this answer




























                1














                pandas 'groupby' operator is used for doing these type of things.



                you can just do:



                df.groupby('keycode')['warehouse_id'].apply(list)


                assuming 'df' is your dataframe name.






                share|improve this answer


























                  1












                  1








                  1







                  pandas 'groupby' operator is used for doing these type of things.



                  you can just do:



                  df.groupby('keycode')['warehouse_id'].apply(list)


                  assuming 'df' is your dataframe name.






                  share|improve this answer













                  pandas 'groupby' operator is used for doing these type of things.



                  you can just do:



                  df.groupby('keycode')['warehouse_id'].apply(list)


                  assuming 'df' is your dataframe name.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 13 '18 at 5:02









                  gauravtolanigauravtolani

                  564




                  564























                      1














                      list(set(iterable))



                      df.groupby('keycode').warehouse_id.apply(lambda x: [*{*x}]).reset_index()

                      keycode warehouse_id
                      0 1 [WH2, WH1]




                      drop_duplicates



                      df.drop_duplicates().groupby('keycode').warehouse_id.apply(list).reset_index()

                      keycode warehouse_id
                      0 1 [WH1, WH2]





                      share|improve this answer




























                        1














                        list(set(iterable))



                        df.groupby('keycode').warehouse_id.apply(lambda x: [*{*x}]).reset_index()

                        keycode warehouse_id
                        0 1 [WH2, WH1]




                        drop_duplicates



                        df.drop_duplicates().groupby('keycode').warehouse_id.apply(list).reset_index()

                        keycode warehouse_id
                        0 1 [WH1, WH2]





                        share|improve this answer


























                          1












                          1








                          1







                          list(set(iterable))



                          df.groupby('keycode').warehouse_id.apply(lambda x: [*{*x}]).reset_index()

                          keycode warehouse_id
                          0 1 [WH2, WH1]




                          drop_duplicates



                          df.drop_duplicates().groupby('keycode').warehouse_id.apply(list).reset_index()

                          keycode warehouse_id
                          0 1 [WH1, WH2]





                          share|improve this answer













                          list(set(iterable))



                          df.groupby('keycode').warehouse_id.apply(lambda x: [*{*x}]).reset_index()

                          keycode warehouse_id
                          0 1 [WH2, WH1]




                          drop_duplicates



                          df.drop_duplicates().groupby('keycode').warehouse_id.apply(list).reset_index()

                          keycode warehouse_id
                          0 1 [WH1, WH2]






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 13 '18 at 5:10









                          piRSquaredpiRSquared

                          153k22146287




                          153k22146287






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53274052%2fcreate-a-list-of-values-for-each-key-in-pandas%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Full-time equivalent

                              Bicuculline

                              さくらももこ