Pandas select on multiple columns then replace











up vote
2
down vote

favorite












I am trying to do a multiple column select then replace in pandas



df:



a  b  c  d  e
0 1 1 0 none
0 0 0 1 none
1 0 0 0 none
0 0 0 0 none


select where any or all of a, b, c, d are non zero



i, j = np.where(df)
s=pd.Series(dict(zip(zip(i, j),
df.columns[j]))).reset_index(-1, drop=True)


s:



0   b
0 c
1 d
2 a


Now I want to replace the values in column e by the series:



df['e'] = s.values


so that e looks like:



e:



b, c 
d
a
none


But the problem is that the lengths of the series are different to the number of rows in the dataframe.



Any idea on how I can do this?










share|improve this question




















  • 1




    Your commend code worked perfectly. I couldn't get the 'duplicate' answer to work. So from that perspective isnt a 100% duplicate
    – proximacentauri
    Nov 11 at 4:50















up vote
2
down vote

favorite












I am trying to do a multiple column select then replace in pandas



df:



a  b  c  d  e
0 1 1 0 none
0 0 0 1 none
1 0 0 0 none
0 0 0 0 none


select where any or all of a, b, c, d are non zero



i, j = np.where(df)
s=pd.Series(dict(zip(zip(i, j),
df.columns[j]))).reset_index(-1, drop=True)


s:



0   b
0 c
1 d
2 a


Now I want to replace the values in column e by the series:



df['e'] = s.values


so that e looks like:



e:



b, c 
d
a
none


But the problem is that the lengths of the series are different to the number of rows in the dataframe.



Any idea on how I can do this?










share|improve this question




















  • 1




    Your commend code worked perfectly. I couldn't get the 'duplicate' answer to work. So from that perspective isnt a 100% duplicate
    – proximacentauri
    Nov 11 at 4:50













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I am trying to do a multiple column select then replace in pandas



df:



a  b  c  d  e
0 1 1 0 none
0 0 0 1 none
1 0 0 0 none
0 0 0 0 none


select where any or all of a, b, c, d are non zero



i, j = np.where(df)
s=pd.Series(dict(zip(zip(i, j),
df.columns[j]))).reset_index(-1, drop=True)


s:



0   b
0 c
1 d
2 a


Now I want to replace the values in column e by the series:



df['e'] = s.values


so that e looks like:



e:



b, c 
d
a
none


But the problem is that the lengths of the series are different to the number of rows in the dataframe.



Any idea on how I can do this?










share|improve this question















I am trying to do a multiple column select then replace in pandas



df:



a  b  c  d  e
0 1 1 0 none
0 0 0 1 none
1 0 0 0 none
0 0 0 0 none


select where any or all of a, b, c, d are non zero



i, j = np.where(df)
s=pd.Series(dict(zip(zip(i, j),
df.columns[j]))).reset_index(-1, drop=True)


s:



0   b
0 c
1 d
2 a


Now I want to replace the values in column e by the series:



df['e'] = s.values


so that e looks like:



e:



b, c 
d
a
none


But the problem is that the lengths of the series are different to the number of rows in the dataframe.



Any idea on how I can do this?







python pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 4:41

























asked Nov 11 at 4:40









proximacentauri

5271420




5271420








  • 1




    Your commend code worked perfectly. I couldn't get the 'duplicate' answer to work. So from that perspective isnt a 100% duplicate
    – proximacentauri
    Nov 11 at 4:50














  • 1




    Your commend code worked perfectly. I couldn't get the 'duplicate' answer to work. So from that perspective isnt a 100% duplicate
    – proximacentauri
    Nov 11 at 4:50








1




1




Your commend code worked perfectly. I couldn't get the 'duplicate' answer to work. So from that perspective isnt a 100% duplicate
– proximacentauri
Nov 11 at 4:50




Your commend code worked perfectly. I couldn't get the 'duplicate' answer to work. So from that perspective isnt a 100% duplicate
– proximacentauri
Nov 11 at 4:50












2 Answers
2






active

oldest

votes

















up vote
2
down vote



accepted










Use DataFrame.dot for product with columns names, add rstrip, last add numpy.where for replace empty strings to None:



e = df.dot(df.columns + ', ').str.rstrip(', ')
df['e'] = np.where(e.astype(bool), e, None)
print (df)
a b c d e
0 0 1 1 0 b, c
1 0 0 0 1 d
2 1 0 0 0 a
3 0 0 0 0 None





share|improve this answer




























    up vote
    2
    down vote













    You can locate the 1's and use their locations as boolean indexes into the dataframe columns:



    df['e'] = (df==1).apply(lambda x: df.columns[x], axis=1)
    .str.join(",").replace('','none')
    # a b c d e
    #0 0 1 1 0 b,c
    #1 0 0 0 1 d
    #2 1 0 0 0 a
    #3 0 0 0 0 none





    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














       

      draft saved


      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245903%2fpandas-select-on-multiple-columns-then-replace%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      2
      down vote



      accepted










      Use DataFrame.dot for product with columns names, add rstrip, last add numpy.where for replace empty strings to None:



      e = df.dot(df.columns + ', ').str.rstrip(', ')
      df['e'] = np.where(e.astype(bool), e, None)
      print (df)
      a b c d e
      0 0 1 1 0 b, c
      1 0 0 0 1 d
      2 1 0 0 0 a
      3 0 0 0 0 None





      share|improve this answer

























        up vote
        2
        down vote



        accepted










        Use DataFrame.dot for product with columns names, add rstrip, last add numpy.where for replace empty strings to None:



        e = df.dot(df.columns + ', ').str.rstrip(', ')
        df['e'] = np.where(e.astype(bool), e, None)
        print (df)
        a b c d e
        0 0 1 1 0 b, c
        1 0 0 0 1 d
        2 1 0 0 0 a
        3 0 0 0 0 None





        share|improve this answer























          up vote
          2
          down vote



          accepted







          up vote
          2
          down vote



          accepted






          Use DataFrame.dot for product with columns names, add rstrip, last add numpy.where for replace empty strings to None:



          e = df.dot(df.columns + ', ').str.rstrip(', ')
          df['e'] = np.where(e.astype(bool), e, None)
          print (df)
          a b c d e
          0 0 1 1 0 b, c
          1 0 0 0 1 d
          2 1 0 0 0 a
          3 0 0 0 0 None





          share|improve this answer












          Use DataFrame.dot for product with columns names, add rstrip, last add numpy.where for replace empty strings to None:



          e = df.dot(df.columns + ', ').str.rstrip(', ')
          df['e'] = np.where(e.astype(bool), e, None)
          print (df)
          a b c d e
          0 0 1 1 0 b, c
          1 0 0 0 1 d
          2 1 0 0 0 a
          3 0 0 0 0 None






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 11 at 4:49









          jezrael

          310k21246321




          310k21246321
























              up vote
              2
              down vote













              You can locate the 1's and use their locations as boolean indexes into the dataframe columns:



              df['e'] = (df==1).apply(lambda x: df.columns[x], axis=1)
              .str.join(",").replace('','none')
              # a b c d e
              #0 0 1 1 0 b,c
              #1 0 0 0 1 d
              #2 1 0 0 0 a
              #3 0 0 0 0 none





              share|improve this answer



























                up vote
                2
                down vote













                You can locate the 1's and use their locations as boolean indexes into the dataframe columns:



                df['e'] = (df==1).apply(lambda x: df.columns[x], axis=1)
                .str.join(",").replace('','none')
                # a b c d e
                #0 0 1 1 0 b,c
                #1 0 0 0 1 d
                #2 1 0 0 0 a
                #3 0 0 0 0 none





                share|improve this answer

























                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  You can locate the 1's and use their locations as boolean indexes into the dataframe columns:



                  df['e'] = (df==1).apply(lambda x: df.columns[x], axis=1)
                  .str.join(",").replace('','none')
                  # a b c d e
                  #0 0 1 1 0 b,c
                  #1 0 0 0 1 d
                  #2 1 0 0 0 a
                  #3 0 0 0 0 none





                  share|improve this answer














                  You can locate the 1's and use their locations as boolean indexes into the dataframe columns:



                  df['e'] = (df==1).apply(lambda x: df.columns[x], axis=1)
                  .str.join(",").replace('','none')
                  # a b c d e
                  #0 0 1 1 0 b,c
                  #1 0 0 0 1 d
                  #2 1 0 0 0 a
                  #3 0 0 0 0 none






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 11 at 4:55

























                  answered Nov 11 at 4:55









                  DYZ

                  24.3k61948




                  24.3k61948






























                       

                      draft saved


                      draft discarded



















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53245903%2fpandas-select-on-multiple-columns-then-replace%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Coverage of Google Street View

                      Full-time equivalent

                      Surfing