Pandas pivot table using custom conditions on the dataframe











up vote
4
down vote

favorite












I want to make a pivot table based on custom conditions in the dataframe:



The dataframe looks like this:



>>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
"City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
"Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
"Population" : [100,150,50,200,170,390,80,100]
"Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })
>>> df
Area City Condition Population Pincode
0 A X Good 100 X1
1 A Y Bad 150 Y1
2 B Z Good 50 Z1
3 A P Good 200 P1
4 C Q Good 170 Q1
5 A R Bad 390 R1
6 D S Good 80 S1
7 A X Good 100 X2


Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



I expect an output like this:



Area  city_count  good_city_count   Population
A 4 2 940
B 1 1 50
C 1 1 170
D 1 1 80
All 7 5 1240


I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



>>> city_count = df.pivot_table(index=["Area"],
values=["City", "Population"],
aggfunc={"City": lambda x: len(x.unique()),
"Population": "sum"},
margins=True)

Area City Population
0 A 4 940
1 B 1 50
2 C 1 170
3 D 1 80
4 All 7 1240


I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.










share|improve this question









New contributor




Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
























    up vote
    4
    down vote

    favorite












    I want to make a pivot table based on custom conditions in the dataframe:



    The dataframe looks like this:



    >>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
    "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
    "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
    "Population" : [100,150,50,200,170,390,80,100]
    "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })
    >>> df
    Area City Condition Population Pincode
    0 A X Good 100 X1
    1 A Y Bad 150 Y1
    2 B Z Good 50 Z1
    3 A P Good 200 P1
    4 C Q Good 170 Q1
    5 A R Bad 390 R1
    6 D S Good 80 S1
    7 A X Good 100 X2


    Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



    I expect an output like this:



    Area  city_count  good_city_count   Population
    A 4 2 940
    B 1 1 50
    C 1 1 170
    D 1 1 80
    All 7 5 1240


    I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



    >>> city_count = df.pivot_table(index=["Area"],
    values=["City", "Population"],
    aggfunc={"City": lambda x: len(x.unique()),
    "Population": "sum"},
    margins=True)

    Area City Population
    0 A 4 940
    1 B 1 50
    2 C 1 170
    3 D 1 80
    4 All 7 1240


    I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.










    share|improve this question









    New contributor




    Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      up vote
      4
      down vote

      favorite









      up vote
      4
      down vote

      favorite











      I want to make a pivot table based on custom conditions in the dataframe:



      The dataframe looks like this:



      >>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
      "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
      "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
      "Population" : [100,150,50,200,170,390,80,100]
      "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })
      >>> df
      Area City Condition Population Pincode
      0 A X Good 100 X1
      1 A Y Bad 150 Y1
      2 B Z Good 50 Z1
      3 A P Good 200 P1
      4 C Q Good 170 Q1
      5 A R Bad 390 R1
      6 D S Good 80 S1
      7 A X Good 100 X2


      Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



      I expect an output like this:



      Area  city_count  good_city_count   Population
      A 4 2 940
      B 1 1 50
      C 1 1 170
      D 1 1 80
      All 7 5 1240


      I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



      >>> city_count = df.pivot_table(index=["Area"],
      values=["City", "Population"],
      aggfunc={"City": lambda x: len(x.unique()),
      "Population": "sum"},
      margins=True)

      Area City Population
      0 A 4 940
      1 B 1 50
      2 C 1 170
      3 D 1 80
      4 All 7 1240


      I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.










      share|improve this question









      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I want to make a pivot table based on custom conditions in the dataframe:



      The dataframe looks like this:



      >>> df = pd.DataFrame({"Area": ["A", "A", "B", "A", "C", "A", "D", "A"],
      "City" : ["X", "Y", "Z", "P", "Q", "R", "S", "X"],
      "Condition" : ["Good", "Bad", "Good", "Good", "Good", "Bad", "Good", "Good"],
      "Population" : [100,150,50,200,170,390,80,100]
      "Pincode" : ["X1", "Y1", "Z1", "P1", "Q1", "R1", "S1", "X2"] })
      >>> df
      Area City Condition Population Pincode
      0 A X Good 100 X1
      1 A Y Bad 150 Y1
      2 B Z Good 50 Z1
      3 A P Good 200 P1
      4 C Q Good 170 Q1
      5 A R Bad 390 R1
      6 D S Good 80 S1
      7 A X Good 100 X2


      Now I want to pivot the dataframe df in a manner such that I can see the unique count of cities against each area along with the corresponding count of "Good" cities and also the population of the area.



      I expect an output like this:



      Area  city_count  good_city_count   Population
      A 4 2 940
      B 1 1 50
      C 1 1 170
      D 1 1 80
      All 7 5 1240


      I can give a dictionary to the aggfunc parameter but this doesn't give me the city count split between the good cities.



      >>> city_count = df.pivot_table(index=["Area"],
      values=["City", "Population"],
      aggfunc={"City": lambda x: len(x.unique()),
      "Population": "sum"},
      margins=True)

      Area City Population
      0 A 4 940
      1 B 1 50
      2 C 1 170
      3 D 1 80
      4 All 7 1240


      I can merge two different pivot tables - one with the count of cities and the other with the population but this is not scalable for a large dataset with a big aggfunc dictionary.







      python pandas pivot-table






      share|improve this question









      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited Nov 10 at 21:19





















      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Nov 10 at 13:36









      Pratiek Malhotra

      212




      212




      New contributor




      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Pratiek Malhotra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote













          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}
          d1 = {'City':'city_count','Condition':'good_city_count'}

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer



















          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31












          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36


















          up vote
          1
          down vote













          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})
          .rename(columns={'City':'city_count', 'Condition':'good_city_count'})
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer























          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.










           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239514%2fpandas-pivot-table-using-custom-conditions-on-the-dataframe%23new-answer', 'question_page');
          }
          );

          Post as a guest
































          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote













          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}
          d1 = {'City':'city_count','Condition':'good_city_count'}

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer



















          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31












          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36















          up vote
          1
          down vote













          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}
          d1 = {'City':'city_count','Condition':'good_city_count'}

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer



















          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31












          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36













          up vote
          1
          down vote










          up vote
          1
          down vote









          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}
          d1 = {'City':'city_count','Condition':'good_city_count'}

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5





          share|improve this answer














          Add new parameters columns with fill_value and also is possible use nunique for aggregate function:



          city_count = df.pivot_table(index = "Area", 
          values = "City",
          columns='Condition',
          aggfunc = lambda x : x.nunique(),
          margins = True,
          fill_value=0)
          print (city_count)
          Condition Bad Good All
          Area
          A 2 2 4
          B 0 1 1
          C 0 1 1
          D 0 1 1
          All 2 5 7


          Last if need convert index to column and change columns names:



          city_count = city_count.add_suffix('_count').reset_index().rename_axis(None, 1)
          print (city_count)
          Area Bad_count Good_count All_count
          0 A 2 2 4
          1 B 0 1 1
          2 C 0 1 1
          3 D 0 1 1
          4 All 2 5 7


          EDIT:



          d = {'City':'nunique','Population':'sum', 'good_city_count':'nunique'}
          d1 = {'City':'city_count','Condition':'good_city_count'}

          mask = df["Condition"] == 'Good'
          df = (df.assign(good_city_count = lambda x: np.where(mask, x['City'], np.nan))
          .groupby('Area')
          .agg(d)
          .rename(columns=d1))

          df = df.append(df.sum().rename('All')).reset_index()

          print (df)
          Area city_count Population good_city_count
          0 A 4 940 2
          1 B 1 50 1
          2 C 1 170 1
          3 D 1 80 1
          4 All 7 1240 5






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 10 at 22:25

























          answered Nov 10 at 13:40









          jezrael

          305k20239314




          305k20239314








          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31












          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36














          • 1




            This wouldn't give me the total_count of cities against the good_count
            – Pratiek Malhotra
            Nov 10 at 14:18










          • @PratiekMalhotra - sorry, you are right. rollback to previous answer.
            – jezrael
            Nov 10 at 14:20






          • 1




            @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
            – jezrael
            Nov 10 at 14:24










          • @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
            – Pratiek Malhotra
            Nov 10 at 14:31












          • @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
            – jezrael
            Nov 10 at 14:36








          1




          1




          This wouldn't give me the total_count of cities against the good_count
          – Pratiek Malhotra
          Nov 10 at 14:18




          This wouldn't give me the total_count of cities against the good_count
          – Pratiek Malhotra
          Nov 10 at 14:18












          @PratiekMalhotra - sorry, you are right. rollback to previous answer.
          – jezrael
          Nov 10 at 14:20




          @PratiekMalhotra - sorry, you are right. rollback to previous answer.
          – jezrael
          Nov 10 at 14:20




          1




          1




          @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
          – jezrael
          Nov 10 at 14:24




          @PratiekMalhotra - If your question was answered, please accept the most helpful answer by clicking the grey check to the left of that answer to toggle it green. It helps the community. Thanks.
          – jezrael
          Nov 10 at 14:24












          @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
          – Pratiek Malhotra
          Nov 10 at 14:31






          @jeszrael, In dataframes where I need to aggregate values from multiple columns. I cannot use the "columns" parameter as it would aggregate all the values based on that.
          – Pratiek Malhotra
          Nov 10 at 14:31














          @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
          – jezrael
          Nov 10 at 14:36




          @PratiekMalhotra - Can you create minimal, complete, and verifiable example? Because not sure if understand... Thank you
          – jezrael
          Nov 10 at 14:36












          up vote
          1
          down vote













          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})
          .rename(columns={'City':'city_count', 'Condition':'good_city_count'})
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer























          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46















          up vote
          1
          down vote













          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})
          .rename(columns={'City':'city_count', 'Condition':'good_city_count'})
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer























          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46













          up vote
          1
          down vote










          up vote
          1
          down vote









          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})
          .rename(columns={'City':'city_count', 'Condition':'good_city_count'})
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240





          share|improve this answer














          Another method without using pivot_table. Use np.where with groupby+agg:



          df['Condition'] = np.where(df['Condition']=='Good', df['City'], np.nan)
          df = df.groupby('Area').agg({'City':'nunique', 'Condition':'nunique', 'Population':'sum'})
          .rename(columns={'City':'city_count', 'Condition':'good_city_count'})
          df.loc['All',:] = df.sum()
          df = df.astype(int).reset_index()

          print(df)
          Area city_count good_city_count Population
          0 A 4 2 940
          1 B 1 1 50
          2 C 1 1 170
          3 D 1 1 80
          4 All 7 5 1240






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 11 at 2:44

























          answered Nov 10 at 13:48









          Sandeep Kadapa

          4,819426




          4,819426












          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46


















          • When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
            – Pratiek Malhotra
            Nov 10 at 20:48










          • @PratiekMalhotra Check the update.
            – Sandeep Kadapa
            Nov 11 at 2:46
















          When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
          – Pratiek Malhotra
          Nov 10 at 20:48




          When using this, the good city count will not be unique as it will be summing all the duplicate city appearances for Area as well
          – Pratiek Malhotra
          Nov 10 at 20:48












          @PratiekMalhotra Check the update.
          – Sandeep Kadapa
          Nov 11 at 2:46




          @PratiekMalhotra Check the update.
          – Sandeep Kadapa
          Nov 11 at 2:46










          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.










           

          draft saved


          draft discarded


















          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.













          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.












          Pratiek Malhotra is a new contributor. Be nice, and check out our Code of Conduct.















           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239514%2fpandas-pivot-table-using-custom-conditions-on-the-dataframe%23new-answer', 'question_page');
          }
          );

          Post as a guest




















































































          Popular posts from this blog

          Full-time equivalent

          Bicuculline

          さくらももこ