Rename result columns from Pandas aggregation (“FutureWarning: using a dict with renaming is deprecated”)

I'm trying to do some aggregations on a pandas data frame. Here is a sample code:

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})



Out[1]: 

      Amount      

         Sum Count

User              

user1   18.0     2

user2   20.5     3

user3   10.5     1

Which generates the following warning:

FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)

How can I avoid this?

edited May 8 '18 at 23:31

smci

14.7k672104

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

4

I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?

– Stephen McAteer
Nov 24 '17 at 5:31

To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?

– smci
May 8 '18 at 23:32

2

Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366

– Nickolay
May 28 '18 at 8:06

How would this work if I don't do a "groupby" but I'm doing "pivot" instead?

– avloss
Jun 11 '18 at 12:57

add a comment |

I'm trying to do some aggregations on a pandas data frame. Here is a sample code:

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})



Out[1]: 

      Amount      

         Sum Count

User              

user1   18.0     2

user2   20.5     3

user3   10.5     1

Which generates the following warning:

FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)

How can I avoid this?

edited May 8 '18 at 23:31

smci

14.7k672104

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

4

I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?

– Stephen McAteer
Nov 24 '17 at 5:31

To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?

– smci
May 8 '18 at 23:32

2

Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366

– Nickolay
May 28 '18 at 8:06

How would this work if I don't do a "groupby" but I'm doing "pivot" instead?

– avloss
Jun 11 '18 at 12:57

add a comment |

I'm trying to do some aggregations on a pandas data frame. Here is a sample code:

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})



Out[1]: 

      Amount      

         Sum Count

User              

user1   18.0     2

user2   20.5     3

user3   10.5     1

Which generates the following warning:

FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)

How can I avoid this?

edited May 8 '18 at 23:31

smci

14.7k672104

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

I'm trying to do some aggregations on a pandas data frame. Here is a sample code:

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})



Out[1]: 

      Amount      

         Sum Count

User              

user1   18.0     2

user2   20.5     3

user3   10.5     1

Which generates the following warning:

FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)

How can I avoid this?

python pandas aggregate rename

edited May 8 '18 at 23:31

smci

14.7k672104

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

edited May 8 '18 at 23:31

smci

14.7k672104

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

edited May 8 '18 at 23:31

smci

14.7k672104

edited May 8 '18 at 23:31

smci

14.7k672104

edited May 8 '18 at 23:31

smci

14.7k672104

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

asked Jun 19 '17 at 16:28

Victor Mayrink

3331314

4

I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?

– Stephen McAteer
Nov 24 '17 at 5:31

To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?

– smci
May 8 '18 at 23:32

2

Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366

– Nickolay
May 28 '18 at 8:06

How would this work if I don't do a "groupby" but I'm doing "pivot" instead?

– avloss
Jun 11 '18 at 12:57

add a comment |

4

I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?

– Stephen McAteer
Nov 24 '17 at 5:31

To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?

– smci
May 8 '18 at 23:32

2

Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366

– Nickolay
May 28 '18 at 8:06

How would this work if I don't do a "groupby" but I'm doing "pivot" instead?

– avloss
Jun 11 '18 at 12:57

I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?

– Stephen McAteer
Nov 24 '17 at 5:31

To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?

– smci
May 8 '18 at 23:32

Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366

– Nickolay
May 28 '18 at 8:06

How would this work if I don't do a "groupby" but I'm doing "pivot" instead?

– avloss
Jun 11 '18 at 12:57

add a comment |

3 Answers
3

active

oldest

votes

Use groupby `apply` and return a Series to rename columns

Use the groupby apply method to perform an aggregation that

Renames the columns

Allows for spaces in the names

Allows you to order the returned columns in any way you choose

Allows for interactions between columns

Returns a single level index and NOT a MultiIndex

To do this:

create a custom function that you pass to apply

This custom function is passed each group as a DataFrame

Return a Series

The index of the Series will be the new columns

Create fake data

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

enter image description here

create custom function that returns a Series

The variable x inside of my_agg is a DataFrame

def my_agg(x):

    names = {

        'Amount mean': x['Amount'].mean(),

        'Amount std':  x['Amount'].std(),

        'Amount range': x['Amount'].max() - x['Amount'].min(),

        'Score Max':  x['Score'].max(),

        'Score Sum': x['Score'].sum(),

        'Amount Score Sum': (x['Amount'] * x['Score']).sum()}



    return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',

                                   'Score Sum', 'Score Max', 'Amount Score Sum'])

Pass this custom function to the groupby apply method

df.groupby('User').apply(my_agg)

enter image description here

The big downside is that this function will be much slower than agg for the cythonized aggregations

Using a dictionary with groupby `agg` method

Using a dictionary of dictionaries was removed because of its complexity and somewhat ambiguous nature. There is an ongoing discussion on how to improve this functionality in the future on github Here, you can directly access the aggregating column after the groupby call. Simply pass a list of all the aggregating functions you wish to apply.

df.groupby('User')['Amount'].agg(['sum', 'count'])

Output

       sum  count

User              

user1  18.0      2

user2  20.5      3

user3  10.5      1

It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

              "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],

              'Other': [1,2,3,4,5,6]})



df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})

Output

      Amount       Other          

         sum count   max       std

User                              

user1   18.0     2     6  3.535534

user2   20.5     3     5  1.527525

user3   10.5     1     4       NaN

edited Dec 7 '17 at 14:41

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

13

But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?

– ErnestScribbler
Sep 25 '17 at 14:13

2

Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.

– mgmonteleone
Oct 9 '17 at 22:40

If you want to rename the columns you will have to do it manually. Either use a list to replace all of them df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.

– Ted Petrou
Oct 9 '17 at 22:44

@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.

– Ted Petrou
Dec 7 '17 at 14:45

2

How would the apply method work with 'first' and 'last'?

– Gregory Saxton
Feb 16 '18 at 19:14

|
show 1 more comment

If you replace the internal dictionary with a list of tuples it gets rid of the warning message

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?

– Peanut
Jun 6 '18 at 12:42

@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.

– Jacob Stevenson
Jun 6 '18 at 16:30

3

This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.

– Ted Petrou
Jun 13 '18 at 12:27

Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.

– Jacob Stevenson
Jun 13 '18 at 16:35

add a comment |

This is what I did:

Create a fake dataset:

import pandas as pd

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

df

O/P:

    Amount  Score   User

0   10.0    9   user1

1   5.0 1   user2

2   8.0 8   user2

3   10.5    7   user3

4   7.5 7   user2

5   8.0 6   user1

6   9.0 9   user3

I first made the User the index, and then a groupby:

ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])

ans

Solution:

    Sum Count

User        

user1   18.0    2

user2   20.5    3

user3   19.5    2

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f44635626%2frename-result-columns-from-pandas-aggregation-futurewarning-using-a-dict-with%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Use groupby `apply` and return a Series to rename columns

Use the groupby apply method to perform an aggregation that

Renames the columns

Allows for spaces in the names

Allows you to order the returned columns in any way you choose

Allows for interactions between columns

Returns a single level index and NOT a MultiIndex

To do this:

create a custom function that you pass to apply

This custom function is passed each group as a DataFrame

Return a Series

The index of the Series will be the new columns

Create fake data

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

enter image description here

create custom function that returns a Series

The variable x inside of my_agg is a DataFrame

def my_agg(x):

    names = {

        'Amount mean': x['Amount'].mean(),

        'Amount std':  x['Amount'].std(),

        'Amount range': x['Amount'].max() - x['Amount'].min(),

        'Score Max':  x['Score'].max(),

        'Score Sum': x['Score'].sum(),

        'Amount Score Sum': (x['Amount'] * x['Score']).sum()}



    return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',

                                   'Score Sum', 'Score Max', 'Amount Score Sum'])

Pass this custom function to the groupby apply method

df.groupby('User').apply(my_agg)

enter image description here

The big downside is that this function will be much slower than agg for the cythonized aggregations

Using a dictionary with groupby `agg` method

df.groupby('User')['Amount'].agg(['sum', 'count'])

Output

       sum  count

User              

user1  18.0      2

user2  20.5      3

user3  10.5      1

It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

              "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],

              'Other': [1,2,3,4,5,6]})



df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})

Output

      Amount       Other          

         sum count   max       std

User                              

user1   18.0     2     6  3.535534

user2   20.5     3     5  1.527525

user3   10.5     1     4       NaN

edited Dec 7 '17 at 14:41

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

13

But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?

– ErnestScribbler
Sep 25 '17 at 14:13

2

Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.

– mgmonteleone
Oct 9 '17 at 22:40

If you want to rename the columns you will have to do it manually. Either use a list to replace all of them df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.

– Ted Petrou
Oct 9 '17 at 22:44

@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.

– Ted Petrou
Dec 7 '17 at 14:45

2

How would the apply method work with 'first' and 'last'?

– Gregory Saxton
Feb 16 '18 at 19:14

|
show 1 more comment

Use groupby `apply` and return a Series to rename columns

Use the groupby apply method to perform an aggregation that

Renames the columns

Allows for spaces in the names

Allows you to order the returned columns in any way you choose

Allows for interactions between columns

Returns a single level index and NOT a MultiIndex

To do this:

create a custom function that you pass to apply

This custom function is passed each group as a DataFrame

Return a Series

The index of the Series will be the new columns

Create fake data

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

enter image description here

create custom function that returns a Series

The variable x inside of my_agg is a DataFrame

def my_agg(x):

    names = {

        'Amount mean': x['Amount'].mean(),

        'Amount std':  x['Amount'].std(),

        'Amount range': x['Amount'].max() - x['Amount'].min(),

        'Score Max':  x['Score'].max(),

        'Score Sum': x['Score'].sum(),

        'Amount Score Sum': (x['Amount'] * x['Score']).sum()}



    return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',

                                   'Score Sum', 'Score Max', 'Amount Score Sum'])

Pass this custom function to the groupby apply method

df.groupby('User').apply(my_agg)

enter image description here

The big downside is that this function will be much slower than agg for the cythonized aggregations

Using a dictionary with groupby `agg` method

df.groupby('User')['Amount'].agg(['sum', 'count'])

Output

       sum  count

User              

user1  18.0      2

user2  20.5      3

user3  10.5      1

It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

              "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],

              'Other': [1,2,3,4,5,6]})



df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})

Output

      Amount       Other          

         sum count   max       std

User                              

user1   18.0     2     6  3.535534

user2   20.5     3     5  1.527525

user3   10.5     1     4       NaN

edited Dec 7 '17 at 14:41

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

13

But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?

– ErnestScribbler
Sep 25 '17 at 14:13

2

Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.

– mgmonteleone
Oct 9 '17 at 22:40

If you want to rename the columns you will have to do it manually. Either use a list to replace all of them df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.

– Ted Petrou
Oct 9 '17 at 22:44

@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.

– Ted Petrou
Dec 7 '17 at 14:45

2

How would the apply method work with 'first' and 'last'?

– Gregory Saxton
Feb 16 '18 at 19:14

|
show 1 more comment

Use groupby `apply` and return a Series to rename columns

Use the groupby apply method to perform an aggregation that

Renames the columns

Allows for spaces in the names

Allows you to order the returned columns in any way you choose

Allows for interactions between columns

Returns a single level index and NOT a MultiIndex

To do this:

create a custom function that you pass to apply

This custom function is passed each group as a DataFrame

Return a Series

The index of the Series will be the new columns

Create fake data

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

enter image description here

create custom function that returns a Series

The variable x inside of my_agg is a DataFrame

def my_agg(x):

    names = {

        'Amount mean': x['Amount'].mean(),

        'Amount std':  x['Amount'].std(),

        'Amount range': x['Amount'].max() - x['Amount'].min(),

        'Score Max':  x['Score'].max(),

        'Score Sum': x['Score'].sum(),

        'Amount Score Sum': (x['Amount'] * x['Score']).sum()}



    return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',

                                   'Score Sum', 'Score Max', 'Amount Score Sum'])

Pass this custom function to the groupby apply method

df.groupby('User').apply(my_agg)

enter image description here

The big downside is that this function will be much slower than agg for the cythonized aggregations

Using a dictionary with groupby `agg` method

df.groupby('User')['Amount'].agg(['sum', 'count'])

Output

       sum  count

User              

user1  18.0      2

user2  20.5      3

user3  10.5      1

It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

              "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],

              'Other': [1,2,3,4,5,6]})



df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})

Output

      Amount       Other          

         sum count   max       std

User                              

user1   18.0     2     6  3.535534

user2   20.5     3     5  1.527525

user3   10.5     1     4       NaN

edited Dec 7 '17 at 14:41

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

Use groupby `apply` and return a Series to rename columns

Use the groupby apply method to perform an aggregation that

Renames the columns

Allows for spaces in the names

Allows you to order the returned columns in any way you choose

Allows for interactions between columns

Returns a single level index and NOT a MultiIndex

To do this:

create a custom function that you pass to apply

This custom function is passed each group as a DataFrame

Return a Series

The index of the Series will be the new columns

Create fake data

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

enter image description here

create custom function that returns a Series

The variable x inside of my_agg is a DataFrame

def my_agg(x):

    names = {

        'Amount mean': x['Amount'].mean(),

        'Amount std':  x['Amount'].std(),

        'Amount range': x['Amount'].max() - x['Amount'].min(),

        'Score Max':  x['Score'].max(),

        'Score Sum': x['Score'].sum(),

        'Amount Score Sum': (x['Amount'] * x['Score']).sum()}



    return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',

                                   'Score Sum', 'Score Max', 'Amount Score Sum'])

Pass this custom function to the groupby apply method

df.groupby('User').apply(my_agg)

enter image description here

The big downside is that this function will be much slower than agg for the cythonized aggregations

Using a dictionary with groupby `agg` method

df.groupby('User')['Amount'].agg(['sum', 'count'])

Output

       sum  count

User              

user1  18.0      2

user2  20.5      3

user3  10.5      1

It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

              "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],

              'Other': [1,2,3,4,5,6]})



df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})

Output

      Amount       Other          

         sum count   max       std

User                              

user1   18.0     2     6  3.535534

user2   20.5     3     5  1.527525

user3   10.5     1     4       NaN

edited Dec 7 '17 at 14:41

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

edited Dec 7 '17 at 14:41

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

answered Jun 19 '17 at 16:33

Ted Petrou

21.7k76864

13

But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?

– ErnestScribbler
Sep 25 '17 at 14:13

2

Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.

– mgmonteleone
Oct 9 '17 at 22:40

If you want to rename the columns you will have to do it manually. Either use a list to replace all of them df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.

– Ted Petrou
Oct 9 '17 at 22:44

@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.

– Ted Petrou
Dec 7 '17 at 14:45

2

How would the apply method work with 'first' and 'last'?

– Gregory Saxton
Feb 16 '18 at 19:14

|
show 1 more comment

13

But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?

– ErnestScribbler
Sep 25 '17 at 14:13

2

Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.

– mgmonteleone
Oct 9 '17 at 22:40

If you want to rename the columns you will have to do it manually. Either use a list to replace all of them df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.

– Ted Petrou
Oct 9 '17 at 22:44

@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.

– Ted Petrou
Dec 7 '17 at 14:45

2

How would the apply method work with 'first' and 'last'?

– Gregory Saxton
Feb 16 '18 at 19:14

But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?

– ErnestScribbler
Sep 25 '17 at 14:13

Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.

– mgmonteleone
Oct 9 '17 at 22:40

If you want to rename the columns you will have to do it manually. Either use a list to replace all of them df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.

– Ted Petrou
Oct 9 '17 at 22:44

@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.

– Ted Petrou
Dec 7 '17 at 14:45

How would the apply method work with 'first' and 'last'?

– Gregory Saxton
Feb 16 '18 at 19:14

|
show 1 more comment

If you replace the internal dictionary with a list of tuples it gets rid of the warning message

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?

– Peanut
Jun 6 '18 at 12:42

@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.

– Jacob Stevenson
Jun 6 '18 at 16:30

3

This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.

– Ted Petrou
Jun 13 '18 at 12:27

Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.

– Jacob Stevenson
Jun 13 '18 at 16:35

add a comment |

If you replace the internal dictionary with a list of tuples it gets rid of the warning message

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?

– Peanut
Jun 6 '18 at 12:42

@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.

– Jacob Stevenson
Jun 6 '18 at 16:30

3

This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.

– Ted Petrou
Jun 13 '18 at 12:27

Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.

– Jacob Stevenson
Jun 13 '18 at 16:35

add a comment |

If you replace the internal dictionary with a list of tuples it gets rid of the warning message

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

If you replace the internal dictionary with a list of tuples it gets rid of the warning message

import pandas as pd



df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})



df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

answered Jun 5 '18 at 9:44

Jacob Stevenson

1,753178

Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?

– Peanut
Jun 6 '18 at 12:42

@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.

– Jacob Stevenson
Jun 6 '18 at 16:30

3

This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.

– Ted Petrou
Jun 13 '18 at 12:27

Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.

– Jacob Stevenson
Jun 13 '18 at 16:35

add a comment |

Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?

– Peanut
Jun 6 '18 at 12:42

@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.

– Jacob Stevenson
Jun 6 '18 at 16:30

3

This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.

– Ted Petrou
Jun 13 '18 at 12:27

Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.

– Jacob Stevenson
Jun 13 '18 at 16:35

Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?

– Peanut
Jun 6 '18 at 12:42

@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.

– Jacob Stevenson
Jun 6 '18 at 16:30

This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.

– Ted Petrou
Jun 13 '18 at 12:27

Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.

– Jacob Stevenson
Jun 13 '18 at 16:35

add a comment |

This is what I did:

Create a fake dataset:

import pandas as pd

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

df

O/P:

    Amount  Score   User

0   10.0    9   user1

1   5.0 1   user2

2   8.0 8   user2

3   10.5    7   user3

4   7.5 7   user2

5   8.0 6   user1

6   9.0 9   user3

I first made the User the index, and then a groupby:

ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])

ans

Solution:

    Sum Count

User        

user1   18.0    2

user2   20.5    3

user3   19.5    2

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

add a comment |

This is what I did:

Create a fake dataset:

import pandas as pd

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

df

O/P:

    Amount  Score   User

0   10.0    9   user1

1   5.0 1   user2

2   8.0 8   user2

3   10.5    7   user3

4   7.5 7   user2

5   8.0 6   user1

6   9.0 9   user3

I first made the User the index, and then a groupby:

ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])

ans

Solution:

    Sum Count

User        

user1   18.0    2

user2   20.5    3

user3   19.5    2

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

add a comment |

This is what I did:

Create a fake dataset:

import pandas as pd

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

df

O/P:

    Amount  Score   User

0   10.0    9   user1

1   5.0 1   user2

2   8.0 8   user2

3   10.5    7   user3

4   7.5 7   user2

5   8.0 6   user1

6   9.0 9   user3

I first made the User the index, and then a groupby:

ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])

ans

Solution:

    Sum Count

User        

user1   18.0    2

user2   20.5    3

user3   19.5    2

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

This is what I did:

Create a fake dataset:

import pandas as pd

df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],

                  "Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],

                  'Score': [9, 1, 8, 7, 7, 6, 9]})

df

O/P:

    Amount  Score   User

0   10.0    9   user1

1   5.0 1   user2

2   8.0 8   user2

3   10.5    7   user3

4   7.5 7   user2

5   8.0 6   user1

6   9.0 9   user3

I first made the User the index, and then a groupby:

ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])

ans

Solution:

    Sum Count

User        

user1   18.0    2

user2   20.5    3

user3   19.5    2

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

answered Nov 12 '18 at 14:46

Jyothsna Harithsa

445

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nrthugu

Rename result columns from Pandas aggregation (“FutureWarning: using a dict with renaming is deprecated”)

3 Answers
3

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Post as a guest

Popular posts from this blog

Coverage of Google Street View

Full-time equivalent

Surfing

Rename result columns from Pandas aggregation (“FutureWarning: using a dict with renaming is deprecated”)

3 Answers 3

Use groupby apply and return a Series to rename columns

Using a dictionary with groupby agg method

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Use groupby apply and return a Series to rename columns

Using a dictionary with groupby agg method

Use groupby apply and return a Series to rename columns

Using a dictionary with groupby agg method

Use groupby apply and return a Series to rename columns

Using a dictionary with groupby agg method

Use groupby apply and return a Series to rename columns

Using a dictionary with groupby agg method

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Coverage of Google Street View

Full-time equivalent

Surfing

3 Answers
3

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

3 Answers
3

3 Answers
3

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method

Use groupby `apply` and return a Series to rename columns

Using a dictionary with groupby `agg` method