Rename result columns from Pandas aggregation (“FutureWarning: using a dict with renaming is deprecated”)
I'm trying to do some aggregations on a pandas data frame. Here is a sample code:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})
Out[1]:
Amount
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
Which generates the following warning:
FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)
How can I avoid this?
python pandas aggregate rename
add a comment |
I'm trying to do some aggregations on a pandas data frame. Here is a sample code:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})
Out[1]:
Amount
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
Which generates the following warning:
FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)
How can I avoid this?
python pandas aggregate rename
4
I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?
– Stephen McAteer
Nov 24 '17 at 5:31
To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?
– smci
May 8 '18 at 23:32
2
Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366
– Nickolay
May 28 '18 at 8:06
How would this work if I don't do a "groupby" but I'm doing "pivot" instead?
– avloss
Jun 11 '18 at 12:57
add a comment |
I'm trying to do some aggregations on a pandas data frame. Here is a sample code:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})
Out[1]:
Amount
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
Which generates the following warning:
FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)
How can I avoid this?
python pandas aggregate rename
I'm trying to do some aggregations on a pandas data frame. Here is a sample code:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": {"Sum": "sum", "Count": "count"}})
Out[1]:
Amount
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
Which generates the following warning:
FutureWarning: using a dict with renaming is deprecated and will be
removed in a future version return super(DataFrameGroupBy,
self).aggregate(arg, *args, **kwargs)
How can I avoid this?
python pandas aggregate rename
python pandas aggregate rename
edited May 8 '18 at 23:31
smci
14.7k672104
14.7k672104
asked Jun 19 '17 at 16:28
Victor MayrinkVictor Mayrink
3331314
3331314
4
I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?
– Stephen McAteer
Nov 24 '17 at 5:31
To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?
– smci
May 8 '18 at 23:32
2
Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366
– Nickolay
May 28 '18 at 8:06
How would this work if I don't do a "groupby" but I'm doing "pivot" instead?
– avloss
Jun 11 '18 at 12:57
add a comment |
4
I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?
– Stephen McAteer
Nov 24 '17 at 5:31
To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?
– smci
May 8 '18 at 23:32
2
Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366
– Nickolay
May 28 '18 at 8:06
How would this work if I don't do a "groupby" but I'm doing "pivot" instead?
– avloss
Jun 11 '18 at 12:57
4
4
I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?
– Stephen McAteer
Nov 24 '17 at 5:31
I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?
– Stephen McAteer
Nov 24 '17 at 5:31
To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?
– smci
May 8 '18 at 23:32
To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?
– smci
May 8 '18 at 23:32
2
2
Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366
– Nickolay
May 28 '18 at 8:06
Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366
– Nickolay
May 28 '18 at 8:06
How would this work if I don't do a "groupby" but I'm doing "pivot" instead?
– avloss
Jun 11 '18 at 12:57
How would this work if I don't do a "groupby" but I'm doing "pivot" instead?
– avloss
Jun 11 '18 at 12:57
add a comment |
3 Answers
3
active
oldest
votes
Use groupby apply and return a Series to rename columns
Use the groupby apply method to perform an aggregation that
- Renames the columns
- Allows for spaces in the names
- Allows you to order the returned columns in any way you choose
- Allows for interactions between columns
- Returns a single level index and NOT a MultiIndex
To do this:
- create a custom function that you pass to
apply
- This custom function is passed each group as a DataFrame
- Return a Series
- The index of the Series will be the new columns
Create fake data
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})

create custom function that returns a Series
The variable x inside of my_agg is a DataFrame
def my_agg(x):
names = {
'Amount mean': x['Amount'].mean(),
'Amount std': x['Amount'].std(),
'Amount range': x['Amount'].max() - x['Amount'].min(),
'Score Max': x['Score'].max(),
'Score Sum': x['Score'].sum(),
'Amount Score Sum': (x['Amount'] * x['Score']).sum()}
return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',
'Score Sum', 'Score Max', 'Amount Score Sum'])
Pass this custom function to the groupby apply method
df.groupby('User').apply(my_agg)

The big downside is that this function will be much slower than agg for the cythonized aggregations
Using a dictionary with groupby agg method
Using a dictionary of dictionaries was removed because of its complexity and somewhat ambiguous nature. There is an ongoing discussion on how to improve this functionality in the future on github Here, you can directly access the aggregating column after the groupby call. Simply pass a list of all the aggregating functions you wish to apply.
df.groupby('User')['Amount'].agg(['sum', 'count'])
Output
sum count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],
'Other': [1,2,3,4,5,6]})
df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})
Output
Amount Other
sum count max std
User
user1 18.0 2 6 3.535534
user2 20.5 3 5 1.527525
user3 10.5 1 4 NaN
13
But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?
– ErnestScribbler
Sep 25 '17 at 14:13
2
Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.
– mgmonteleone
Oct 9 '17 at 22:40
If you want to rename the columns you will have to do it manually. Either use a list to replace all of themdf.columns = ['your', 'new', 'col', 'names']or use therenamemethod which will be a bit difficult since you have this results in a MultiIndex.
– Ted Petrou
Oct 9 '17 at 22:44
@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.
– Ted Petrou
Dec 7 '17 at 14:45
2
How would the apply method work with 'first' and 'last'?
– Gregory Saxton
Feb 16 '18 at 19:14
|
show 1 more comment
If you replace the internal dictionary with a list of tuples it gets rid of the warning message
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})
Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?
– Peanut
Jun 6 '18 at 12:42
@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.
– Jacob Stevenson
Jun 6 '18 at 16:30
3
This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.
– Ted Petrou
Jun 13 '18 at 12:27
Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.
– Jacob Stevenson
Jun 13 '18 at 16:35
add a comment |
This is what I did:
Create a fake dataset:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})
df
O/P:
Amount Score User
0 10.0 9 user1
1 5.0 1 user2
2 8.0 8 user2
3 10.5 7 user3
4 7.5 7 user2
5 8.0 6 user1
6 9.0 9 user3
I first made the User the index, and then a groupby:
ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])
ans
Solution:
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 19.5 2
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f44635626%2frename-result-columns-from-pandas-aggregation-futurewarning-using-a-dict-with%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use groupby apply and return a Series to rename columns
Use the groupby apply method to perform an aggregation that
- Renames the columns
- Allows for spaces in the names
- Allows you to order the returned columns in any way you choose
- Allows for interactions between columns
- Returns a single level index and NOT a MultiIndex
To do this:
- create a custom function that you pass to
apply
- This custom function is passed each group as a DataFrame
- Return a Series
- The index of the Series will be the new columns
Create fake data
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})

create custom function that returns a Series
The variable x inside of my_agg is a DataFrame
def my_agg(x):
names = {
'Amount mean': x['Amount'].mean(),
'Amount std': x['Amount'].std(),
'Amount range': x['Amount'].max() - x['Amount'].min(),
'Score Max': x['Score'].max(),
'Score Sum': x['Score'].sum(),
'Amount Score Sum': (x['Amount'] * x['Score']).sum()}
return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',
'Score Sum', 'Score Max', 'Amount Score Sum'])
Pass this custom function to the groupby apply method
df.groupby('User').apply(my_agg)

The big downside is that this function will be much slower than agg for the cythonized aggregations
Using a dictionary with groupby agg method
Using a dictionary of dictionaries was removed because of its complexity and somewhat ambiguous nature. There is an ongoing discussion on how to improve this functionality in the future on github Here, you can directly access the aggregating column after the groupby call. Simply pass a list of all the aggregating functions you wish to apply.
df.groupby('User')['Amount'].agg(['sum', 'count'])
Output
sum count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],
'Other': [1,2,3,4,5,6]})
df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})
Output
Amount Other
sum count max std
User
user1 18.0 2 6 3.535534
user2 20.5 3 5 1.527525
user3 10.5 1 4 NaN
13
But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?
– ErnestScribbler
Sep 25 '17 at 14:13
2
Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.
– mgmonteleone
Oct 9 '17 at 22:40
If you want to rename the columns you will have to do it manually. Either use a list to replace all of themdf.columns = ['your', 'new', 'col', 'names']or use therenamemethod which will be a bit difficult since you have this results in a MultiIndex.
– Ted Petrou
Oct 9 '17 at 22:44
@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.
– Ted Petrou
Dec 7 '17 at 14:45
2
How would the apply method work with 'first' and 'last'?
– Gregory Saxton
Feb 16 '18 at 19:14
|
show 1 more comment
Use groupby apply and return a Series to rename columns
Use the groupby apply method to perform an aggregation that
- Renames the columns
- Allows for spaces in the names
- Allows you to order the returned columns in any way you choose
- Allows for interactions between columns
- Returns a single level index and NOT a MultiIndex
To do this:
- create a custom function that you pass to
apply
- This custom function is passed each group as a DataFrame
- Return a Series
- The index of the Series will be the new columns
Create fake data
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})

create custom function that returns a Series
The variable x inside of my_agg is a DataFrame
def my_agg(x):
names = {
'Amount mean': x['Amount'].mean(),
'Amount std': x['Amount'].std(),
'Amount range': x['Amount'].max() - x['Amount'].min(),
'Score Max': x['Score'].max(),
'Score Sum': x['Score'].sum(),
'Amount Score Sum': (x['Amount'] * x['Score']).sum()}
return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',
'Score Sum', 'Score Max', 'Amount Score Sum'])
Pass this custom function to the groupby apply method
df.groupby('User').apply(my_agg)

The big downside is that this function will be much slower than agg for the cythonized aggregations
Using a dictionary with groupby agg method
Using a dictionary of dictionaries was removed because of its complexity and somewhat ambiguous nature. There is an ongoing discussion on how to improve this functionality in the future on github Here, you can directly access the aggregating column after the groupby call. Simply pass a list of all the aggregating functions you wish to apply.
df.groupby('User')['Amount'].agg(['sum', 'count'])
Output
sum count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],
'Other': [1,2,3,4,5,6]})
df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})
Output
Amount Other
sum count max std
User
user1 18.0 2 6 3.535534
user2 20.5 3 5 1.527525
user3 10.5 1 4 NaN
13
But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?
– ErnestScribbler
Sep 25 '17 at 14:13
2
Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.
– mgmonteleone
Oct 9 '17 at 22:40
If you want to rename the columns you will have to do it manually. Either use a list to replace all of themdf.columns = ['your', 'new', 'col', 'names']or use therenamemethod which will be a bit difficult since you have this results in a MultiIndex.
– Ted Petrou
Oct 9 '17 at 22:44
@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.
– Ted Petrou
Dec 7 '17 at 14:45
2
How would the apply method work with 'first' and 'last'?
– Gregory Saxton
Feb 16 '18 at 19:14
|
show 1 more comment
Use groupby apply and return a Series to rename columns
Use the groupby apply method to perform an aggregation that
- Renames the columns
- Allows for spaces in the names
- Allows you to order the returned columns in any way you choose
- Allows for interactions between columns
- Returns a single level index and NOT a MultiIndex
To do this:
- create a custom function that you pass to
apply
- This custom function is passed each group as a DataFrame
- Return a Series
- The index of the Series will be the new columns
Create fake data
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})

create custom function that returns a Series
The variable x inside of my_agg is a DataFrame
def my_agg(x):
names = {
'Amount mean': x['Amount'].mean(),
'Amount std': x['Amount'].std(),
'Amount range': x['Amount'].max() - x['Amount'].min(),
'Score Max': x['Score'].max(),
'Score Sum': x['Score'].sum(),
'Amount Score Sum': (x['Amount'] * x['Score']).sum()}
return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',
'Score Sum', 'Score Max', 'Amount Score Sum'])
Pass this custom function to the groupby apply method
df.groupby('User').apply(my_agg)

The big downside is that this function will be much slower than agg for the cythonized aggregations
Using a dictionary with groupby agg method
Using a dictionary of dictionaries was removed because of its complexity and somewhat ambiguous nature. There is an ongoing discussion on how to improve this functionality in the future on github Here, you can directly access the aggregating column after the groupby call. Simply pass a list of all the aggregating functions you wish to apply.
df.groupby('User')['Amount'].agg(['sum', 'count'])
Output
sum count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],
'Other': [1,2,3,4,5,6]})
df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})
Output
Amount Other
sum count max std
User
user1 18.0 2 6 3.535534
user2 20.5 3 5 1.527525
user3 10.5 1 4 NaN
Use groupby apply and return a Series to rename columns
Use the groupby apply method to perform an aggregation that
- Renames the columns
- Allows for spaces in the names
- Allows you to order the returned columns in any way you choose
- Allows for interactions between columns
- Returns a single level index and NOT a MultiIndex
To do this:
- create a custom function that you pass to
apply
- This custom function is passed each group as a DataFrame
- Return a Series
- The index of the Series will be the new columns
Create fake data
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})

create custom function that returns a Series
The variable x inside of my_agg is a DataFrame
def my_agg(x):
names = {
'Amount mean': x['Amount'].mean(),
'Amount std': x['Amount'].std(),
'Amount range': x['Amount'].max() - x['Amount'].min(),
'Score Max': x['Score'].max(),
'Score Sum': x['Score'].sum(),
'Amount Score Sum': (x['Amount'] * x['Score']).sum()}
return pd.Series(names, index=['Amount range', 'Amount std', 'Amount mean',
'Score Sum', 'Score Max', 'Amount Score Sum'])
Pass this custom function to the groupby apply method
df.groupby('User').apply(my_agg)

The big downside is that this function will be much slower than agg for the cythonized aggregations
Using a dictionary with groupby agg method
Using a dictionary of dictionaries was removed because of its complexity and somewhat ambiguous nature. There is an ongoing discussion on how to improve this functionality in the future on github Here, you can directly access the aggregating column after the groupby call. Simply pass a list of all the aggregating functions you wish to apply.
df.groupby('User')['Amount'].agg(['sum', 'count'])
Output
sum count
User
user1 18.0 2
user2 20.5 3
user3 10.5 1
It is still possible to use a dictionary to explicitly denote different aggregations for different columns, like here if there was another numeric column named Other.
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0],
'Other': [1,2,3,4,5,6]})
df.groupby('User').agg({'Amount' : ['sum', 'count'], 'Other':['max', 'std']})
Output
Amount Other
sum count max std
User
user1 18.0 2 6 3.535534
user2 20.5 3 5 1.527525
user3 10.5 1 4 NaN
edited Dec 7 '17 at 14:41
answered Jun 19 '17 at 16:33
Ted PetrouTed Petrou
21.7k76864
21.7k76864
13
But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?
– ErnestScribbler
Sep 25 '17 at 14:13
2
Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.
– mgmonteleone
Oct 9 '17 at 22:40
If you want to rename the columns you will have to do it manually. Either use a list to replace all of themdf.columns = ['your', 'new', 'col', 'names']or use therenamemethod which will be a bit difficult since you have this results in a MultiIndex.
– Ted Petrou
Oct 9 '17 at 22:44
@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.
– Ted Petrou
Dec 7 '17 at 14:45
2
How would the apply method work with 'first' and 'last'?
– Gregory Saxton
Feb 16 '18 at 19:14
|
show 1 more comment
13
But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?
– ErnestScribbler
Sep 25 '17 at 14:13
2
Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.
– mgmonteleone
Oct 9 '17 at 22:40
If you want to rename the columns you will have to do it manually. Either use a list to replace all of themdf.columns = ['your', 'new', 'col', 'names']or use therenamemethod which will be a bit difficult since you have this results in a MultiIndex.
– Ted Petrou
Oct 9 '17 at 22:44
@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.
– Ted Petrou
Dec 7 '17 at 14:45
2
How would the apply method work with 'first' and 'last'?
– Gregory Saxton
Feb 16 '18 at 19:14
13
13
But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?
– ErnestScribbler
Sep 25 '17 at 14:13
But suppose you do want the renaming aspect as well, to have different column names than the default after aggregation. Is there some syntax to still get this functionality?
– ErnestScribbler
Sep 25 '17 at 14:13
2
2
Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.
– mgmonteleone
Oct 9 '17 at 22:40
Had same question about naming, since I am using the same col twice (one min and one max) and need a way to refer to them uniquely when I put results back in objects.
– mgmonteleone
Oct 9 '17 at 22:40
If you want to rename the columns you will have to do it manually. Either use a list to replace all of them
df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.– Ted Petrou
Oct 9 '17 at 22:44
If you want to rename the columns you will have to do it manually. Either use a list to replace all of them
df.columns = ['your', 'new', 'col', 'names'] or use the rename method which will be a bit difficult since you have this results in a MultiIndex.– Ted Petrou
Oct 9 '17 at 22:44
@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.
– Ted Petrou
Dec 7 '17 at 14:45
@ErnestScribbler I have updated this answer to show a fairly simple way of performing an aggregation, renaming and returning a single-level index.
– Ted Petrou
Dec 7 '17 at 14:45
2
2
How would the apply method work with 'first' and 'last'?
– Gregory Saxton
Feb 16 '18 at 19:14
How would the apply method work with 'first' and 'last'?
– Gregory Saxton
Feb 16 '18 at 19:14
|
show 1 more comment
If you replace the internal dictionary with a list of tuples it gets rid of the warning message
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})
Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?
– Peanut
Jun 6 '18 at 12:42
@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.
– Jacob Stevenson
Jun 6 '18 at 16:30
3
This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.
– Ted Petrou
Jun 13 '18 at 12:27
Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.
– Jacob Stevenson
Jun 13 '18 at 16:35
add a comment |
If you replace the internal dictionary with a list of tuples it gets rid of the warning message
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})
Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?
– Peanut
Jun 6 '18 at 12:42
@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.
– Jacob Stevenson
Jun 6 '18 at 16:30
3
This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.
– Ted Petrou
Jun 13 '18 at 12:27
Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.
– Jacob Stevenson
Jun 13 '18 at 16:35
add a comment |
If you replace the internal dictionary with a list of tuples it gets rid of the warning message
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})
If you replace the internal dictionary with a list of tuples it gets rid of the warning message
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0]})
df.groupby(["User"]).agg({"Amount": [("Sum", "sum"), ("Count", "count")]})
answered Jun 5 '18 at 9:44
Jacob StevensonJacob Stevenson
1,753178
1,753178
Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?
– Peanut
Jun 6 '18 at 12:42
@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.
– Jacob Stevenson
Jun 6 '18 at 16:30
3
This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.
– Ted Petrou
Jun 13 '18 at 12:27
Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.
– Jacob Stevenson
Jun 13 '18 at 16:35
add a comment |
Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?
– Peanut
Jun 6 '18 at 12:42
@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.
– Jacob Stevenson
Jun 6 '18 at 16:30
3
This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.
– Ted Petrou
Jun 13 '18 at 12:27
Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.
– Jacob Stevenson
Jun 13 '18 at 16:35
Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?
– Peanut
Jun 6 '18 at 12:42
Do you know if this will also (presumably) work in the future release or is also just to workaround the warning message?
– Peanut
Jun 6 '18 at 12:42
@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.
– Jacob Stevenson
Jun 6 '18 at 16:30
@Peanut, I do not know. But if the warning message is not there, then presumably (as you say) it will continue to be supported.
– Jacob Stevenson
Jun 6 '18 at 16:30
3
3
This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.
– Ted Petrou
Jun 13 '18 at 12:27
This is an undocumented and accidental feature and I would highly suggest no one using this syntax as it may not work in the future.
– Ted Petrou
Jun 13 '18 at 12:27
Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.
– Jacob Stevenson
Jun 13 '18 at 16:35
Thanks for the information @TedPetrou. And thanks for the link to the discussion in your answer. It sounds like a difficult question to find the right syntax.
– Jacob Stevenson
Jun 13 '18 at 16:35
add a comment |
This is what I did:
Create a fake dataset:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})
df
O/P:
Amount Score User
0 10.0 9 user1
1 5.0 1 user2
2 8.0 8 user2
3 10.5 7 user3
4 7.5 7 user2
5 8.0 6 user1
6 9.0 9 user3
I first made the User the index, and then a groupby:
ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])
ans
Solution:
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 19.5 2
add a comment |
This is what I did:
Create a fake dataset:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})
df
O/P:
Amount Score User
0 10.0 9 user1
1 5.0 1 user2
2 8.0 8 user2
3 10.5 7 user3
4 7.5 7 user2
5 8.0 6 user1
6 9.0 9 user3
I first made the User the index, and then a groupby:
ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])
ans
Solution:
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 19.5 2
add a comment |
This is what I did:
Create a fake dataset:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})
df
O/P:
Amount Score User
0 10.0 9 user1
1 5.0 1 user2
2 8.0 8 user2
3 10.5 7 user3
4 7.5 7 user2
5 8.0 6 user1
6 9.0 9 user3
I first made the User the index, and then a groupby:
ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])
ans
Solution:
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 19.5 2
This is what I did:
Create a fake dataset:
import pandas as pd
df = pd.DataFrame({"User": ["user1", "user2", "user2", "user3", "user2", "user1", "user3"],
"Amount": [10.0, 5.0, 8.0, 10.5, 7.5, 8.0, 9],
'Score': [9, 1, 8, 7, 7, 6, 9]})
df
O/P:
Amount Score User
0 10.0 9 user1
1 5.0 1 user2
2 8.0 8 user2
3 10.5 7 user3
4 7.5 7 user2
5 8.0 6 user1
6 9.0 9 user3
I first made the User the index, and then a groupby:
ans = df.set_index('User').groupby(level=0)['Amount'].agg([('Sum','sum'),('Count','count')])
ans
Solution:
Sum Count
User
user1 18.0 2
user2 20.5 3
user3 19.5 2
answered Nov 12 '18 at 14:46
Jyothsna HarithsaJyothsna Harithsa
445
445
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f44635626%2frename-result-columns-from-pandas-aggregation-futurewarning-using-a-dict-with%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
I'd love to know why this is being depreciated (I'm sure there is a good reason). Does anyone have a link to a discussion on it?
– Stephen McAteer
Nov 24 '17 at 5:31
To focus on the keywords of the solution rather than just the existing warning, I retitled "rename result columns from aggregation" and tagged. Now people might even find this question :) ahead of (say) the not-so-canonicalNaming returned columns in Pandas aggregate function?
– smci
May 8 '18 at 23:32
2
Hopefully this will be addressed in github.com/pandas-dev/pandas/issues/18366
– Nickolay
May 28 '18 at 8:06
How would this work if I don't do a "groupby" but I'm doing "pivot" instead?
– avloss
Jun 11 '18 at 12:57