How to do group by in two levels on python pandas an count values?
I have the following dataframe
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['Name'] = ['AK', 'Ram', 'Ram', 'Singh', 'Murugan', 'Kishore', 'AK']
df['Email'] = ['AK@gmail.com', 'a@djgbj.com', 'a@djgbj.com', '3454@ghhg.io', 'dgg@qw.cc', 'dgdg@dg.com', 'AK@gmail.com']
df['Cat'] = ['ab1', 'ab2', 'ab1', 'ab2', 'ab1', 'ab2', 'ab1']
df['Id'] = ['abc1', 'abc2', 'abc3', 'abc4', 'abc5', 'abc6', 'abc7']
For the following code
dfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
It gives:
Email Cat Number
0 3454@ghhg.io ab2 1
1 AK@gmail.com ab1 2
2 a@djgbj.com ab1 1
3 a@djgbj.com ab2 1
4 dgdg@dg.com ab2 1
5 dgg@qw.cc ab1 1
How to group by on dfs to get the following output?
Cat Number Count
ab1 1 3
ab1 2 1
ab2 1 3
python pandas
add a comment |
I have the following dataframe
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['Name'] = ['AK', 'Ram', 'Ram', 'Singh', 'Murugan', 'Kishore', 'AK']
df['Email'] = ['AK@gmail.com', 'a@djgbj.com', 'a@djgbj.com', '3454@ghhg.io', 'dgg@qw.cc', 'dgdg@dg.com', 'AK@gmail.com']
df['Cat'] = ['ab1', 'ab2', 'ab1', 'ab2', 'ab1', 'ab2', 'ab1']
df['Id'] = ['abc1', 'abc2', 'abc3', 'abc4', 'abc5', 'abc6', 'abc7']
For the following code
dfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
It gives:
Email Cat Number
0 3454@ghhg.io ab2 1
1 AK@gmail.com ab1 2
2 a@djgbj.com ab1 1
3 a@djgbj.com ab2 1
4 dgdg@dg.com ab2 1
5 dgg@qw.cc ab1 1
How to group by on dfs to get the following output?
Cat Number Count
ab1 1 3
ab1 2 1
ab2 1 3
python pandas
Multiple solutions, e.g. see Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series
– jpp
Nov 12 '18 at 11:12
Its should be simple with groupby and count.
– pygo
Nov 12 '18 at 11:32
I think you wantc = df.groupby(['Email', 'Cat'])['Email'].cumcount(); df.groupby(['Cat', c]).Cat.count()if not then your question has been phrased poorly.
– coldspeed
Nov 12 '18 at 17:07
add a comment |
I have the following dataframe
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['Name'] = ['AK', 'Ram', 'Ram', 'Singh', 'Murugan', 'Kishore', 'AK']
df['Email'] = ['AK@gmail.com', 'a@djgbj.com', 'a@djgbj.com', '3454@ghhg.io', 'dgg@qw.cc', 'dgdg@dg.com', 'AK@gmail.com']
df['Cat'] = ['ab1', 'ab2', 'ab1', 'ab2', 'ab1', 'ab2', 'ab1']
df['Id'] = ['abc1', 'abc2', 'abc3', 'abc4', 'abc5', 'abc6', 'abc7']
For the following code
dfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
It gives:
Email Cat Number
0 3454@ghhg.io ab2 1
1 AK@gmail.com ab1 2
2 a@djgbj.com ab1 1
3 a@djgbj.com ab2 1
4 dgdg@dg.com ab2 1
5 dgg@qw.cc ab1 1
How to group by on dfs to get the following output?
Cat Number Count
ab1 1 3
ab1 2 1
ab2 1 3
python pandas
I have the following dataframe
import pandas as pd
import numpy as np
df = pd.DataFrame()
df['Name'] = ['AK', 'Ram', 'Ram', 'Singh', 'Murugan', 'Kishore', 'AK']
df['Email'] = ['AK@gmail.com', 'a@djgbj.com', 'a@djgbj.com', '3454@ghhg.io', 'dgg@qw.cc', 'dgdg@dg.com', 'AK@gmail.com']
df['Cat'] = ['ab1', 'ab2', 'ab1', 'ab2', 'ab1', 'ab2', 'ab1']
df['Id'] = ['abc1', 'abc2', 'abc3', 'abc4', 'abc5', 'abc6', 'abc7']
For the following code
dfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
It gives:
Email Cat Number
0 3454@ghhg.io ab2 1
1 AK@gmail.com ab1 2
2 a@djgbj.com ab1 1
3 a@djgbj.com ab2 1
4 dgdg@dg.com ab2 1
5 dgg@qw.cc ab1 1
How to group by on dfs to get the following output?
Cat Number Count
ab1 1 3
ab1 2 1
ab2 1 3
python pandas
python pandas
edited Nov 12 '18 at 12:45
Ali AzG
581515
581515
asked Nov 12 '18 at 11:08
pandapanda
967
967
Multiple solutions, e.g. see Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series
– jpp
Nov 12 '18 at 11:12
Its should be simple with groupby and count.
– pygo
Nov 12 '18 at 11:32
I think you wantc = df.groupby(['Email', 'Cat'])['Email'].cumcount(); df.groupby(['Cat', c]).Cat.count()if not then your question has been phrased poorly.
– coldspeed
Nov 12 '18 at 17:07
add a comment |
Multiple solutions, e.g. see Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series
– jpp
Nov 12 '18 at 11:12
Its should be simple with groupby and count.
– pygo
Nov 12 '18 at 11:32
I think you wantc = df.groupby(['Email', 'Cat'])['Email'].cumcount(); df.groupby(['Cat', c]).Cat.count()if not then your question has been phrased poorly.
– coldspeed
Nov 12 '18 at 17:07
Multiple solutions, e.g. see Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series
– jpp
Nov 12 '18 at 11:12
Multiple solutions, e.g. see Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series
– jpp
Nov 12 '18 at 11:12
Its should be simple with groupby and count.
– pygo
Nov 12 '18 at 11:32
Its should be simple with groupby and count.
– pygo
Nov 12 '18 at 11:32
I think you want
c = df.groupby(['Email', 'Cat'])['Email'].cumcount(); df.groupby(['Cat', c]).Cat.count() if not then your question has been phrased poorly.– coldspeed
Nov 12 '18 at 17:07
I think you want
c = df.groupby(['Email', 'Cat'])['Email'].cumcount(); df.groupby(['Cat', c]).Cat.count() if not then your question has been phrased poorly.– coldspeed
Nov 12 '18 at 17:07
add a comment |
2 Answers
2
active
oldest
votes
Use groupby+size and reset_index:
df1 = dfs.groupby(['Cat','Number']).size().reset_index(name='Count')
Or:
df1 = dfs.groupby(['Cat','Number'])['Email'].value_counts().reset_index(name='Count')
print(df1)
Cat Number Count
0 ab1 1 2
1 ab1 2 1
2 ab2 1 3
Isn't your answer incorrect?
– coldspeed
Nov 12 '18 at 11:14
The weird thing is that the solution is basically the same as part of OP's code, just done once more.
– John Zwinck
Nov 12 '18 at 11:19
@coldspeed Not sure, but fromdfsI did what I can, need to check with OP to clear the confusion.
– Sandeep Kadapa
Nov 12 '18 at 11:28
@JohnZwinck Yes, but that is how OP's output.
– Sandeep Kadapa
Nov 12 '18 at 11:30
@SandeepKadapa Not sure what the confusion is, they've indicated their expected output at the end of the post.
– coldspeed
Nov 12 '18 at 12:04
|
show 1 more comment
Simply:
dfs.groupby(['Cat', 'Number']).count()
reproduced the below, which works..
>>> dfs.groupby(['Cat', 'Number']).count()
Email
Cat Number
ab1 1 2
2 1
ab2 1 3
OR
>>> dfs.groupby(['Cat', 'Number'])['Email'].count()
Cat Number
ab1 1 2
2 1
ab2 1 3
Name: Email, dtype: int64
This won't produce OP's expected output.
– coldspeed
Nov 12 '18 at 11:16
@coldspeed, it works correctly , see the POST , OP mentioneddfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
– pygo
Nov 12 '18 at 11:24
Regardless of what they mentioned, it isn't their "expected output".
– coldspeed
Nov 12 '18 at 11:53
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260893%2fhow-to-do-group-by-in-two-levels-on-python-pandas-an-count-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use groupby+size and reset_index:
df1 = dfs.groupby(['Cat','Number']).size().reset_index(name='Count')
Or:
df1 = dfs.groupby(['Cat','Number'])['Email'].value_counts().reset_index(name='Count')
print(df1)
Cat Number Count
0 ab1 1 2
1 ab1 2 1
2 ab2 1 3
Isn't your answer incorrect?
– coldspeed
Nov 12 '18 at 11:14
The weird thing is that the solution is basically the same as part of OP's code, just done once more.
– John Zwinck
Nov 12 '18 at 11:19
@coldspeed Not sure, but fromdfsI did what I can, need to check with OP to clear the confusion.
– Sandeep Kadapa
Nov 12 '18 at 11:28
@JohnZwinck Yes, but that is how OP's output.
– Sandeep Kadapa
Nov 12 '18 at 11:30
@SandeepKadapa Not sure what the confusion is, they've indicated their expected output at the end of the post.
– coldspeed
Nov 12 '18 at 12:04
|
show 1 more comment
Use groupby+size and reset_index:
df1 = dfs.groupby(['Cat','Number']).size().reset_index(name='Count')
Or:
df1 = dfs.groupby(['Cat','Number'])['Email'].value_counts().reset_index(name='Count')
print(df1)
Cat Number Count
0 ab1 1 2
1 ab1 2 1
2 ab2 1 3
Isn't your answer incorrect?
– coldspeed
Nov 12 '18 at 11:14
The weird thing is that the solution is basically the same as part of OP's code, just done once more.
– John Zwinck
Nov 12 '18 at 11:19
@coldspeed Not sure, but fromdfsI did what I can, need to check with OP to clear the confusion.
– Sandeep Kadapa
Nov 12 '18 at 11:28
@JohnZwinck Yes, but that is how OP's output.
– Sandeep Kadapa
Nov 12 '18 at 11:30
@SandeepKadapa Not sure what the confusion is, they've indicated their expected output at the end of the post.
– coldspeed
Nov 12 '18 at 12:04
|
show 1 more comment
Use groupby+size and reset_index:
df1 = dfs.groupby(['Cat','Number']).size().reset_index(name='Count')
Or:
df1 = dfs.groupby(['Cat','Number'])['Email'].value_counts().reset_index(name='Count')
print(df1)
Cat Number Count
0 ab1 1 2
1 ab1 2 1
2 ab2 1 3
Use groupby+size and reset_index:
df1 = dfs.groupby(['Cat','Number']).size().reset_index(name='Count')
Or:
df1 = dfs.groupby(['Cat','Number'])['Email'].value_counts().reset_index(name='Count')
print(df1)
Cat Number Count
0 ab1 1 2
1 ab1 2 1
2 ab2 1 3
edited Nov 12 '18 at 11:16
answered Nov 12 '18 at 11:12
Sandeep KadapaSandeep Kadapa
6,092429
6,092429
Isn't your answer incorrect?
– coldspeed
Nov 12 '18 at 11:14
The weird thing is that the solution is basically the same as part of OP's code, just done once more.
– John Zwinck
Nov 12 '18 at 11:19
@coldspeed Not sure, but fromdfsI did what I can, need to check with OP to clear the confusion.
– Sandeep Kadapa
Nov 12 '18 at 11:28
@JohnZwinck Yes, but that is how OP's output.
– Sandeep Kadapa
Nov 12 '18 at 11:30
@SandeepKadapa Not sure what the confusion is, they've indicated their expected output at the end of the post.
– coldspeed
Nov 12 '18 at 12:04
|
show 1 more comment
Isn't your answer incorrect?
– coldspeed
Nov 12 '18 at 11:14
The weird thing is that the solution is basically the same as part of OP's code, just done once more.
– John Zwinck
Nov 12 '18 at 11:19
@coldspeed Not sure, but fromdfsI did what I can, need to check with OP to clear the confusion.
– Sandeep Kadapa
Nov 12 '18 at 11:28
@JohnZwinck Yes, but that is how OP's output.
– Sandeep Kadapa
Nov 12 '18 at 11:30
@SandeepKadapa Not sure what the confusion is, they've indicated their expected output at the end of the post.
– coldspeed
Nov 12 '18 at 12:04
Isn't your answer incorrect?
– coldspeed
Nov 12 '18 at 11:14
Isn't your answer incorrect?
– coldspeed
Nov 12 '18 at 11:14
The weird thing is that the solution is basically the same as part of OP's code, just done once more.
– John Zwinck
Nov 12 '18 at 11:19
The weird thing is that the solution is basically the same as part of OP's code, just done once more.
– John Zwinck
Nov 12 '18 at 11:19
@coldspeed Not sure, but from
dfs I did what I can, need to check with OP to clear the confusion.– Sandeep Kadapa
Nov 12 '18 at 11:28
@coldspeed Not sure, but from
dfs I did what I can, need to check with OP to clear the confusion.– Sandeep Kadapa
Nov 12 '18 at 11:28
@JohnZwinck Yes, but that is how OP's output.
– Sandeep Kadapa
Nov 12 '18 at 11:30
@JohnZwinck Yes, but that is how OP's output.
– Sandeep Kadapa
Nov 12 '18 at 11:30
@SandeepKadapa Not sure what the confusion is, they've indicated their expected output at the end of the post.
– coldspeed
Nov 12 '18 at 12:04
@SandeepKadapa Not sure what the confusion is, they've indicated their expected output at the end of the post.
– coldspeed
Nov 12 '18 at 12:04
|
show 1 more comment
Simply:
dfs.groupby(['Cat', 'Number']).count()
reproduced the below, which works..
>>> dfs.groupby(['Cat', 'Number']).count()
Email
Cat Number
ab1 1 2
2 1
ab2 1 3
OR
>>> dfs.groupby(['Cat', 'Number'])['Email'].count()
Cat Number
ab1 1 2
2 1
ab2 1 3
Name: Email, dtype: int64
This won't produce OP's expected output.
– coldspeed
Nov 12 '18 at 11:16
@coldspeed, it works correctly , see the POST , OP mentioneddfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
– pygo
Nov 12 '18 at 11:24
Regardless of what they mentioned, it isn't their "expected output".
– coldspeed
Nov 12 '18 at 11:53
add a comment |
Simply:
dfs.groupby(['Cat', 'Number']).count()
reproduced the below, which works..
>>> dfs.groupby(['Cat', 'Number']).count()
Email
Cat Number
ab1 1 2
2 1
ab2 1 3
OR
>>> dfs.groupby(['Cat', 'Number'])['Email'].count()
Cat Number
ab1 1 2
2 1
ab2 1 3
Name: Email, dtype: int64
This won't produce OP's expected output.
– coldspeed
Nov 12 '18 at 11:16
@coldspeed, it works correctly , see the POST , OP mentioneddfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
– pygo
Nov 12 '18 at 11:24
Regardless of what they mentioned, it isn't their "expected output".
– coldspeed
Nov 12 '18 at 11:53
add a comment |
Simply:
dfs.groupby(['Cat', 'Number']).count()
reproduced the below, which works..
>>> dfs.groupby(['Cat', 'Number']).count()
Email
Cat Number
ab1 1 2
2 1
ab2 1 3
OR
>>> dfs.groupby(['Cat', 'Number'])['Email'].count()
Cat Number
ab1 1 2
2 1
ab2 1 3
Name: Email, dtype: int64
Simply:
dfs.groupby(['Cat', 'Number']).count()
reproduced the below, which works..
>>> dfs.groupby(['Cat', 'Number']).count()
Email
Cat Number
ab1 1 2
2 1
ab2 1 3
OR
>>> dfs.groupby(['Cat', 'Number'])['Email'].count()
Cat Number
ab1 1 2
2 1
ab2 1 3
Name: Email, dtype: int64
edited Nov 12 '18 at 11:23
answered Nov 12 '18 at 11:12
pygopygo
2,4081618
2,4081618
This won't produce OP's expected output.
– coldspeed
Nov 12 '18 at 11:16
@coldspeed, it works correctly , see the POST , OP mentioneddfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
– pygo
Nov 12 '18 at 11:24
Regardless of what they mentioned, it isn't their "expected output".
– coldspeed
Nov 12 '18 at 11:53
add a comment |
This won't produce OP's expected output.
– coldspeed
Nov 12 '18 at 11:16
@coldspeed, it works correctly , see the POST , OP mentioneddfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')
– pygo
Nov 12 '18 at 11:24
Regardless of what they mentioned, it isn't their "expected output".
– coldspeed
Nov 12 '18 at 11:53
This won't produce OP's expected output.
– coldspeed
Nov 12 '18 at 11:16
This won't produce OP's expected output.
– coldspeed
Nov 12 '18 at 11:16
@coldspeed, it works correctly , see the POST , OP mentioned
dfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')– pygo
Nov 12 '18 at 11:24
@coldspeed, it works correctly , see the POST , OP mentioned
dfs=df.groupby(['Email', 'Cat'])['Email'].count().reset_index(name='Number')– pygo
Nov 12 '18 at 11:24
Regardless of what they mentioned, it isn't their "expected output".
– coldspeed
Nov 12 '18 at 11:53
Regardless of what they mentioned, it isn't their "expected output".
– coldspeed
Nov 12 '18 at 11:53
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260893%2fhow-to-do-group-by-in-two-levels-on-python-pandas-an-count-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Multiple solutions, e.g. see Pandas groupby.size vs series.value_counts vs collections.Counter with multiple series
– jpp
Nov 12 '18 at 11:12
Its should be simple with groupby and count.
– pygo
Nov 12 '18 at 11:32
I think you want
c = df.groupby(['Email', 'Cat'])['Email'].cumcount(); df.groupby(['Cat', c]).Cat.count()if not then your question has been phrased poorly.– coldspeed
Nov 12 '18 at 17:07