How to extract the characters from a string that are inside parentheses?











up vote
2
down vote

favorite
1












Picture of the DataFrame:





I have one column named contracting and another named contractor inside a DataFrame.



I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).



Example:



Contractor: Meo(504615947)


I need that it becomes:



Contractor_Name: Meo and Contractor_Number:504615947


I tried to do this:



proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())

proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')


Problem 1:



I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.



Problem 2:



Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).



All Fiscal Numbers have 9 digits.










share|improve this question




















  • 1




    Please give a proper Minimal, Complete, and Verifiable example, in text form.
    – jonrsharpe
    Nov 11 at 15:16










  • Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
    – quant
    Nov 11 at 15:29










  • Sorry, it's my first post. I will keep your suggestions in mind next time.
    – jess
    Nov 11 at 16:16

















up vote
2
down vote

favorite
1












Picture of the DataFrame:





I have one column named contracting and another named contractor inside a DataFrame.



I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).



Example:



Contractor: Meo(504615947)


I need that it becomes:



Contractor_Name: Meo and Contractor_Number:504615947


I tried to do this:



proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())

proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')


Problem 1:



I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.



Problem 2:



Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).



All Fiscal Numbers have 9 digits.










share|improve this question




















  • 1




    Please give a proper Minimal, Complete, and Verifiable example, in text form.
    – jonrsharpe
    Nov 11 at 15:16










  • Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
    – quant
    Nov 11 at 15:29










  • Sorry, it's my first post. I will keep your suggestions in mind next time.
    – jess
    Nov 11 at 16:16















up vote
2
down vote

favorite
1









up vote
2
down vote

favorite
1






1





Picture of the DataFrame:





I have one column named contracting and another named contractor inside a DataFrame.



I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).



Example:



Contractor: Meo(504615947)


I need that it becomes:



Contractor_Name: Meo and Contractor_Number:504615947


I tried to do this:



proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())

proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')


Problem 1:



I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.



Problem 2:



Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).



All Fiscal Numbers have 9 digits.










share|improve this question















Picture of the DataFrame:





I have one column named contracting and another named contractor inside a DataFrame.



I need to divide, for example, the column contractor, into 2 new columns: one column containing the Fiscal number that is inside the parenthesis and another column containing all the rest (the description).



Example:



Contractor: Meo(504615947)


I need that it becomes:



Contractor_Name: Meo and Contractor_Number:504615947


I tried to do this:



proc_2013[['contractor_description', 'contractor_NIF']]= pd.DataFrame(proc_2013['contractor'].str.split(('('),1).tolist())

proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('(d+)')


Problem 1:



I can have a name description inside a parenthesis as well, followed by the number that I am trying to extract.



Problem 2:



Sometimes, if the contractor is from a foreign country, it has a letter in the beginning of the Fiscal Number (not only numbers as I assumed at first, using my second line of code).



All Fiscal Numbers have 9 digits.







python pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 16:59









Akash Ranjan

10811




10811










asked Nov 11 at 15:12









jess

164




164








  • 1




    Please give a proper Minimal, Complete, and Verifiable example, in text form.
    – jonrsharpe
    Nov 11 at 15:16










  • Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
    – quant
    Nov 11 at 15:29










  • Sorry, it's my first post. I will keep your suggestions in mind next time.
    – jess
    Nov 11 at 16:16
















  • 1




    Please give a proper Minimal, Complete, and Verifiable example, in text form.
    – jonrsharpe
    Nov 11 at 15:16










  • Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
    – quant
    Nov 11 at 15:29










  • Sorry, it's my first post. I will keep your suggestions in mind next time.
    – jess
    Nov 11 at 16:16










1




1




Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16




Please give a proper Minimal, Complete, and Verifiable example, in text form.
– jonrsharpe
Nov 11 at 15:16












Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29




Please don't post images of your code, see meta.stackoverflow.com/questions/374700/…
– quant
Nov 11 at 15:29












Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16






Sorry, it's my first post. I will keep your suggestions in mind next time.
– jess
Nov 11 at 16:16














2 Answers
2






active

oldest

votes

















up vote
2
down vote



accepted










As far as i could understand your question, this can be a possible solution,



df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))


Hope this helps.






share|improve this answer























  • Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
    – jess
    Nov 11 at 16:26












  • i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
    – Akash Ranjan
    Nov 11 at 16:36










  • It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
    – jess
    Nov 11 at 16:44










  • No, Thanks. Didn't knew that. Keep Coding :)
    – Akash Ranjan
    Nov 11 at 17:09










  • It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
    – jess
    Nov 11 at 17:12


















up vote
2
down vote













You could change d to w for any alphanumeric like:



proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')  





share|improve this answer























  • Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
    – jess
    Nov 11 at 15:41












  • Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
    – Franco Piccolo
    Nov 11 at 15:46










  • Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
    – jess
    Nov 11 at 16:45











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53250089%2fhow-to-extract-the-characters-from-a-string-that-are-inside-parentheses%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










As far as i could understand your question, this can be a possible solution,



df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))


Hope this helps.






share|improve this answer























  • Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
    – jess
    Nov 11 at 16:26












  • i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
    – Akash Ranjan
    Nov 11 at 16:36










  • It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
    – jess
    Nov 11 at 16:44










  • No, Thanks. Didn't knew that. Keep Coding :)
    – Akash Ranjan
    Nov 11 at 17:09










  • It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
    – jess
    Nov 11 at 17:12















up vote
2
down vote



accepted










As far as i could understand your question, this can be a possible solution,



df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))


Hope this helps.






share|improve this answer























  • Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
    – jess
    Nov 11 at 16:26












  • i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
    – Akash Ranjan
    Nov 11 at 16:36










  • It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
    – jess
    Nov 11 at 16:44










  • No, Thanks. Didn't knew that. Keep Coding :)
    – Akash Ranjan
    Nov 11 at 17:09










  • It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
    – jess
    Nov 11 at 17:12













up vote
2
down vote



accepted







up vote
2
down vote



accepted






As far as i could understand your question, this can be a possible solution,



df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))


Hope this helps.






share|improve this answer














As far as i could understand your question, this can be a possible solution,



df['contractor_name']=list(map(lambda x : x.split('(')[0],df['con']))
df['contractor_number']=list(map(lambda x : x.split('(')[-1][-10:-1],df['contractor']))


Hope this helps.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 11 at 16:51

























answered Nov 11 at 15:45









Akash Ranjan

10811




10811












  • Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
    – jess
    Nov 11 at 16:26












  • i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
    – Akash Ranjan
    Nov 11 at 16:36










  • It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
    – jess
    Nov 11 at 16:44










  • No, Thanks. Didn't knew that. Keep Coding :)
    – Akash Ranjan
    Nov 11 at 17:09










  • It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
    – jess
    Nov 11 at 17:12


















  • Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
    – jess
    Nov 11 at 16:26












  • i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
    – Akash Ranjan
    Nov 11 at 16:36










  • It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
    – jess
    Nov 11 at 16:44










  • No, Thanks. Didn't knew that. Keep Coding :)
    – Akash Ranjan
    Nov 11 at 17:09










  • It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
    – jess
    Nov 11 at 17:12
















Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26






Thanks for the help Akash! But I still have a problem. I have the following string: "Trabalhadores em Funções Públicas (ADSE) (600000303)"" My goal is to split this into two new columns: Contractor_Name: "Trabalhadores em Funções Públicas (ADSE) " and Contractor_Number: 600000303. Using the solution you provided, I obtain Contractor_Number: "ADSE)" . Can you still help me, please?
– jess
Nov 11 at 16:26














i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36




i have updated the code to give you required contractor_number, please confirm if you need (ADSE) as well in your contractor_name?
– Akash Ranjan
Nov 11 at 16:36












It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44




It's working now exactly as I wanted. Thank you so so much Akash for your great help!!!!
– jess
Nov 11 at 16:44












No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09




No, Thanks. Didn't knew that. Keep Coding :)
– Akash Ranjan
Nov 11 at 17:09












It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12




It seems to have worked now. I was able to click in the arrow to vote +1. Thanks again! :)
– jess
Nov 11 at 17:12












up vote
2
down vote













You could change d to w for any alphanumeric like:



proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')  





share|improve this answer























  • Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
    – jess
    Nov 11 at 15:41












  • Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
    – Franco Piccolo
    Nov 11 at 15:46










  • Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
    – jess
    Nov 11 at 16:45















up vote
2
down vote













You could change d to w for any alphanumeric like:



proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')  





share|improve this answer























  • Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
    – jess
    Nov 11 at 15:41












  • Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
    – Franco Piccolo
    Nov 11 at 15:46










  • Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
    – jess
    Nov 11 at 16:45













up vote
2
down vote










up vote
2
down vote









You could change d to w for any alphanumeric like:



proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')  





share|improve this answer














You could change d to w for any alphanumeric like:



proc2013['contractor_NIF'] = proc2013.contractor_NIF.str.extract('((w+))')  






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 11 at 15:36

























answered Nov 11 at 15:31









Franco Piccolo

1,335611




1,335611












  • Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
    – jess
    Nov 11 at 15:41












  • Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
    – Franco Piccolo
    Nov 11 at 15:46










  • Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
    – jess
    Nov 11 at 16:45


















  • Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
    – jess
    Nov 11 at 15:41












  • Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
    – Franco Piccolo
    Nov 11 at 15:46










  • Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
    – jess
    Nov 11 at 16:45
















Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41






Thanks a lot Franco! Your comment helped me to solve my second problem. I still need to understand what to do when I have something like: Trabalhadores em Funções Públicas (ADSE) (600000303). I am getting ADSE as the fiscal number instead of 600000303.
– jess
Nov 11 at 15:41














Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46




Welcome! It would be good if you clarify which are the inputs that you are having trouble with with examples.
– Franco Piccolo
Nov 11 at 15:46












Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45




Akash already helped me to solve the problem. Thanks again Franco for the help and sorry if I wasn't very clear!
– jess
Nov 11 at 16:45


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53250089%2fhow-to-extract-the-characters-from-a-string-that-are-inside-parentheses%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Full-time equivalent

さくらももこ

13 indicted, 8 arrested in Calif. drug cartel investigation