regular expression of python











up vote
2
down vote

favorite












I am struggling when writing regular expression in python.
For instance I get the following right



"GET /images/launch-logo.gif HTTP/1.0" 220 1839


is matched by



"(S+) (S+)s*(S*)" (d{3}) (S+)


however I still need to include the following cases all together




  1. "GET /history/history.html hqpao/hqpao_home.html
    HTTP/1.0" 200 1502

  2. "GET /shuttle/missions/missions.html Shuttle Launches from
    Kennedy Space Center HTTP/1.0"200 8677

  3. "GET /finger @net.com HTTP/1.0"404 -


obviously I should change the bold part of the expression



"(S+) (S+)s*(S*)" (d{3}) (S+)


But how should I change it. I have one approach in mind which is change the bold part to



[s |(s*)(S+) |(S+)(12) |(S+)]


where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.



But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.










share|improve this question
























  • Are the beginnings (1), (2) and (3) part of what you want to match or is that a numbered list of strings to match?
    – das-g
    Nov 11 at 10:27










  • my regular expression have to include all (1), (2), (3) and the very beginning cases
    – Ricky Ng
    Nov 11 at 10:28






  • 1




    Yes, but is the actual string to match in the (1) case (1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502 and the (1) is just there to number it in your post here?
    – das-g
    Nov 11 at 10:32










  • oh! It is just for numbering and not included in the requirement
    – Ricky Ng
    Nov 11 at 11:29















up vote
2
down vote

favorite












I am struggling when writing regular expression in python.
For instance I get the following right



"GET /images/launch-logo.gif HTTP/1.0" 220 1839


is matched by



"(S+) (S+)s*(S*)" (d{3}) (S+)


however I still need to include the following cases all together




  1. "GET /history/history.html hqpao/hqpao_home.html
    HTTP/1.0" 200 1502

  2. "GET /shuttle/missions/missions.html Shuttle Launches from
    Kennedy Space Center HTTP/1.0"200 8677

  3. "GET /finger @net.com HTTP/1.0"404 -


obviously I should change the bold part of the expression



"(S+) (S+)s*(S*)" (d{3}) (S+)


But how should I change it. I have one approach in mind which is change the bold part to



[s |(s*)(S+) |(S+)(12) |(S+)]


where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.



But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.










share|improve this question
























  • Are the beginnings (1), (2) and (3) part of what you want to match or is that a numbered list of strings to match?
    – das-g
    Nov 11 at 10:27










  • my regular expression have to include all (1), (2), (3) and the very beginning cases
    – Ricky Ng
    Nov 11 at 10:28






  • 1




    Yes, but is the actual string to match in the (1) case (1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502 and the (1) is just there to number it in your post here?
    – das-g
    Nov 11 at 10:32










  • oh! It is just for numbering and not included in the requirement
    – Ricky Ng
    Nov 11 at 11:29













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I am struggling when writing regular expression in python.
For instance I get the following right



"GET /images/launch-logo.gif HTTP/1.0" 220 1839


is matched by



"(S+) (S+)s*(S*)" (d{3}) (S+)


however I still need to include the following cases all together




  1. "GET /history/history.html hqpao/hqpao_home.html
    HTTP/1.0" 200 1502

  2. "GET /shuttle/missions/missions.html Shuttle Launches from
    Kennedy Space Center HTTP/1.0"200 8677

  3. "GET /finger @net.com HTTP/1.0"404 -


obviously I should change the bold part of the expression



"(S+) (S+)s*(S*)" (d{3}) (S+)


But how should I change it. I have one approach in mind which is change the bold part to



[s |(s*)(S+) |(S+)(12) |(S+)]


where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.



But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.










share|improve this question















I am struggling when writing regular expression in python.
For instance I get the following right



"GET /images/launch-logo.gif HTTP/1.0" 220 1839


is matched by



"(S+) (S+)s*(S*)" (d{3}) (S+)


however I still need to include the following cases all together




  1. "GET /history/history.html hqpao/hqpao_home.html
    HTTP/1.0" 200 1502

  2. "GET /shuttle/missions/missions.html Shuttle Launches from
    Kennedy Space Center HTTP/1.0"200 8677

  3. "GET /finger @net.com HTTP/1.0"404 -


obviously I should change the bold part of the expression



"(S+) (S+)s*(S*)" (d{3}) (S+)


But how should I change it. I have one approach in mind which is change the bold part to



[s |(s*)(S+) |(S+)(12) |(S+)]


where the 2nd, 3rd , 4th expression is the (1), (2), (3) extra cases I need to deal with.



But my expression do not work. What do I misunderstand about regular expression as I simply deal with it case by case.







python regex






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 11:31









das-g

5,86322250




5,86322250










asked Nov 11 at 10:20









Ricky Ng

167




167












  • Are the beginnings (1), (2) and (3) part of what you want to match or is that a numbered list of strings to match?
    – das-g
    Nov 11 at 10:27










  • my regular expression have to include all (1), (2), (3) and the very beginning cases
    – Ricky Ng
    Nov 11 at 10:28






  • 1




    Yes, but is the actual string to match in the (1) case (1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502 and the (1) is just there to number it in your post here?
    – das-g
    Nov 11 at 10:32










  • oh! It is just for numbering and not included in the requirement
    – Ricky Ng
    Nov 11 at 11:29


















  • Are the beginnings (1), (2) and (3) part of what you want to match or is that a numbered list of strings to match?
    – das-g
    Nov 11 at 10:27










  • my regular expression have to include all (1), (2), (3) and the very beginning cases
    – Ricky Ng
    Nov 11 at 10:28






  • 1




    Yes, but is the actual string to match in the (1) case (1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502 and the (1) is just there to number it in your post here?
    – das-g
    Nov 11 at 10:32










  • oh! It is just for numbering and not included in the requirement
    – Ricky Ng
    Nov 11 at 11:29
















Are the beginnings (1), (2) and (3) part of what you want to match or is that a numbered list of strings to match?
– das-g
Nov 11 at 10:27




Are the beginnings (1), (2) and (3) part of what you want to match or is that a numbered list of strings to match?
– das-g
Nov 11 at 10:27












my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28




my regular expression have to include all (1), (2), (3) and the very beginning cases
– Ricky Ng
Nov 11 at 10:28




1




1




Yes, but is the actual string to match in the (1) case (1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502 and the (1) is just there to number it in your post here?
– das-g
Nov 11 at 10:32




Yes, but is the actual string to match in the (1) case (1)"GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502, or is it just "GET /history/history.html hqpao/hqpao_home.html HTTP/1.0" 200 1502 and the (1) is just there to number it in your post here?
– das-g
Nov 11 at 10:32












oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29




oh! It is just for numbering and not included in the requirement
– Ricky Ng
Nov 11 at 11:29












2 Answers
2






active

oldest

votes

















up vote
1
down vote













This Might be a bit messy but it works:



"(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*


You can play with it on Regexr. Regexr Shared Link






share|improve this answer




























    up vote
    0
    down vote













    You may use



    ^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$


    See the regex demo



    Details





    • ^ - start of a line (use re.M if you are reading the whole file into a variable, f.read())


    • " - a double quotation mark


    • ([^s"]+) - Group 1: one or more chars other than whitespace and a double quotation mark


    • s+ - 1+ whitespaces


    • ([^s"]+) - Group 2: one or more chars other than whitespace and a double quotation mark


    • (?:s+([^"]+?))? - an optional non-capturing group matching



      • s+ - 1+ whitespaces


      • ([^"]+?) - Group 3: any 1 or more chars other than ", as few as possible




    • s+ - 1+ whitespaces


    • ([A-Z]+/d[d.]*) - Group 4: 1+ uppercase letters, / and then 1 digit followed with any 0+ digits or . chars


    • " - a double quotation mark


    • s+ - 0+ whitespaces


    • (d{3}) - Group 5: three digits


    • s* - 0+ whitespaces


    • (S+) - 1 or more non-whitespace chars


    • $ - end of string.






    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247770%2fregular-expression-of-python%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      1
      down vote













      This Might be a bit messy but it works:



      "(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*


      You can play with it on Regexr. Regexr Shared Link






      share|improve this answer

























        up vote
        1
        down vote













        This Might be a bit messy but it works:



        "(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*


        You can play with it on Regexr. Regexr Shared Link






        share|improve this answer























          up vote
          1
          down vote










          up vote
          1
          down vote









          This Might be a bit messy but it works:



          "(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*


          You can play with it on Regexr. Regexr Shared Link






          share|improve this answer












          This Might be a bit messy but it works:



          "(S+) (S+[sw.@]*)s*(S*)"s?(d{3})s(S+)*


          You can play with it on Regexr. Regexr Shared Link







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 11 at 10:35









          Dani G

          427411




          427411
























              up vote
              0
              down vote













              You may use



              ^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$


              See the regex demo



              Details





              • ^ - start of a line (use re.M if you are reading the whole file into a variable, f.read())


              • " - a double quotation mark


              • ([^s"]+) - Group 1: one or more chars other than whitespace and a double quotation mark


              • s+ - 1+ whitespaces


              • ([^s"]+) - Group 2: one or more chars other than whitespace and a double quotation mark


              • (?:s+([^"]+?))? - an optional non-capturing group matching



                • s+ - 1+ whitespaces


                • ([^"]+?) - Group 3: any 1 or more chars other than ", as few as possible




              • s+ - 1+ whitespaces


              • ([A-Z]+/d[d.]*) - Group 4: 1+ uppercase letters, / and then 1 digit followed with any 0+ digits or . chars


              • " - a double quotation mark


              • s+ - 0+ whitespaces


              • (d{3}) - Group 5: three digits


              • s* - 0+ whitespaces


              • (S+) - 1 or more non-whitespace chars


              • $ - end of string.






              share|improve this answer

























                up vote
                0
                down vote













                You may use



                ^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$


                See the regex demo



                Details





                • ^ - start of a line (use re.M if you are reading the whole file into a variable, f.read())


                • " - a double quotation mark


                • ([^s"]+) - Group 1: one or more chars other than whitespace and a double quotation mark


                • s+ - 1+ whitespaces


                • ([^s"]+) - Group 2: one or more chars other than whitespace and a double quotation mark


                • (?:s+([^"]+?))? - an optional non-capturing group matching



                  • s+ - 1+ whitespaces


                  • ([^"]+?) - Group 3: any 1 or more chars other than ", as few as possible




                • s+ - 1+ whitespaces


                • ([A-Z]+/d[d.]*) - Group 4: 1+ uppercase letters, / and then 1 digit followed with any 0+ digits or . chars


                • " - a double quotation mark


                • s+ - 0+ whitespaces


                • (d{3}) - Group 5: three digits


                • s* - 0+ whitespaces


                • (S+) - 1 or more non-whitespace chars


                • $ - end of string.






                share|improve this answer























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  You may use



                  ^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$


                  See the regex demo



                  Details





                  • ^ - start of a line (use re.M if you are reading the whole file into a variable, f.read())


                  • " - a double quotation mark


                  • ([^s"]+) - Group 1: one or more chars other than whitespace and a double quotation mark


                  • s+ - 1+ whitespaces


                  • ([^s"]+) - Group 2: one or more chars other than whitespace and a double quotation mark


                  • (?:s+([^"]+?))? - an optional non-capturing group matching



                    • s+ - 1+ whitespaces


                    • ([^"]+?) - Group 3: any 1 or more chars other than ", as few as possible




                  • s+ - 1+ whitespaces


                  • ([A-Z]+/d[d.]*) - Group 4: 1+ uppercase letters, / and then 1 digit followed with any 0+ digits or . chars


                  • " - a double quotation mark


                  • s+ - 0+ whitespaces


                  • (d{3}) - Group 5: three digits


                  • s* - 0+ whitespaces


                  • (S+) - 1 or more non-whitespace chars


                  • $ - end of string.






                  share|improve this answer












                  You may use



                  ^"([^s"]+)s+([^s"]+)(?:s+([^"]+?))?s+([A-Z]+/d[d.]*)"s*(d{3})s*(S+)$


                  See the regex demo



                  Details





                  • ^ - start of a line (use re.M if you are reading the whole file into a variable, f.read())


                  • " - a double quotation mark


                  • ([^s"]+) - Group 1: one or more chars other than whitespace and a double quotation mark


                  • s+ - 1+ whitespaces


                  • ([^s"]+) - Group 2: one or more chars other than whitespace and a double quotation mark


                  • (?:s+([^"]+?))? - an optional non-capturing group matching



                    • s+ - 1+ whitespaces


                    • ([^"]+?) - Group 3: any 1 or more chars other than ", as few as possible




                  • s+ - 1+ whitespaces


                  • ([A-Z]+/d[d.]*) - Group 4: 1+ uppercase letters, / and then 1 digit followed with any 0+ digits or . chars


                  • " - a double quotation mark


                  • s+ - 0+ whitespaces


                  • (d{3}) - Group 5: three digits


                  • s* - 0+ whitespaces


                  • (S+) - 1 or more non-whitespace chars


                  • $ - end of string.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 11 at 11:22









                  Wiktor Stribiżew

                  303k16123199




                  303k16123199






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247770%2fregular-expression-of-python%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Full-time equivalent

                      さくらももこ

                      13 indicted, 8 arrested in Calif. drug cartel investigation