Using explicitly numbered repetition instead of question mark, star and plus












47















I've seen regex patterns that use explicitly numbered repetition instead of ?, * and +, i.e.:



Explicit            Shorthand
(something){0,1} (something)?
(something){1} (something)
(something){0,} (something)*
(something){1,} (something)+


The questions are:




  • Are these two forms identical? What if you add possessive/reluctant modifiers?

  • If they are identical, which one is more idiomatic? More readable? Simply "better"?










share|improve this question





























    47















    I've seen regex patterns that use explicitly numbered repetition instead of ?, * and +, i.e.:



    Explicit            Shorthand
    (something){0,1} (something)?
    (something){1} (something)
    (something){0,} (something)*
    (something){1,} (something)+


    The questions are:




    • Are these two forms identical? What if you add possessive/reluctant modifiers?

    • If they are identical, which one is more idiomatic? More readable? Simply "better"?










    share|improve this question



























      47












      47








      47


      7






      I've seen regex patterns that use explicitly numbered repetition instead of ?, * and +, i.e.:



      Explicit            Shorthand
      (something){0,1} (something)?
      (something){1} (something)
      (something){0,} (something)*
      (something){1,} (something)+


      The questions are:




      • Are these two forms identical? What if you add possessive/reluctant modifiers?

      • If they are identical, which one is more idiomatic? More readable? Simply "better"?










      share|improve this question
















      I've seen regex patterns that use explicitly numbered repetition instead of ?, * and +, i.e.:



      Explicit            Shorthand
      (something){0,1} (something)?
      (something){1} (something)
      (something){0,} (something)*
      (something){1,} (something)+


      The questions are:




      • Are these two forms identical? What if you add possessive/reluctant modifiers?

      • If they are identical, which one is more idiomatic? More readable? Simply "better"?







      regex readability repetition






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jun 13 '10 at 23:49









      Alan Moore

      61k979133




      61k979133










      asked Jun 13 '10 at 14:39









      polygenelubricantspolygenelubricants

      283k101506592




      283k101506592
























          4 Answers
          4






          active

          oldest

          votes


















          63














          To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.



          The only time I would use explicitly numbered repetition is when the repetition is greater than 1:




          • Exactly two: {2}

          • Two or more: {2,}

          • Two to four: {2,4}


          I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: ddd but I would rather write d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.



          If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.






          share|improve this answer


























          • +1, I too think shorthand is better, but I'm also in love with nested ternaries, and I've been virtually yelled at for doing that. I can see that some people may think {0,1} "shows intent more clearly" than ?, hence the Q.

            – polygenelubricants
            Jun 13 '10 at 15:47





















          7














          I can see how, if you have a regex that does a lot of bounded repetition, you might want to use the {n,m} form consistently for readability's sake. For example:



          /^
          abc{2,5}
          xyz{0,1}
          foo{3,12}
          bar{1,}
          $/x


          But I can't recall ever seeing such a case in real life. When I see {0,1}, {0,} or {1,} being used in a question, it's virtually always being done out of ignorance. And in the process of answering such a question, we should also suggest that they use the ?, * or + instead.



          And of course, {1} is pure clutter. Some people seem to have a vague notion that it means "one and only one"--after all, it must mean something, right? Why would such a pathologically terse language support a construct that takes up a whole three characters and does nothing at all? Its only legitimate use that I know of is to isolate a backreference that's followed by a literal digit (e.g. 1{1}0), but there are other ways to do that.






          share|improve this answer

































            2















            • They're all identical unless you're using an exceptional regex engine. However, not all regex engines support numbered repetition, ? or +.


            • If all of them are available, I'd use characters rather than numbers, simply because it's more intuitive for me.







            share|improve this answer































              1














              They're equivalent (and you'll find out if they're available by testing your context.)



              The problem I'd anticipate is when you may not be the only person ever needing to work with your code.
              Regexes are difficult enough for most people. Anytime someone uses an unusual syntax, the question
              arises: "Why didn't they do it the standard way? What were they thinking that I'm missing?"






              share|improve this answer























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3032593%2fusing-explicitly-numbered-repetition-instead-of-question-mark-star-and-plus%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                4 Answers
                4






                active

                oldest

                votes








                4 Answers
                4






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                63














                To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.



                The only time I would use explicitly numbered repetition is when the repetition is greater than 1:




                • Exactly two: {2}

                • Two or more: {2,}

                • Two to four: {2,4}


                I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: ddd but I would rather write d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.



                If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.






                share|improve this answer


























                • +1, I too think shorthand is better, but I'm also in love with nested ternaries, and I've been virtually yelled at for doing that. I can see that some people may think {0,1} "shows intent more clearly" than ?, hence the Q.

                  – polygenelubricants
                  Jun 13 '10 at 15:47


















                63














                To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.



                The only time I would use explicitly numbered repetition is when the repetition is greater than 1:




                • Exactly two: {2}

                • Two or more: {2,}

                • Two to four: {2,4}


                I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: ddd but I would rather write d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.



                If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.






                share|improve this answer


























                • +1, I too think shorthand is better, but I'm also in love with nested ternaries, and I've been virtually yelled at for doing that. I can see that some people may think {0,1} "shows intent more clearly" than ?, hence the Q.

                  – polygenelubricants
                  Jun 13 '10 at 15:47
















                63












                63








                63







                To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.



                The only time I would use explicitly numbered repetition is when the repetition is greater than 1:




                • Exactly two: {2}

                • Two or more: {2,}

                • Two to four: {2,4}


                I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: ddd but I would rather write d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.



                If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.






                share|improve this answer















                To my knowledge they are identical. I think there maybe a few engines out there that don't support the numbered syntax but I'm not sure which. I vaguely recall a question on SO a few days ago where explicit notation wouldn't work in Notepad++.



                The only time I would use explicitly numbered repetition is when the repetition is greater than 1:




                • Exactly two: {2}

                • Two or more: {2,}

                • Two to four: {2,4}


                I tend to prefer these especially when the repeated pattern is more than a few characters. If you have to match 3 numbers, some people like to write: ddd but I would rather write d{3} since it emphasizes the number of repetitions involved. Furthermore, down the road if that number ever needs to change, I only need to change {3} to {n} and not re-parse the regex in my head or worry about messing it up; it requires less mental effort.



                If that criteria isn't met, I prefer the shorthand. Using the "explicit" notation quickly clutters up the pattern and makes it hard to read. I've worked on a project where some developers didn't know regex too well (it's not exactly everyone's favorite topic) and I saw a lot of {1} and {0,1} occurrences. A few people would ask me to code review their pattern and that's when I would suggest changing those occurrences to shorthand notation and save space and, IMO, improve readability.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Jun 13 '10 at 15:32

























                answered Jun 13 '10 at 15:27









                Ahmad MageedAhmad Mageed

                76.8k14137162




                76.8k14137162













                • +1, I too think shorthand is better, but I'm also in love with nested ternaries, and I've been virtually yelled at for doing that. I can see that some people may think {0,1} "shows intent more clearly" than ?, hence the Q.

                  – polygenelubricants
                  Jun 13 '10 at 15:47





















                • +1, I too think shorthand is better, but I'm also in love with nested ternaries, and I've been virtually yelled at for doing that. I can see that some people may think {0,1} "shows intent more clearly" than ?, hence the Q.

                  – polygenelubricants
                  Jun 13 '10 at 15:47



















                +1, I too think shorthand is better, but I'm also in love with nested ternaries, and I've been virtually yelled at for doing that. I can see that some people may think {0,1} "shows intent more clearly" than ?, hence the Q.

                – polygenelubricants
                Jun 13 '10 at 15:47







                +1, I too think shorthand is better, but I'm also in love with nested ternaries, and I've been virtually yelled at for doing that. I can see that some people may think {0,1} "shows intent more clearly" than ?, hence the Q.

                – polygenelubricants
                Jun 13 '10 at 15:47















                7














                I can see how, if you have a regex that does a lot of bounded repetition, you might want to use the {n,m} form consistently for readability's sake. For example:



                /^
                abc{2,5}
                xyz{0,1}
                foo{3,12}
                bar{1,}
                $/x


                But I can't recall ever seeing such a case in real life. When I see {0,1}, {0,} or {1,} being used in a question, it's virtually always being done out of ignorance. And in the process of answering such a question, we should also suggest that they use the ?, * or + instead.



                And of course, {1} is pure clutter. Some people seem to have a vague notion that it means "one and only one"--after all, it must mean something, right? Why would such a pathologically terse language support a construct that takes up a whole three characters and does nothing at all? Its only legitimate use that I know of is to isolate a backreference that's followed by a literal digit (e.g. 1{1}0), but there are other ways to do that.






                share|improve this answer






























                  7














                  I can see how, if you have a regex that does a lot of bounded repetition, you might want to use the {n,m} form consistently for readability's sake. For example:



                  /^
                  abc{2,5}
                  xyz{0,1}
                  foo{3,12}
                  bar{1,}
                  $/x


                  But I can't recall ever seeing such a case in real life. When I see {0,1}, {0,} or {1,} being used in a question, it's virtually always being done out of ignorance. And in the process of answering such a question, we should also suggest that they use the ?, * or + instead.



                  And of course, {1} is pure clutter. Some people seem to have a vague notion that it means "one and only one"--after all, it must mean something, right? Why would such a pathologically terse language support a construct that takes up a whole three characters and does nothing at all? Its only legitimate use that I know of is to isolate a backreference that's followed by a literal digit (e.g. 1{1}0), but there are other ways to do that.






                  share|improve this answer




























                    7












                    7








                    7







                    I can see how, if you have a regex that does a lot of bounded repetition, you might want to use the {n,m} form consistently for readability's sake. For example:



                    /^
                    abc{2,5}
                    xyz{0,1}
                    foo{3,12}
                    bar{1,}
                    $/x


                    But I can't recall ever seeing such a case in real life. When I see {0,1}, {0,} or {1,} being used in a question, it's virtually always being done out of ignorance. And in the process of answering such a question, we should also suggest that they use the ?, * or + instead.



                    And of course, {1} is pure clutter. Some people seem to have a vague notion that it means "one and only one"--after all, it must mean something, right? Why would such a pathologically terse language support a construct that takes up a whole three characters and does nothing at all? Its only legitimate use that I know of is to isolate a backreference that's followed by a literal digit (e.g. 1{1}0), but there are other ways to do that.






                    share|improve this answer















                    I can see how, if you have a regex that does a lot of bounded repetition, you might want to use the {n,m} form consistently for readability's sake. For example:



                    /^
                    abc{2,5}
                    xyz{0,1}
                    foo{3,12}
                    bar{1,}
                    $/x


                    But I can't recall ever seeing such a case in real life. When I see {0,1}, {0,} or {1,} being used in a question, it's virtually always being done out of ignorance. And in the process of answering such a question, we should also suggest that they use the ?, * or + instead.



                    And of course, {1} is pure clutter. Some people seem to have a vague notion that it means "one and only one"--after all, it must mean something, right? Why would such a pathologically terse language support a construct that takes up a whole three characters and does nothing at all? Its only legitimate use that I know of is to isolate a backreference that's followed by a literal digit (e.g. 1{1}0), but there are other ways to do that.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 23 '15 at 21:49

























                    answered Jun 13 '10 at 22:16









                    Alan MooreAlan Moore

                    61k979133




                    61k979133























                        2















                        • They're all identical unless you're using an exceptional regex engine. However, not all regex engines support numbered repetition, ? or +.


                        • If all of them are available, I'd use characters rather than numbers, simply because it's more intuitive for me.







                        share|improve this answer




























                          2















                          • They're all identical unless you're using an exceptional regex engine. However, not all regex engines support numbered repetition, ? or +.


                          • If all of them are available, I'd use characters rather than numbers, simply because it's more intuitive for me.







                          share|improve this answer


























                            2












                            2








                            2








                            • They're all identical unless you're using an exceptional regex engine. However, not all regex engines support numbered repetition, ? or +.


                            • If all of them are available, I'd use characters rather than numbers, simply because it's more intuitive for me.







                            share|improve this answer














                            • They're all identical unless you're using an exceptional regex engine. However, not all regex engines support numbered repetition, ? or +.


                            • If all of them are available, I'd use characters rather than numbers, simply because it's more intuitive for me.








                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Jun 13 '10 at 15:29









                            tiftiktiftik

                            83759




                            83759























                                1














                                They're equivalent (and you'll find out if they're available by testing your context.)



                                The problem I'd anticipate is when you may not be the only person ever needing to work with your code.
                                Regexes are difficult enough for most people. Anytime someone uses an unusual syntax, the question
                                arises: "Why didn't they do it the standard way? What were they thinking that I'm missing?"






                                share|improve this answer




























                                  1














                                  They're equivalent (and you'll find out if they're available by testing your context.)



                                  The problem I'd anticipate is when you may not be the only person ever needing to work with your code.
                                  Regexes are difficult enough for most people. Anytime someone uses an unusual syntax, the question
                                  arises: "Why didn't they do it the standard way? What were they thinking that I'm missing?"






                                  share|improve this answer


























                                    1












                                    1








                                    1







                                    They're equivalent (and you'll find out if they're available by testing your context.)



                                    The problem I'd anticipate is when you may not be the only person ever needing to work with your code.
                                    Regexes are difficult enough for most people. Anytime someone uses an unusual syntax, the question
                                    arises: "Why didn't they do it the standard way? What were they thinking that I'm missing?"






                                    share|improve this answer













                                    They're equivalent (and you'll find out if they're available by testing your context.)



                                    The problem I'd anticipate is when you may not be the only person ever needing to work with your code.
                                    Regexes are difficult enough for most people. Anytime someone uses an unusual syntax, the question
                                    arises: "Why didn't they do it the standard way? What were they thinking that I'm missing?"







                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Jun 13 '10 at 16:10









                                    dkretzdkretz

                                    32.9k1373130




                                    32.9k1373130






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3032593%2fusing-explicitly-numbered-repetition-instead-of-question-mark-star-and-plus%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Full-time equivalent

                                        さくらももこ

                                        13 indicted, 8 arrested in Calif. drug cartel investigation