Skip to main content
Engineering LibreTexts

5.5: Implementation Notes by Vassili Bykov

  • Page ID
    43816
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    What to look at first. In 90% of the cases, the method String»matchesRegex: is all you need to access the package.

    RxParser accepts a string or a stream of characters with a regular expression, and produces a syntax tree corresponding to the expression. The tree is made of instances of Rxs* classes.

    RxMatcher accepts a syntax tree of a regular expression built by the parser and compiles it into a matcher: a structure made of instances of Rxm* classes. The RxMatcher instance can test whether a string or a positionable stream of characters matches the original regular expression, or it can search a string or a stream for substrings matching the expression. After a match is found, the matcher can report a specific string that matched the whole expression, or any parenthesized subexpression of it. All other classes support the same functionality and are used by RxParser, RxMatcher, or both.

    Caveats. The matcher is similar in spirit, but not in design to Henry Spencer’s original regular expression implementation in C. The focus is on simplicity, not on efficiency. The matcher passes H. Spencer’s test suite (see “test suite” protocol), with quite a few extra tests added, so chances are good there are not too many bugs.

    Acknowledgments. Since the first release of the matcher, thanks to the input from several fellow Smalltalkers, I became convinced a native Smalltalk regular expression matcher was worth the effort to keep it alive. For the advice and encouragement that made this release possible, I want to thank: Felix Hack, Eliot Miranda, Robb Shecter, David N. Smith, Francis Wolinski and anyone whom I haven’t yet met or heard from, but who agrees this has not been a complete waste of time.


    This page titled 5.5: Implementation Notes by Vassili Bykov is shared under a CC BY-SA 3.0 license and was authored, remixed, and/or curated by Alexandre Bergel, Damien Cassou, Stéphane Ducasse, Jannik Laval (Square Bracket Associates) via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.