Class Rouge::RegexLexer
In: lib/rouge/regex_lexer.rb
Parent: Lexer

@abstract A stateful lexer that uses sets of regular expressions to tokenize a string. Most lexers are instances of RegexLexer.

Methods

append   delegate   get_state   get_state   goto   group   groups   in_state?   pop!   prepend   push   recurse   replace_state   reset!   reset_stack   stack   start   start_procs   state   state   state?   state_definitions   states   step   stream_tokens   token  

Classes and Modules

Class Rouge::RegexLexer::Rule
Class Rouge::RegexLexer::State
Class Rouge::RegexLexer::StateDSL

Constants

MAX_NULL_SCANS = 5   The number of successive scans permitted without consuming the input stream. If this is exceeded, the match fails.

Public Class methods

Specify an action to be run every fresh lex.

@example

  start { puts "I'm lexing a new string!" }

The routines to run at the beginning of a fresh lex. @see start

Define a new state for this lexer with the given name. The block will be evaluated in the context of a {StateDSL}.

The states hash for this lexer. @see state

Public Instance methods

Delegate the lex to another lexer. The lex method will be called with `:continue` set to true, so that reset! will not be called. In this way, a single lexer can be repeatedly delegated to while maintaining its own internal state stack.

@param [lex] lexer

  The lexer or lexer class to delegate to

@param [String] text

  The text to delegate.  This defaults to the last matched string.

replace the head of the stack with the given state

@deprecated

Yield a token with the next matched group. Subsequent calls to this method will yield subsequent groups.

Yield tokens corresponding to the matched groups of the current match.

Check if `state_name` is in the state stack.

Pop the state stack. If a number is passed in, it will be popped that number of times.

Push a state onto the stack. If no state name is given and you‘ve passed a block, a state will be dynamically created using the {StateDSL}.

reset this lexer to its initial state. This runs all of the start_procs.

reset the stack back to `[:root]`.

The state stack. This is initially the single state `[:root]`. It is an error for this stack to be empty. @see state

The current state - i.e. one on top of the state stack.

NB: if the state stack is empty, this will throw an error rather than returning nil.

Check if `state_name` is the state on top of the state stack.

Runs one step of the lex. Rules in the current state are tried until one matches, at which point its callback is called.

@return true if a rule was tried successfully @return false otherwise.

This implements the lexer protocol, by yielding [token, value] pairs.

The process for lexing works as follows, until the stream is empty:

  1. We look at the state on top of the stack (which by default is `[:root]`).
  2. Each rule in that state is tried until one is successful. If one is found, that rule‘s callback is evaluated - which may yield tokens and manipulate the state stack. Otherwise, one character is consumed with an `’Error’` token, and we continue at (1.)

@see step step (where (2.) is implemented)

Yield a token.

@param tok

  the token type

@param val

  (optional) the string value to yield.  If absent, this defaults
  to the entire last match.

[Validate]