re2-0.2: Bindings to the re2 regular expression library

LicenseMIT
Safe HaskellNone
LanguageHaskell98

Regex.RE2

Contents

Description

re2 is a regular expression library offering predictable run-time and memory consumption. This package is a binding to re2.

Supported expression syntax is documented at http://code.google.com/p/re2/wiki/Syntax.

$ ghci -XOverloadedStrings
ghci> import Regex.RE2

ghci> find "\\w+" "hello world"
Just (Match [Just "hello"])

ghci> find "\\w+$" "hello world"
Just (Match [Just "world"])

ghci> find "^\\w+$" "hello world"
Nothing

Synopsis

Compiling patterns

data Pattern #

A pattern is a compiled regular expression plus its compilation options.

Patterns can be created by calling compile explicitly:

import Data.ByteString.Char8 (pack)

p :: Pattern
p = case compile (pack "^hello world$") of
        Right ok -> ok
        Left err -> error ("compilation error: " ++ errorMessage err)
 

Or by using the IsString instance:

import Data.String (fromString)

p :: Pattern
p = fromString "^hello world$"

Or by using the OverloadedStrings language extension:

{-# LANGUAGE OverloadedStrings #-}

p :: Pattern
p = "^hello world$"

compileWith :: Options -> ByteString -> Either Error Pattern #

Compile a regular expression with the given options. If compilation fails, the error can be inspected with errorMessage and errorCode.

Use optionEncoding to select whether the input bytes should be interpreted as UTF-8 or Latin1. The default is UTF8.

Pattern properties

patternInput :: Pattern -> ByteString #

The regular expression originally provided to compileWith.

patternOptions :: Pattern -> Options #

The options originally provided to compileWith.

patternGroups :: Pattern -> Vector (Maybe ByteString) #

The capturing groups defined within the pattern. Groups are listed from left to right, and are Nothing if the group is unnamed.

ghci> patternGroups "(\\d+)|(?P<word>\\w+)"
fromList [Nothing,Just "word"]

Options

data Options #

Options controlling how to compile a regular expression. The fields in this value may be set using record syntax:

compileNoCase :: B.ByteString -> Either Error Pattern
compileNoCase = compileWith (defaultOptions { optionCaseSensitive = False })
 

Instances

defaultOptions :: Options #

defaultOptions = Options
        { optionEncoding = EncodingUtf8
        , optionPosixSyntax = False
        , optionLongestMatch = False
        , optionMaxMemory = 8388608  -- 8 << 20
        , optionLiteral = False
        , optionNeverNewline = False
        , optionDotNewline = False
        , optionNeverCapture = False
        , optionCaseSensitive = True
        , optionPerlClasses = False
        , optionWordBoundary = False
        , optionOneLine = False
        }
 

optionPerlClasses :: Options -> Bool #

Only checked in posix mode

optionWordBoundary :: Options -> Bool #

Only checked in posix mode

optionOneLine :: Options -> Bool #

Only checked in posix mode

Compilation errors

data Error #

Instances

Eq Error # 

Methods

(==) :: Error -> Error -> Bool #

(/=) :: Error -> Error -> Bool #

Show Error # 

Methods

showsPrec :: Int -> Error -> ShowS #

show :: Error -> String #

showList :: [Error] -> ShowS #

Matching

data Match #

A successful match of the pattern against some input. Capturing groups may be retrieved with matchGroup or matchGroups.

Instances

Eq Match # 

Methods

(==) :: Match -> Match -> Bool #

(/=) :: Match -> Match -> Bool #

Show Match # 

Methods

showsPrec :: Int -> Match -> ShowS #

show :: Match -> String #

showList :: [Match] -> ShowS #

matchGroup :: Match -> Int -> Maybe ByteString #

The capturing group with the given index, or Nothing if the group was not set in this match.

The entire match is group 0.

matchGroups :: Match -> Vector (Maybe ByteString) #

All of the groups in the pattern, with each group being Nothing if it was not set in this match. Groups are returned in the same order as patternGroups.

The entire match is group 0.

data Anchor #

Constructors

AnchorStart 
AnchorBoth 

Instances

Eq Anchor # 

Methods

(==) :: Anchor -> Anchor -> Bool #

(/=) :: Anchor -> Anchor -> Bool #

Show Anchor # 

match #

Arguments

:: Pattern 
-> ByteString 
-> Int

Start position

-> Int

End position

-> Maybe Anchor 
-> Int

How many match groups to populate

-> Maybe Match 

The most general matching function. Attempt to match the pattern to the input within the given constraints.

If the number of match groups to populate is 0, matching can be performed more efficiently.

Searching

find :: Pattern -> ByteString -> Maybe Match #

Attempt to find the pattern somewhere within the input.

Replacing

replace #

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Replacement template

-> (ByteString, Bool) 

Replace the first occurance of the pattern with the given replacement template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns the new bytes, and True if a replacement occured.

replaceAll #

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Replacement template

-> (ByteString, Int) 

Replace every occurance of the pattern with the given replacement template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns the new bytes, and how many replacements occured.

extract #

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Extraction template

-> Maybe ByteString 

Attempt to find the pattern somewhere within the input, and extract it using the given template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns Nothing if the pattern was not found in the input.

Utility functions

quoteMeta :: ByteString -> ByteString #

Escapes bytes such that the output is a regular expression which will exactly match the input.