haskell - Indices of all matches of a regex -


i trying match occurrences of regex , indices result. example real world haskell says can do

string =~ regex :: [(int, int)] 

however, broken since regex library has been updated since publication of rwh. (see all matches of regex in haskell , "=~" raise "no instance (regexcontext regex [char] [string])"). correct way this?

update:

i found matchall might give me want. have no idea how use it, though.

the key using matchall using type annotation :: regex when creating regexs:

import text.regex import text.regex.base  re = makeregex "[^aeiou]" :: regex test = matchall re "the quick brown fox" 

this returns list of arrays. list of (offset,length) pairs, access first element of each array:

import data.array ((!))  matches = map (!0) $ matchall re "the quick brown fox" -- [(0,1),(1,1),(3,1),(4,1),(7,1),(8,1),(9,1),(10,1),(11,1),(13,1),(14,1),(15,1),(16,1),(18,1)] 

to use =~ operator, things may have changed since rwh. should use predefined types matchoffset , matchlength , special type constructor allmatches:

import text.regex.posix  re = "[^aeiou]" text = "the quick brown fox"  test1 = text =~ re :: bool   -- true  test2 = text =~ re :: string   -- "t"  test3 = text =~ re :: (matchoffset,matchlength)   -- (0,1)  test4 = text =~ re :: allmatches [] (matchoffset, matchlength)   -- (not showable)  test4' = getallmatches $ (text =~ re :: allmatches [] (matchoffset, matchlength))   -- [(0,1),(1,1),(3,1),(4,1),(7,1),(8,1),(9,1),(10,1),(11,1),(13,1),(14,1),(15,1),(16,1),(18,1)] 

see docs text.regex.base.context more details on contexts available.

update: believe type constructor allmatches introduced resolve ambiguity introduced when regex has subexpressions -- e.g.:

foo = "axx ayy" =~ "a(.)([^a])"  test1 = getallmatches $ (foo :: allmatches [] (matchoffset, matchlength))   -- [(0,3),(3,3)]   -- returns locations of "axx" , "ayy" no subexpression info  test2 = foo :: matcharray   -- array (0,2) [(0,(0,3)),(1,(1,1)),(2,(2,1))]   -- returns match "axx" 

both list of offset-length pairs, mean different things.


Comments

Popular posts from this blog

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -

node.js - StackOverflow API not returning JSON -