haskell - Indices of all matches of a regex -
i trying match occurrences of regex , indices result. example real world haskell says can do
string =~ regex :: [(int, int)]
however, broken since regex library has been updated since publication of rwh. (see all matches of regex in haskell , "=~" raise "no instance (regexcontext regex [char] [string])"). correct way this?
update:
i found matchall might give me want. have no idea how use it, though.
the key using matchall
using type annotation :: regex
when creating regexs:
import text.regex import text.regex.base re = makeregex "[^aeiou]" :: regex test = matchall re "the quick brown fox"
this returns list of arrays. list of (offset,length) pairs, access first element of each array:
import data.array ((!)) matches = map (!0) $ matchall re "the quick brown fox" -- [(0,1),(1,1),(3,1),(4,1),(7,1),(8,1),(9,1),(10,1),(11,1),(13,1),(14,1),(15,1),(16,1),(18,1)]
to use =~
operator, things may have changed since rwh. should use predefined types matchoffset
, matchlength
, special type constructor allmatches
:
import text.regex.posix re = "[^aeiou]" text = "the quick brown fox" test1 = text =~ re :: bool -- true test2 = text =~ re :: string -- "t" test3 = text =~ re :: (matchoffset,matchlength) -- (0,1) test4 = text =~ re :: allmatches [] (matchoffset, matchlength) -- (not showable) test4' = getallmatches $ (text =~ re :: allmatches [] (matchoffset, matchlength)) -- [(0,1),(1,1),(3,1),(4,1),(7,1),(8,1),(9,1),(10,1),(11,1),(13,1),(14,1),(15,1),(16,1),(18,1)]
see docs text.regex.base.context more details on contexts available.
update: believe type constructor allmatches
introduced resolve ambiguity introduced when regex has subexpressions -- e.g.:
foo = "axx ayy" =~ "a(.)([^a])" test1 = getallmatches $ (foo :: allmatches [] (matchoffset, matchlength)) -- [(0,3),(3,3)] -- returns locations of "axx" , "ayy" no subexpression info test2 = foo :: matcharray -- array (0,2) [(0,(0,3)),(1,(1,1)),(2,(2,1))] -- returns match "axx"
both list of offset-length pairs, mean different things.
Comments
Post a Comment