javascript - How to make capture group "absorb" whitespace before/after it without capturing it? -
i have regex expression found here. try out strings below, problem i'm facing there's whitespace located @ beginning of each captured group after 1st one. need whitespace matched don't need them captured.
regex expression:
^(\/[a-za-z0-9]+)?(\s~[a-za-z]+)?([\w\s'()-]+)?((?:\s~[a-za-z]+){0,2})?$
viewing @ link above makes simpler comprehend.
these strings can paste test string area 1 one:
/test ~example matches ~extra ~space has ~space ~matched /like wise /and ~this
take @ match groups area , notice after 1st group, 1 preceding whitespace between groups captured.
what want this:
for 1st , 2nd capture group, want them detect succeeding space , absorb not capture it, 3rd capture group won't detect , capture space. 4th capture group, want detect preceding space , absorb not capture it.
what mean absorb space gets "removed" in sense 3rd capture group won't realize it's there.
how can this?
thanks.
this regex came with-
^(\/[a-za-z0-9]+)?(?:\s)?(~[a-za-z]+)?(?:\s)?([\w\'()\-\s]+)?(?:\s(~[a-za-z]+))?(?:\s(~[a-za-z]+))?$
elaborating regex in 2 parts per requirement-
for 1st , 2nd capture group, want them detect succeeding space , absorb not capture it, 3rd capture group won't detect , capture space.
your regex 1st , 2nd groups -
(\/[a-za-z0-9]+)?(\s~[a-za-z]+)?
so, after each first , second capturing group, i've added non-capturing (?:\s)? .this allows 3rd capturing group not absorb preceding space. regex -
(\/[a-za-z0-9]+)?(?:\s)?(~[a-za-z]+)?(?:\s)?
for 4th capture group, want detect preceding space , absorb not capture it.
your regex
((?:\s~[a-za-z]+){0,2})?
here, obvious solution capture text part([a-za-z]) , non-capture \s part. this,
(?:(?:\s(~[a-za-z]+)){0,2})? ^^^^^^^^^^ capturing this.
but repeated capturing group, capturing new element on top of old element. basically, repeated capturing group capture last iteration. if wanted match-
" ~space ~matched"
, capture last "~matched"
.
so 1 solution since checking {0,2}, can explicitly check 2 times, -
(?:\s(~[a-za-z]+))?(?:\s(~[a-za-z]+))?
but if requirement {0,2} later changes then, best solution capture preceding spaces , split captured group spaces separately.
-> output - when run regex given strings in javascript- ["/test ~example matches ~extra ~space", "/test", "~example", "matches", "~extra", "~space", index: 0, input: "/test ~example matches ~extra ~space"] (index):18 ["this has ~space ~matched", undefined, undefined, "this has extra", "~space", "~matched", index: 0, input: "this has ~space ~matched"] (index):18 ["/like wise this", "/like", undefined, "wise this", undefined, undefined, index: 0, input: "/like wise this"] (index):18 ["/and ~this", "/and", "~this", undefined, undefined, undefined, index: 0, input: "/and ~this"]
hope helped.
Comments
Post a Comment