Tuesday, October 14, 2014

Regular expression tips, match repeatable characters.

Regular expression has a powerful support for submatches (subexpressions). here are some powerful match.

if you user got a sticky keyboard, you may enter something like ‘gooooooooooooooooooooooooooooood’, or ‘happppppppppppy’ how to remote the duplicate o or p.

you can use (.)\1{1,} to match. basically the first group means match any character, \1means this is a subexpression or a variable. {1,} means this have to be repeated for 1+ times.

image

but if you user got a stick paste key, it mays shows a lot repeated words. like the good example .

we can put (.+)\1{1,} , you can tell it only match 4 good, not five. why? 

image

because it try to find a repeatable patter , so two good, as a variable, and occur 2 times. , if we have even number of good, it will match all.

image

what if we want to match all , basically certain word been repeated for 1+ times, we can put a ? in the subexpression, means non-greedy match. so it will just match one word which is not greedy, then repeat for 1+ times.

image

No comments:

 
Locations of visitors to this page