Tried solving today's using only regex, and it turns out that it's not easy to match a string containing any character consecutively repeated exactly 2 times (but no more).

For example, I want to match "abccd", but not "abcccd".

It's simple for a single character, e.g. the "c" from the example above:

(?<!c)cc(?!c)

This uses negative lookbehind and lookahead to make sure the two c's are not preceeded or followed by any other c's.

But if you want to make it generic, e.g. match any letter, you might try using backreferences:

(?<!\1)(\w)\1(?!\1)

This produces the error: "lookbehind assertion is not fixed length" since the regex engine cannot know how big the backref is.

Another attempt was using the \K delimiter which allows for variable-length lookabehinds by matching anything before it and discarding it. However there is no negative variant of \K so it's of no use here.

Follow

Finally, I tried matching the character in the lookbehind, and while this kind of thing works for a positive lookbehind, e.g. this matches any letter preceeded by itself:

(?<=(.))\1

It does not work for negative lookbehind. This does not match anything:

(?<!(.))\1\1

So, an hour later I'm no wiser. :) Sounds like it should be pretty easy, and I'm possibly missing something obvious. If anyone knows the solution let me know.

This is trivially solved in code, and I've done so, but I like messing around with regexes and this would make me happy.

@ihabunek this would have been my first thought

(?<=(.))[\1]{2}[^\1]

@hirojin I did think of something similar, but [\1] matches character with ASCII code 1 (base 80), and not the backref. E.g. [\80] would match 0.

As far as I can tell, it's not possible to use back-references in square brackets.

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!