May 17 2008

How to write a fast catch-all RegExp

Tag: FixingAndrea Ercolino @ 18:55:36

In How to write a safe catch-all RegExp I suggested to use (?:\w|\W)* for matching any character in a regular expression. It’s certainly true and safe, and the same stands for its siblings (?:\s|\S)* and (?:\d|\D)*.

If you want to match a large text, these expressions are not the best. I’ve prepared a simple test page where the GeSHi’s engine file, which is almost 120KB, is going to be matched by the regular expression you input.

In Firefox 2 the performance is quite good, about 100 ms on my PC, but in Internet Explorer 7 it takes more than 7 minutes !!!

The best catch-all regular expression is [\w\W]*, which employs about 50 ms in FF2, and 0 ms in IE7 !!! (yes, zero milliseconds)