Broke my brain with a simple regex

a man with a clockwork machine rising out of his skullToday I spent WAY too much time debugging why I was getting an empty string from a string replace operation to pull spaces, dashes, and underscores out of a string.

string.replace(/[ -_]/g, ' ')

I even escaped the space: string.replace(/[\ -_]/g, ' ')

Still got nuttin.

Finally I escaped everything and it worked string.replace(/[\ \-\_]/g, ' ')

But why wasn’t it working without escaping?

The dash has a special meaning in brackets

Then I realized… [ -_] wasn’t matching one of the three. It was matching every character in the ASCII table between space and underscore. See space is character 32 and underscore is character 95. I’m used to using A-Z and 0-9 in brackets, but it didn’t even occur to me that it would catch things between space and underscore.

That captures all 26 upper-case letters in the English alphabet, all ten digits, and a couple dozen symbols to boot. Since my string was uppercase letters, numbers, or those three symbols, it was just clobbering the whole string. Either escaping or putting the hyphen first (so it wasn’t x to y) gave me the result I wanted.

Maybe I’m wrong on the regex using ASCII for the space-to-underscore capture, but it seems like a reasonable explanation and the selection of characters from 32 – 95 includes every possible character in my string.

Between that and using “tag” as the name of a variable for assignment, but using “tags” everywhere else, I got to spend an hour or so debugging my own brain farts this afternoon. UGGH. It’s going to be REALLY tight on the project I wanted to complete by tomorrow. Might need another day or two. And my manager (me) is not the kind of guy who’ll take that delay well.

How’s your Sunday?

Add a Comment

Your email address will not be published. Required fields are marked *