Browsing the archives for the regex tag

Matching UTF-8 encoded characters with regular expressions

in Programming

Working with accented characters (or any unicode, non latin character for that matter) often poses problems when trying to match them using regular expression functions such as preg_match or preg_replace in PHP. The w expression is meant to match any word character, but it won’t match é or ï in a unicode (UTF-8) encoded string. […]