Regex for ASCII Characters

ASCII (American Standard Code for Information Interchange) is a standard set of 255 character representation used widely in computers. Regex makes it easy to match any specific ASCII character or a range of characters.

A regular expression that matches ASCII characters consists of an escaped string \x00 where 00 can be any hexadecimal ASCII character code from 00 to FF. A range of ASCII characters can be matched by enclosing two such codes in square brackets.

/[\x00-\xFF]/Edit with Regexity

The expression lists two ASCII characters in hexadecimal code (00 to FF), enclosed in square brackets []Edit with Regexity and separated by a dash symbol Edit with Regexity. The square brackets indicate that we’ll accept any of the characters in this range. The hexadecimal codes are escaped using a backslash \Edit with Regexity and xEdit with Regexity.

The expression above will match all ASCII characters from NULL (hex code 0) to ÿ (hex code 255) as shown in this article or this list. These ASCII characters are divided into three groups:

  • 33 control characters (hex code 00 to 1F as well as 7F)
  • 95 printable characters (hex code 20 to 7E)
  • 128 extended character set (hex code 80 – FF)

Note that the first 32 characters (00 to 1F) as well as 7F are control characters and can often be omitted. This requires specifying two character ranges which excludes these character:

/[\x20-\x7E\x80-\xFF]/Edit with Regexity

Regex for Printable ASCII Characters

To match only printable ASCII characters, one can restrict the character range to characters from 20 to 7E:

/[\x20-\x7E]/Edit with Regexity

As seen in this post, a shorthand method to write this is:

/[ -~]/Edit with Regexity

This expression works by including all characters from the space (hex code 20) to the tilde ~Edit with Regexity (hex code 7E). The character with hex code 7F is the “delete” character which does not show up in print.

Regex for Extended ASCII Characters

A regular expression for the extended set of ASCII characters is written by specifying the following character range:

/[\x80-\xFF]/Edit with Regexity

This will match any special character from € (hex code 80) to ÿ (hex code FF).

Regex for Non-ASCII Characters

To accept any character except ASCII characters, we make use of a negated character range, which starts with a caret symbol ^Edit with Regexity at the start of the square brackets:

/[^\x00-\xFF]/Edit with Regexity

Sources

The regular expressions on this page was adapted by a solution posted here.

Benjamin

Founder, owner, and sole content creator on RegexLand. Enjoys programming, blogging, and teaching others how to do the same. Read more...

Leave a Comment