UTF-8 Special Char Finder

This tool was created to find hidden UTF-8 characters in your text and was inspired by figuring out what LLMs are storing within your text. This tools uses Javascript and does not send any information to our server - so it runs local on your machine is only for you.

Additionally to showing non-printable Unicode characters the tool allows to remove special characters and copy the cleaned up text directly from the site. It's a one-stop-shop for cleaning LLM output.


Input Text

Just copy your text into this field.

text



Plain Text

This is the plain text whith all the stuff you don't like. You can decode this text as well, just to make sure that it is really like you want it.

text
Change emdash to normal dash ( — > - )
Remove special UTF-8 chars (ZWJ)
Remove emojis
Remove tabulators
Replace tabulator with space
Replace special spaces by default space (U+2002 ... U+200A)
Remove small spaces (U+202F)
Replace non-break by default space (U+00A0)

Decoded Text

Here you see all the embedded codes, emojis and special characters.



Links

The general idea came from SoSciSurvey UTF-8 decoder, which is available on Github as well.