Boneyard Tools

Unicode Character Counter

Paste any text to see its real size in Unicode code points, UTF-16 code units and UTF-8 bytes, plus grapheme clusters, words and lines. Useful when emoji, accents or scripts make a simple length count misleading.

How to count Unicode characters

  1. Type or paste your text, including any emoji or accented characters, into the box.
  2. Read the live grid of code points, UTF-16 units and UTF-8 bytes.
  3. Check graphemes, words and lines to see how the text is perceived and split.

Examples

An emoji that is not one length

👍
1 code point, 2 UTF-16 units, 4 UTF-8 bytes, 1 grapheme

Accented text

café
4 code points, 4 UTF-16 units, 5 UTF-8 bytes

Frequently asked questions

What is the difference between code points and UTF-16 units?

A code point is one Unicode character. UTF-16 units are how JavaScript stores text, where characters above U+FFFF (like most emoji and many math symbols) take two units, called a surrogate pair. So 👍 is one code point but two UTF-16 units.

Why do UTF-8 bytes differ from the character count?

UTF-8 uses one byte for ASCII, two for most Latin accents, three for most CJK characters and four for emoji and astral characters. Bytes are what files, databases and HTTP payloads actually count, so they often exceed the visible character count.

What is a grapheme and why might it differ from code points?

A grapheme is one user-perceived character. Some symbols combine several code points, such as a flag, a skin-toned emoji or a family emoji joined with zero-width joiners. Those read as one grapheme but several code points. Graphemes use Intl.Segmenter when your browser supports it.

Which count should I use for a database column or API limit?

Check the system. Many databases and byte-limited fields count UTF-8 bytes, JavaScript string length and many older APIs count UTF-16 units, and human-facing limits usually mean graphemes or code points. This tool shows all of them so you can match the right one.

How are words and lines counted?

Words are runs of non-whitespace separated by spaces, tabs or line breaks. Lines are split on every line break, including CRLF, lone CR and LF, so one paragraph with no breaks counts as a single line.

Is my text private?

Yes. All counting runs entirely in your browser. Nothing you type is uploaded, logged or stored.

Related tools