Question 1

Is my file uploaded anywhere?

Accepted Answer

No. The file is read and analyzed entirely in your browser using JavaScript. Nothing is sent to a server, so even confidential text stays on your device.

Question 2

How can it tell UTF-8 from Latin-1 without a BOM?

Accepted Answer

It scans the bytes for the bit patterns UTF-8 requires: a lead byte (110xxxxx, 1110xxxx, or 11110xxx) must be followed by the right number of continuation bytes in the range 0x80 to 0xBF. If every high byte fits a valid UTF-8 sequence it is UTF-8. If those rules are broken but the bytes are otherwise printable, it is almost certainly a single-byte code page like Windows-1252 or ISO-8859-1.

Question 3

What is a byte-order mark (BOM)?

Accepted Answer

A BOM is a short signature at the very start of a file that names its encoding. UTF-8 with a BOM begins with EF BB BF, UTF-16 LE with FF FE, UTF-16 BE with FE FF, and UTF-32 adds two more bytes. When a BOM is present the encoding is certain, which is why those results show high confidence.

Question 4

Why does it sometimes report Latin-1 with only medium confidence?

Accepted Answer

The bytes of Windows-1252 and ISO-8859-1 (Latin-1) overlap, and neither carries a marker, so a file using one of them cannot be told apart from the other by content alone. The tool reports the family and lowers the confidence to signal that the exact code page is a guess.

Question 5

What does it mean when a file is flagged as binary?

Accepted Answer

A file is treated as binary when it contains a NUL (0x00) byte or an unusually high share of control bytes. Tabs, line feeds, and carriage returns are normal in text and never count against it. Binary files have no meaningful text encoding, so the result is simply Binary.

Question 6

How does it detect the line ending?

Accepted Answer

It counts the newline styles in the bytes: CRLF (the bytes 0D 0A, common on Windows), a lone LF (0A, common on macOS and Linux), and a lone CR (0D, old classic Mac). If only one style appears it is reported by name; if several appear the result is mixed, and if there are no line breaks it is none.

Text File Encoding Detector

How to detect a text file's encoding

Examples

Frequently asked questions

Related tools

File Type Identifier

Hex Viewer

CSV File Inspector

Base64 File Encoder

CBOR Decoder

ELF Header Inspector