MIME

MIME

Multipurpose Internet Mail Extensions. The format that gives email rich content: text + HTML alternatives, attachments, non-ASCII characters, multilingual headers. Defined in RFCs 2045–2049 (1996); RFC 5322 covers the basic message format MIME extends.

Without MIME, email would be plain ASCII text only. With it, you get inboxes that look like inboxes.

The basic structure

An email is a list of headers, a blank line, then a body. The body is either a single piece of content or a tree of parts ("multipart"), recursively.

From: alice@example.com
To: bob@example.com
Subject: hello
Date: Mon, 1 May 2026 09:00:00 +0000
Message-ID: <abc123@example.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="boundary42"

--boundary42
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Hello, Bob!

--boundary42
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<p>Hello, =E2=9C=A8 Bob!</p>

--boundary42--

Two parts (text and HTML alternatives) under a multipart/alternative container. Boundaries delimit parts. Each part has its own headers describing how its bytes are encoded.

Content types

Top-level MIME types:

Inboxes typically nest: multipart/mixedmultipart/alternative (text + HTML) + attachments.

Content-Transfer-Encoding

Email infrastructure historically only handled 7-bit ASCII. MIME ships binary content by encoding it:

A modern parser handles all of them transparently.

Header encoding (RFC 2047)

Email headers must be ASCII at the wire level. Non-ASCII text in headers (UTF-8 subjects, names with accents) is encoded with the encoded-word syntax:

Subject: =?UTF-8?Q?Hello_=E2=9C=A8_Bob!?=
From: =?UTF-8?B?44GT44KT44Gr44Gh44Gv?= <user@example.com>

Format: =?<charset>?<encoding>?<encoded-text>?= where encoding is Q (quoted-printable) or B (base64).

Decoding is per-fragment; one header may have multiple encoded-words mixed with ASCII.

Address parsing

From, To, Cc, Bcc, Reply-To use a structured format. Examples:

To: alice@example.com
To: Alice Aardvark <alice@example.com>
To: "Aardvark, Alice" <alice@example.com>, bob@example.com, "Bob B" <bob@example.com>
Cc: alice@example.com, group:bob@example.com,carol@example.com;

Display names may be quoted (for special chars). Multiple addresses comma-separated. Group syntax (name:addr1,addr2;) is rare but spec-compliant.

Parsing is non-trivial — RFC 5322 grammar is rich. Use a library; don't roll your own.

Date format

Date: Mon, 01 May 2026 09:00:00 +0000
Date: 1 May 2026 09:00:00 GMT
Date: Mon, 01 May 26 09:00 +0100

RFC 5322 §3.3 defines the syntax. Edge cases include obsolete year formats (26 instead of 2026), missing day-of-week, named timezones (PST, EST — ambiguous), and just plain lies (servers with wrong clocks).

For threading and sorting, mail clients usually fall back to delivery time (e.g., Gmail's internalDate) when parsing fails or the date is implausible.

What mxr does

Per Mxr's crates/mail-parse/src/lib.rs:

Common pitfalls

See also