Broobles » eml2mbox » mbox format  
 
Unix mbox format

This document describes the format traditionally used by Unix hosts to store mail messages locally. mbox files typically reside in the system's mail spool, under various names in users' Mail directories, and under the name mbox in users' home directories.

An mbox is a text file containing an arbitrary number of e-mail messages. Each message consists of a postmark, followed by an e-mail message formatted according to RFC 822. The file format is line-oriented. Lines are separated by line feed characters (ASCII 10).

A postmark line consists of the four characters "From", followed by a space character, followed by the message's envelope sender address, followed by whitespace, and followed by a time stamp. The sender address is expected to be an addrspec as defined in appendix D of RFC 822.

The date is expected to be formatted according to the following syntax (represented in the augmented Backus-Naur formalism used by RFC 822):

mbox-date =weekday month day time [ timezone ] year
weekday ="Mon" / "Tue" / "Wed" / "Thu" / "Fri"
/ "Sat" / "Sun"
month ="Jan" / "Feb" / "Mar" / "Apr" / "May"
/ "Jun" / "Jul" / "Aug" / "Sep"
/ "Oct" / "Nov" / "Dec"
day =1*2DIGIT
time =1*2DIGIT ":" 1*2DIGIT [ ":" 1*2DIGIT ]
timezone =( "+" / "-" ) 4DIGIT
year =( 4DIGIT / 2DIGIT )

For compatibility reasons with legacy software, two-digit years greater than or equal to 70 should be interpreted as the years 1970+, while two-digit years less than 70 should be interpreted as the years 2000-2069.

Software reading files in this format should also be prepared to accept non-numeric timezone information such as "CET DST" for Central European Time, dailight saving time.

Example:

From noone@example.org Fri Jun 23 02:56:55 2000

In order to avoid mis-interpretation of lines in message bodies which begin with the four characters "From", followed by a space character, the character ">" is commonly prepended in front of such lines.


Back to eml2mbox