PEG email_storage

Author: Marc Schiereck
Last-Modified:2003-03-31
Revision: 1.1
Date-Created:2002-10-28
Status: incomplete

Description of a concept for storing mails in Storm.

Issues

Rationale

We need a system for saving emails in Storm (especially for the planned email-client.) so we can transclude and link. A way to do this is to put all headers of the mail (In the case of a multipart-message there could be sub-headers for each part) in one block and to put all bodies in different blocks, as said in Issues and described further later on.

The reason for putting the the bodies in separate blocks is that their contents so have xanalogical IDs, which don't depend on the mail-header. So if one recives mails with the same body but different headers the body will only be stored once and the headers point to the body's content using only one xanalogical ID.

And it should be possible to fetch emails on different computers and to synchronize the Storm-Pool.

Description

Non-multipart messages

When saving non-multipart-messages, header and body get separated and saved in a block respectively. In order to preserve the connection between header and body the content-type of the e-mail needs to be changed to message/external-body. In addition a new access-type, called x-storm is introduced. It has a parameter block which holds the ID of the block containing the body.

The Content-Type of the header-block is message/rfc822. After the block's header the following is inserted:

Content-Type: message/external-body;
              access-type="x-storm";
              block="<ID>"

where <ID> is the ID of the body's block. Then the original mail header follows. There Content-ID: xxx and Content-Transfer-Encoding: binary are added.

The body-block's header has the following fields, and only these:

Content-Type: <content-type of the mail>
Message-ID: <message-ID of the mail, if existent>
Content-Transfer-Encoding: <...>

An example:

E-Mail:

From: Marc Schiereck <kanonensau@gmx.net>
To: gzz developers list <gzz-dev@mail.freesoftware.fsf.org>
Subject: Sample message 
Content-Type: text/plain    

Sample message.

Header block:

Content-Type: message/rfc822

Content-Type: message/external-body;
              access-type="x-storm";
              block="ID"

From: Marc Schiereck <kanonensau@gmx.net>
To: gzz developers list <gzz-dev@mail.freesoftware.fsf.org>
Subject: Sample message 
Message-ID: <id@foo.org>
Content-ID: storm:block:01A1F452AE2B3441AB2234B1A2378B24DFA12B3212
Content-Type: text/plain    
Content-Transfer-Encoding: binary

Body Block:

Content-Type: text/plain
Message-ID: <id@foo.org>

Sample message.

Multipart messages

Storing multipart messages works similar to storing non-multipart messages. There is one block for the headers and a block for each part's body.

Content-Type of the header-block is message/rfc822. The header of the message stays untouched. Just the headers of the individual parts are changed. They are replaced by:

Content-Type: message/external-body;
              access-type="x-storm";
              block="<block-ID>"

<replaced header>
Content-ID: xxx
Content-Transfer-Encoding: binary

where <block-ID> is the ID of the body's block.

The body blocks have the same format as the body blocks of non-multipart messages.

An example:

E-Mail:

From: Marc Schiereck <kanonensau@gmx.net>
To: gzz developers list <gzz-dev@mail.freesoftware.fsf.org>
Subject: Sample message 
Content-type: multipart/mixed;
              boundary="boundary" 
Message-ID: <id1@foo.org>

--boundary
Content-Type: text/plain

part1 
--boundary
Content-Type: text/plain

part2
--boundary--

Header block:

Content-Type: message/rfc822

From: Marc Schiereck <kanonensau@gmx.net>
To: gzz developers list <gzz-dev@mail.freesoftware.fsf.org>
Subject: Sample message 
Content-type: multipart/mixed;
              boundary="boundary" 
Message-ID: <id1@foo.org>

--boundary
Content-Type: message/external-body;
              access-type="x-storm";
              block="<ID-part1>"

Content-Type: text/plain
Content-ID: storm:block:01A1F452AE2B3441AB2234B1A2378B24DFA12B3213
Content-Transfer-Encoding: binary
 
--boundary
Content-Type: message/external-body;
              access-type="x-storm";
              block="<ID-part2>"

Content-Type: text/plain
Content-ID: storm:block:01A1F452AE2B3441AB2234B1A2378B24DFA12B3214
Content-Transfer-Encoding: binary

--boundary--

Body block 1:

Content-Type: text/plain
Message-ID: <id1@foo.org>

part1

Body block 2:

Content-Type: text/plain
Message-ID: <id1@foo.org>

part2

Implementation

This functionality will be implemented in a Jython module, gzz.modules.email.converter.

To store a single mail the following function will be defined:

storeMail(mail, mediaserver)

mail is a string holding the complete mail (header and body. They get separated inside the function).