The proprietary "SmartMail" IMAP server likes to send no untagged SEARCH
response when the set of matching email messages is empty.
This was brought up as sf.net support request #3213272.
This fixes test suite failures with Python 2.7.
Starting with Python 2.7, gzip.GzipFile is subclassing io.IOBase.
The seek() method of io.IOBase differs from file.seek() and the old
gzip.GzipFile.seek() in that it returns the new file position, not None.
And in Python 2.7, gzip.GzipFile.tell() is inherited from
io.IOBase.tell(), which is implemented using its seek() method.
FixedGzipFile subclasses gzip.GzipFile and overrides seek(); therefore,
this method need be adapted for this change in the interface.
Closes: #3314293.
* docbook2{man,html} used to generate temporary files; the new XML tool
xsltproc does not, so we can drop the corresponding cleanup rule.
* The `bdist_rpm' rule for building rpm packages was broken for a long
time, and therefore commented out. The distutils bug that broke the
rule is now fixed, but I'm removing the rule nevertheless because it's
useless.
* The `upload' rule no longer works; drop it.
* Update .PHONY
The LIST reply handling in imap_find_mailboxes() was buggy in several
ways. It expected a non-NIL hierarchy delimiter, and it assumed that the
mailbox name in a LIST reply is always quoted; but it can be an astring, a
quoted string, or a literal (see RFC 3501 for the definitions of these
tokens). These variants should now all be interpreted correctly.
In addition, quoted mailbox names are now handled more strictly to the
letter of the RFC; this only affects mailbox names containing a " or a
backslash, though.
The only non-obvious code change required for this is due to the fact that
computing the archive names has to move into the format-specific archiving
functions, because they can no longer be derived from the mailbox name
beforehand.
In particular:
* we no longer use shutil.copy{,2} to write back a changed mbox
* having temporary mbox files in the same directory as the originals doesn't
make sense anymore since we no longer commit them with rename(2)
* the --archive-name option is now implemented
Document the --archive-name option, and explain the basic idea of deriving the
archive filename from the mailbox earlier and more prominently. Also document
how archivemail tries not to create hidden archive files, and remove some
obsolete notes.
When archiving a mailbox with leading dots in the name and no archive name
prefix specified, strip the dots off the archive name. This is targeting
Maildir++ subfolders.
Technically, this works just like the --suffix option. This commit also
updates the manpage accordingly.
Currently, the prefix is not checked for slashes, so it could contain path
components. (The same applies for the suffix, btw). Since the expanded
string is prepended to the archive base name, this can be used to dynamically
configure the archive directory, depending on the archive cutoff date. I'm
not sure if this can be considered a reasonable feature, though.
IMAP servers (Dovecot and UW-IMAP at least) may store mailbox meta data for
mboxes in a pseudo message. Such messages are now detected and never archived.
This commit includes a test case in the test suite.
If we don't have sufficient permissions to create a dotlock for an mbox file,
record that, and don't try to remove the dotlock when unlocking the mbox
later.
Better don't write "soon there will be... <foo>" and don't be specific about
available versions. Writing it more generically means lesser maintenance. :)
On Unix, most scripts don't come with a file extension, it's not needed, and
we distribute the script as "archivemail" anyway. And most importantly, I
like it better without the extension. :)
With a little trick we can still load the script as a module from the test
suite.
Notable items that are now resolved or implemented:
* archives are now locked
* the mbox classes have been refactored to a cleaner design
* we moved from flock locking to fcntl
* the setuid() feature is long gone
* symlink attacks for tempfiles are not possible (that is really an
ancient TODO item from the original author)
* the test suite now has a lot of maildir test cases
I just discovered that archivemails MH support is broken with respect to
message flags, and in my opinion it doesn't make much sense to test
known-broken functionality.
In fact there may well be zero archivemail users with MH mailboxes; MH is
basically an obsolete format, and any archivemail user with MH mailboxes would
probably have complained about lost message flags.
This code is complex, too complex actually. Rename some methods and
variables, rework some code and and add some explaining comments in order to
make it it least a bit easier to understand.
This should minimize the risk of data loss. Flushing a locked mbox file
before unlocking it also ensures that there's no window when another
process could lock the mbox after us, but still see the old content.
The mbox locking methods move into a new class LockableMboxMixin, and the
Mbox and ArchiveMbox classes become subclasses of LockableMboxMixin.
class StaleFiles is updated to handle multiple dotlock files.
In particular:
* If writing the archived messages to the final archive fails, try to
restore the archive and abort (by not handling the exception). This is
possible since we first save the archive, and only then the modified
mailbox, so we don't corrupt the original mbox in this case.
* If writing a modified mbox file fails, save the temporary copy.
The RetainMbox and ArchiveMbox classes are now gone, mainly because their
finalise() methods were messing with the archived mbox and the archive,
respectively, which was not good OO design.
The core functionality of the finalise() methods of both removed classes
is moved to the objects that are manipulated: the Mbox class representing
the mbox that is being archived gains a new method overwrite_with(), and
there is a new class ArchiveMbox that represents the actual archive, which
has an append() method (yes, unfortunately the new class has the same name
like the removed class).
The RetainMbox instance is replaced with a TempMbox, and the ArchiveMbox
instance either with a TempMbox, or a CompressedTempMbox if archive
compression is enabled.
Finally, a compressed TempMbox is now a implemented as a subclass of
TempMbox, named CompressedMbox.
Cooperation with the StaleFiles class moves into the TempMbox class.
This means slightly less detailed verbose cleanup reporting, oh well.
We used to create a dotlock file first and then lock with fcntl; swap that
order, since locking first with fcntl seems to be more common.
This patch also adds general mbox lock/unlock methods, which call the
dotlock and fcntl-lock methods, and moves the retry logic there.
When the dotlock and fcntl methods fail to acquire a lock, they now raise
a custom exception "LockUnavailable", which gets caught in the general
lock() method. That way, if we succeed to acquire one lock but fail to
acquire the other, we can release our locks at the upper level and retry.
These helper methods provide success verification after test archiving runs, and
test case setup. This is a tradeoff: because these methods need to support all
scenarios in one place, they introduce some new complexity - but they replace a
lot of tedious, very similar, but still not entirely identical code all over the
place.
TestArchiveMboxPreserveStatus actually doesn't test that the message
status is preserved, but that the --preserve-unread option works.
Rename it to TestArchiveMboxPreserveUnread.
The test suite used to run a lot of triple tests, by first calling
archivemail.archive() directly, and then running the entire archivemail
script twice, once with long and once with short options. But we already
test option processing seperately, and beyond that, archivemail.main()
essentially just calls archive() for each mailbox in turn. So we just drop
all runs of the entire archivemail script from the test suite, giving it a
huge speed boost (on my old iBook, running the test suite drops from 73 to
5 seconds).
os.utime() uses the utimes(2) system call to set file timestamps. utimes(2)
has a microsecond resolution, but stat(2) may return timestamps with
nanosecond resolution. So, the check that we have properly reset the mbox
file timestamp must allow a minor deviation.
* Make the finalise() methods spot if they have anything to do
* We used to create the temporary mbox files on demand in the message
processing loop, if we needed to write to them. Now we create them
beforehand, but only if they might be needed (e.g. we don't create an
archive if options.delete_old_mail is set).
* The above combined makes the final committing of the changes simpler (a
*lot* simpler for mboxes), and we can dump the Mbox.leave_empty() method.
When committing a changed mbox, don't use os.rename(), and don't open/close
the mbox file to truncate it to zero length. Locking was pretty much broken
before -- at least in theory a quite severe bug.
* Remove code duplication: restore the mbox timestamps once and for all when
we're done
* Don't bother restoring the file mode when finishing, since this is handled in
RetainMbox.finalise() (and need be)
* Therefore, rename Mbox.reset_stat() to reset_timestamps()
This is now before we do the sanity checking, so in verbose mode, we don't error
out before having said that we now turn attention to the current mailbox.
This should also protect people relying on the old setuid feature.
If the mailbox is local, by checking the ownership we necessarily check for
existance.
I don't think anybody wants to archive folders in shared or public IMAP
namespaces, so we don't bother checking all possible namespaces. The code was
ugly anyway.
archivemail development has moved to git. This patch updates the project
webpage, removes the subversion $Id$ keyword that was stored in
archivemail.__svn_id__, and updates the Makefile.
* Automatically add NAMESPACE prefix to the mailbox path if necessary,
* Explicitely check for guessed mailbox names with LIST instead of just trying
to SELECT them.
* Updated documentation about NAMESPACE handling.
* look for the timestamp of the latest 'Received' header before resorting to
'Date' or 'Resent-Date'.
* let 'Resent-date' header take precedence over 'Date'.
Document these changes in manpage and changelog.
Closes: #1481316, #1764855, Debian bug #272666.
* implement --all (?)
* implement --include-draft (?)
* consider to use target directories for temporary files, this might spare us
one copy if they reside on other filesystems than /tmp
* fallback if an IMAP server doesn't implement SEARCH (?)
Some notably changes:
* don't make so much noise about archivemail being a python program;
* add little box with current version information;
* partly reworded for a more friendly, inviting tone (hopefully);
* removed some superfluous links to trivial information like the pyhon.org
website;
* link changelog and TODO file to HEAD in the svn browser instead of using
(obsolete) copies;
* warmly encourage svn access;
* drop dead link to article about archivemail.
* mark message 'old' iff it's not \Recent (drop requirement that it's
unread; this probably confused mutt's message status flags in the index
with mbox status flags).
* a message not \Seen and not \Recent was marked as 'N', but there is no
such thing like an mbox status flag 'N'.
* Move paragraph about archiving IMAP folders before the option list
* Added section subsection about IMAP URL handling
* Added IMAP example command line
user_error() and unexpected_error(). If archivemail is used as a module, let
the functions raise the corresponding exceptions rather than writing to stderr
and calling sys.exit().
deleted, so it's idempotent and e.g. doesn't stomp over someone else's files if
invoked twice and running as root. Currently I don't see how this could happen,
but it will with a per-testcase cleanup.
Should just serve as a last security fallback, since we operate in a safe
temporary directory and everything should be okay anyway, but that may be less
obvious. :-)
Derive all testcases that create temporary files from the new class
TestCaseInTempdir, which provides standard fixtures to set up a secure temporary
root directory for tempfiles and cleaning up afterwards. This also simplifies
the code.
This addresses Debian bug #385253, and reading the BTS log, it seems this issue
was assigned CVE-2006-4245, although I cannot find any further reference to that
CVE. Note that the bug was initially reported to affect archivemail itself,
too. This is not correct. There *are* race conditions with archivemail, but
they were not subject of that report, and are not that critical.
Also bumped python dependency to version 2.3 since we use tempfile.mkstemp() and
other recent stuff.
date directives in the suffix to the current date, but rev. 94 changed that to
the archive cut off date. Based on analysis by Peter Poeml. Thanks, Peter.
Since we already run docbook2man, we build-depend on that package anyway, and
the current, hand-crafted jade command fails on Debian systems (wrong path to
stylesheet).
Locale's appropriate date and time representation.</para></listitem>
<listitem><para><option>%d</option>
Day of the month as a decimal number [01,31].</para></listitem>
<listitem><para><option>%H</option>
Hour (24-hour clock) as a decimal number [00,23].</para></listitem>
<listitem><para><option>%I</option>
Hour (12-hour clock) as a decimal number [01,12].</para></listitem>
<listitem><para><option>%j</option>
Day of the year as a decimal number [001,366].</para></listitem>
<listitem><para><option>%m</option>
Month as a decimal number [01,12].</para></listitem>
<listitem><para><option>%M</option>
Minute as a decimal number [00,59].</para></listitem>
<listitem><para><option>%p</option>
Locale's equivalent of either AM or PM.</para></listitem>
<listitem><para><option>%S</option>
Second as a decimal number [00,61]. (1)</para></listitem>
<listitem><para><option>%U</option>
Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0.</para></listitem>
<listitem><para><option>%w</option>
Weekday as a decimal number [0(Sunday),6].</para></listitem>
<listitem><para><option>%W</option>
Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0.</para></listitem>
<listitem><para><option>%x</option>
Locale's appropriate date representation.</para></listitem>
<listitem><para><option>%X</option>
Locale's appropriate time representation.</para></listitem>
<listitem><para><option>%y</option>
Year without century as a decimal number [00,99].</para></listitem>
<listitem><para><option>%Y</option>
Year with century as a decimal number.</para></listitem>
<listitem><para><option>%Z</option>
Time zone name (or by no characters if no time zone exists).</para></listitem>