I just discovered that archivemails MH support is broken with respect to
message flags, and in my opinion it doesn't make much sense to test
known-broken functionality.
In fact there may well be zero archivemail users with MH mailboxes; MH is
basically an obsolete format, and any archivemail user with MH mailboxes would
probably have complained about lost message flags.
This code is complex, too complex actually. Rename some methods and
variables, rework some code and and add some explaining comments in order to
make it it least a bit easier to understand.
This should minimize the risk of data loss. Flushing a locked mbox file
before unlocking it also ensures that there's no window when another
process could lock the mbox after us, but still see the old content.
The mbox locking methods move into a new class LockableMboxMixin, and the
Mbox and ArchiveMbox classes become subclasses of LockableMboxMixin.
class StaleFiles is updated to handle multiple dotlock files.
In particular:
* If writing the archived messages to the final archive fails, try to
restore the archive and abort (by not handling the exception). This is
possible since we first save the archive, and only then the modified
mailbox, so we don't corrupt the original mbox in this case.
* If writing a modified mbox file fails, save the temporary copy.
The RetainMbox and ArchiveMbox classes are now gone, mainly because their
finalise() methods were messing with the archived mbox and the archive,
respectively, which was not good OO design.
The core functionality of the finalise() methods of both removed classes
is moved to the objects that are manipulated: the Mbox class representing
the mbox that is being archived gains a new method overwrite_with(), and
there is a new class ArchiveMbox that represents the actual archive, which
has an append() method (yes, unfortunately the new class has the same name
like the removed class).
The RetainMbox instance is replaced with a TempMbox, and the ArchiveMbox
instance either with a TempMbox, or a CompressedTempMbox if archive
compression is enabled.
Finally, a compressed TempMbox is now a implemented as a subclass of
TempMbox, named CompressedMbox.
Cooperation with the StaleFiles class moves into the TempMbox class.
This means slightly less detailed verbose cleanup reporting, oh well.
We used to create a dotlock file first and then lock with fcntl; swap that
order, since locking first with fcntl seems to be more common.
This patch also adds general mbox lock/unlock methods, which call the
dotlock and fcntl-lock methods, and moves the retry logic there.
When the dotlock and fcntl methods fail to acquire a lock, they now raise
a custom exception "LockUnavailable", which gets caught in the general
lock() method. That way, if we succeed to acquire one lock but fail to
acquire the other, we can release our locks at the upper level and retry.
These helper methods provide success verification after test archiving runs, and
test case setup. This is a tradeoff: because these methods need to support all
scenarios in one place, they introduce some new complexity - but they replace a
lot of tedious, very similar, but still not entirely identical code all over the
place.
TestArchiveMboxPreserveStatus actually doesn't test that the message
status is preserved, but that the --preserve-unread option works.
Rename it to TestArchiveMboxPreserveUnread.
The test suite used to run a lot of triple tests, by first calling
archivemail.archive() directly, and then running the entire archivemail
script twice, once with long and once with short options. But we already
test option processing seperately, and beyond that, archivemail.main()
essentially just calls archive() for each mailbox in turn. So we just drop
all runs of the entire archivemail script from the test suite, giving it a
huge speed boost (on my old iBook, running the test suite drops from 73 to
5 seconds).
os.utime() uses the utimes(2) system call to set file timestamps. utimes(2)
has a microsecond resolution, but stat(2) may return timestamps with
nanosecond resolution. So, the check that we have properly reset the mbox
file timestamp must allow a minor deviation.
* Make the finalise() methods spot if they have anything to do
* We used to create the temporary mbox files on demand in the message
processing loop, if we needed to write to them. Now we create them
beforehand, but only if they might be needed (e.g. we don't create an
archive if options.delete_old_mail is set).
* The above combined makes the final committing of the changes simpler (a
*lot* simpler for mboxes), and we can dump the Mbox.leave_empty() method.
When committing a changed mbox, don't use os.rename(), and don't open/close
the mbox file to truncate it to zero length. Locking was pretty much broken
before -- at least in theory a quite severe bug.
* Remove code duplication: restore the mbox timestamps once and for all when
we're done
* Don't bother restoring the file mode when finishing, since this is handled in
RetainMbox.finalise() (and need be)
* Therefore, rename Mbox.reset_stat() to reset_timestamps()
This is now before we do the sanity checking, so in verbose mode, we don't error
out before having said that we now turn attention to the current mailbox.