Training spam with doveadm

Posted Aug 13, 2023

By Steven Haigh

1 min read

A while ago, I posted about training SpamAssassin Bayes filter with Proxmox Mail Gateway. That’s really easy when you’re using Maildir - as each email message is its own file.

At this point, we could easily just cat out a file and treat email in folders as files and ignore the fact they were part of an imap mailbox. However, what happens if you use something other than Maildir - like the newer mailbox formats? We can’t use the same approach, as each email is likely not just a file anymore.

For example, dbox is Dovecot’s own high-performance mailbox format.

If we use mdbox, we can no longer open a single message per file, nor can we tell what folders are what from the on disk layout. So we have to get smarter.

Using doveadm, we can search for messages in a mailbox, and fetch them to feed into our previously configured script and feed them into PMG as before. The main advantage is that this will work with any mail storage backend.

This simple bash script will go through all users Spam or INBOX/Spam folders and fetch each one, feed it into the learning system, and then remove it from the users mailbox.

        
      
	#!/bin/bash
	MAILFILTER=my.pmg.install.example.com
	shopt -s nullglob

	doveadm search -A mailbox Spam OR mailbox INBOX/Spam | while read user guid uid; do
		doveadm fetch -u $user text mailbox-guid $guid uid $uid | tail -n+2 > /tmp/spam.$guid.$uid
		cat /tmp/spam.$guid.$uid | ssh root@$MAILFILTER report
			if [ $? != 0 ]; then
				echo "Error running sa-learn. Aborting."
				exit 1
			fi
			rm -f /tmp/spam.$guid.$uid
			doveadm expunge -u $user mailbox-guid $guid uid $uid
		done

Use it with the scripts / general configuration from the previous article, and this should be able to be used across all mail storage methods supported by Dovecot.

Cron it to run every 5 minutes or so, and you’re done! Nice and easy.

This post is licensed under CC BY 4.0 by the author.

Trending Tags