Integrate DSPAM into postfix + dovecot + any mail client

Recently I figured that my spamasassin setup stopped working correctly for some reason. At first I didn't see that, then I didn't care immediately as Thunderbird ist still 99% right, but when using Roundcube while "Thunderbird at home" is shut down, it became more and more annoying. I rechecked my setup twice, started all over, re-trained it for almost a week, to no dice.

So I watched out for alternatives. DSPAM. There is nothing else, really. To say one thing upfront: it works from the start, even while being in training phase still.

The benefits:

  • Mail client agnostic for retraining. I use dovecot2-antispam for that.
  • Instant retrain. No cronjobs.
  • Low system resource consumption
  • Near-to-perfect detection rates
  • Better insight through dspam_stats and syslog features 

After half a day of struggeling, here goes my setup. Big thanks to – it just needed some minor adjustments on a recent Arch Linux box.

[toc:number Contents]

Install required packages

This pulls in dovecot2-antispam-plugin-hg from the AUR. Be aware that dovecot-antispam-git does NOT work with dovecot-2.x!

yaourt -S dspam dovecot2-antispam-hg

Setup dspam

Setup the database

As user postgres:

createuser -P -S -R -E -D dspam
createdb -O dspam dspam
psql -U dspam -d dspam < /usr/share/dspam/pgsql/pgsql_objects.sql
psql -U dspam -d dspam < /usr/share/dspam/pgsql/virtual_users.sql

Configure dspam

Edit /etc/dspam/dspam.conf. The important stuff is:

StorageDriver /usr/lib/dspam/
TrustedDeliveryAgent "/usr/sbin/sendmail"
UntrustedDeliveryAgent "/usr/lib/dovecot/deliver -d %u"
TrainingMode toe
Tokenizer osb
Preference "trainingMode=TOE"           # { TOE | TUM | TEFT | NOTRAIN } -> default:teft
Preference "spamAction=deliver"         # { quarantine | tag | deliver } -> default:quarantine
Preference "statisticalSedation=5"      # { 0 - 10 } -> default:0
Preference "enableBNR=on"               # { on | off } -> default:off
Preference "signatureLocation=headers" # { message | headers } -> default:message 
PgSQLServer /run/postgresql/
PgSQLUser dspam
PgSQLPass secret
PgSQLDb dspam
IgnoreHeader DKIM-Signature
IgnoreHeader X-Bogosity
IgnoreHeader X-Spam-Checker-Version
IgnoreHeader X-Spam-Flag
IgnoreHeader X-Spam-Level
IgnoreHeader X-Spam-Status
IgnoreHeader X-GMX-Antispam
IgnoreHeader X-GMX-Antivirus
IgnoreHeader X-UI-Filterresults
ParseToHeaders on
ChangeModeOnParse off
ChangeUserOnParse full
ServerPID /run/dspam/
ServerDomainSocketPath "/run/dspam/dspam.sock"
ClientHost /run/dspam/dspam.sock

I didn't touch the rest.

Now start dspam daemon with 
systemctl enable dspam
systemctl start dspam

Setup postfix

  • Edit /etc/postfix/ and add the required transports.
dspam     unix  -       n       n       -       10      pipe
  flags=Ru user=dspam argv=/usr/bin/dspam --deliver=innocent,spam --user $recipient -i -f $sender -- $recipient

dovecot   unix  -       n       n       -       -       pipe
  flags=DRhu user=mail:mail argv=/usr/lib/dovecot/deliver -f ${sender} -d ${recipient}

Note that the ArchLinux package for dspam in [community] installs dspam with dspam:dspam permissions on all folders involved, so user=dspam matters!

  • Create /etc/postfix/dspam_filter_access with following filter definition
​/./   FILTER dspam:unix:/run/dspam/dspam.sock
  • Edit /etc/postfix/ and configure postfix to use the dovecot mail transport and require the above filter rule for any smtp client.
# edit this one, has been "dovecot-spamasassin" before here
virtual_transport = dovecot

# new settings for dspam
dspam_destination_recipient_limit = 1
smtpd_client_restrictions =
   check_client_access pcre:/etc/postfix/dspam_filter_access

Setup dovecot

  • Edit /etc/dovecot/dovecot.conf and enable the antispam plugin
mail_plugins = $mail_plugins imap_quota antispam
  • Edit /etc/dovecot/conf.d/90-plugin.conf
plugin {
   # Antispam (DSPAM)
   antispam_backend = dspam
   antispam_allow_append_to_spam = YES
   antispam_spam = Spam;Junk
   antispam_trash = trash;Trash;Gelöschte*;Papierkorb
   antispam_signature = X-DSPAM-Signature
   antispam_signature_missing = error
   antispam_dspam_binary = /usr/bin/dspamc
   antispam_dspam_args = --user;%Lu;--deliver=spam,innocent;--source=error;--signature=%%s
   antispam_dspam_spam = --class=spam
   antispam_dspam_notspam = --class=innocent
   antispam_dspam_result_header = X-DSPAM-Result

Getting metrics from Graphite into Nagios and Centreon

Getting metrics from logs and various other sources into Graphite is quite simple. The most interesting metrics do represent critical performance data, and the pro-active-monitoring approach, means a person sitting there and waching the dashboard, isn’t suited to our needs. We use Nagios with Centreon as our monitoring plattform, and we want to alert on some of the metrics collected in Graphite. Also since version 2.4 Centreon supports custom dashboard views and, although this might sound like doublemobble, we wanted to get the metrics graphically integrated into the Centreon interface, as RRD graphs that is.

Looking around I found the check_graphite plugin by obfuscurity, and greatly enhanced it to support multiple metrics in one call, performance data with customizable metric shortnames and retry calls in case there were no datapoints in the given duration. It’s called check_graphite_multi, available from my nagios-scripts perfdata branch on github, and is especially usefull if you’d like to get multiple metrics of the same type into one RRD graph in Centreon or PNP4Nagios or thelike. Our usecase is a graph with JVM heap generation usage and garbage collector statistics. We alert on full old generation and high GC durations.

Here are some short usage notes:

–metrics|-m accepts a string of metrics, seperated by a pipe |

--metrics "scale(weblogic.server01.jvm.gc.oldgen_used_before_gc)|scale(weblogic.server01.jvm.gc.oldgen_used_after_gc)"

–shortname|-s accepts a comma separated list of aliases for the output of status and performance data

--shortname "jvm_eden_before_gc,jvm_eden_after_gc"

If no –shortname is specified for the given metric, it defaults to the full metric name.

–warn|-w also accepts a comma separated list

--warn "100,150"

At least one value is required, if only one value is given for multiple metrics, the given value counts for all

–critical|-c works the same as –warn

Also note:

When specifying multiple metrics, make sure to keep the order for all parameters, like

-m "metric1|metric2" -s "alias1,alias2" -w "warn1,warn2" -c "crit1,crit2"

If at least one of the metrics returns a CRITICAL state, the plugin exits with CRITICAL return code. Dito for WARNING.

By default, if the metric has no datapoints in the given –duration timeframe, the plugin retries with 10times the given duration. This is mostly cosmetic to prevent holes in RRD graphs, and I might make that configurable in the future. Unfortunately Graphite has no option via the render API to just return the last datapoint, so this is a hack to work around that.

Monitoring and graphing Weblogic performance using Graphite and metrics-sampler

My current project is to take our Weblogic monitoring setup from parsing gc logs in Splunk up to the next level. For other things metrics we do use Graphite already. Graphite is an awesome app for graphing any sort of metrics. You just need to get them in there somehow. Some days ago I stumbled over an outstanding piece of software written by Dimo Velev: metrics-sampler

It's able to read metrics from various inputs, like e.g. the JMX tree of the Weblogic Runtime MBean Server. Exactly this usecase is well documented in the configuration examples. The syntax is very well thought. It's not worth documenting it here or posting examples. Yust read the docs and the examples included, it's really straight forward.

Setup prerequisites:

For talking to Weblogic via JMX, you need to generate a wlfullclient.jar and put it in metrics-sampler/lib directory. See for details on how to generate it.

By default, the Weblogic Runtime MBean Server does not expose the domains from the Plattform Mbean Server of the JVM, so any java.lang/* stuff can't be queried there. You either need to enable the JVM's remote management (which doesn't allow to specify a listenaddress for the MBean server and instead listens on all interfaces by default), or you configure the Weblogic Runtime MBean server to act as the Plattform Mbean server. The second option allows to query both, the com.bea and the JVM's java.lang domains – so every metric you can get – all from one source.

In our case it also was very impractical to install metrics sampler directly on the production weblogic cluster machines. Instead, I confiured a dedicated box with a SSH tunnel to the IP of every Weblogic node. As JMX requires the remote and "local" port to be the same, the tunnels were created using a loopback IP for every node on the local metrics-sampler box. So to speak, server01 was tunneled to, server02 to, server03 to, and so on. 

Once this was setup, it just required to edit the provided config.xml.template in metrics-sampler to match our DataSource namings, add some nodes to get samples from – and it was ready to go. This is really some well done JUST WORKS[tm] sort of software. I love it. Next step: mashing up dashboards in Graphite.

Clap clap clap!

Indexing and searching Weblogic logs using Logstash and Graylog2

Update 2013/10: we decided to replace Graylog2 with Kibana3 completely. The article below is just for reference, the logstash config is outdated since logstash 1.2 and the setup as described below is suboptimal anyway. I’ll post a new article shortly.

Update 2014/02: Finally, the new guide is here: Indexing and searching Weblogic logs using Logstash, Elasticsearch and Kibana.


Recently we decided to get rid of our Splunk “Free” log indexing solution as the 500MB limit is too limited for our 20+ Weblogic environments (we couldn’t even index all production apps with that) and $boss said “he won’t pay over 20000€ for a 2GB enterprise license, that’s just rediculous”. So I went out on the interwebs to watch out for alternatives, and stumbled over Graylog2 and Logstash. This stuff seemed to have potential, so I started playing around with it.

Logstash is a log pipeline that features various input methods, filters and output plugins. The basic process is to throw logs at it, parse the message for the correct date, split the message into fields if desired, and forward the result to some indexer and search it using some frontend. Logstash scales vertically, and for larger volumes it’s recommended to split the tasks of logshipping and log parsing to dedicated logstash instances. To avoid loosing logs when something goes down and to keep maintenance downtimes low, it’s also recommended to put some message queue between the shipper(s) and the parser(s).

Redis fits exactly in that pictures, and acts as a key-value message queue in the pipeline. Logstash has a hard coded queue size of 20 events per configured input. If the queue fills up, the input gets blocked. Using a dedicated message queue instead is a good thing to have.

Graylog2 consits of a server and a webinterface. The server stores the logs in Elasticsearch, the frontend lets you search the indexes.

So, our whole pipeline looks like this:

logfiles logstash shipper redis logstash indexer cluster gelf graylog2-server elasticsearch cluster graylog2-web-interface

Logstash is able to output to Elasticsearch directly, and there is the great Kibana frontend for it which is in many ways superior to graylog2-web-interface, but for reasons I explain at the end of the post we chose Graylog2 for weblogic logs.


The first step was to get graylog2, elasticsearch and mongodb up and running. We use RHEL 6, so this howto worked almost out of the box. I changed following:

  • latest stable elasticsearch-0.19.10
  • latest stable mongodb 2.2.0
  • default RHEL6 ruby 1.8.7 (so I left out any rvm stuff in that howto, and edited the provided scripts removing any rvm commands)

Prepare access to the logfiles for logstash

Next was to get logstash to index the logfiles correctly.

We decided to use SSHFS for mounting the logfile folders of all Weblogic instances onto a single box, and run the logstash shipper on that one using file input and output to redis. The reason for using SSHFS instead of installating logstash directly on the Weblogic machines and using for example a log4j appenders to logstash log4j inputs was mainly that our Weblogics are managed by a bank’s data centre, so getting new software installed requires a lot work. The SSH access was already in place.

We have weblogic server logs (usually the weblogic.log), and each application generates a log4j-style logfile.

... and so on

This is the configuration file for the file-to-redis shipper. The only filter in place is the multiline filter, so that multiline messages get stored in redis as a single event already.

input {
  # server logs
  file {
    type => "weblogic-log"
    path => [ "/data/logfiles/*/*/weblogic.log" ]
  # application logs
  file {
    type => "application"
    path => [ "/data/logfiles/*/*/planethome.log",
              "/data/logfiles/*/*/planetphone.log" ]
filter {
  # weblogic server log events always start with ####
  multiline {
    type => "weblogic"
    pattern => "^####"
    negate => true
    what => "previous"
  # application logs use log4j syntax and start with the year. So below will work until 31.12.2099
  multiline {
    type => "application"
    pattern => "^20"
    negate => true
    what => "previous"
output {
  redis {
    host => "phewu01"
    data_type => "list"
    key => "logstash-%{@type}"

And this is the config for the logstash parsers. Here the main work happens, logs get parsed, fileds get extracted. This is CPU intensive, so depending on the amount of messages, you can simply add more instances with the same config.

input {
  redis {
    type => "weblogic"
    host => "phewu01"
    data_type => "list"
    key => "logstash-weblogic"
    message_format => "json_event"
  redis {
    type => "application"
    host => "phewu01"
    data_type => "list"
    key => "logstash-application"
    message_format => "json_event"

filter {
  # weblogic server logs
  grok {
     # extract server environment (prod, uat, dev etc..) from logfile path
     type => "weblogic"
     patterns_dir => "./patterns"
     match => ["@source_path", "%{PH_ENV:environment}"]
  grok {
    type => "weblogic"
    pattern => ["####<%{DATA:wls_timestamp}> <%{WORD:severity}> <%{DATA:wls_topic}> <%{HOST:hostname}> <(%{WORD:server})?> %{GREEDYDATA:logmessage}"]
    add_field => ["application", "server"]
  date {
    type => "weblogic"
    # joda-time doesn't know about localized CEST/CET (MESZ in German), 
    # so use 2 patterns to match the date
    wls_timestamp => ["dd.MM.yyyy HH:mm 'Uhr' 'MESZ'", "dd.MM.yyyy HH:mm 'Uhr' 'MEZ'"]
  mutate {
    type => "weblogic"
    # set the "Host" in graylog to the environment the logs come from (prod, uat, etc..)
    replace => ["@source_host", "%{environment}"]

  # application logs
  # match and pattern inside one single grok{} doesn't work
  # also using a hash in match didn't work as expected if the field is the same,
  # so split this into single grok{} directives
  grok {
    type => "application"
    patterns_dir => "./patterns"
    match => ["@source_path", "%{PH_ENV:environment}"]
  grok {
    # extract app name from logfile name
    type => "application"
    patterns_dir => "./patterns"
    match => ["@source_path", "%{PH_APPS:application}"]
  grok {
    # extract node name from logfile path
    type => "application"
    patterns_dir => "./patterns"
    match => ["@source_path", "%{PH_SERVERS:server}"]
  grok {
    type => "application"
    pattern => "%{DATESTAMP:timestamp} %{DATA:some_id} %{WORD:severity} %{GREEDYDATA:logmessage}"
  date {
    type => "application"
    timestamp => ["yyyy-MM-dd HH:mm:ss,SSS"]
  mutate {
    type => "application"
    replace => ["@source_host", "%{environment}"]

output {
  gelf {
    host => "localhost"
    facility => "%{@type}"

In ./patterns there is a file containing 3 lines for the PH_ENV etc grok patterns to match.

I had several issues initially:

  • rsync to jumpbox and mounting to logserver via ssfhs would cause changes to go missing on some, but not all files
  • chaining rsyncs would cause some messages to be indexed with partial @message, as some files are huge and slow to transfer.
    • This was solved by mounting all logfile folders on the different environments directly with SSHFS, where possible.
    • The remaining rsync’ed files are rsync’ed with “–append –inplace” rsync parameters
  • indexing files would always start at position 0 of the file, over and over again
    • Only happened for rsync, using “–append –inplace” fixes this

Meanwhile I also took a look at Kibana, another great frontend to ElasticSearch. For the weblogic logs I’ll keep Graylog2, as it allows saving predefined streams and provides that easy-to-use quickfilter, which eases up log crawling for our developers (they only want to search for a string in a timeframe – they also never made use of the power of Splunk). Also Kibana doesn’t provide a way to view long stack traces in an acceptable fashion yet (cut message a n characters and provide a more link, something like that). But I added Apache logs in logstash, and those are routed to a logstash elasticsearch output and Kibana as WebUI. I’d really like to see some sort of merge between Kibana and Graylog2 and add saved searches to that mix – that would make a realy competitive Splunk alternative.

List open/active sessions

This lists open/active sessions.

By username:

select username
,      count(status) as Sessions
from v$session 
where username is not null --and status = 'ACTIVE' 
group by username
order by Sessions desc;

By os user

select osuser
,      count(status) as Sessions
from v$session 
where username is not null --and status = 'ACTIVE' 
group by osuser
order by Sessions desc;

List waiting users due to row locks

This list users who are queued up due to other session holding a lock on objects.

SELECT substr(s1.username,1,12)    "WAITING User",
       substr(s1.osuser,1,8)            "OS User",
       substr(to_char(w.session_id),1,5)    "Sid",
       P1.spid                              "PID",
       substr(s2.username,1,12)    "HOLDING User",
       substr(s2.osuser,1,8)            "OS User",
       substr(to_char(h.session_id),1,5)    "Sid",
       P2.spid                              "PID"
FROM   sys.v_$process P1,   sys.v_$process P2,
       sys.v_$session S1,   sys.v_$session S2,
       dba_locks w,     dba_locks h
WHERE  w.mode_held        = 'None'
AND    h.mode_held       != 'None'
AND    h.mode_requested  = 'None'
AND    w.mode_requested  != 'None'
AND    w.lock_type (+)    = h.lock_type
AND    w.lock_id1  (+)    = h.lock_id1
AND    w.lock_id2  (+)    = h.lock_id2
AND    w.session_id       = S1.sid  (+)
AND    h.session_id       = S2.sid  (+)
AND    w.session_id       != h.session_id
AND    S1.paddr           = P1.addr (+)
AND    S2.paddr           = P2.addr (+)

Running Owncloud WebDAV with Nginx

Update: since Owncloud 5.0 the config below didn't work anymore. A slightly more complex config is needed. See this GIST:

Old info for pre 5.0 below.

Here is how I got OwnCloud's WebDAV feature to work in Nginx.

I use Nginx with dav-ext-module which provides support for OPTIONS and PROPFIND methods, but it works with plain http_dav_module, too. You do not need dav-ext-module, but if you're going to use it, you have to be very careful not to set dav_ext_methods in the root context, otherwhise the hole site's folder structure can be browsed with webdav. It's best to set the dav handler only on remote.php.

On my server, Owncloud is accessed at /owncloud along with Drupal7 in the root-context.

Note that the dav handler location has to be set before the \.php handler, because with Nginx the first ~ match wins.

server {
        ##common server settings

        root /srv/http;
        index index.php;

        #required for owncloud
        client_max_body_size 8M;
        create_full_put_path on;
        dav_access user:rw group:rw all:r;

        ##common rules here

        # Owncloud WebDAV
        location ~ /owncloud/remote.php/ {
                dav_methods PUT DELETE MKCOL COPY MOVE;
                dav_ext_methods PROPFIND OPTIONS;
                fastcgi_split_path_info ^(.+\.php)(/.+)$;
                fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
                include fastcgi_params;
                fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock;

        location / {
                try_files $uri $uri/ @rewrite;
                expires max;

        location @rewrite {
                #some rules here for legacy stuff
                # Drupal7
                rewrite ^ /index.php last;

        # PHP handler
        location ~ \.php$ {
                fastcgi_pass   unix:/var/run/php-fpm/php-fpm.sock;
                fastcgi_param  SCRIPT_FILENAME  $document_root$fastcgi_script_name;
                include        fastcgi_params;
                fastcgi_intercept_errors on;

        ##other common rules

Using OpenSUSE’s snapper on ArchLinux to manage btrfs snapshots

Today I created a PKGBUILD for OpenSUSE's snapper utility which allows creating and managing BTRFS snapshots. You can find it on AUR here.

Snapper is a quite handy tool to manage BTRFS snapshots. On OpenSUSE it comes with a YaST2 plugin, so it even has a GUI. On Arch you can still get the command line version though.


1. First create a .snapshots subvolume in the root of the subvolume you want to be snappered. E.g. for a /mnt/rootfs/arch-root subvolume:

btrfs subvolume create /mnt/rootfs/arch-root/.snapshots

2. Create a config based on the provided template

cd /etc/snapper
cp config-templates/default configs/root

3. Edit /etc/snapper/configs/root

# subvolume to snapshot

4. Edit /etc/conf.d/snapper


Et voilà. That's it.

# snapper list-configs
Config | Subvolume            
root   | /mnt/rootfs/arch-root
home   | /mnt/rootfs/home

By default, snapper will take a snapshot every hour. To disable this, edit /etc/snapper/configs/ and set


As snapshots cost almost no space at all, I'll keep it enabled. After some hours:

# snapper list
Type   | #  | Pre # | Date                        | Cleanup  | Description | Userdata
single | 0  |       |                             |          | current     |         
single | 1  |       | Mi 01 Feb 2012 00:01:01 CET | timeline | timeline    |         
single | 2  |       | Mi 01 Feb 2012 01:01:01 CET | timeline | timeline    |         
single | 3  |       | Mi 01 Feb 2012 02:01:01 CET | timeline | timeline    |         
single | 4  |       | Mi 01 Feb 2012 03:01:01 CET | timeline | timeline    |         
single | 5  |       | Mi 01 Feb 2012 04:01:01 CET | timeline | timeline    |         
single | 6  |       | Mi 01 Feb 2012 05:01:01 CET | timeline | timeline    |         
single | 7  |       | Mi 01 Feb 2012 06:01:01 CET | timeline | timeline    |         
single | 8  |       | Mi 01 Feb 2012 07:01:01 CET | timeline | timeline    |         
single | 9  |       | Mi 01 Feb 2012 08:01:01 CET | timeline | timeline    |         
single | 10 |       | Mi 01 Feb 2012 09:01:01 CET | timeline | timeline    |         
single | 11 |       | Mi 01 Feb 2012 10:01:01 CET | timeline | timeline    |         
single | 12 |       | Mi 01 Feb 2012 11:01:01 CET | timeline | timeline    |         
single | 13 |       | Mi 01 Feb 2012 12:01:01 CET | timeline | timeline    |         

Basic Usage:

snapper will pick the "root" config by default. To see the "home" config, use:

snapper -c home list

You can compare snapshots with

snapper diff 12..13

By default, it will show chages from every file that changed. Also if you use snapper to revert changes, all files will get reverted. This is not desirable for systemfiles like /etc/mtab, logfiles and other dynamic files. This is where the filters feature comes in. To exclude everything in /var/log/, create the file /etc/snapper/filters/logfiles.txt


I use these excludes for now:



Snapper includes manpages. See

man snapper
man snapper-configs

Hints (updated 2013-10-27)

After a year of snapper usage here are some hints that might help others, too.

If you use snapper's timeline feature with the default values for daily/monthly/yearly snapshots, you may notice a serious slowdown of your filesystem. On my SSD I sometimes saw performace drop to 5MB/s and complete stalls. This is due to having way too many snapshots spread across a large time frame.

Trying Gnome3.. ah wait.. it still can’t handle seperate X screens

From time to time I like to take a look at new things, like e.g. Gnome3 and its gnome-shell.

My setup uses proprietary nvidia drivers and two seperate X screens, one for the LCD, one for the Samsung TV over HDMI (XBMC that is). This works just fine in KDE4, Gnome2 and anything else I've tried. But Gnome3 just blatantly fails to start with this setup. Last time I tried, it was Gnome 3.0.0 or something, but meanwhile there is Gnome 3.2.2 and still they didn't manage or refused to fix this issue.

Well, I'm not alone:

..and here the bug report: