Message Sniffer.TechnicalDetails.Customization

From ARM-KB

This page is no longer maintained and may contain information that is out of date. We have left this page in place to provide a historical reference and to provide assistance to folks who may have not yet upgraded from Version 2 to Version 3. EVERYONE should upgrade to the latest version if they have not done so already.

For the latest information covered on this page, please see the following pages on our web site: http://www.armresearch.com/support/qa/customize/index.jsp


Home -> Message Sniffer -> Technical Details -> Customization

A registered rulebase can be customized in almost any way. This includes white listing, black listing, rule strength tuning, etc. If you have any customizations you want done to your rulebase please contact us at support@armresearch.com.

Contents

I am not familiar with adjusting the "weights". Do you have any information that I could refer to that would help me to better understand and configure these?

The weighting system is largely Declude specific. It is possible to apply separate weights to the result codes from SNF, but it is not usually necessary.

We recommend that you ask for suggestions and examples on the Declude and Sniffer mail lists w/ regard to configuring your Declude weights --- there are many helpful folks there with a wide variety of approaches that might fit your needs or at least send you in the right direction.

Back to Top


I have a list of domains that I want to white rule. How do I get these set up?

If you would like white-rules added to your rulebase then please send a list to support@armresearch.com and we will add them for you. Keep in mind that we will be coding for a pattern match in the headers and body of the message. It is not usually wise to whitelist your domains since spammers frequently forge from addresses from the domain where they are targeting their spam in order to exploit these finds of rules.

We will work with you to develop a good strategy. If you need to develop an extensive rulebase then we can contract with you for that work. Support included with the subscription is intended to cover adjustments for specific false positives or specific spam issues not covered by the core rule base.

Back to Top


What features would you recommend I configure with the CFG file, besides the persistent mode?

You place the .cfg file in the same directory where the sniffer executable and rulebase files are found.

You may want to configure your log file format - that can save you some space.

You may at some point have a RulePanic (those are rare but they do happen). In that case you would be able to make a RulePanic entry in the .cfg file to take care of it until the rulebase could be adjusted.

There are a number of timing parameters that you might change if you are sure about what you are doing - but generally they should be left commented out so that the defaults will be used.

Upcoming versions will make further use of the .cfg file, so having it in place as a matter of practice will make those transitions much easier.

Back to Top


What does the .cfg file do and what are my options with it?

The .cfg file is included in the Message Sniffer distribution. You do need to rename the .cfg with your license id to use it (Exm: If your license id is: abcdefg, rename your .cfg to abcdefg.cfg).

If you haven't been using it then you'll want to start slow. Most of the things that can be adjusted in the .cfg file are best left alone unless you have a specific need to change them. All of the options are explained in the .cfg file itself - if you have additional questions after reading the descriptions there then please do ask. ;-)

Most likely you won't need any of the options in the .cfg file. That said, the things you can do in the .cfg file include altering timing parameters, adjusting the log file format, producing x-headers in an .xhdr file, or enabling the diagnostic mode. (in version 2-3.1)

Back to Top


How can I use the X-SNF header? Is there a cheat sheet someplace for making that happen, if possible, in a Declude / Imail environment?

In the distribution the option is described in the .cfg file. However, in the Declude environment I don't know of any easy way to make use of it. What would be best is if Declude could be persuaded to pick up the .xhdr file SNF produces and add it to the headers it is already adding to the the message. This way, the message would only need to be altered once (less I/O) for all of the headers.

MDaemon systems using the plugin have the SNF headers by default.

Most *nix systems also use the .xhdr option and then allow the programs that follow to respond to the headers planted by SNF.

A number of custom-built systems are also using it.

Back to Top


Is it a good idea to hold messages based on not passing the Sniffer test?

Every system is different and has different policies. Some folks prefer to only warn on all spam, while others may delete messages based on any test that fails (including Message Sniffer). The best answers are usually some where in between and will depend on your customer base, the resources, software and tools you have at your disposal, and your policies.

Best practice is to combine a number of tests with Message Sniffer in a weighted scheme where each result code can be given an individual weight. For example, the gray hosting group (60) should be weighted low while the porn/adult group (54) should be weighted high.

Under these conditions it is usually ok to hold when a message fails SNIFFER and one other test. Perhaps some rule groups might be weighted high enough to cause a hold, and others reduced in weight to require more than one other test failure.

In any case, if you have false positives, please be sure to submit them to us so that we can adjust your rule base.

Back to Top


I want to tune my rule strength. What setting do you recommend?

There is a standard procedure defined for this that works as part of the overall system design:

Currently the default rule strength is set to 1.0. Generally it works to go "half way there" (0.5) and see how your system responds. If the extra load is undetectable then we would probably recommend going straight from there to the most sensitive setting (0.1).

If going to 0.5 creates a significant increase in system loads then we would need to decide if this was acceptable. If it was then we would leave the setting there, and if not then we would go half way back to 0.25. If going to 0.5 creates a moderate increase but we feel there is still room then we would continue on - again have the distance and try 0.25.

This is in essence a "binary search" for the best setting. The goal is to get to 0.1 if possible since this includes all rules that have any reported activity in the sample window (45 days currently).

This mechanism is a low level part of the collaborative mechanisms in Message Sniffer... any system detecting activity on a given rule wakes up that rule in other systems automatically. As more of the sensitive systems report, the strength of the rule grows until systems with moderate and nominal sensitivity begin to also report the activity. The mix of sensitivities in the systems helps to balance the size of the active rulebase against the available computing capacity of all of the nodes.

One final piece:

A few special nodes process spamtraps and submitted spam against the entire historical rulebase corpus (0.0) so that even completely inactive rules can be awakened for analysis on the collaborative network.

Send us a note at support@armresearch.com to let us know what setting you would like to try. Be sure to include your license ID in your message. We will be happy to work with you to tune your rule strength to the optimal setting for your system.

Back to Top


What about obfuscation techniques?

Some (most) obfuscation techniques are encoded in our rulebase, and more are created as we identify them. It is a very dynamic process partly driven by AI and partly by human intervention. We don't have any direct method for listing our rules - and that would also be against security policy.

The pattern matching engine in Message Sniffer is very good - but it does have some limitations - no doubt there are some obfuscation tests that can be done better in a different environment.

Back to Top


What is a site specific customization? How is it helpful?

Site-specific customization is part of the design strategy of our system. The basic concept is to develop an aggregate rule base that the majority of systems can agree upon and then to specialize derivative subsections and ultimately individual system policies as required.

The vast majority of our subscribers tend to agree on which messages to reject. The best strategy in this case is to block the general forms those messages and then unblock the specific exceptions for each system. The end result of this design philosophy is that the number of specific rules required to satisfy the subscriber base is manageable without reducing the system's performance as measured from the perspective of each individual site; provided, of course, that the individual sites are properly customized.

For more information on customizing your rule base, please see False Postives help, or send us an email at support@armresearch.com.

Back to Top


What are my options for customizing my rulebase?

A registered rulebase can be customized in almost any way. This includes white listing, black listing, rule strength tuning, etc. If you have any customizations you want done to your rulebase please contact us at support@armresearch.com.

The first way we would customize your rulebase is if you were to submit a false positive report. We would be able to remove or adjust for any rules that cause you false positives.

We can also add any rules that you request directly.

When you submit log files, those rules that are capturing spam on your system will gain higher strength values and will remain active in the rulebase. This is particularly important if you submit spam to us.

You can submit spam that gets through to our spam@ address. While this will not be applied specifically to your rulebase as a general rule, we will code rules for those messages to ensure that they will be filtered - that is, unless we cannot safely add those rules to the core rulebase.

If you have a chronic spam that continues to get through even though you submit your spam to spam@armresearch.com, then you can contact us about it on our support@armresearch.com address and we will work with you to create a custom rule for your system, or to debug the reasons why there are no rules in the core rulebase.

Another way to customize your rulebase is to adjust the rule strength (how sensitive your rulebase is). If you would like this done, please send a note to the support@ address.

At any time you can request the addition of any rules you specify for your rulebase.

Back to Top


Are there any suggestions you have for increasing the level of spam that is caught?

There are a number of things you might do depending upon your system's policies and tolerance for false positives.

I see that we have already lowered the minimum rule-strength threshold on your rulebase so you are already using rules that have marginal hit rates in addition to normal rules.

Of course, we are constantly working to improve the capture rate of SNF in any case.

1. Be sure you are updating your rulebase each time you receive an update notification. This can/should be done automatically.

2. Be sure you are sending us your log files - this will help to ensure that rules that are effective on your system remain active in the rulebase.

3. Be sure we are getting copies of spam that does get through on your system. The best way to do this is to set up a mailbox on your system that our bots can pull from using POP3 and to forward any missed spam to that mailbox.

4. If there are character sets or other common characteristics of troublesome spam on your system and you also do not have any legitimate mail that contains those character sets then let us know and we will create local black rules for your system to block these.

5. SNF on it's own is very good, but not perfect ;-) Be sure you are also making the best use of other tests that are available (most of them free).

--- Filtering spam is a tough business... To start with, the spammers these days are testing all of their messages against existing filters to be sure they will get through before they are sent, and then they send those messages in huge spikes through distributed bot networks to maximize the damage they can do before filters adapt.

If you have any special insights to the spam that is still leaking in your system - in particular, if there is anything that you can filter out because you would never expect any legitimate messages with a particular characteristic - then let us know and we will work with you to develop additional black rules specifically for your system.

Back to Top


Can I use Message Sniffer as the only test that Declude uses to weight my incoming mail?

Some folks do use SNF as the only test, though we do not recommend that. As a best practice, it is always better to use a number of test that leverage different techniques in order to reduce errors and improve capture rates. That said, as I have indicated, there are folks who effectively use SNF as their only test and they are very happy with those results.

Back to Top


What other tests to do you recommend we use with Message Sniffer?

I recommend keeping some of the better blacklists handy. INVURIBL comes highly recommended in combination with SNF. I think that if you follow this kind of recipe and weight SNF test at 70-80% of your hold weight (perhaps allowing some SNF result codes to hold on their own) then you will have a system that is very low maintenance and provides very good results.

This recommendation is based on the aggregate anecdotal evidence reported by users on our list, IMail and Declude lists, and some of our own experimentation.

Back to Top


How do I turn on the .xhdr option?

In your snf <licenseid>.cfg file, un-comment the .xhdr option line:

#### X-Headers
# XHeader File Output - When set to On the engine will create a new file with
# each message scanned with the name scanfilename.xhdr that contains x-header
# information that should be added to the message.

XHeaderData: X-SortMonster-MessageSniffer-Rules
XHeaderFinal: X-SortMonster-MessageSniffer-Result

A quick, ok-but-not-very-correct way to add these headers to the message is simply to pre-pend them (copy msgfile.xhdr + msgfile to newmsgfile, delete msgfile, rename newmsgfile msgfile). Technically, X-headers can go anywhere in the header section. By convention they go just before the body. A utility would seek out the first empty line in the message (\r\n\r\n) and emit the contents of the .xhdr file there.

Oddly enough we occasionally get support questions about "what are all these .xhdr files building up in my spool directory" when folks accidentally turn on this option without knowing what it will do ;-)

Back to Top


I would like stop spam foreign language mails, how can I do that?

Both of these suggestions can cause false positives - however you might try:

  1. Use country of origin blacklists to eliminate messages that are not "local".
  2. Add rules to block character sets that are not "local".

Back to Top