Message Sniffer.Introduction.SnifferBasics

From ARM-KB

This page is no longer maintained and may contain information that is out of date. We have left this page in place to provide a historical reference and to provide assistance to folks who may have not yet upgraded from Version 2 to Version 3. EVERYONE should upgrade to the latest version if they have not done so already.

For the latest information covered on this page, please see the following pages on our web site: http://www.armresearch.com/products/sniffer.jsp


Home-> Message Sniffer-> Introduction->Sniffer Basics

Contents

What is Message Sniffer?

Message Sniffer is a powerful, multi-platform spam filter and email content scanner which provides sophisticated message filtering and classification services to ISPs, educational institutions, small businesses and corporate IT departments. The software installs on your email server and uses MicroNeil's advanced pattern recognition technologies to simultaneously apply thousands of heuristic algorithms to your email messages while using relatively little computing power. The efficiency, speed and depth of this technology make it practical to filter incoming and outgoing email messages based on content - in real time, even on very busy server systems.

The Message Sniffer service includes a subscription to our advanced spam filtering database which is maintained and updated around the clock by examining and categorizing unsolicited commercial Email messages, pornography advertisements, and other unwanted content. This content is submitted by our users or captured by our research teams using anonymous "spam trap" accounts.

Message Sniffer Improves Security & Privacy while keeping you firmly in control! The Message Sniffer utility does not actually filter messages itself. Rather, it integrates with your email system and identifies pattern matches which indicate that messages are likely to contain unwanted content. Since Message Sniffer rule bases are customizable it is possible to develop sophisticated, specialized applications based on message content. Once idenfied, the Email Administrator can then decide to filter out these messages or direct them through the email system for some other purpose such as support call management, or corporate security and privacy requirements.

Back to top


Version 2-3 Features (2004-05-08):

Persistent Instance Mode - It is now possible to start an instance of the Message Sniffer scan utility in a "persistent" mode. When started in this mode the instance will go directly to a server mode and will remain there indefinitely unless otherwise configured. With this "persistent server" in place all other instances will immediately elect a client mode and wait for the persistent server to scan their messages. This has the effect of enforcing a client-server model virtually eliminating rulebase load events.

A persistent server instance uses the following command line:

snflicid.exe authenticationxx persistent

The persistent instance can be run as a service on Win32 systems using a number of third party tools such as RunSvcExe, Fire Daemon, or even the Win2K Toolkit. On unix type systems a simple script called from etc/rc.local is usually sufficient.

Under normal circumstances the persistent server instance will check for a new rulebase file and/or .cfg file changes once in every 10 minutes. This can be adjusted in the .cfg file, and if needed a "reload" message can be sent to the persistent server.

To send the "reload" message use the following command line:

snflicid.exe reload

The client instance sending the message will not return until it sees that the message was accepted - or 30 seconds have elapsed. In addition to the reload message, a persistent instance will also accept a stop message or a rotate message (see below - On-Demand Log Rotation).

The stop message causes the persistent instance to exit gracefully.

To send the "stop" message use the following command line:

snflicid.exe stop

Rulebase Digest Checking - Rulebase files now include a Mangler digest of the entire rulebase file. This new format is backward compatible with all 2.x versions of Message Sniffer. Included in the new distribution is version 2 of the snf2check.exe utility. This new version now checks the digest to verify the integrity of the entire rulebase file. The new snf2check.exe utility (found in the distribution file) is a drop-in replacement for the previous version and will return 0 if the rulebase file is correct. The only outward difference is that the new version will emit an ERROR_BAD_MATRIX result if any corruption is detected - even if the rulebase file's security segments authenticate properly. If the security segments of the rulebase fail to authenticate then the utility will respond just as the previous version did where generally the ERROR_RULE_AUTH result will be returned.

On-Demand Log Rotation - The persistent server instance (if any) will accept the "rotate" command. When the rotate message is received the persistent server will gracefully rename the current log file to snlicid.log.yyyymmddhhmmss and will continue logging to the snflicid.log file as usual.

To send the "rotate" message use the following command line:

snflicid.exe rotate

Variable Log Detail - Version 2-3 can utilize an optional .cfg file to adjust tuning parameters in the persistent server instance and to support other features such as a variable log file format. There are three possible options for the log file format that can be set in the .cfg file. The log file format can be adjusted to show full details of every pattern match, to remove duplicate matches (each rule shown only once), or to show only the final result for each message. The .cfg file in the distribution contains detailed instructions.

Rule-Panic Entries - The Message Sniffer rulebase is constantly and rapidly evolving to keep up with new spam. As a result there are rare occasions where new rules might emerge that cause false positives on a given system. Normally an administrator would be able to roll-back to an earlier rulebase file until the problematic rules could be adjusted however this is sometimes difficult or unwieldy. Starting with version 2-3 up to ten " RulePanic:" entries can be made in the .cfg file. This allows the system administrator to immediately remove problematic rules without recovering an earlier rulebase from backups.

Back to top

Version 2-2 Features (2004-01-02):

Cellular Peer-Server Technology - In addition to minor refinements in the program, this version introduces a new technology that dramatically increases the number of messages that can be scanned on a given hardware platform.

This new "cellular peer-server" technology allows Message Sniffer to share and conserve vital server resources when multiple messages are being scanned. Our cellular peer-server technology works by allowing multiple instances of Message Sniffer to organize themselves in much the same way insects work together in bee hives and ant colonies.

When more than one message is being scanned an organized group of Message Sniffer instances will emerge. The group members will organize themselves so that a small number of instances will load the rulebase and scan messages while the other instances quietly wait for their messages to be processed by the "emergent server."

Cellular peer-server technology provides many of the benefits of a client-server architecture while retaining the flexibility of a command-line utility and avoiding the complexity of running "yet another network service" on a busy Email server. In fact the new version of Message Sniffer is a direct drop-in replacement for the previous utility and requires no changes to existing scripts or configuration files.

The speed and efficiency of this new version of Message Sniffer allows Email system administrators to retain the benefits of content based message classification even in the face of the recent, dramatic increases in spam volumes. Message Sniffer also has the advantage that it does not depend on any DNS based blocking lists. Systems that are heavily dependent on DNSBLs can take several seconds to classify a message while waiting for responses from external databases that may well be under DDoS attack. In contrast, Message Sniffer can consistently classify messages with it's local rulebase in a fraction of a second - even while applying more than 30,000 active heuristics!

Back to top


Version 2 Features (2002-12-04):

Compound Rule Coding - Messages are now scanned for multiple rule matches. This provides comprehensive content matching information in the log files and allows for additional features such as White-Rules and Above-Band rules (See Below).

Specific Symbol Encoding - Rule groups are now assigned specific symbol values. These symbols are returned by the Message Sniffer utility to facilitate more specific actions in the host system. This will allow system administrators to generate special rule groups for specific actions. For example general spam might return a value of 63, while technical support email might return a code of 2, while messages for users who do not wish to employ any filtering might return a code of 10. Where multiple rules match from multiple groups the lowest symbol value is returned. This means that rule groups can be prioritized by chosing appropriate symbol values. (See White-Rules below)

White-Rules - White-Rules can now be encoded with a symbol of 0. These rules produce a "White" result in the log file and produce a 0 result from the Message Sniffer engine. This is the same code that is returned for a "Clean" result. This has the result of ensuring a message which matches any white-rule will be passed as if it were "Clean". As a result of this feature, our standard rule base now includes a number of "standing white rules" to mitigate false positives from email systems that tend to append advertising content. For example, eBay bid notifications and yahoo groups. (NOTE: The standing white-rule group is in it's early development and will take some time to stabilize. Be sure to help us with this work by forwarding False positives from "well known" legitimate systems and be sure to include all header information.)

Above-Band-Rules - Rule groups can be coded with symbols in the range 65-255. Rules coded in this range are ignored by the current Message Sniffer engine but will appear in the log files if matches occur. This allows rule bases to include special rules for capturing important statistics or triggering special functions. (Advanced tests scheduled for future development will use above-band rules to parse important message elements.) We can also use above-band rules to test experimental rule-sets, collect important metrics for tuning and refining our rulebases, and support advanced analysis functions in future - such as automated dynamic white-rules and network distribution metrics.

Improved Rule Base Format - The rule base compiler engine has been updated to improve rule-folding. This allows more rules to be packed into a single rule base file for improved performance with larger rule sets. The new rule base files can be as small as 50% of a comparable version 1 rule base file. This means faster load times, lower system loads, and improved performance.

Customizable Per License - Version 2 implements unique license IDs and authentication codes. This allows our system to interpret license specific statistics from submitted log files and provides a facility for license specific rule-base customization. Customizations can be implemented upon request for the time being and soon will be possible directly through our online rule-base editing system. Customizations can include a combination of specail white-rules, special black-rules, special symbol groups, above-band rules, and specific rule blocking. In future we will begin to offer optional rule base packages which have special features - for example, filtering out content with foreign characters, or aggressive rule sets based on objectionable words, etc...

Rule Base Authentication - Version 2 includes a rule base authentication tool (snf2check.exe) which checks rule base files for corruption and incomplete downloads. This utility can be used to make automated update scripts that are more robust so that invalid rule base files are not automatically installed.

Additional Coding Symbols - Additional coding symbols have been added to the engine to allow for more complex pattern coding and better header / line specific rule isolation.

Fixed Run Loop Interference Bug - Fixed a bug in the message sniffer engine and compiler where similar rules with run-loop patterns coded in identical locations would interfere under special circumstances. The new rule engine prevents any rule interference and provides better isolation for all run-loop sequences.

Back to top


How can I purchase Message Sniffer?

You can purchase a Message Sniffer subscription directly from the ARM Research web site. For only $495/year (annual subscription) or $45/mo ( monthly subscription) you will receive frequent rule base and software updates via Email, personalized support for managing and customizing your rule base(s) to fit the needs of your system, and access to any additional services and capabilities that are developed while your subscription is active.

Educational institutions get a 10% discount! Send an email to our billing department for details when you are ready to purchase.

Back to top


How does Message Sniffer work?

There are approximately 120,000 heuristic pattern definitions (rules) in our Message Sniffer database. Currently about 40,000 are active. The system is looking for email addresses, phrases, obfuscation techniques, links, IP sources and other patterns which have been derived from spam arriving through our submission addresses and spam traps. Patterns may be found in headers, attachments, message bodies or may even be combined across these boundaries as needed.

The current version of Message Sniffer is based on a white/black rule set so there is no weighting involved. However version 3 (under development) will include "bayesian hinting" which could arguably be considered a weighting system. In addition, there are heuristics which white-rule certain messages in particular those which might also attach advertising content or utilize "gray" bulk mailers. The combination of black and white rules help to keep spam captures high while keeping false positives low.

The rule base is continuously tuned and adapted to the current needs of the subscriber base. In general, everyone starts with the core rule base, however over time each individual license is tuned to meet the needs of that particular user through the addition of local white rules, local black rules, and local blocking rules.

Back to top