<size-trigger/>
The size-trigger section controls an alternative condensation mechanism for the GBUdb database. When enabled, the size-trigger mechanism will condense the database after the specified amount of RAM (in megabytes) is used by the active dataset. This mechanism is turned on by default.
<size-trigger on-off='on' megabytes='150'/>
The on-off='on' attribute enables and the on-off='off' attribute disables this mechanism.
The megabytes='150' attribute defines the amount of RAM used by the GBUdb dataset that will trigger condensation. The default value is 150 MBytes. This value was derived after more than a year of testing and represents the typical upper limit of RAM used by GBUdb in large filtering systems.
This mechanism is useful in situations where RAM is strictly limited such as in an OEM application or filtering appliance. Normally, the amount of RAM used by GBUdb is stabilized by condensing the dataset once per day. This is the default configuration. Due to condensation process, it is reasonably safe to restrict RAM usage by forcing earlier condensation events though there will be some effect on GBUdb accuracy as the dataset is restricted.
As an aide for developers and administrators concerned about RAM usage by GBUdb, here is an example chart based on "typical" telemetry. The amount of RAM used by GBUdb without a size restriction is primarily dependent on the number of messages that are processed per day, however the relationship is by no means linear.
There seems to be a low end where small systems will typically use the base amount of RAM (between 8 MBytes and 16 MBytes) and a high end where larger systems will typically climb toward, but not typically beyond 150 MBytes. In between these two extremes, the RAM footprint for GBUdb seems to vary wildly depending on the mix of messages seen by that system.
More diverse systems or systems that see heavy spam traffic will typically see larger numbers of IPs and larger GBUdb RAM usage. Systems like this are grouped in the top of the chart.
Systems that see a narrow range of traffic (such as systems serving few domains) will typically see a smaller number of active IP records and a smaller GBUdb memory foot print. This is also true of systems that perform heavy filtering before calling SNF as a last resort -- thus exposing SNF to fewer spam than other implementations. Systems like this are grouped in the bottom of the chart.
Note: During some operations the GBUdb dataset is copied into a separate working buffer so that the active dataset can continue operating without interference. As a result the amount of RAM used by GBUdb will periodically double for short periods of time. If you are designing a system that will have severe RAM constraints then you should plan ahead for these periodic operations.
Please email support@armresearch.com with any questions.