About incremental processing of logs

Incremental processing of logs enables large log files to be split up into multiple files for processing, without any loss of data.

Why do you need incremental processing

Log files grow to very large sizes, over a period of time. A relatively small Web site that has several thousand visitors a month will have approximately 25 megabytes of log files. Processing large log files can thus be intensive on time and resources.

Incremental processing aids faster processing of log data by fragmenting large files into smaller data entities, without compromising on data. This is achieved by saving and restoring necessary information to a disk file (by default, webalizer.current, located in the output directory of the application), so that data between runs is  preserved and log processing resumes without any loss of detail.

For example, when you run Webalizer, it stores all the log data in a special disk file; this information also includes the time stamp of the log record, last processed. When you run webalizer the next time, it scans the data in the existing disk file, reads the timestamp, and generates reports only for those that were logged after the timestamp.

Caveats of incremental processing

Incremental processing requires you to adhere to the following guidelines.

Ensure that Webalizer configuration options are not changed between runs, as this could cause corruption of the data that encapsulates the state of the previous run.

If you need to change configuration options, it is advisable to do it at the end of the month after normal processing of the previous month and before processing the current month.