Syslog is the primary source of information about intrusion-related activity on a Unix system. Searching for known messages and patterns in syslog data is easy to do, and many tools are available for doing so. However, information and patterns that are not already "known" -- those that have not been seen or derived already, may provide even more information about attacks and intrusions. Data mining techniques can help us discover and analyze that information, but, the general lack of structure in syslog data makes it impossible to apply these techniques directly to the data.
To address the problem, we are researching methods of generating patterns from an archive of system logs which can uniquely identify syslog messages by the variant and invariant elements of the messages. Once syslog messages can be uniquely identified, data mining techniques for use in intrusion detection or forensic analysis will be far more useful.
Speaker: Abe Singer · University of California at San Diego