Just comments on a previous coworker’s paper that he’s writing on tuning ArcSight. It’s a bit spewy and unedited (and will go to the other blog as a less stream-of-consciousness bit when I start it shortly), but I thought I’d pass the time until a write another art entry (photography is fun!) with it anyway:

What seems to be missing is commentary on the how and why of acting on the information that goes through the ESM – beyond just how the tools to perform those actions work.

By way of example, look at these specific quotes:
1. Normalization also includes translating the severity scales used by the different devices into ArcSight’s “Agent Severity” scale.

2. ArcSight connectors also assign each event to a set of categories (that is, it assigns a category tuple) using six fields derived from the fields included in the events collected by the connectors. These categories are designed to group like events from unlike devices, from two different IDSs for example, say, from ISS and Cisco.

Why does ArcSight do this? What does it mean to my correlation rules? Can I, algorithmically ahead of time, guarantee that the system will “think” about every event I want it to? With almost every single correlation methodology Ive seen – especially including ArcSight’s default methodology – the answer is a resounding “NO”. This means that you (formally) have no idea where your bits are at any point, whether they’ve been aggregated, why or why not, what transformations or decisions ArcSight has made about them, etc.

This methodology failure means that you cannot go back and do formal analysis on an incident that has passed through ArcSight without the original raw events and significant manual labor except by sheer luck (and thats not formal).

Read that statement above again, it’s important!

Basically, tuning the correlation engine (ArcSight) should never be approached from an “I need to get rid of stuff” – pure data reduction – standpoint. You will, probably, ultimately achieve reduction but thats an effect of the effort, not it’s actual goal. What you are doing, rather, is defining your environment (in a very literal sense).

These definitions (filters in ArcSight) then allow you to programmatically create an ontology within your system which defines your information classes, what their properties are, and how they relate to each other. That ontology exists as a combination of your basic filters and your core rules.

Once you know what your classes are, you can then write rules to define what kind of transformation (comparison, aggregation, filter, pass to another rule, send to active channel) ArcSight performs on your events.

Once these basic rules are written, you can then write higher level rules to express your intentions logically: “Show me when any perimeter firewall exceeds its normal state by a factor thats unusual across the enterprise firewalls”.

In that statement, you have to have “Firewalls” defined, what a Perimeter Firewall is, what your enterprise is, what kind of traffic values and ranges firewalls can expect, what your average enterprise data rate is for firewalls, and a host of other things. Unless you have formally created these things in ArcSight’s rule/filter system and can reuse you cant hope to create a scalable correlation engine – youll lose track of what the system is doing and will have to spend time / effort manually retracing how ArcSight got from point A to B and you lose the precision/accuracy of machine correlation in favor of manual correlation under pressure.

Once all of that is in place, you can use create rule classes: Groups of rules that organize and group events, rules that compare them to each other to say something smart about them, and then rules that either present the new events to analysts, send them back for additional correlation, or drop them completely.

I hope Im making some sense here :)

I would highly suggest checking out this URL: http://en.wikipedia.org/wiki/Ontology_(computer_science)



Ontologies are excruciatingly important to understand if youre doing ESM correlation (not that theyre commonly understood, but trust me on this)

Enterprise Service Bus’s (in Service Oriented Architectures) have a lot of the same requirements and features as ArcSight/ESM’s and are a good model to look at for what ArcSight’s role is in the context of security devices.