You are currently browsing the category archive for the ‘Enterprise Architecture’ category.

Earlier this week, I started back up at TSA supporting their private sector critical infrastructure responsibilities under HSPD-7 and the NIPP.  Being new (well, new again), I just had to get on some of my recurring soap boxes.  One of them was our doomed-to-failure to security approaches.  (Nice to start off on an optimistic foot yeh?)  Pretty soon, the conversation narrowed down to the role of CERTs and incident response. In the middle of trying to explain how sending a bunch of guys in trenches to combat an enemy who could nuke from thousands of miles away was a waste of time, I had a revelation: The “bad guys”, with complete cooperation with the “good guys”, are creating a denial of service condition across the country and planet: a Responder Denial of Service – or, an “RDOS”.

What exactly is an RDoS? It works a lot like a syn-flood, which spins up a whole lot of blank connection attempts to a server. The server must receive these connections, wait for awhile to see if valid data arrives, then close them. The thing is, because the sender knows the connections are blank (and using things like botnets and such), it can generate a lot more connection attempts than the server can handle. Eventually, the server gets so busy that it fails to respond to real connections.

Now, think of how we handle “security”.  We religiously and studiously avoid building hardened, defensible systems from the ground up and rely on fixes, patches, and incident responders to cope with the eventual problems later (hoping all the while – in vain – that the attacks never come).

What we end up with, by and large, are systems that are so poorly constructed that it takes a large amount of effort to detect, confirm, respond to, and recover from attacks.  Further, while attackers can fairly easily attack multiple systems simultaneously, we require dedicated defenders/responses for much smaller groups of systems (or even individual systems).  This leaves us with an “RDoS”. Our security philosophies leave so much open that we can never, ever sufficiently resource our defenses at an adequate level. Everyone is occupied. Just ask your incident response vendors, teams, and CERT’s (over beers, of course), about their available resources vs the demand for their services, vs the large iceberg of incidents under the water that aren’t even talked about yet.

As I’ve said before: Good guys – you, we, have failed and will continue to fail if we keep going down this same road.  We can’t win until we change strategies completely. We need to embrace our failure and build systems which are defensible from the inside, which are measurably effective against operational/business objectives,  and which assume, from the get go, that sections and components have, are, and will continue  to be compromised. This hacking perimeters on, giving lip service to change control, and our complete inability to integrate cyber into our ORM and our ORM into our business decision making is a waste of time and resources. We’d be better off spending the money and time elsewhere if we’re going to keep doing security as badly as we do it now.

If anyone disagrees with this post, I’d LOVE to hear a rational argument as to why. (Really!)

(UPDATE: 08/06/10)

I really think some of Bruce Potter’s remarks at Shmoocon in 2009 are pertinent here:

People are getting owned a lot.
Trends

  • Increased success in getting past our defenses
  • Increasingly malicious motivations. The bad guys aren’t after web defacements
  • In spite of the above, we haven’t changed our methods. Its a lot of the same
  • Spear phishing and drive-bys are unabated.

What we have is a Maginot line…in depth
Of 66 million websites indexed by Google, 5 percent had drivebys.
These sites with drivebys weren’t just the risky underbelly of the web. It was every category of website. I don’t think that is surprising to anyone who has paid attention to security.
These findings were published last year in in USENIX.
The malicious content on these sites was then scanned using three top Antivirus vendors. The best detection rate among these three vendors was only 75%. The worst was 30%. These are untargeted attacks. Imagine the ability of an attack targeted at your organization to cut through your antivirus defenses.
So What do you do?

NAC? Most people don’t have that deployed even if they’ve bought it.
Firewall Internally?
Token authentication?
Change jobs?

Al McDougall from Evolutionary Security Management made the following point in response to my last post, and I thought it was useful to repeat it here:

“End result, the system view is lost because everybody works within their part of the behemoth but forgets about the mission.”

He’s right, of course. Furthermore: “Mission oriented” sounds “fuzzy” and people tend to blow it off, but it’s is not – it’s quite important.  In western culture, we seem to need to rush to go solve problems, without really ever trying to understand the nature of what we’re solving. This leads to all sorts of mayhem and things going wrong. We look back and can’t figure out why our solutions arent working or why they’re causing all these weird other problems.

What we need to do, instead, is spend our time groking the problems we’re wrestling with until we understand their deeper natures.  If we learn to ask sufficiently detailed questions, correct elegant answers will present themselves.  This, in many respects, is the essence of SABSA and Enterprise Architecture (although, especially in the case of the latter, an essence that is often missed).

In the case of cyber security, we absolutely blow past figuring out and AGREEING ON the nature of the problem and rush straight to the “solving” phase with perfectly predictable results.

My compatriots at TSA are asking me to, before I depart for INL,  transition my approach to the role of the SSA in the NIPP framework, but it really isn’t detailed or special. Fundamentally it is this: Figure out ahead of time what you’re asking and why. What is the mission being supported by cyber systems? What do you need to know to make sure those cyber systems continue to enable that mission? Start from the mission and work down. You’ll get there.

Hmm. Start somewhere and finish? That sounds like “Alice and Wonderland” – “start at the beginning and, when you get to the end, stop” – but it also sounds like a “process”. A “process” is what the NIPP lacks, yes? More to come…

Starting September 14th, I will no longer be contracting to TSA (via KCG, who have been wonderful). Instead, I will be working for Idaho National Labs (INL) onsite at DHS as a liaison between the smart people exploring the vulnerabilities of our nation’s critical infrastructure and the smart people at DHS CSSP doing the many things that they do.

Before I head out, though, I’d like to comment a little bit on an issue I’ve dealt with at TSA that I think also extrapolates to national cyber security efforts and is in no way unique to a single agency, or even the government. The issue is the label “cyber security”.  At TSA, as at DHS, as within the media, as within popular culture, there is confusion as to what “cyber security” means – even at a very high level. The term gets bandied about so loosely that it means everything and nothing. Still, people are making policy based on it without any definition.  The amorphous nature of the conversation is going to kick us in the pants sooner rather than later. Can we please nail it down more specifically when we discuss “cyber security”?

Below, find some areas of confusion that I’ve personally run into:

1. The internet, government networks, SCADA/ICS: This one is simple. When we talk about cyber security, we really need to preface our statements with which of these areas we’re discussing. They’re NOT THE SAME and the strategies, ownership, and etc to deal with them are NOT THE SAME either. Over and over again a lack of explicit distinction here burns us.

2. “IT Security” and Technology vs Strategy: Often, in my role, we were lumped in with what IT Security does: “Isn’t that the same thing, only with more computers?” was a popular sentiment.  There is the concept that these efforts are technical in nature and that they look a lot like FISMA shops: Assess, Remediate, Certify, etc.  against some standard or set of standards.  Nothing could be further from the truth.  “Cyber security” issues are of a strategic business and programmatic nature. We know how to fix computers, we don’t know how to define what security means to our businesses, how computers affect our operations, and we don’t know our risk appetites. In other words, “cyber security” in an executive (CEO, CFO, COO, CTO, CIO) issue, not one for technologists.

3. Computers vs Infrastructure vs Business Assets: We don’t care in most sectors if our computers work. Really, we don’t. What we care about is that our energy grid keeps pumping out power, our chemicals get mixed right, our cars are manufactured correctly, our financial transactions are accurate, our goods get delivered on time, etc.  These are the “assets” we are protecting. We are not protecting the internet, we are not protecting government computer systems. We are protecting the national operational interests of the United States.

4. Think globally, act locally: We’re so used to thinking about single companies and single systems within those companies that we forget that everything we do cooperates to larger goals. Our enterprise systems work together to achieve business goals which must be protected. Our business goals within critical infrastructure sectors, in aggregate, also work together to support national goals. For instance, the thousands of independent companies in “the transportation sectors” all combine to “move people and goods throughout the US and the world on time, to the correct destination, in acceptable condition”.   Many decision makers believe that it’s ok to ignore this larger context and focus on single system security or, at best, enterprise security. This is dangerous. Since these systems are interdependent whether we acknowledge it or not, they can be be used to exploit each other and damage our soft assets (goals) if we don’t regular take a look at and secure the larger picture.

(Needs Editing, but Im a bit stuck…so reviews and comments welcome.)

Years ago – in Mad Magazine – I saw an illustration of Alfred E. Newman sitting in a tire swing. At first glance everything seemed normal, but after a second if reflection I noticed that it was Alfred himself holding up the tire swing while sitting in it – a situation which obviously does not (heh) fly.

This is the image that came to mind when, after my last post on Data Visualization’s Lessons for Enterprise Security, I was asked questions like:

  • What does Operational Risk Management have to do with IDS monitoring directly?
  • What about better and more transparent auditing? Wouldn’t that help?
  • I don’t see how SABSA really applies to MSSP’s or IDS monitoring

My short answer is that no matter how you make your security tire swing or what you do while you’re sitting in it, as a security practice you have to be bolted to something independent that holds it all up. That’s “The Business” in case it’s not clear.

I’d like to address the IDS example, in particular, because I think it is very illustrative of the connection between detailed technical and high level business realities.  Please keep in mind that this is only a snapshot of the direct implications to a very small section of a much larger, very holistic process.  There are many secondary dependencies and repercussions which I do not address here (like tactical technical responses, incorporating lessons learned, strategic business decision making, etc.)

So, first of all, at a process level, IDS monitoring is pretty simple:

Get data -> Evaluate nature of data -> Evaluate implications of activity represented by data ->Respond to and/or continue getting data

No matter what environment you’re in, if you’re looking at IDS data (or doing any other monitoring, really), you do these four things. If you look a little closer, though, you begin to see that they are (or should be) repeated iteratively. This is because there are really multiple levels – or layers – at which security data can be evaluated (which, incidentally, looks a lot like any other protocol stack). Let’s say, conceptually, that there are five of them:

  1. Universal Technical Standards: This layer would consist of measuring activity against RFC’s, Protocol Standards, etc. Things that -should- work the same everywhere.
  2. Environmental Configuration: Here, traffic is evaluated against local configurations that might change from network to network. This includes the configuration of OS’s, Web Servers, Infrastructure Devices, etc.
  3. Data and Information Control: What happens to data riding on your network and IT obviously falls in the area of concern for IDS analysis.
  4. Timing and Behavioral Thresholds: Are things happening more frequently than normal? Less frequently? Uptime? User logins? Memory Usage? etc.
  5. Business Rules: Is the IT actually doing something that directly affects the business? Are manufacturing robots shutting down? Are internal company secrets being sent to competitors? Are you spamming the military?

So what is the intersection between these layers and enterprise business architecture or operational risk management? It looks, initially, like the only direct overlap is in layer 5, right? Not true.

First, each of these layers requires some level of the business context provided by business security architectures to even be effectively evaluated.

For example:

  • To evaluate security data against potential technical standards, analysts need to know what technologies are in place and deployed and in what manner. Exceptions and outliers are especially important.
  • From an environmental perspective, analysts would be well served by knowing the security policies that the configuarations and environment are supporting. E.g., what actions the configurations trying to prevent  or support (in terms of the other 5 layers)
  • The need to know what data belongs to what data policies and what those policies say is also fundamental.  Data policies are tied to conceptual business architecture, which is tied to contextual business assets and requirements.
  • System behavior is evaluated in part by knowing things like business schedules and processes. Is payroll being run every 4th Thursday? Are people going to be logging in from all over the world, or just certain locations? Should lab systems pull data from production systems?
  • Knowing what business functions are important to keep running, to what thresholds, and how IT systems support those is crucial when trying to understand the big picture and put “events” in terms of “incidents”.  Additional, it should be kept in mind that things like “reputation” and “customer satisfaction” are also considered business assets to protect.  Organizations have a need to protect those as well.

Secondly, and maybe more importantly, if you actually look at the process flow (below) you find that the analysis process always rolls up to an evaluation at layer 5 (the business rules) of the analysis stack.

From a process flow perspective, there are absolutely no analysis scenarios that do not terminate before completing a layer 5 business analysis (At the bottom of image).


(Click Image for Full Size View)

idsentseccolorlines

How does this work?

Analysis begins at one of these five layers – which one is first doesn’t really matter (they are often, in fact, done in parallel). Data is received and is evaluated against the criteria at the layer in question. If there are no exceptions, the same raw data is evaluated against the next layer in the chain. If an exception is found at any one of these layers, the impacts of that exception are then evaluated at all layers. So, for instance, if an analyst notices that there are “funny packets” that aren’t normal TCP/IP traffic while evaluating against “Technical Standards”, he or she then looks to see what the potential technical, environmental, data, behavioral, and business implications are of that traffic. For each of those, the analyst follows the process as if he’d just received new raw data.

This continues to happen until the original data has been run up the entire stack and a final business impact has been determined. Sometimes the path there is short because the answers are known or obvious, or complete data is unavailable to make a determination, or the entire process is followed at a very detailed level. Regardless, the logical process holds true in all cases and there is either a potential business impact or there isn’t.

Read that again: There is either a potential business impact or there isn’t. Without context, IDS monitoring can never be a security function.

The value of IDS monitoring never gets realized if exception events are not tied to business operating requirements and risk appetite (which only business stakeholders can determine). If that linkage is not formally made or that appetite not assessed, IDS monitoring fails. None of the five analysis layers are inherently worth evaluating if a business context for them does not exist and most can’t even be evaluated at all without that context.

What provides this context? Business Security Architecture and Risk Management.

What’s interesting, though, is that these things don’t work when isolated to security, as the original blog post (and others) pointed out. If you limit the scope of your activities to “security”, you end up with the tire swing with no tree. You have to account for and model your entire business formally to achieve security, What this says is that business security architectures are, at a very real level, just business architectures. There is no material difference between the two.

But why would you need a full fledged business-wide process to get this information to you (or your analysts)? Because it’s really hard and expensive to do without the practice and culture in place enterprise-wide. You might brute force it and get your answers once without it, but trying to keep that information up to date would be completely futile.


In closing, I’d like to reiterate that I’ve only discussed business security architecture and operational risk management’s impacts technical security operations (looking up). Of at least as much importance is its role in aiding executive or management decision makers in correctly assessing and responding to risk. This is accomplished by providing a very clear line of sight from the trenches to business assets and risk appetite (looking down).

If you’ve read some of my recent posts here, you’ll have seen that Im back working on creating data visualization pieces as art.  In the process of making these,  I was reminded again of the relationship between art and security and its practical implications for enterprise security efforts that literally dictate success of failure. Bear with me as I walk through the art piece first and then arrive at the security observations :)

First, to work, art has to have a solid concept. You might accidentally create a piece that’s appealing on some level if you just throw paint at canvas, but you probably won’t repeat that success often and observers will understand this.

Taking that into the realm of data visualization, you can make all the pretty graphs you like, but unless you do some leg-work ahead of time and massage the data into shape, they’ll be of little use and only may accidentally be visually appealing in a way that let’s you intuitively grok it.   (I think this is philosophically similar to some of what Tufte teaches, but I don’t remember for sure.)

For example, if I wanted to (as I did) visually represent the stimulus bill in a meaningful way on screen at once, I could really just use a microscopic font…or turn the whole thing into a jpg and resize it to fit on screen. But what would that accomplish? It would just be mush.  We wouldn’t have identified or accounted for inherent structural properties that we needed to keep to preserve order. We also wouldn’t have separated the wheat from the chaff – useless information would hide useful information. And we wouldn’t have manually added linkages between data points that would help us draw meaningful conclusions visually to account for a loss of resolution in individual words.

What would work, instead, is to turn (as I did) the Stimulus Bill into columns of useful information. You could convert the free form english structure of the Bill into a tabular format and add meta data about the text that I wanted to see in the visuals.  You could add line numbers, position in sentences, group words by sections of the document and add word counts, etc. All this would show up visually and present a much more useful visualization that would also, because of the new more conscious conceptual structure, be more appealing to look at.

So what does this have to do with security? Everything.

Recently, much has been made of the new SANS CAG control list. Basically, this is a list of “best practice” security measures and controls that, if properly done, will make the most impact in securing organizations. Where’s the problem? The problem is that none of these are new (except WiFi). They’ve all been around longer than I’ve worked in the field (7ish years) and probably much longer than that. Everyone who works in security knows them.  Most CTO’s, CIO’s, and CISO’s will probably not be unfamiliar with them. But yet, they’re either not implemented or, more often, they just don’t work.


If these really are best practices (and they are), but yet they’re not working, where’s the disconnect? I think it’s lack of structure. Most organizations do not operate their businesses in a manner that can be secured. There are inherent structural flaws (as in, there isnt any) in the enterprises themselves that conflict with and outright prevent security from happening – just like in art and visualizations.  No matter how much effort or money you throw at the problem, cyber/IT/technical security controls will get you nowhere quickly (if anywhere ever) without a properly run and organized business. What failed cyber or IT security really is, ultimately, is a symptom of failed Operational Risk Management.

If you can’t track assests, if you haven’t identified your key data, if you don’t have clear and measurable business objectives for IT and cyber systems, if you don’t have a clear line of sight between the risk of technical failure to business impact, your security controls -will- fail.

Why? Because an organization run without these things will consistently make poor decisions based on incorrect, out of date, or conflicting information. In other words, you have to build break points into the business to be able to check, measure, and change the the organization at key junctures in order to make good risk-based decisions.  “Risk-Based decision making” get’s bantered about like “moving forward” and “synergies” – but it’s not an empty phrase and it has real, concrete impacts and prerequisites.

Let’s look at a best-case scenario where everyone wants to do the right thing, but there isn’t an enterprise or business architecture in place. Everyone goes through an evaluation of need and risk, pick the right controls, put them in place. Hunky dory, yeah? Well, what happens when a new line of business is added? Nothing to do with security, right? What if the new line is taking critical data that wasn’t exposed by the other systems and making it public inadvertently? Would you know that? If you need to patch critical systems quickly to prevent a flaw, would you know which ones kept your business running? Would you have documented in an easily accessible manner the fact that your manufacturing systems depended on a feature that the new patch – which works just fine on desktops – disables? Etc. Not to mention that your IDS’s depend on this info, your firewalls, your SEMs, everything.  There is relatively little happening on your network that is inherently bad outside of a business context. There are many more (and probably better) examples…but there are two take-home points:

  1. Everyone with the authority to make changes to your business needs to be aware of the secondary dependencies of those decisions and how they intersect with security and inform others of changes they make
  2. If you try and do this without managed processes and without maintaing and continuously updating the information about the business in an architecture, you’ll fail. It’s too hard, too expensive, and takes to long to keep doing it from scratch. It’ll never be accurate, timely, relevant, etc.

Business leadership at all levels and in many (most?) organizations simply are making bad decisions that affect security.  It’s not that we don’t know, as security professionals, the right things to do. It’s that we can’t express it in terms of business risk and the business leaders typically don’t seem to have the structure built in to affect positive change throughout the organization. Build some good, clean structure with visible break points at critical junctures in your business flow and then security will start to become cheaper, easier, and more effective.

Yesterday, I threw down my soap box into another discussion of ways to rearchitect the internet – specifically the pieces supporting critical infrastructure.  It was, as usual, about technical solutions to large scale, enterprise security problems.  It was a bit of a stretch for me to bring this up in that particular thread, but I think it’s important to beat the drums on this subject wherever possible:

The “security” problems we’re having nationally and globally aren’t technical.  They’re not even security problems, really; they’re failures of management. In fact, they’re very similar failures to those leading up to and causing the current economic mess.  Any technical discussion is really putting the cart before the horse.

For example,  I was recently on a con-call recently where a bunch of people at a large enterprise were trying to track down (to keep it generic) “Secure Devices” they’d purchased. Absolutely no one knew where they all were, who owned them, how many there were, whether they worked or not, how they were configured, etc. Some groups knew theirs, others didn’t. In some cases, there was duplication of effort. In others, worse still, there was conflict of effort. How can this environment possibly result in “security”?

This kind of management mess is the primary contributor to the failure of cyber security – CIKR or otherwise, not technical problems.

Why do I believe this? I started out doing network security analysis. I was really good at it, but couldn’t do it nearly well enough because the tools seemed to suck.  So, I started designing better tools to do things in ways that had never been done before. But then I found that even with better tools, I still couldn’t provide a good basis for analysis because I didn’t know anything about the organization I was “securing”. Once I figured that out I went to try and get the business leaders to provide that information to their security team and I found that the information had never been collected and no one seemed to see the value in doing so.  That’s how I ended up (in short) with the perspective I have today.  It’s based in a sequence of layered steps that I know are solid – I only wish I could do a better job of communicating the dependencies here.

The conceptual failure seems to be the belief that technical risk remediation is a sane strategic end-goal.  It’s not. There will always be technical vulnerabilities and failures of design – that’s a given. You can fix these individually, but that’s a tactic not a strategy.  There is no end game or any way to get ahead of the curve.

Instead, we lack and should pursue national business, social, and government consensus on solid plans to:

  • Assess current environments and keep those assessments up to date,
  • Do interdependency analysis,
  • Plot those against business risk (individual organizations, nationally, etc.)
  • Measure performance and success in terms of business needs supported

Not to mention consensus on “communication” (which is probably even more important) like: who should be at the table for these things, how communication happens and with who, etc. You get the idea.

These are all deficits that are completely independent of the technical architecture of our infrastructure.  Filling them would get us a long way down the road to solving our security problems in our current environments

We have a habit, in the cyber world, of consistently making changes without sober scientific evaluations of cause+effect and it bites back every time.  And, until we start getting better at the above named activities, we can’t do that evaluation in any way that will guarantee successful solutions. (I recognize that there are many, many good initiatives going on in these areas…but so far, they still seem disjointed and lacking enough universal consensus to solve the problem.)

Maybe some of these technical suggestions for rearchitecting the internet will work. Who knows? We don’t even have consensus on where, why, or how our current technology fails or where it succeeds.  How can we claim to know what will fix it? Technical solutions to security problems without business context will only ever, at best, be hail mary’s and misguided hope.

Now to get a little more ranty (smile):

I really fear what is happening…which is calls for large scale, quick change without even the most fundamental management practices in place.  (eg, business architecture).

What is going to happen is we’re going to invest a lot of time, money, and effort in investing in technical re-engineering and we’re STILL going to get trampled on by malicious actors…except we’ll be billions of dollars more in the hole. I think that merits being called out as often as possible.  What do you think?

The government and large enterprises get compromised constantly and -at will-.  The whole mess from top to bottom, public and private, is absolutely fubar’d. This is public knowledge – it ends up in CNN regularly. Yet,  our management processes are SO bad, that even ending up on mainstream news does not force real change. Failing FISMA does not force real change.  There is NO visibility from cyber technology to management to business leaders to business risk. There are exceptions, but this is the rule. So you dont have the visibility to make the needed changes.  Not only that, but without the data gathered by these management processes, security controls cannot ever be effectively placed, configured, or run.  We will lose, no matter what technology we put in place without these management practices. There is no question.

Technical solutions may work,  but that’s like putting a finger in the dam. Unless there is a framework to consistently identify and correlate environment, requirements, risk, technology, operational processes, controls will eventually fail because the enterprise (national, private, whatever) cannot respond to evolving threats. Spend the money up front to put in strong security practices, though, and the rest will follow.

Even then, we can’t possibly identify all the inter-dependencies and requirements needed to make large changes move without going through exactly the kind of process and management methodology I’m referring to anyway.  Just to put the cart before the horse requires the horse be in the front. (Does that even make sense? heh.)

Contact Me

sintixerr@gmail.com

Twitter Updates

My Art / Misc. Photo Stream

IMG_2108_2

IMG_2107

IMG_2101

IMG_2100

IMG_2080

More Photos

a

Follow

Get every new post delivered to your Inbox.