Saturday, December 13, 2008

(In)Security Response: Room for Improvement

I gotta confess.  I’m a bit depressed at the moment.

No, strike that.  Depressed is a word too strong and connotative.  Maybe melancholy?

Hmmmm.  Not quite there.

Let’s just describe it as reflectively-frustrated.

That will do.

See, I’ve decided that our security responsiveness is kinda “weak”.  And I’m feeling the pull of duty to do my part to kick-it up a notch…and the extra work that will bring on.  And maybe some resistance as well if things are ever implemented.

Background

I don’t see myself as a John Wayne or Walker, Texas Ranger figure.  Sure, I did want to pursue a career in law-enforcement as a young-man and through college.  Even applied at the F.B.I. at one point and talked to a Houston P.D. recruiter.

I think that came from two sources; a deep sense of respect for my late maternal grandfather who was a commended F.B.I. Special Agent (old-school Fed), and a deep curiosity of figuring out things that I currently don’t understand.

My career choices haven’t led me down that path.  However, that curiosity has led me down deeper into the realms of computer forensics and incident response awareness.

Computer systems fail for numerous reasons and I’ve always enjoyed working on them without feeling intimidated in the least.  That led to side-duties in my earlier jobs as a local site pc first contact.  That led me to become pretty darn good on my own troubleshooting local systems.  That was noticed (my offices rarely called into the Help Desk) and I was successfully recruited and joined the IT department. My familiarity with the desktop OS’s led me to pretty quickly detect malicious software without needing to use the traditional “AV” scan tool, and I could remove most infections by hand.

Dealing with malware regularly as part of my job and the go-to-guy led to a deeper and constant review of malware write-ups and analysis by others as well as additional tools used to detect and monitor system processes and activity.  Some of the very best tools and techniques overlap in the computer forensics field.  So I began adding just such websites and blogs to my RSS feed list, always on the lookout to learn more to sharpen my skills in core OS support.

Evolution

Funny thing happens when you do that.  You might grow in unintended ways

Although the majority of my job duties as a SME (subject matter expert) now entail project management and knowledge-base/process documentation and development I continue to actively stay engaged in the the field and topics of OS workings and malware/virus response. I love the challenge it brings.

All those readings and knowledge gleaned from real experts in the forensics and incident response professionals (of which I am not) have rubbed off. 

I have become deeply sensitive to these things, and the standards to which we need to not only aspire to, but master and apply.

And in my role, I have a duty and level of organizational influence to try to do something about it for improvement.

And we probably have a very long climb ahead.

The Peaks

Way up in or organization we have a CSO (chief security officer) who has been doing a great job in bringing security awareness and application into our organization.  We are now working on encrypting all hard-drives org-wide, have a great security policy document on the intranet somewhere, use email encryption, set password policy, and clearly have focused on software solutions for a majority of security weaknesses.

Way over elsewhere we have a crack team of network professionals who do magical things.  They actively monitor and filter the network and are very responsive during high-impact virus/worm/trojan breakouts in our system, blocking infected systems from the network until cleaned.

Finally, we have a very clever desktop and server support group.  They work hard and long to ensure desktop images are patched and up to date.  They coordinate and monitor reports to find local workstations that don’t have current anti-virus defs loaded, as well as systems that have reported in with AV activity.

So here’s the problem.

Our local group of technicians and analysts are tasked with working with these groups and fixing the problems found.  And the vast majority of work in the incident-response plan is sending a technician out to the location, running various cleaning tools (AV/AM) to disinfect the system, ensure it is fully patched and AV DAT files are current. Period.

That’s the bulk of of local incident-response plan and procedure.

And I’m now painfully aware that isn’t sufficient.

  • No attempt to first isolate the system and capture an image of it for review.
  • No attempt to determine the date and duration of initial compromise.
  • No attempt to log and capture the malware/virus/trojan/etc.
  • No attempt to determine what (if any) information on the local system might have been compromised or lost.
  • No attempt to analyze the source and vector of the “attack” infection.

None of the standard incident-response actions.

Usually only if something really “icky” is found, or IT is independently notified by our inspector general’s division, or a special request for review comes in does our IT team scramble the jets and actively do a “incident” response.  But even then, I sometimes wonder if our response process would would meet professional forensic response guidelines.

On most all days and cases it’s just explore, poke around, “clean”, and if it is really yucky, just off-load the user’s data, wipe the system, reimage it, and put the data back.

Scary isn’t it?

How much information is lost?  How much “damage” occurs?  What knowledge is lost by the “cleaning and inspection” process performed on the system by our technicians?

How do we find a balance between getting the end-user back up and running quickly for production work versus performing a thorough incident response to assess what (if any) information leak or compromise has occurred?

Meditations

I know from experience that at the root this is a “cultural” issue in our organization.

Our local staff are low in number and we have a ton of work to do.

They haven’t been trained in incident response methodology.

We don’t have (at least at the local level) any process, procedures, or clear expectations for incident response.  In fact, we really haven’t even clearly defined the scope and impact of what constitutes an “incident’.  Clearly based on our responses, infection of a system with virus/trojan/worm/rootkit/malware is defined as a removal task, not a potential system compromise incident response.

I, D-Man, Mr. No, and the other senior members of the IT team do care and are sensitive to these matters and want to vastly improve what we do in this area.

We are blessed to have a manager whom we report directly to who is also very sensitive and responsive about these issues.

We just need to do our homework, create a incident response structure and plan that fits our environment, do training, and then foster an ongoing and enhanced sense of incident response and awareness.

Right now I’m culling, printing, and using my “free-time” at work to study up materials, incident response forms, policies and structure from the following sources:

Incident Response Resources – U.S. Security Awareness

Best Practices Guide (BPGL) – FIRST Forum of Incident Response and Security Teams

What got me thinking…

Not too long ago we had an incident where an automatic tripwire alerted me to someone with a Chinese IP address attempting to log onto various network devices.  Even though it was the weekend, I alerted D-Man as well as the network gurus.  It appeared no harm was done, and (apparently) this happens all the time and isn’t that big of a concern.  Based on my own analysis of the event and the sphere of control I have, I proposed making some password and ID changes to the specific devices.  That was acknowledged but changes have yet to be implemented.

I read NASA’s Wayne Hale’s blog post Real Engineers and the way organizations look at the value of people based not on the roles people play, but what they can “really” do.

I earned an undergraduate degree in engineering from a prestigious and notoriously competitive university.  After that I went on to do engineering research and complete a graduate degree in engineering from another major university with a reputation for excellence in engineering; along the way I wrote and defended a thesis and authored several papers which were published in professional engineering journals.

When I came to work for NASA, I was fortunate to get a job in the operations area:  mission control.  A thorough understanding of engineering principles and practices was mandatory for my job.

So I was floored just a few months later when I first heard it:  "you are not a real engineer". I was just "an ops guy".

In the NASA pantheon of heros, the highest accolade any employee can be granted is that they are a "real engineer".  Not even astronauts rate higher.  The heart of the organization worships at the altar of engineering:  accomplishment, precision, efficiency.  What does it take to be a "real engineer"?

It’s a great read and while I was originally analyzing it in light of the “forensic examiners now need P.I. certifications” debate going on across states, it struck me that this might apply to our IT culture as well.

Maybe since we don’t see or interact with any “real” security incident responders, we don’t see the importance or value of our role on the front lines in this battle.  Are we just the grunts or infantry men who go in and take out the enemy pill-box and continue to advance?  It’s the job of military intelligence to collect the trends and larger picture. Clean and move on.

I think that is a dated and dangerous stance if true; particularly on the front-lines.  Our technicians play a keystone role in incident response.  Only it look like very few have realized it yet and certainly not drafted a plan for their role in it.

Consider the following recent posts from the professionals Hogfly and Keydet86’s computer incident response blogs on the dynamic tug between first responders and incident responders (who just happen to be two of the very best of many great incident response blog authors):

I promise, it will make your head spin!

Wish us luck. 

Kicking up this potential ant-pile at work seems like the only responsible thing to do.

I’m in no way saying there isn’t any security awareness at our shop or in our organization at large, or that our technicians are the problem, or that any of the groups or individuals charged with securing and responding to incidents in our system aren’t doing their jobs.  We do have clear polices and our staff work extra hard at doing what they are assigned to do.  I just wonder if it currently enough (on multiple levels of application) in today’s IT environment and regulatory demands.

I think we need to do more, and particularly at our field-level.  It’s the Sherpas who those who climb the highest peaks depend on.

And BTW I’m open to suggestions from the professionals on how and where to start this process building and implementation.

Cheers!

--Claus V.

2 comments:

Anonymous said...

Why would you wipe out the data to fix software problems? You should try the Reimage.com technology, they have solved that problem.

Claus said...

@ anonymous - re: "...you should try the reimgage.com technology, they have solved that problem."

Umm. Not really.

Per their website FAQ "Reimage deactivates viruses and makes sure they will not be executed again after reboot. Reimage also repairs the damage they may cause. However, viruses may remain on the PC, so it is highly recommended to run a good AntiVirus application after the repair in order to clean the hard disk. "

See it's not a software problem we are fixing; it is a system integrity problem and that is something that reimage.com just can't deliver. No offense as I'm sure it is useful for its particular customer base and intended usage.

The fact that it leaves "bits" of the problem behind makes it a non-starter solution.

The whole purpose of our incident response (malware infection) remediation work is to ensure 100% that the compromised system is returned to a secure state. It's not that the system is "broken" in most cases or software doesn't work. It does.

The only way to be sure that there are not remnants of a root-kit, virus, trojan or other malware, despite the best and most attentive cleaning efforts or products would be to zero-out the entire drive including MRB table, reformat, then apply a fresh and pristine corporate image that we have scratch-built and pre-patched ourselves in a protected enviroment. Anything else leaves bits that might just come back to bite us.

Not even Windows SteadyState or Faronics - Deep Freeze would meet our standards in this particular case.

Thanks for the tip anyway...others might find it useful.

Cheers.

--Claus V.