Wednesday, April 22, 2009

Weird but Fixed

Most of the desktop support stuff that crosses my desk turns out to be pretty standard fare.

However from time to time I get my hands on an odd-duckling.

Last week the group-members were taking turns trying to address a stubborn Windows 2000 system.

Yeah…W2K…I know….

This particular system is an older Dell Latitude laptop that is special-purposed.  It doesn’t connect to our network and is used to run a portable ID making setup

Needless to say it is mission-critical hardware.

The user reported that they were suddenly unable to log onto the system.

The Problem

It would boot, take forever to reach the standard Windows 2000 login window (msgina) and toss the following error without allowing anyone a chance to actually attempt logon.

System Cannot Log You on Because Domain <Computername> Is Not Available

Where <Computername> held the device-specific name of this system.

The team determined it had various issues which they addressed and repaired; bad sectors, blank/reset of the administrator password, etc.

Alas they still couldn’t get past the primary logon barrier.

General consensus was that it was still some kind of password error but the only solution appeared to be a complete wipe/rebuild.  They had managed to off-load the key data from the system…just in case.

So it was placed on my desk for a solutioning attempt.

The Solution

My goal was to get it operational again without having to reinstall all the applications and database information currently on it.

Normally, when I have “GINA” issue on our W2K/XP systems, I just swap GINA files; either by replacing the msgina with the Novell Netware nwgina file or vise-versa.  That usually gets me around those issue so I can at least get to the desktop so I can do software/configuration cleanup.

In this case though, since the system never connects to our network, my go-to files weren’t installed.  Nothing could be done to get past the error and reach an account profile.  It even occurred in Safe Mode.

I could use my Windows PE disks to poke around on the drive contents, but nothing seemed amiss.

I hit the Google and found others (way back when W2K was heavily deployed) who ran into the issue and though reports of success were low, the suggestions helped point me in the right direction.

I went back into the CD archives and extracted my old Windows 2000 Setup disk with the SP4 slipstream.

I popped it in and ran a Windows “Fast Repair” on the operating system.  I ran through the options, let it scan the system, rebooted, and repeated…with it doing the actual file-replacements the second-go-round.

Rebooted again.

The Windows GINA popped up almost immediately and I was able to log on to the desktop with no errors.  Applications and data all present and accounted for.

Victory!

Cleanup…

I knew this action had rolled back all the security patches and updates and left it at a fresh SP4 state.

To avoid having to connect it to our network, I just ran the most recent Heise Offline Update tool on my own system, selected the options necessary to build a Windows 2000 OS update CD, let it run and pull down the updates and create the CD ISO file.  I then mounted the ISO as a virtual drive and copied the folder structure to my USB stick.

Then I popped the USB stick onto the system and ran the “offline" updater tool.

Three more reboots/reruns later and the system was as patched as it could be.

(I chose to do this instead of burning the ISO to a hard-CD as it seemed wasteful for a one-shot system fix.)

Now that everything was repaired and stable, I tweaked out the login controls to force the OS to require a CTRL-ALT-DEL key-press before starting the login procedure as well as setting the option to require the user to enter an id/password to log into the system (I know, I was surprised as well this hadn’t been done by whoever had set up such a critical system.)

Since everyone had also been diagnosing that the drive was failing (because they had earlier found file-system errors) I took a moment to look at that.

The disk-health parameters reported by the SMART system on the drive all came back nominal.

I also ran a sector-error scan and none were found.

While these are no guarantees of future performance, it seems that things are currently well enough to send it back to the user.

Once assured that all was back to normal I shut it down, rebooted, and captured an image of the system, just in case.

Based on what I observed, it appears that for some reason some key networking files on the OS had become corrupted/scrambled. 

The system-repair set things in order again.

In the meantime, we are requisitioning a new laptop to replace this old one.  This way we can get an updated OS (XP Pro) on it, get it locked down and secured, whole-disk-encrypted, and be more confident that hardware issues won’t come back to bite us again with it.

--Claus V.

No comments: