Two days clueless about a malfunctioning Apache
Posted on 16 May 2009
Though I mentioned in the introduction that I would refrain from regular blogposts on this website and only show high-quality postings (none available at the moment of this writing, but just hold your breath a while longer, they’ll on their way), I have to write about the small events that ruled my life for these last two days. In-depth, of course.
The few of you that know their way to my blog by now, may have noticed an interruption in the service: host unreachable, is what your browser said. “Failed to start Apache Service” is all I got, and these lousy three lines to keep me company in my thoughts:
Starting the Apache2.2 service The Apache2.2 service is running. ] Apache/2.2.11 (Win32) configured -- resuming normal operations [Fri May 15 12:01:22 2009] [notice] Server built: Dec 10 2008 00:10:06 [Fri May 15 12:01:22 2009] [crit] (22)Invalid argument: Parent: Failed to create the child process. [Fri May 15 12:01:22 2009] [crit] (OS 6)The handle is invalid. : master_main: create child process failed. Exiting. [Fri May 15 12:01:22 2009] [notice] Parent: Forcing termination of child process 36
Now that was nasty! This happened right on the middle of the day and google, newsgroups nor forums had any serious clue about this. The results I found were old, not related to my situation (MaxRequestsPerChild was not related) or just simply dead-ends. So what do you do in such a case? There are basically three levels of investigation to resolve it, and if you encounter any seemingly unresolvable problem, I suggest you follow this path:
- Re-install, re-configure, re-start etc.
Rollback your server to the last-known-good configuration (and that is not the same as what Windows thinks is your last-known-good config, because Windows simply takes the config of the last successful boot) and rollback your Apache to a basic install, until you have a working version again. Windows System Restore can be your friend here, but not always. - Run a system health check.
This is a very important part and often overlooked. Just running a virus scanner or two will not help. Go deep, use professional tools and make sure that your system is sound. Disable anything not absolutely necessary. - Debug the sources
If one and two don’t help, your options will grow dim very quickly. If possible, obtain the sources, setup a debug session and/or do a local build of the Apache source. Try to create a breakpoint just before the error rises. Consult experts in the field to help you setup this environment. The excuse for tracking down a possible bug or security hole will usually receive a lot of eagerness from the developers. Make sure to keep them posted of your proceedings, even if you don’t succeed.
I made the mistake of doing the things mentioned above in the wrong order. Day 1 found me trying to create a new configuration (step 1) and with everything back to default, it still didn’t run. I knew from earlier installs that Apache, on my machine, “just works”. It didn’t, and for the life of me, I didn’t had a clue what that change was to my system that causes this crash.
Restarting, re-installing, head-scratching: couple of hours later, no results, still. I must admit that I never really gave much attention to the feature “Restore Point”, I found it more a nuisance then a benefit, but this time I decided to really use it and to go back to my previous situation. Already excited about having a working Apache again, I decided to choose the restore point just before the last Windows Update. Because we all know, that Windows Updates are evil, right? Imagine my disappointment and surprise to find out that it didn’t help.
Right here we’ll have to stop this story for a moment and do a tiny bit of Reflection with the famous Frederick P. Brooks (Mythical Man Month) in mind. In one of his article’s he quotes a software programmer who’s part is broken. The programmer has tried everything and is convinced that “Unix Pipe is broken”. He spends two weeks in finding why Unix Pipe is broken and uses kernel level debugging and all his knowledge and tools, only to find out that the behavior of Pipe seemed broken because his data contained an illegal character. Two weeks wasted. Moral: there’s a much large chance that your home-grown software contains a mistake than the time tested, used-by-millions software of the operating system kernel.
My situation wasn’t much different when I blamed the operating system for disrupting my Apache service. If you don’t see what’s up and you can’t fix it, it must be somewhere in the mysteries and realms of Redmond’s software, right? Wrong! Even though any software has its flaws, operating systems breaking default behavior of programs not doing anything special is very unlikely. Apache uses well proven techniques that interop with Microsoft Windows and the likelihood of a part of the operating system being broken that is so vital is just, well, unlikely.
Where was I? Blaming the operating system, that’s where I was. I followed that same idea a bit further. The coffee, beer and time of night must have clouded my judgment. Going back even further in the history of my poor system, restoring the oldest Restore Point I had, but at the end of the day (actually: night) I found myself still in exactly the same position. A broken Apache on a fine working Windows. Or, what was actually broken?
Solution
If you’ve read up this far, you’re probably interested in the solution. You either had the problem yourself or you’ve just become curious.
Answer: do not follow my path. In many cases, sticking to the 1-2-3 above is easy and will save you a lot of headaches, but don’t introduce step 4 after step 2: blaming the OS or Apache. And don’t go to step 3 until you absolutely positively sorted out all other issues.
In my situation, after hours and hours of debugging – with the added benefit of understanding the internal processes of Apache much better now – I solved the problem by running ComboFix. It is a very powerful tool and should only be used as a last resort: if something is really really broken and if all other means have failed. It will run rootkit checks and will fix registry errors. It will backup all changes, which it will remove as soon as your run ComboFix /u, which you shouldn’t do until a specialist helps you. ComboFix specialists have had months of training and even if you consider yourself a power user or a seasoned hacker, these people can still look further and they know how to interpret these logs and backups.
Before resorting to that, you should:
- already have checked your system for viruses, using your own scanner, but also using Kaspersky and/or Comodo which are free and find much more then the consumer products from Symantec and, God forbid, McAfee.
- already have run SuperAntiSpyWare and MalwareBytes Anti Malware products. These products are free for home use and together find more garbage on your system then any other commercial product. These products are not bloatware like their commercial counterparts are.
- already monitored your firewall logs or, even better, your external firewall, modem or connection box for unexpected visitors
If all turns to nothing, consult a ComboFix expert. Some are around at Experts Exchange, others are available through numerous blogs. Follow their instructions carefully and if you are as lucky as I am, you should be fine quickly thereafter.
Cause
Usually I go right down to the bottom of something. I spent several days solving this issue and when after running ComboFix everything was sorted, I decided to leave it at that. The logs didn’t reveal enough information anymore because I ran the tool twice. Because there was no serious thread, rootkit or virus found on my system, my biggest suspect is a new driver for my nVidia Quadro FX 1700 from Microsoft. After deinstalling and after going back to a previous restore point didn’t solve the problem, it is still possible that the driver installed something that was missed by the restore point processes.
I did a very safe new try with that same driver a few weeks later. First a full image-to-image backup of my system, then installation of new driver, but new disaster kept away from my system, leaving me in the dark for any new clues.
If you have any clues after reading this story or if you have had the same problems and found a definite cause + solution, I’d really appreciate you sharing your thoughts here (if you don’t want to do it publicly, you can also contact me directly at abel.blog@xs4all.nl).
– Abel –Don’t follow the advised on the TechNet, MS Connect and MSDN forums, they are time-consuming and don’t work if you have a newer version of .NET installed like .NET 4.0. Instead, read this and easily fix this hard-to-resolve error by simply changing one registry value. read article
Ever tried to shrink a volume? Ever wondered why you cannot shrink a volume smaller than half its size? Ever wondered what $MFTMirr is all about and what it’s doing in the middle of your drive? Or do you just want to get the biggest available free space and shrink your drive? Then this article is for you — read article
Have you ever received this error using Windows System Backup and Restore Center? Never managed to get rid of it or it mysteriously keeps coming back? Here’s a lightweight and easy solution — read article
1 Response to Two days clueless about a malfunctioning Apache
Leave a Response
Additional comments powered by BackType
We had this exact same error, and we noticed a couple of weird things, first, running “httpd -e warn start” would make it work. I don’t know why adding the -e warn made a change but it did. Eventually, we ended up disabled the exif module in the php.ini file and that fixed it. Again, no idea why. Hopefully this helps someone.