I've been dealing with a Vendor at work for the last several months and the more I deal with them the more I am amazed that they are in business. Not just in business - these dingleberries have managed to get a multi-million dollar contract for their incredibly lousy software. I mean we'd be better off without it. It's almost like a virus... no, a virus is successful... this is more like an albatross around our necks. It's crazy.
Anyway... the latest is that they are shipping this software component for linux and solaris and it's got basic start/stop scripts - pretty routine stuff. The problem is in the start and stop scripts. I suppose it's in their code as well, but I'll give them a pass on that because they wrote the start and stop scripts to keep the software out of a 'bad' state.
Here's what's happening: you start this component and for some reason it gets to thinking that it's stopped and so it tries to restart itself. The problem is that it hasn't really stopped. It's still running. Given that it's listening on a socket, when it 'restarts' it sees that it can't get the socket and not only fails on the restart, it takes the running instance with it as well.
This last little bit can be avoided easily enough by making the app see if anything is listening on the socket before trying to set up the listener, but that's something that not everyone would think to do, and it's a little more than I'd expect from most vendors. But the script thinking that the app died when it hasn't is a serious lame brain duh.
I went into the script, spent about 15 mins clearing out a bunch of the junk and putting in a simple restart sequence that I've used in a lot of linux services I've built and started that puppy up. Previously, I couldn't get the component to stay up for 8 hours, and now we're well past three times that.
Their basic components are decent - not worth the money, but they aren't horrible, either. But the scripts are crap. And let's not even get into the tech support that's half way around the globe. After three days and countless explanations of the problem from me to them, the best their tech support could say was You must have started two - stop them all and start just one and the problem will go away.
Now, if I was a junior dufus just completing the How to Program in 24 Minutes book, then I'd see that it might be good advice. But since my first email pointed out that I had done just that, this was not the most helpful of advice. Then, they had their guru that's in town have a look at it, and he took the changes I made to the script and said "Well... I don't really know the linux scripts, but the guys back in Greater Gufendorf will have a look."
Yeah right. I've got stability and as long as I'm forced to use this garbage, at least I can make it decent enough to use. Holy Cow! What a bunch of goobers.