General geekyness

Posted on October 19, 2003

Late last week I had to reboot my server because of a breakdown in Portugal Telecom’s ADSL service which also affected the other operators running their offers on PT’s infrastructure (which they do because the telecom market liberalization in Portugal is a joke and the regulatory entity is an even bigger joke and the conditions for the unbundling of the local loop were not even a joke but just plain sad until very recently –but I digress, I don’t want to get into this rant right now).

Actually I didn’t have to reboot but I was in a hurry and made a bad call… OK, so I screwed up but hey “it’s my server and o reboot if I want to, reboot if I want to…”, etc, to the sound of a very well known tune. :)
Anyway the point is that I did reboot and since I had a pretty decent uptime until then some strange things happened during the weekend and I didn’t have the time to devote to them until now.

Rebooting a system which is running for a long time and is being changed constantly is always a scary thing to do and I should know better by now than to let changes that need to be taken into account at reboot time half-way through (“I’ll add it to the startup sequence latter, I gotta go now”). I don’t do this kind of stupid thing with the servers at my job but hey, at home it is a whole different ball park, right? I mean, it’s not even the same sport! :)

Well, anyway, I’ve added temperature monitoring to my server (which incidentally is the home desktop machine and the network server where this –and some other sites– are hosted and yes, this is why they where unavailable this last friday) and I had to tweak the kernel modules somewhat in order to get the sensors to report the right temperature instead of all those other values that would either turn my CPU into a blazing inferno or give it the potential to go superconductive. Well, after rebooting the PC the graph I got for the temperature was this:

[This image has been lost inthe ether. All the bits have been duly recycled.]

Notice the time that I remembered to load in the modules. Notice also that although I got the power source temperature (blue) OK I continued to have the CPU (green) at 0°C (actually it was at -180°C) but mrtg doesn’t show it. Sloppy, sloppy! And the reason for it is that not only must I load in the modules, I have to load them with the correct parameters for them to know what kind of sensors they are actually dealing with.

And I had done it all, I had discovered all the correct values for the required parameters after painstakingly trying lots of different combinations (this is an area where documentation is scarce at best, especially where the hardware is concerned) but did I put it in the startup scripts where I should? No! I left it out to do it “some other day” because I was, quite frankly, tired of dealing with the stupid sensors and their modules. And did I even do a simple script that loaded everything neatly into place in order to remember the correct parameters? Well, darn it, I just completely forgot.

I bet if I where a lawyer or a baker or an accountant or something else my own personal computer and webserver would get a lot better treatment…
Oh well, life goes on, the regular programme will continue shortly after this and this entry will be here for ever to remind me that I should be more thorough with my server’s configuration. And I will be, honest. Until next time, that is… :)