[NTLUG:Discuss] "You Don't Exist: Go Away!"

MadHat madhat at unspecific.com
Fri Nov 3 07:14:56 CST 2000


Christopher Browne wrote:
> 
> On Thu, 02 Nov 2000 10:25:05 CST, the world broke into rejoicing as
> MadHat <madhat at unspecific.com>  said:
> > MadHat wrote:
> > >
> > > Christopher Browne wrote:
> > > >
> > > > I am having an unfortunate situation where a machine periodically gets
> > > > somewhat 'wedged up' such that:
> > > >   a) Port services that check for user IDs die;
> > > >   b) Permissions on files apparently "disappear";
> > > >   c) Pretty much anything that checks IDs against /etc/passwd gets
> > > >      hosed.
> > > >
> > > > This does _not_ appear to be the result of a hack; it seems moderately
> > > > "time based," probably relating to some resource filling up thereby
> > > > making {utmp|PAM} throw up.
> > > >
> > > > Other interesting facts:
> > > > - It seems to happen _around_ once a day.  But not greatly predictable.
> > > >    Oct 27, 03:38
> > > >    Oct 27, 21:32
> > > >    Oct 30, 08:02
> > > >    Oct 31, about 1:56am
> > > >    Nov 1, between 9:00 and 9:02 pm.
> > > >
> > > > - I don't need to reboot to get everything to "reset;" if I drop to
> > > >   runlevel 1 via "init 1," and then head back to "init 3", this seems
> > > >   to suffice to clear things up.
> > > >
> > > > - Debian Unstable Pretty Much Up To Date.
> > > >   Linux knuth 2.2.14 #5 Sat May 6 07:29:45 CDT 2000 i586 unknown
> > > >
> > > > The two things I've seen looking on Google that match the symptoms are:
> > > >
> > > > a) "Oops.  You deleted /etc/passwd."
> > > >
> > > >    Not the case.
> > > >
> > > > b) Something vague involving utmp being "somehow messed up."
> > > >
> > > > Anyone run into this sort of thing before?
> > >
> > > kind of...  My problem was bad nodes on the drive, but it took a fsck to
> > > fix.  The drive was going bad and was losing data on the section of the
> > > disk that held the /etc.
> > >
> > > Because an 'init 1' & 'init 3' seem to be the p[roblem, that does point
> > > more towards the software not hardware...  what Kernel you running?
> >
> > This should have read because the init 1 and init 3 seem to _FIX_ the
> > problem, that doesn't pont towards hardware, but more towards software.
> >
> > Need more caffeine.
> 
> Need more blood with your caffeine level?  I follow that...
> 
> It certainly seems to be a software issue, and the fact that changing
> runlevels "fixes" it seems suggestive that the problem is not with the kernel.
>  (2.2.14, as mentioned up there somewhere...)

D'OH!!!  sorry...  I was hoping it was something newer, so we could
blame that. [[|:^)

init to another level doesn't remount any file systems, correct?  It
just starts and stops daemons right?  what deamons are you running?  Is
there something there that might be causing a problem?  anything running
in a cron that might be causing?  

Sorry, just thinking out loud.

-- 
MadHat at unspecific.com
                                   "The 3 great virtues of a programmer:
                                      Laziness, Impatience, and Hubris."
                                                 --Larry Wall



More information about the Discuss mailing list