[NTLUG:Discuss] Software RAID woes
Scott Ripley
sripley at cst.net
Sat Jan 22 10:03:24 CST 2000
Howdy'all,
First off, I have an Abit BP6 Mainboard with dual Celeron 366Mhz chips
(usually oc'd to 550Mhz, but I turned it back down until I get
everything working again). The BP6 has two extra ide(ATA/66) controllers
off of an hpt366 chip. I have been runing the 2.3.x kernels compiled for
SMP, and everything has been fine since I dumped Mandrake and the pgcc
compiler for Slackware and gcc.
Recently, I purchased two IBM 13.5G 7200RPM drives with the intent of
implementing software RAID0 over the disks. This is my problem.
The directions for setting up RAID seem pretty simple. I have recompiled
the kernel with RAID support. I then used mdcreate to make an /etc/mdtab
file, and ran mdadd and mdrun. The entries show up as active in
/proc/mdstat.
twin:~# cat /proc/mdstat
Personalities : [1 linear] [2 raid0] [3 raid1]
read_ahead 128 sectors
md0 : active raid0 hde1 hdg1 131984 blocks 8k chunks
md1 : active raid0 hde2 hdg2 26388432 blocks 8k chunks
md2 : inactive
md3 : inactive
twin:~#
Once I try to access a /dev/mdx device to make a filesystem mke2fs
freezes and can't be killed. I get kernel oopses when I try to remove
the entries from the kernel. The other filesystems will not unmount, and
have to be fsck'd on reboot which is the only way I have determined to
get the system back to a working state. Also, if I try to do anything
intensive (compile) after doing mdadd/mdrun I get more kernel oopses,
and it fails. This happens with both SMP and UP kernels.
Also, I might add that the disks do work fine independently. I can
create and remove filesystems and data to/from both /dev/hde and
/dev/hdg before I run the mdadd/mdrun commands. I have not tried to
access the disks directly after running the command (it seems like a bad
idea to me).
I read through many mail archives in the past 24 hrs, and saw kernel
patches for RAID on 2.2.x kernels. I have seen nothing about patching
2.3.x kernels to get them to work.
The only thing I noticed out of the ordinary on the system is that the
ide2 and ide3 controllers are using the same "IRQ?". I'm not sure if
that is the right term, as it is set to 18.
twin:~$ cat /proc/interrupts
CPU0 CPU1
0: 40712 26622 IO-APIC-edge timer
1: 2 1 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
9: 0 0 IO-APIC-edge acpi
12: 8 4 IO-APIC-edge PS/2 Mouse
13: 1 0 XT-PIC fpu
14: 3750 1987 IO-APIC-edge ide0
15: 16 1 IO-APIC-edge ide1
16: 68 59 IO-APIC-level eth0
18: 7 7 IO-APIC-level ide2, ide3
NMI: 67245 67245
LOC: 67254 67252
ERR: 0
twin:~$
If someone could enlighten me to the ways of SMP IO-APICs (and all the
xtra IRQs) that would be great too. I'm an old school XT/AT guy, and
anything above 15 doesn't quite make sense to me. :) I'm sure they could
share the interrupt under normal circumstances, but would "simultanious"
access be a problem?
Also, the software package I am using is mdtools V0.41. I have seen
mention of a renamed raidtools V0.90, but I have had no luck in finding
it. This too could be a problem.
I have been looking at this all night. So, if any of this doesn't make
any sense I could just be tired.
TIA,
Scott
_____________________________________ ___
/_____ ______________/ _________ \/ /\ Scott Ripley
\____/ /\___________/ /________/ / / / sripley at cyberstation.net
/ /__ ________/ / _________/ /___________
/ / /_/ / __/ /\ \ O| _ \ / ___/ ___ \
/ / __ / __/ / /\ \| | _/ / ___/ __/\
/___/__/ /__/____/___/ / \__\_|_|________/__/\__\_\/
\___\__\/\__\____\___\/ \__\_\_\_______\__\/\__\
More information about the Discuss
mailing list