------------------------------------------------------------------------ NOV-HDW4.DOC -- 19980311 -- Email thread on NetWare file server hardware ------------------------------------------------------------------------ Feel free to add or edit this document and then email it back to faq@jelyon.com Date: Sat, 18 Jan 1997 10:00:37 -0600 From: Joe Doupnik Subject: That 64MB barrier, again There has been a rash of msgs on servers seeing only 64MB while more SIMMs are actually installed. This is a short note on possible reasons why. NetWare does not blindly test memory when it starts. To do so is dangerous to the health of the system. It calls on the system BIOS to report its memory capacity. The normal call is Int 15h, function 88h, which returns 16 bit register AX with the number of 1KB blocks of memory above 1MB. Simple binary math says that can count only to 64MB. The system's CMOS memory may hold more information from the original Power On Self Test (POST) cold boot probes, if we could get a decent answer. EISA bus machines use this CMOS technique, and NetWare obeys the findings. For EISA bus (and perhaps others) the memory capacity must be set into CMOS by running a configuration utility shipped with the motherboard; automatic detection is most often wrong. Cold boot tests are not what callers see. HIMEM.SYS can get memory size information by probing, and it may (not always) make it available to the Bios for NW to see. DMPI memory management code is able to do this, but we aren't running that stuff on server machines. Recent patches to NW should help. Here is a snipping from file updates\nwos\nw410\410pt6.exe: LOADER.EXE ========== SYMPTOM: 3) ADDED support for using the new BIOS call int 15 sub function E8 for memory detection on PCI machines and other ISA machines that have more that 64 Meg of memory. If your machine's Bios does not support the call it may not be able to reveal more than 64MB to NetWare. A Bios upgrade from the maker might help. Joe D. ------------------------------ Date: Sun, 9 Feb 1997 17:19:54 -0600 From: Joe Doupnik Subject: Re: Spurs from T-pieces >Sorry to continue this thread after a break but I wish to clarify >something. > >I have checked my sources and we can agree it is a digitial signal. Mother nature has not the slightest notion of "analog" versus "digital" outside of quantum mechanics and the two terms have not a thing to do with transmission lines. They are human terms, not physical phenonema, for us. Best to select another set of sources, or be more discriminating in the question (always a problem for those old guys in Greece consulting oracles). >It is not RF, it is electricity albeit in bursts. Please, this is without meaning. The signals obey Maxwell's equations (as do all electrical phenomena) and they propagate in free space. They are radio waves, when the terms are used accurately. "Radio" of course first meant comms for people talk, the ordinary stuff, but it has a much broader meaning to those dealing with electrical waves: electrical waves traveling through free space. >The outer sheath reduces attenuation for the signal allowing greater >distances to be achieved than UTP for example which relies on the twists >in the copper wiring (conductor) to reduce the attenuation. False. Shielding is not attenutation. Coax is self shielding because the magnetic field, as seen at a distance, from one conductor is exactly the opposite of that from the other conductor, by co-axial symmetry. Their effects thus cancel and yield no leakage. Coax has losses from resistance in the conductors (skin effect is important) and dielectric losses in the insulation separating conductors. Twisted pair has losses from these two plus from fields induced in neighboring objects. Twisting open wires is but an approximation to self-shielding and it fails badly when observed close to the wires. But it's CHEAP. The first guess is attenuation is lower than coax, if the surrounding medium is air or other non-conductor. It's loss is higher than coax if the surrounding medium is conductive and lossy. Twisted pair leaks like a sieve, which is the point here. Ever hear of the lan wiring spec named NEXT? Near End CrossTalk: leakage from transmit pair to receive pair. >A few points about RF: it is not absorbed by a resistor as electrical >energy is. If you had a cable without Quite wrong. Ohm's law is obeyed in detail by any and every electrical signal, period. >a terminator on the end it would not be reflected back because of an >open circuit it would flow out the end of the cable, an antenna after >all, is a wave guide with holes in it. Co-axial cable therefore, is not >being used as a waveguide but as a conductor of electrical energy. Coax is a transmission line. Waveguides are transmission lines with an outer but no inner conductor. Twisted pair conductors is another transmission line. Traveling waves, and that is what we are dealing with, are sensitive to changes in impedance of the meduim and those changes produce reflections. Missing terminators are vast changes in impedance and lead to 100% reflection. So do short circuits. So do dings in the wire from running chairs over it, or tying it in knots. >I agree that changing the impedance of the cable will alter its ability >to carry the signal but that signal is still an electrical one. Hmmm, one wonders what that sentence is supposed to convey. >The amount of data carried in a signal does not change its physical >properties thus an electrical signal is still an electrical signal even >if it is carrying data at 10 or 100Mbps. Ditto. >Imagine a $40 NIC that can not only be switched from using a 10baseT >connector to BNC but its TX/RX being capable of switching from producing >electrical digital signals to analog RF energy - not for $40 i'm afraid. Ditto squared. May I suggest you look at common Ethernet boards and notice many have a BNC connector, a 10BaseT connector, and often a transceiver (AUI) connector as well. All for US$40. I can appreciate trying to fit vague but heartfelt descriptions of phenomena into familiar patterns. But that does not imply the phenomena works that way. Your feeling about what's RF is more like tuning in a radio station because that's the vague but h.f. analogy that you are familiar with: a carrier signal with imposed modulation. Alas, the real world isn't that simple. One of the triumphs of 19th Century physics was the culmination of a couple hundred years work, thought, and experiment into a unified electrical field theory by Clerk Maxwell. It is well worth reading about in detail at a non-specialist's level if you can find some good history of science books in the library. The subject is one of action at a distance via charged bodies, and the results continue to boggle the mind when given deep consideration. An introductory Physics text book is also a place to become familiar with electrical theory. One does not have to be an Electrical Engineer or Physicist to read and understand the basics of this material; just be attentive, discriminating, and keep an open mind. Joe D. ------------------------------ Date: Tue, 18 Feb 1997 00:29:05 +0100 From: "Arthur B." To: Subject: Re: device deactivated >I have a problem with a Western Digital AC31600 EIDE hard drive on my >home netware 4.1 file server. Last week after I downed it in the usual >manner after being up for 30 days, it was brought back up and to my >horror these messages greeted me: > > *****Reading in FAT > *****Verifying Directory FAT chain > *****Scanning the Directory Most probably you do have a harddisk failure. If you didn't backup your harddisk lately then the following might help: Find an identical harddisk and unscrew the printboard that is part of it. Replace the faulty one with it. Try the (formerly?) bad hard disk again. I've found this to help in many home situations. The reason this works is because most of the time harddisc failures of this sort are a result of a bad resistor and/or IC on the printboard. Which means the actual data on the harddisk isn't damaged one bit. You still need to replace the harddisk... ------------------------------ Date: Tue, 25 Feb 1997 07:32:56 -0500 From: Dennis Large To: netw4-l@ecnet.net Subject: Re: Comments Please -our NW411 design >We will be in the process of upgrading all of our 3.12 servers to >Netware 4.11 in the next two weeks. I have a couple of questions and >also would like comments. > >First: >NDS tree struct. Interesting layout. I'd review things just to make sure that access to apps, printers, shared data other than their one class, etc weren't a major pain to manage. What you show seems workable though. >Tree = devry > o=yyc [stands for Calgary in airport terms like LAX etc] > ou=staff > ou=faculty > ou=admin ... etc (happy with the staff one) > ou=student (this may need some work) > ou=CIS > ou=bisop > ou=eet An ou for each course > ou=et > ou=cab > ou=disable (disabled accounts ie. dropouts) > >Under each course ou is another series of ou ie: >The date ou is the start date of that particular course section. > ou=cis > ou=1195 > cn=students accts. > ou=0396 > cn=students accounts. > ou=eet > ... etc > >Each course here is about three years in length. It is our intention >to put accounts in under the dates where the students enrolled >(three starting dates per year - spring, summer & winter). >This start date would then remain with the student until they >graduate. After graduation we could then delete the entire ou. > >I'm fairly comfortable with this scenario but if anyone has other >NDS design ideas feel free to comment. Where in the tree were the servers going? Sounds like you're planning for them to be fairly high up in the tree, which is appropriate in my mind. If you're planning on bringing in existing accounts from a 3.x, you'll want to look at the DSMIGRATE with 4.11. Should simply things a bunch. May not apply though if you're planning an inplace upgrade, at least on the first server. >Other questions. >on our new server I have about 13 gigs available (in a RAID 5 array) >and would like to have the following struct. >sys vol: 700 megs - queues go here no quotas set up (or a big quota) >user vol: 5 gigs - quota is 2.5 megs per user >apps vol: the rest of the space - quota = 0 megs Sounds like a tad overkill for standard office apps, but not if courseware is to be included. >The users volume will hold the the home directories in this format >/root > /home > /bisop > /home dirs > /cis > /home dirs > etc. >I have heard that there may be a CPU utilization problem on large >block sizes with many tiny files and suballocation turned on. >Most of the home dirs will contain smaller files (like mail) and each >user is limited to 2.5 megs of space in their home dir (1700 users * >2.5 megs). > >Should I go with an 8, 16 or 32k block size in the users volume? Normally I stick with 64k, unless you know, as you do, something about the expected files to be stored. Having not had any problems with suballoc, I'd stay with 16k if not 32k. >We have an older server (90MHz pentium, 116 Megs ram, and 8 gigs of >usable space in a RAID 5 array). > >Anyone have any suggestions as to how I can set these two servers up >so I can kind of balance the load between both? (new server is a >P. Pro 200, 160 meg memory, 13 gig hd space). > >I have thought of moving some of the home dirs to the old server. >Will this affect the Mercury /Pegasus mail we now use? (will some >have to get new e-mail addresses? I don't have a mail hub set up >now/yet so mail addresses are based on which server their on. > >I will be installing apps on both servers. Our plan is to use NAL to >load balance the applications. Excellent idea. Haven't worked that much with pmail, but if you can manage to keep all the users on one server, maintenance and mgmt is going to be a little simpler. >Have I overlooked anything? Probably but I'll send this out anyway. RAID's great, but it still needs a backup system. And don't scrimp. ------------------------------ Date: Wed, 5 Mar 1997 08:38:49 -0700 From: George Taylor Subject: Re: Fiber-optic info >We had planned to run a thin coax "backbone" between our computer room >and a hub located in the aforementioned areas (staff and patron areas). >A person in the county's ISS department has decided that, regardless of >the situation, he wants to run fiber (another bright person carrying an >ego-flag at taxpayer expense). I've worked with fiber as a cabling tech. >many years ago. But, I have not worked with it recently. I was looking >for a good source of info on the WWW, and didn't find one. Does anybody >know of a good site for info on fiber and it's use with ethernet? Charles Spugeon at Utexas has a pretty nice Quick Reference for Ethernet out on the Web. Look at: http://wwwhost.ots.utexas.edu/ethernet/10quickref The section you'll be most interested in is chapter 6-1 You may also want to look at a couple commercial cable sites. SIECOR comes to mind. --------- Date: Wed, 5 Mar 1997 16:23:24 -0500 From: Jason Lester Subject: Re: Fiber-optic info There are several vendors sites that provide some info. Try: www.panduit.com (cabling vendor) www.fisfiber.com (tools and supplies) www.siemon.com (cabling vendor) www.fons.com (cabling vendor) An even better solution is to order some catalogs from one of the companies above. Many of them have excellent documentation. If you have any specific questions, send me an e-mail, I'm pretty familiar with Ethernet over fiber. ------------------------------ Date: Tue, 4 Mar 1997 13:33:03 -0500 From: Bud Durland To: "'netw4-l@ecnet.net'" Subject: RE: 3.12 lost hardware interrupts >I keep getting an error on one of my 3.12 servers stating: >error 1.1.140 Primary interrupt controller detected a lost hardware >interrupt. I know I can turn off the error, but I would like to be >able to troubleshoot it while the server is up, and if that is impossible, >then what would be a good procedure some time when I can take it down. >I have been warned by HP to not use IRQ's 2,9, and 15. Novell has an extensive technote on this on their web site (sorry, I don't have the URL). Basically, though, this message is practically inescapable if you are using anything on IRQ 15, and is common if you are using IRQ 9. If you are sure there are no IRQ conflicts, this message should be harmless. ------------------------------ Date: Thu, 6 Mar 1997 08:48:35 -0500 From: "Eric Horde" To: Subject: Addition to section P.2 of the main NOVELL.FAQ file Regarding the message: "Primary Interrupt Controller Detected A Lost Hardware Interrupt" You will see this error when the NOS can't respond to the card bus in time to answer the interrupt being generated. This is most common when you mix and match PCI with EISA and ISA. The PCI bus will get priority over the EISA and the ISA. If the Network cards in a mixed bus system are other than PCI you may have a nightmare on your hands. ------------------------------ Date: Thu, 13 Mar 1997 10:15:14 -0600 From: "Mike Avery" To: netw4-l@ecnet.net Subject: Re: Viewing Interrupts >Does anybody know of a way to check on which interrupts are assigned >to what, while the server is up and running. I am still having lost >hardware interrupt problems and need to know on the fly if interrupt >15 or 7 are being used by something other than NetWare. Sometimes it's worth tracking down every problem and "making it right". Sometimes it's not. The lost hardware interrtupt problem seems to occur mostly on some mask sets of the Intel 486 processor. A friend spent about 6 months tracking it down, and in the end weaseled his way into the Novell Deveopers program. Once he was in touch with Novell developers, they told him to turn the message off. Some CPU's and some motherboards have the problem, and there ain't nothin' you can do about it. (I paraphrased what they told him.) Intel, of course, insists that their CPU's are fine. In the interim, I suggest turning off the message with a set command. Another SET command I find helpful to turn OFF is the "Beep on errors". Running a speaker takes a remarkable amount of CPU time, and while the CPU is making the speaker go "beep", nothing else is happening. If you get a problem that causes a stream of errors, it's REAL hard to deal with it as all the CPU time is being poured into the speaker. --------- Date: Thu, 13 Mar 1997 20:13:51 -0600 From: Darwin Collins To: netw4-l@ecnet.net Subject: Re: Viewing Interrupts >The lost hardware interrtupt problem seems to occur mostly on some >mask sets of the Intel 486 processor. A friend spent about 6 months >tracking it down, and in the end weaseled his way into the Novell >Deveopers program. Once he was in touch with Novell developers, >they told him to turn the message off. Some CPU's and some >motherboards have the problem, and there ain't nothin' you can do >about it. Old memories. If it is the same 'lost interrupt' that we ran into. It was based on a early 486 mask type. Basically, Intel knew of the problem, and passed it on the to the motherboard makers which should put a workaround in place. Some motherboard makers didn't put it in place. I remember the AMI Enterprise motherboard didn't. The Mylex did (around 1992). I remember one weekend, we 'manually' upgraded some motherboards with processors that an Intel rep gave us. It did fix the problem, but, we managed to break some diodes on one of them. (oh, the days before ZIF sockets) ------------------------------ Date: Tue, 18 Mar 1997 13:36:50 +0100 From: Hans Nellissen Subject: Re: Certified Lists >Does Novell put out a Certified Device list for CDROMs? I've been able >to find bits and pieces, but a complete list would be better. Take a look at: http://labs.novell.com/ ------------------------------ Date: Sun, 30 Mar 1997 15:51:35 -0600 From: Darwin Collins To: netw4-l@ecnet.net Subject: Re: Server with 2 network cards I went to a 'developing for multi-processor' class last week. Basically, with current technology (Netware 4.x and NT), it is better to use single processor instead of multi-processor, unless you have an application that is written for it. Whats strange is that ZiffDavis benchmarks are written better for MP than most applications. (except: Oracle) This item, makes me wonder... basically, NW 4.1 beat NT 4.0 on all cases including multiprocessor. So... could the 'benchmark program' just be too good? With MP, you have to 'program in' spinlocks and other 'locks', for Cache contention, Bus contention, and even Memory contention. Each processor must also 'snoop' into another processors 'cache' to ensure that it will not be accessing the same memory areas. The 'Pro' s can handle multi-processing better, and don't take as much of a hit when there is a cache miss (or cache snoop). (The 'penalty' was anywhere from 10 to 100 times slower) One item, that was kinda interesting, is that the instructor suggested that we do use the MMX processors for the 'non-MMX', because they have more cache ram. (the graphics code doesn't help any) With File/Print services, its mostly 'IO bound', so the 'work' is piped thru 'Processor 0' (first). With the next version of Netware (Moab), it will be 'yet' more interesting, since the MP code was totally redone. --------- Date: Sun, 30 Mar 97 22:44:14 -0800 From: Randy Grein To: Subject: Re: Server with 2 network cards >I went to a 'developing for multi-processor' class last week. > >Basically, with current technology (Netware 4.x and NT), it is better >to use single processor instead of multi-processor, unless you have >an application that is written for it. Whats strange is that ZiffDavis >benchmarks are written better for MP than most applications. (except: >Oracle) This item, makes me wonder... basically, NW 4.1 beat NT 4.0 on >all cases including multiprocessor. So... could the 'benchmark >program' just be too good? I don't think so. Oracle has been touting the Netware Oracle server as being a superior performance configuration in comparison to NT, and several other applications appear to perform similarly on Netware. It really looks like NT simply is not as efficient with CPU cycles. Given MS's insistence on hardware abstraction, I'm not too suprised. >With the next version of Netware (Moab), it will be 'yet' more >interesting, since the MP code was totally redone. I sat in a session for the new scheduler. It looks pretty interesting, and should address a number of problems regarding performance, memory protection and backwards compatibility. ------------------------------ Date: Fri, 9 May 1997 15:51:18 -0600 From: Joe Doupnik Subject: Re: Intel Venus MB and Adaptec 2940uw Controller >I had an Intel Pentium Pro 200 venus motherboard with an adaptec 2940uw >controller and used the controller for a user volume (sys was IDE). it >worked great. I moved the controller to another Intel Pentium Pro 200 venus >motherboard and when I install NW4.11, it dismounts the SYS when copying the >netwareip software and I can't remount the sys volume. I get alot of write >errors on the FAT table from the system console. I read here about the >1540, but not the 2940uw. I haved tried it in a middle PCI slot and an end >one (no diff). I have tried completely different hard drives, too. Another >weird problem is when installing NW4.11, It says that my DOS partition is >only 3mb do I want to continue (even though it's actually 50mb). I have >downloaded the latest drivers from novell and adaptec for this and that >doesn't work either. The standard SCSI suggestions apply: detune SCSI rates, ensure termination really is proper and good, ensure cables really are good, turn on parity, temporarily turn off sync negotiation as a test, don't mix disk and CD-ROM (or other SCSI weak-sister) units on the same bus, and so on. Once the real cause has been discovered one can restore many settings. Also, look hard and long at the motherboard configuration and Bios setup parameters. Be conservative. Watch those memory SIMMs because flakey ones can wreck havoc. >One thing I just thought of is I'm using Micropolis 4G drive for the sys vol >(previously using a seagate 9G for users and 3G seagate IDE for sys). Do >you think this would matter? Does anyone have good/bad information about >this combination? I've talked to Micropolis tech support and now they want >me to find out what bios version my drives have (just bought the drives). >If this integration is going to be such a pain, I'm wondering whether I >should return the Micropolis for Seagate drives. I've never had a problem >with those. You should talk again with Micropolis about firmware upgrades, and about the mandatory turning off of disk caching (if any). High tech does not necessarily mean plug and play, most often just the opposite. Joe D. ------------------------------ From: Peter McCoy To: "'floyd@direct.ca'" Subject: Novell SFT-III Date: Thu, 22 May 1997 15:18:10 -0500 In section J.13.2 SFT-III and RAID 5, you are correct about the different implementations of RAID in terms of speed. However, this is only the case if you are implementing RAID via software. Using hardware, RAID 0 (striping) is the fastest possible RAID configuration. If you have a 5 channel controller and one hard disc on each channel, RAID 0 divides each piece of information in 5 sections and writes a different piece to each disc. Since each disk can sustain roughly 20 MB/s, the discs as a whole can sustain 100 MB/s. Now, the bottleneck resides in the Host channel capable of only 40 MB/s (Ultra Wide SCSI spec.). RAID levels 3 (sometimes called level 4) and 5 are the next fastest. RAID 3 optimizes for servers containing large files and RAID 5 optimizes for servers containing small files. Disc mirroring is the slowest since 2 discs allow for only 2 channels. When most people think of RAID they think of fault tolerance. This is only half of the benefit. Since hard discs are a common bottleneck, RAID boosts the spped necessary for demanding server applications. ------------------------------ Date: Mon, 2 Jun 1997 02:28:11 +0200 From: "Arthur B." Subject: Re: Need help with error message on 3.12 server >Has anyone come across this error message on a NW 3.12 server: > >1.1.141 Secondary interrupt controller detected a lost hardware interrupt > >The server locks up about 3 minutes after this message appears. > >The server is an IBM PC server 300 running NW 3.12 with the 312PTA patch. >It has one SCSI drive and one IDE drive both connected to internal >controllers. It boils down to that you need to rethink which card has which IRQ most probably. Each motherboard with 16 IRQ's (0 - 15) has two controllers. One handles 0-7 the other 8-15. IRQ2/9 is the 'bridge' between them. *IRQ list in order of highest priority* IRQ0 timer IRQ1 keyboard IRQ2 I/O channel IRQ8 real-time clock IRQ9 points to IRQ2 IRQ10 reserved (default PnP IRQ) IRQ11 reserved IRQ12 reserved (PS/2 mouse) IRQ13 co-processor IRQ14 first HDU controller IRQ15 second HDU controller IRQ3 COM2, COM4 IRQ4 COM1, COM3 IRQ5 LPT2 IRQ6 FDU IRQ7 LPT1 As you can see if you have a NIC on IRQ7 it would have the lowest priority within the system. Look into the FAQ that comes with this list (actually you must download it yourself). It has extra info about this subject if I remember correctly. --------- Date: Mon, 2 Jun 1997 19:35:17 -0600 From: Joe Doupnik Subject: Re: Need help with error message on 3.12 server >>Has anyone come across this error message on a NW 3.12 server: >> >>1.1.141 Secondary interrupt controller detected a lost hardware interrupt >> >>The server locks up about 3 minutes after this message appears. >I saw a memo on the ADAPTEC site that made mention >that IRQ 15 is used by Netware internally for housekeeping. >I wish I could remember where I saw it, I believe it was in >the FAQ's. I'll bet money a certain JoeD has a clue. -------- Yes, Novell has notes on this. The idea, as I recall, is if a device requests an interrupt but clears its IRQ wire too quickly the interrupt controller chip has not latched the event and is unable to decide who caused the interrupt. By default such an event is assigned to the highest numbered one on the chip (which means IRQ 7 or IRQ 15). There's more to the story but this is a rough outline (working from vague memory). Muddying the waters is emulation of 8259 interrupt controller chips in the big square bus adapter chips; emulation has always been an act of approximation. [I was digging into Intel spec sheets on two this morning.] In practice noisy buses, flakey boards, failing power supplies, failed fans, poor bus timing, and acts of nature all conspire to generate this lost interrupt condition. By placing a real board on IRQ 7/15 the system has to work harder to discover the problem, hence Novell's advice to leave them unused. I often put boards on IRQ 7/15 (having no other choice) and difficulties have not arisen. But then the error message had not appeared either. By now the incidence of such events is small, as board quality has improved. The lockup cited above suggests real hardware problems, anywhere in the machine. The FAQ has two files holding "lost interrupt" (using grep on the collection), one in nov-hdw4.doc to the point and other brief one in novell.faq. Somewhere deep on CD-ROM I have the full doc from Novell on the matter, but finding it is asking too much right now. Joe D. ------------------------------ Date: Mon, 2 Jun 1997 11:00:08 -0600 From: Joe Doupnik Subject: Re: HAM vs DSK drivers? >Does anyone have anything to say about the newer .HAM drivers vs the >.DSK drivers. I've got a support person who says that he's moving >things across to the new HAMs, but I haven't heard much about them. >Comments? -------- In principle, as they say, the .HAM stuff should be "better" because it is newer. But that remains to be seen. Arcada/Seagate's BackupExec docs say stay with the .DSK material for best stability at this time. Both seem to work ok here, and the .HAM variety seems smarter about recognizing new devices on the SCSI bus without rebooting the server (such as plugging in a CD-ROM player). Choose whichever is the most robust in your situation. Joe D. ------------------------------ Date: Thu, 12 Jun 1997 10:52:27 -0600 From: Joe Doupnik Subject: Re: Mirrored partitions becoming un-mirrored >We are running 2 Netware 3.12 servers with the following setup: > >A Seagate SCSI hard drive and a Quantum SCSI hard drive running off of >the servers on-board SCSI controller, and identical Seagate and Quantum >drives running off an Adaptec 2940UW PCI controller. This way we have >duplexed the Seagate drives to mirror each other, and the Quantums >mirror each other. > >Every 6 or 7 days we get the following message: > >6/10/97 11:25:10 am Severity = 4. >1.1.65 Device # 2 SEAGATE ST32550W 0009 (5D010000) deactivated >by driver due to device failure > >6/10/97 11:25:10 am Severity = 4. >1.1.85 Volume SYS still operational despite drive deactivation. > >6/10/97 11:25:10 am Severity = 4. >1.1.72 The mirrored partitions on this system are not all synchronized > >This error message only shows up on one of the servers. And this time it >was hard drive device #2, which is the second Seagate. Other times it >has been device #0, which is the first Seagate hard drive. ---------- In my experience it means you can be losing a disk drive. A common error with these large drives is not providing adequate cooling. Once overheated the bearings wear out, and intermittent failures occur. Left hot, the drives become unstable too. Listen to the drives as you power on the server; bearing noise is easily noted then. As usual, a careful look at the SCSI bus is worth while, just in case that is the root difficulty. Joe D. ------------------------------ Date: Fri, 13 Jun 1997 09:14:28 +0800 From: Brett Looney Subject: Re: Mirrored partitions becoming un-mirrored Following on from the "max_tags" stuff: We always seem to run into that problem with Adaptecs and mixed drive setups. By mixed drives I mean different brands, or running tape/cdroms off the same bus as the hard disks. To fix it, we do this: load aicwhatever slot=xx tag_disable=ffffffff This turns off tag queueing, which probably results in some sort of performance hit (which I've never noticed) but the problem you describe goes away... ------------------------------ Date: Fri, 20 Jun 1997 01:57:26 +0200 From: "Arthur B." Subject: Re: Servers for Netware >What do you like in the way of file servers? > >We're looking at servers with some level of redundancy, hot-swap drive >capability, support for at least two Pentium Pro or Pentium II CPUs, >support for at least 256KB of RAM, and reasonable expansion capacity to >meet now unknown future needs with mininimal upgrade problems. Initial >application is basic file server functionality under Intranetware, but the >machines will also become the key elements in an intranet implementation. > >We're inclined to look at Compaq, Dell, and HP for starters. Will consider >others, including Micron, ALR, etc, if those are considered to be in the >same class in terms of quality, performance, and reliability. Price >counts, but within a reasonable variance need not be the determining >factor if other technical issues warrant a higher price. Are there >others that should be considered? Which would you consider, rule out, >or most likely buy? Why? I'm just one person but I like not A-brand servers but A-components servers. Meaning that I prefer a server "home-build" (or customer designed build by others) for this gives you more performance, 100% A-brand components, everything you want and for less money if you know what you're doing. Sure, A-brands are "tested & approved". But only for a specific configuration which you may want to alter (if not now then later) eg. driver updates. Whereas A-brand components are "tested & approved" also but on more then one system. Sure, A-brands have greater support. But for some reason or another I'm the one that replaces broken hardware instead of waiting for the service man to arrive. If the hardware breaks that is. Software driver updates? Well, I'm getting those from the one that build that specific component. Helpdesk calls? If needed then rarely to the one that build the server. Sure, A-brands are completly tuned and configured for the task. But if that is a big issue what do you do when you need to put in some new cards? Re-tune the system? Yourself? Sure, A-brands have proven themselves. So did the A-brand components. In fact, you may find A-brand components inside A-brand computers. Sure, A-brands have warranty. So do A-brand components. Sure, A-brands deliver hardware burn-in tested. I would like to see this for myself thank you. Sure, some A-brands come with installed software. Are you sure you don't want to do that yourself? Did I forget something? "Quality, performance, and reliability". Is that determined by the brand label on the outside of the box or by the components that make out your system? Another point to watch for. Almost everyone knows that server downtime equals lots of lost dollars for the company. But so does bad performance of a server. In other words, if the average responce time of a server is improved such that all users in the entire company get one more minute work time instead of wait time each day... how much is that over a year? Not that I encourage people to simply order in a bunch of components and make use of a nearby screwdriver if they're not sure what they are up against. But there may be a local store nearby (what would their response time be in case of hardware troubles?) that is willing to do the assembly for you and put a thing or two in writing beforehand. The message is that one should not just blindly go for A-brands without looking around first. Even if you still only want A-brands you may want to add some extra piece of hardware to it to clear up some potential bottlenecks. And looking around may change your current wish-list of hardware specs for the better. --------- Date: Fri, 20 Jun 1997 11:55:47 -0600 From: Joe Doupnik Subject: Re: Servers for Netware >Another angle on this topic. >More RAM can mean better performance and is needed when using large >disc farms. You are lucky if you can put a system together that can >handle more than 128 MB of RAM by just purchasing "A-Brand" Components. >When purchasing a system from a Compaq, HP etc, you buy the abality to >add more RAM (at a significant cost mind you) --------- It sort of comes down to "trust the vendor" or "trust your own evaluations". Both choices encounter engineering difficulties depending on the mix of components. Personally I don't trust the vendors to design things properly, so I evaluate big names the same as brand X material. I've had my share of troubles with big name systems; oh boy have there been troubles designed in. But then the no-name stuff is equally culpable. I buy based on suitability to purpose, technical support when needed most, price, and not least acceptance by the paying customers. The first critera means I dig in deeply, with hands-on examples. The last means I let the end users have a good look at components they will touch in practice, and I listen to their comments. Servers I choose shrewdly, with a pretty good batting average so far. In the end I trust only what I can test, and even then surprizes occur. Joe D. ------------------------------ Date: Tue, 15 Jul 1997 08:02:36 -0600 From: Joe Doupnik Subject: Re: 10Base-2 coax. >I haven't the slightest idea what you mean by "dropcables" - if .5 m is >permitted between T-connectors, then .5 m is permitted between NICs as well. >Where does the "2.5 m between NICs" fit in to this? ---------- 2.5M separation between taps is for thick wire coax. 0.5M is for thin wire coax. The physics is to prevent accumulation of lumped capacitance of the taps and thus excessive reflections. The above rules are loose, where greater distance is better. A side effect is taps are spaced non-harmonically when computed in wavelengths of the major signal (20MHz) and thus we break up resonance conditions. 20MHz includes the clock transition of Manchester encoding (10M data bits per sec, ditto clock transitions). Joe D. --------- Date: Tue, 15 Jul 1997 23:16:05 +0200 From: "Arthur B." Subject: Re: 10Base-2 coax. >I haven't the slightest idea what you mean by "dropcables" - if .5 m is >permitted between T-connectors, then .5 m is permitted between NICs as well. >Where does the "2.5 m between NICs" fit in to this? Does 'thinktap' ring a bell? It's the only other name I know. Normally you would take a T-connector and bring that to your NIC's BNC. Thus having two cables between your computer and the wall. If someone was to disconnect the coax at the wrong side of the T-connector you would find out very soon. A dropcable solves this. The T-connector is 'replaced' by a wall outlet with a little microswitch inside. It still stays a T-connector ofcourse, but it gets some added features. In this outlet you plug a dropcable (which must have a minimum length of 1 meter, hence the minimum distance of 2.5m between NICs, because the minimum distance between T-connectors is still 0.5 meters) and connect it to the NIC's BNC. The dropcable looks like a single wire but is in fact two. This is why you must substract the doubled physical length of all dropcables you have from the maximum allowed total coax length of 185 meters. However, even if you unplug the dropcable at the wall outlet side (which is equal to disconnecting the coax at the wrong side of a normal T-connector) the little microswitch will insure that no harm will be done to the rest of the network. Personally I like this system. Note: when looking for wall outlets for dropcables you may find wall outlets that have two connectors in them. At first this would seem that this breaks the rule of 0.5 meter between taps (each of the connectors is in fact a tap, 'cause each of them holds one dropcable). However, inside the wall outlet is 0.5 meter of wire put it somehow. So no worry. You do, however, need to count each 'double fitted' wall outlet for two taps when it comes to the maximum allowed number of taps within one segment. Some other physical limitations of a 10Base2 network: * Minimum distance between T-connectors (taps): 0.5 meters or 1.5 feet. In case of dropcables this rule still applies, but also the minimum length between NIC BNC's must be 2.5 meters or 7.5 feet. * Maximum segment length: 185 meters or 607 feet (each side should be terminated, one of them also grounded). In case of dropcables the summerized physical length of all dropcables must be doubled and then substracted from 185. * Maximum network length: 925 meters or 3035 feet. Substract dropcables here too. * Maximum number of allowed segments: 5, but a maximum of 3 segments may be populated. The other two must be so called 'link segments' or unpopulated segments. * Maximum number of taps per segment: 30 (a repeater counts as 1 also). * Maximum allowed number of repeaters between sending & receiving nodes: 2. ------------------------------ Date: Sun, 10 Aug 1997 07:08:59 GMT From: "Eric E. Allen" Subject: Re: Raid Arrays >As I said duplexing made up my mind (and my environment has a large >number of small writes). However, there is a second factor. If I do >duplexed Raid 1 I can use high quality "standard" components. I can, >for instance, buy high quality disks at a reasonable price. If I do >raid 5, I normally end up with proprietary components that are very high >priced, e.g. the same drive in a plastic tray with a different plug and >a fancy label costs me 1.5-2.5x as much. This makes those ratios even >worse. Duplexing and standard drives, gives Raid 1 almost a 3 or 4 to >1 performance cost adavanatge. >John H. Lederer A lot of good info, However: Netware mirroring is a great way to go until you reach about 9GB an then have a drive failure. I have seen it on average take 3-6 hours to remirror the drives. (The time depends upon: cpu usage, drive types, and such.) RAID 5 arrays on average take about 30-90 minutes (depending on drive size) to rebuild the array. The cost advantage of RAID 1 declines as the size of the server capacity increases to 30GB or more. Plus at this high of a capacity performance of RAID 1 is lost when you add cache to the RAID 5 array. As far as 1.5-2.5x the price you pay for drives. This is easily explained: 1. Drive Firmware: Most of the time the firmware found in a drive sold with a sub-system is not the same version that you will find out on the open market. Though the drive will work outside of it's "Plastic Tray" when connected to a standard SCSI controller. However, for performance reasons the firmware has to be developed and tested just like any other software that is written. 2. The drives that are purchased from a storage company have been quality checked and tested more rigously than those purchased from your local computer store. They have actually test the drive instead of just handing you a box with a label. 3. Most store bought drives, even though they are the same exact drive (model, manufacture) as the one from the storage company has a better warranty. 4. As far as the expensive side: Shop around. --------- Date: Mon, 11 Aug 1997 09:04:43 -0600 From: Joe Doupnik Subject: Re: Raid Arrays >One disadvanatge that I would see for controller cache/write >verification is that it would not meet the requirements of some for "end >to end" verfification. The controller would verify that it wrote to disk >what it got from the server -- but cannot verify that what it got from >the server is what the server sent. That is perhaps a reasonable risk, >since the likelihood of a bad cable/connection/whatever would seem low. >John Lederer ------ End to end verification is the very heart of security. That is why Novell added read-back from disk as a normal operation. The problem with caches on the controller is when the system is working fine the cache makes little difference, but when the system is not working fine all heck breaks loose without letting us know. And the system is not working fine in a measurably significant portion of the servers, depending on when one looks. A simple abrupt power loss is a nice way of creating a not-fine situation. Joe D. ------------------------------ Date: Fri, 15 Aug 1997 22:34:51 +0200 From: "Arthur B." Subject: Re: Upgrading A File Server's Motherboard Check out: "Tom's Hardware Guide": http://sysdoc.pair.com/ What's very important is to choose the right BIOS. And don't buy cheap!!! Or you'll pay later on (just my own experience). ------------------------------ Date: Thu, 30 Oct 1997 11:35:10 +0200 From: Mike Glassman - Admin Subject: Mirrored Disks failure So long as your system is still up and running,ergo, your primary mirrored disk fails, but your server I still up, you will not notice anything at all except for a message at your server console stating that one of the disks has failed. When you reboot your system tho, you may have problems. What you should do, is ensure that both disks have identical boot partitions so all it would take is a cable shift for the mirrored disk that is still operating to become your primary, and your old/replaced disk will then be your mirror. Works like a charm. --------- Date: Thu, 30 Oct 1997 09:35:23 -0600 From: Joe Doupnik Subject: Re: Another question about mirroring -Reply >>>If you have two drives mirrored and one fails, does the other >>>automatically take over or is there some intervention that needs to be >>>done to make the backup drive active? I'm specifically talking about >>>mirroring under NW 3.12. >>-------- >> Open server box, reach around and unplug the power >>connector to a drive, observe results. Restore power to drive, >>observe results. Notice that users are unaware of your >>actions, and you need do nothing more than this. >> Joe D. > >Ouch !!!! >I cringe at the thought of doing this, even on a test server !! >Couldn't that severly hurt drive/box/etc... Joe? No, it does not severly hurt the box, nor anything. The drive power is a Molex connector, +5V, ground, ground, +12V. I have done this safely for many many years, particularly hot swapping SCSI drives. The latter drill is: unplug drive power, detach/attach SCSI cable, reapply drive power. David Hanson's large spark was simply a drive drawing huge amounts of startup power which killed his power supply. That used to be the case with old 5.25" boat anchor drives and flimsy power supplies, but those days are long gone. If you feel uneasy orienting a Molex power connector then clearly hire an technician to do that tiny chore. I'm an EE and do know what is going on here. The rest of you can simply recycle power to the server between setups. My point, lest it becomes lost, is run experiments locally to fully satisfy your doubts. Production equipment needs to be known-good, so find out (which means do not ask someone remotely their emotional feelings regarding your server). Joe D. --------- Date: Thu, 30 Oct 1997 11:01:45 -0600 From: Joe Doupnik Subject: Re: Another question about mirroring >It would more easy if you unplug the bus.......not the power --------- No, please. The number of contacts on a SCSI connector are too many, and when they are active the bus is clobbered by partial connections. Use the power plug and things are fine. Joe D. --------- Date: Fri, 31 Oct 1997 15:44:22 -0500 From: "Brien K. Meehan" Subject: How Swappable Drives (Was: Question about mirroring) >Those units are designed to be able to unplug one and plug in the new one >while the server is running full bore. Do those units haev special power >connectors, do they use specially designed hard drives? I always though >those units used plain old scsi drives just like theones we are using in >our servers so it seems that if you can unplug them in a hot swappable >raid device you should be able to do the same in a regular server >(obviously you need to be watchful of ESD when reaching into the server) Generally, they are regular drives in a special "drive tray." I'm most familiar with Compaq, which have regular Seagate/OEM drives in a special tray, which plugs in to a special backplane in a "drive cage," the same way an adapter card plugs into the computer's bus. Power is provided through the card connector, along with the SCSI bus connections. The backplane (I'm misusing that term, but can't think of a better one) is smart enough not to toast the drive when you plug it in, and assigns it a SCSI ID on the Proliants. >Which begs another question: If you have two drives mirrored and one >fails, can you swap the failed one while the server is up and running? If you have the controller mirroring the drive, absolutely. Generally, if you have a hot-swap drive system, you have a controller that is capable of mirroring the drives. If you have Netware mirroring the drives, probably. You can generally take out the failed drive, put in a new one, scan for new devices, and set up the new drive. I've had an incident in particular where a drive in a RAID 5 array failed. The .DSK driver reported the drive failed. I brought my boss to the console, showed him the message, pulled out the failed drive, and handed it to him while it was still spinning. I put a new drive in, and the driver reported that it was rebuilding the array using the new drive. Easy as pie! --------- Date: Fri, 31 Oct 1997 15:07:13 -0600 From: Joe Doupnik Subject: Re: Another question about mirroring >I have to opt for the cringing on this affair. Are you nuts, even thought >you have made us all aware of the laws and principals of electricity >regarding this procedure, what about ESD, along with the risk, however slim >on performing such a procedure. These are your servers, the life blood of >your network. To do anything as stupid ad to power down a drive in mid >operation, just to prove a point. Wouldn't it be easier to say, I have had >good success with mirroring and duplexing, and it does indeed work? >Seth Tilis --------- Ok, that is a safe haven. I am certainly neither nuts nor careless about such matters. I know what I'm doing and why, and the consequences, and surpisingly I'm a more conservative manager/engineer than most people realize. By simply offering the advice I am fully aware that it is reaching a few thousand sites. No proving of points is involved; I leave that to the Me-Tarzan types. Btw, there is no ESD involved when handing components with normal care. But maybe what's normal care for me is not the same as for everyone. Now you know something useful that you may not have before. When the occassion arises to use that information, no matter how uncomfortable you may be with it, you do have the option of going ahead. Joe D. ------------------------------ Date: Sat, 1 Nov 1997 12:34:05 -0500 From: Bud Durland Subject: Re: Problem with two PCI network boards on 4.11 server >I have the following problem with two PCI network boards >(manufactured by SVEC) on NetWare 4.11 server: >Every time I try to load the lan driver for the board I get >following message: > >Multiple boards found. Please specify the board to load the driver >on: 1. [or 2] > >I have tried to add additional parameters such as EtherId= or >Slot=, but it didn't help. > >There were no problems at all when configuring NetWare 4.10 server. >It looks like the OS could not recognize two PCI boards. >When one ISA and PCI board is installed everything is OK. Whenever you load a driver, whether it's for a NIC, disk controller, fax board, or whatever, Netware needs to know where to find it. With ISA cards, this is done with the PORT=??? and INT=??? parameters. On EISA and PCI machines, this is done with the SLOT=??? parameter. In the case of similar multiple boards, you will have to load the driver once for each board. Some drivers are smart enough that if there is only one board in the machine, they load on their own. If you have more than one board, the driver needs to know which one you are wanting to load the driver for. Normally, if you load the driver without the SLOT parameter, the driver will stop and prompt you with the valid choices. Remember the choices given, and you can modify your AUOTEXEC.NCF to automatically supply the information on the LOAD command. --------- Date: Sun, 2 Nov 1997 07:49:15 +0200 From: Mike Glassman - Admin Subject: Problem with two PCI network boards on 4.11 server This is a well known issue on IW servers during an upgrade process. What you need to do is rather simple actually, and altho it doesn't change much as far as the way things are right now, it does fix this problem. Go to the start of the FIRST load line of your PCI NIC and press F5. Mark the line to the end and then copy it as your second NIC, changing only those parameters that are suitable for it. Delete your previous second NIC's line. Do the same for your second Bind line (copy from the first and change parameters) and the problem will dissapear. This happens even if your parameters are similar after the change, I have no idea why tho. ------------------------------ Date: Mon, 3 Nov 1997 16:24:14 +0000 From: Steve Kurz Subject: Re: IPX Segment size >What is the reccomended maximum number of workstation nodes on an >IPX segment? Between the two FORE systems PowerHubs we have about >700 users (each PowerHub assigned one IPX segment, i.e. 01026E04 >and 01026E05) , which seems to me to be too many. It Depends. Most of the hubs have a limit of MAC addresses that they can handle. The smaller hubs used to have a limit of 1024 MAC addresses per bridge tablet, but some of the newer ones show a limit of 8192. The PowerHub 8000 fits in this class. That's the theoretical end of things. The practical, real-world limit would depend on your physical and logical plant, the server configuration and the applications pushing those electrons across the copper. In other words...It Depends. ------------------------------ Date: Tue, 11 Nov 1997 08:20:10 +0100 From: Joop van der Velden Subject: Re: Mirror Disks of different size >We need to upgrade our SYS volume from a 1 Gig drive to a 4.5 Gig >drive. We currently have the 1 Gig drive mirrored to another 1 Gig. >Is there any way I can pull out one of the drives and install the 4.5 >Gig and then mirror the data over from the 1 Gig to the 4.5 Gig. >Then pull out the other 1 Gig and mirror the data to the other 4.5 >Gig. Thanks in advance. Yes this is possible without rebuilding the server from backup. I have done (almost) exactly the same job a few weeks ago on a 3.12. The trick is that you have to resize the netware partition after the remirroring to the new larger disk. This can be done with the resize.exe program that comes with the SNAPBACK program from columbia data prooducts. You can download it from http://www.cdp.com. The steps you have to take (roughly, you must know what you are doing) 1 - remove the mirror 1G disk and install the new 4.5G disk 2 - install DOS partition on the mirror disk and make it equal to the first disk 3 - start your server. It complains about mirror inconsitency's. Dont worry. Break mirroring with INSTALL. 4 - install 1G netware partition of equal data size to original netware partition on other disk. 5 - start mirroring again and wait until ready 6 - remove the other 1G disk and make the 4.5G bootable. (partition active) 7 - start to see if your server comes up ok 8 - You now still have a 1G netware partition and 3.5G unused space ! Now what ? You can only have one netware partition on a disk ! 9 - Run RESIZE.EXE from the SNAPBACK software. It's a DOS program that requires ASPI connectivity to your SCSI drive's. 10 - resize the netware partition from 1G -> 4.5G 11 - start your server again and load INSTALL, VOLUMES, insert new segment to existing volume. Now you see 3.5G available !. Add it to your SYS volume as an extra segment, or make a new volume, whatever you want. 12 - restart server to see if it's ok. Do you have enough memory to cope with the extra diskspace btw ? 13 - install the other new 4.5G disk. Install DOS part. and copy it from the other disk. Start netware, INSTALL, make netware partition and start mirroring Done. This is a relatively easy upgrade thanks to RESIZE.EXE. The official Novell way makes you completely rebuild the server. --------- Date: Tue, 11 Nov 1997 06:03:52 -0500 From: Jerry Shenk Subject: Re: Mirror Disks of different size Wow!!! That really works? I just did this the other day except for the part about reclaiming the unused space...only 1 gig in my case. Wow!!! I gotta try this!! ------------------------------ Date: Thu, 13 Nov 1997 09:35:12 -0600 From: Joe Doupnik Subject: Re: Netware server and MMX/TX Chip set >We had a server that rebooted and corrupted files often (at least twice >a week). Tried SP2, SP3, and not using compression, finally gave up >and replaced with an older system. The only differance between >these systems was the chipset and processor. It's up and working >so this is just a curiosity. > >SystemA ( did not work ) >TX chipset >200Mhz MMX P5 > >SystemB ( did work ) >HX chipset ( I think ) >75Mhz P5 > >Both systems used the same 96Mb EDO memory, same Adaptec >UntraWide Controller, and same seagate drive. > >This has been several weeks, and I did not know of SP4. --------- Intel PCI TX chipsets support memory caches only to memory sizes of 64MB. Memory above that amount is not cached. TX chipsets are for el cheapo Windows desktop boxes. Intel PCI HX chipsets support much much more cached memory. Alas, Intel has stopped producing HX chipsets. TX and HX are for Pentium processors, not PPro's or PII's. Intel has rarely made a decent PCI chipset, and others are further behind. So shop very carefully. 440FX chipsets for the PPro/PII does cache large memory. It's all in the number of wires made available, and one would think even Intel could get this right by now. For more details see the various motherboard NEWS groups. Intel's web site has details of the chipsets but it takes an expert to spot the design limitations (no, they don't come right out and say 64MB max for caching, or if one can cache more if only another tag ram chip is installed). Yes, it's frustrating, to all of us. No, I am not an expert on this stuff, just a defensive sysmgr. Joe D. --------- Date: Thu, 13 Nov 1997 10:25:44 -0700 From: Tim Madden Subject: Re: Netware server and MMX/TX Chip set To add to JoeD's comments, check out Tom's Hardware Guide at http://sysdoc.pair.com/. It is frequently cited by many on both this list and an NT list to which I belong. He discusses things like which chipsets have which limitations and he seems to do an extensive job of testing and reporting on many pieces of HW. ------------------------------ Date: Mon, 17 Nov 1997 13:07:11 -0500 From: "Brien K. Meehan" Subject: Re: Iomega Zip Drivers >I wish to use an Iomega Zip Drive from the server. Is this possible? >If so where can I obtain drivers or nlms and further information. Strangely enough, it is possible, if you have the SCSI flavor of Zip drive. Details abound at www.iomega.com. Use the regular SCSI drivers for your adapter. You should be able to see the SCSI drive as a disk device. Create a Netware partition on the Zip disk. Define a volume on it. Mount it. Cool! You'll find this very un-useful for transportability, because you can't do much with a Netware partition on Zip disk on your DOS-like machine. But you're not asking if it's useful, you're asking if it's possible. --------- Date: Mon, 17 Nov 1997 18:33:12 +0000 From: Richard Letts Subject: Re: Iomega Zip Drivers Uses: (1) Have a SYS: <100 MB , mirror sys: onto the zip drive. Periodically eject the zip drive and store in a fire safe. nb. useful for NetWare 3.x, very dangerous for netware 4.x since an old backup of the NDS is dangerous. Explanation: when objects are deleted from the NDS all of the replicas of the object agree the object is deleted. If you bring back an old copy of the NDS then the object will re-appear in your tree. Timestamps will be wrong, schema extensions won't match.... etc... etc... (2) When setting up a classroom of servers set one machine up, mirror sys: onto zip drive, eject, and re-install on other machines. Aside: I wonder if Linux or UnixWare can read/write netware partitions. ------------------------------ Date: Wed, 26 Nov 1997 09:49:11 +0200 From: Mike Glassman - Admin Subject: Re: IRQ 15 and Adaptec 2940UW >I was reading through the readme file that came with the Adaptec >2940UW in my new server and it says that > >"Novell recommends not to put the HBA on IRQ 15" The problem with IRQ15 has nothing to do with SCSI boards or any boards for that matter. This interrupt is used by Netware as part of the OS. You can see this if you go to the Novell support site and do a query on this issue. Do NOT use Int 15 on any board, it's as simple as that. If you do, you will experience problems such as the one you describe, and in some cases as bad as AbEnds. No NIC or adapter has to sit on Int 15, so if you can, change it. --------- Date: Wed, 26 Nov 1997 13:53:58 -0500 From: Scott Wiersum Subject: IRQ 15 and Adaptec 2940UW (2) >I was reading through the readme file that came with the Adaptec >2940UW in my new server and it says that > >"Novell recommends not to put the HBA on IRQ 15" IRQ 15 is used my many operating systems to collect lost interrupts. (apparently including NetWare, but I'm not sure about DOS) When the processor gets too busy to service every interrupt, it uses IRQ 15 as an "extra chance" to service any interrupt it may have missed. I believe IRQs 11 and 12 are the standard ones for SCSI cards. --------- Date: Tue, 2 Dec 1997 19:04:38 -0600 From: Ian Huggins Subject: Re: IRQ 15 and Adaptec 2940UW >The problem with IRQ15 has nothing to do with SCSI boards or any >boards for that matter. > >This interrupt is used by Netware as part of the OS. You can see this if >you go to the Novell support site and do a query on this issue. This is incorrect, IRQ15 is not used by Netware as part of the OS. The problem with lost interrupts is happening at level lower than the OS, BIOS and below, as the Intel 8259 A/ 82489DX et seq. PIC finds the peripheral device drops its assertion of the IRQ line before the CPU returns the INTA acknowledgement signal Ref. http://support.novell.com/search/kb_index.htm do a search on lost interrupt and read Issues with interrupt 15 On EISA bus systems you will probably have to add a fake board to one of your free EISA slots using your EISA Config Utility, allocate the board IRQ 15 then lock the board to prevent the ECU from dynamically allocating IRQ 15 to another device. I did this on my EISA systems by virtually adding the plainest board I could find a 'generic isa adapter' ------------------------------ Date: Fri, 5 Dec 1997 13:59:03 -0600 From: Joe Doupnik Subject: Re: Netware, IRQ 15 and Plug&Play M/B >A few days ago, Joe D. shared with us the inadvisability of using IRq 15 >with a H/W device because Netware uses it for internal (software) purposes. >My question is "What about those Plug&Play motherboards and BIOS's that >automatically assign IRQ's, including IRQ 15, to hardware? Some BIOSes let >one reserve IRQ's for 'legacy' hardware, but some don't make the choice >available for IRQ 15. I have an Intel Tucson M/B that pre-assigns IRQ15 to >the PCI bridge, or IDE controller. There is no way to reserve IRQ 15 and >prevent it's use by hardware. Am I to conclude that such motherboards >should not be used with Netware? ------------- Your conclusion would be excellent. Often Win/DOS desktop caliber motherboards are not suitable for heavy duty server use, nor for some protected mode operating systems. One needs to shop, and/or tame the BIOS, and/or avoid desktop hardware such as IDE equipment. Joe D. ------------------------------ Date: Wed, 10 Dec 1997 11:23:24 +0000 From: Phil Randal Subject: Re: 5 PCI slots, sharing IRQ's > First, the five PCI slots is a recent Intel "feature" which comes >free with the bonus of using only four IRQs. Tyan has the problem, ASUS >has the problem. It is not a good thing. > The reason the system continues to function is the drivers are >written to share interrupts. Alas, this is not foolproof. If the wrong >driver is awakened it has to probe its board (delicate) and if not the >cause then pull up the next interrupt service routine. We can easily >see cpu time being devoted to sorting out matters. > Honestly I have no faith in the entire system sharing interrupts >under heavy load for long times. I have such a system (ASUS motherboard) >and have had plenty of grief with a stuck/hung server. It gets worse. >Some peripherals have mulitple PCI devices, such as dual channel SCSI >controllers, and they can cause more reuse of IRQ wires. > My feelings on PCI bus designers is not suitable for public forums. >Cheap and inadequate even in the design phase will be sufficient comment. > Joe D. I too have had grief with PCI, especially dual-ported ethernet cards (etc) which use an on card PCI bridge. When Intel designed PCI bridging, they built in some very strange rules about what 'devices' could share interrupts with others, but didn't think of the obvious case of letting lan cards share interrupts with lan cards, but not with SCSI controllers. In the end, to get things reliable, I've had to resort to using a single 100Mb/s ethernet card into a switching hub rather than using dual-ported ethernet cards. Things can only get worse with the PC98 spec which banishes ISA slots from PCs. Time for a PCI 3 spec which allows sensible configuration of devices and interrupts. ------------------------------ Date: Tue, 23 Dec 1997 10:07:30 +0200 From: Mike Glassman - Admin Subject: Re: IBM Server >>My company is looking to upgrade our current Novell server to a IBM >>Server 330. >> >>Has anyone heard anything bad/good about this particular machine? > >Not that particular model but have 300, 310, & 315 and they are all >justing ticking along like fine watches. Got a bunch of management >software with them if that's important to you. >Bill Sneed The 330 familly is one of the better of the servers IBM has put out lately. the reason I say lately is because as far as their servers go (and out site is totaly IBM orientated, out of 950 PC's and Servers, only 5 are not IBM based), they change the numbers and versions as fast as they can, and one never knows from one minute to the next what version will be out next. The thing that is ALWAYS a problem with IBM servers, is that there NEVER is a server that arives which contains the lates BIOS upgrades or EISA upgrades etc. For every server that arives, we have to dld the latest patches off the Internet and run them oureselves. IBM is famous for this issue actually, and I would think puts out more patches, flashes and upgrades than Novell and Microsoft together. The server itself, the 330, is ver fast, much more so that the 300, 310, 315, 320 or 325 versions. It's main problem is that it will not allow you to use the EISA configuration program to change the amount of memory you have onboard, even tho it is an EISA based machine. This of course causes problems under IW or Netware which will not recognise the memory over 32 MB unless you patch the server with the latest Netware OS patches. This is most definetaly an IBM bug, and they have a patch out which will fix the server.exe.....BUT.....the patch works on anything BUT SFT III, so if you are using the 330 for SFT III, do not try the patch. Also, this patch from IBM is not yet fully Netware approved. As far as the 300 version that Bill talked about, it was and is the worst of the IBM PC server familly, slow, and does not go well with IW at all. Once again, this is a bug which IBM knows about and aknowledges, but has not managed to date to fix, even tho there have been multiple BIOS upgrades to try and deal with this. Personaly, if you have a 300 version, change the board to something better. The 310 familly is cool, but if I wanted to use a regular PC as a server, I'd do more than just turn it on it's side and give it a wide base and call it a Mini. The 310 server is fine for Novell, works great so long as you once again ensure the BIOS is upgraded with the latest version. But....this server won't run WinNT version 4. I guess what I'm saying here is that like all servers, IBM's are not perfect, but they work so long as you do the necissary. I like the 330 version best so far. We have it set up as SFT III, 5*4GB disks in an internal Raid-5 bay (21.6GB disk space total), 256MB Ram running 300 users 24/7 and an FTP internall site. Works like a charm. The previous server for this was a PC 320 with 198MB Ram and 8GB disks. The difference in speed between the two servers is astronomical. For eg, in the old server, it took 15 minutes to mount just 8GB of disks, in the new server, the whole server load including mounts and all NLM's takes about 6 minutes from start to end. To bring up one side of the SFT in the old server could take up to 10 minutes after a crash, in the new system it takes less than 2 minutes. Not bad uh :) ------------------------------ Date: Tue, 23 Dec 1997 12:15:53 -0500 From: George Spack Subject: Re: AHA1542b and IDE drives >How to use an IDE drive as the "C" (boot) >drive, using the IDE controller >to access the floppy drives, and use an AHA1542b to access the SCSI drive? The adaptec SCSI controller probably has a floppy controller too. You need to disable the Adaptec floppy port so it does not conflict with the floppy controller on your motherboard (or on an older IDE/Floppy/Serial/Parallel card.) Also make sure the SCSI board does not have any conflicting IRQ/Address settings. As far as the boot options go, make sure that the SCSI controller and disk are not set to boot. The IDE should then be the only boot device found and boot up normally. I'm not sure what effect having the SCSI BIOS enabled or disabled would be on Netware since it loads its own drivers. I have setup a few systems like this in the past that worked well. You may want to throw another IDE drive in there as a D: drive to make a copy of the C: drive. If your C: drive dies and you have no backup, it would not be a fun day in Netwareland. --------- Date: Tue, 23 Dec 1997 19:19:40 -0500 From: Nathan Durland Subject: Re: AHA1542b and IDE drives >I can get the IDE drive and the SCSI drive to work together but I always >get an error on startup...I press F1 and continue and all appears ok. An >exception to this is that I can't access the floppy drives...doesn't matter >if I plug them into the 1542 or the IDE controller. I do this all the time -- it arguably the best way to set up mirrored drives. >What should my BIOS settings be? I figure the computer has to have a drive >type for the IDE drive and NONE for the second drive. What about the floppy >drives? Do I need to disable the BIOS on the 1542 to make things work >correctly? I'll assume that you're using an older machine that has a multi-IO card on it, which is usually IDe/Floppy/Parallel/Serial all in one shot. Since you say you have a 1542, it has a floppy controller as well. This is the first problem. You need to disable one or the other. The One on the 1542 is probably easier (a single switch on the block, if I remember correctly) I'd recommend the following settings for the 1540. Some are set with DIP switches, others are set in the controller's BIOS: MEM D0000 IRQ 11 PORT 330 Translation for drives >1GB = OFF SYNC= PARITY= In the computer's CMOS, you should have just the IDE drive defined; SCSI drives are always "not installed". ------------------------------ Date: Wed, 7 Jan 1998 17:42:03 -0700 From: Joe Doupnik Subject: Re: Use of Netware 3.12 Debugger >>Could somebody please let me know if/where I can get hold of any >>documentation on how to use the Netware 3.12 debugger to analyse >>server crashes? >> >>Can I, for instance, use the debugger to find why a server crashes >>out on an INT3 Breakpoint? > >Interrupt 3 is Com2:, isn't it? What hardware is using Interrupt 3 >if you don't have a second serial port? > >Randy Richardson ----------- No, that's not correct. INT's are software interrupts, dispatched via a double word table residing at the very bottom of physical memory for DOS, and are operating system animals. IRQ's are system bus wires available to peripheral devices. Periphs have no knowledge of nor access to software interrupts. The two are related by the way the 8259 interrupt controller chip pair translates IRQ requests into INT numbers when talking to the cpu, and this process is not at all 1:1. For example IRQ 3 maps to INT 0Bh. A basic PC hardware book will explain these things. INT 3 is the debugger interrupt, a software affair. Joe D. ------------------------------ Date: Sat, 10 Jan 1998 01:11:18 +0000 From: Randy Richardson Subject: Re: A word of caution - Re: strange 4.11 install problem with ni A word of caution with using IRQ 3. Some motherboards that have two integrated serial ports don't really disable the use of IRQ 3 when you disable Com2: in the BIOS. I've found this to create situations of instability especially in servers. My solution to this problem is to avoid the possibility and just use IRQ 10 or 11 instead. Does your system have a sound card installed? I've seen many Novell servers with sound cards installed (it's just the way they came), which usually grab IRQ 5. I recommend you move the sound card to a workstation instead so it will at least get some use. ------------------------------ Date: Sun, 11 Jan 1998 09:49:44 -0700 From: Joe Doupnik Subject: Re: Proper Chipsets for Servers >I was wondering if anyone had any insight regard which Chipset would >be most appropriate for a server. I don't have much knowledge regarding >these issues. I have seen from time to time people saying that the TX >and VX are not made for servers etc... Use the HX etc... Whats the deal? >Where can I find out more about this. Any help would be great. -------- The chipsets to which you refer are Intel PCI support chips. While Intel's web site has full specs only very careful reading will reveal the interesting details. You will get more information by skimming the NEWS groups on motherboards, such as the alt.periph.mainboard.* series, after employing the usual strong nonsense filter appropriate to such forums. In a nut shell, PCI chipsets for regular Pentium cpus have problems caching memory above 64MB, except for the HX flavor and some older PCI/EISA ones. But Intel has stopped making regular Pentium cpus in favor of their latest incarnation as Pentium II's (basically the same chip, spiffed up, and put in a new package), and still the Pentium Pros (ditto). The PCI chipsets for the PII and PPRO are different because the cpus have cache on-board and enough wires and cache memory and such to cache more main memory. So far as I am aware (corrections welcomed) the 64MB cache limit has vanished with these latter guys. A quick way of telling apart PCI chipsets for regular vs "new and improved" Pentium cpus is the number, 430 for regular with external cache, 440 for the newer integrated cache variety. Append a random letter for the current attempt and add X at the end. While shopping may I emphasize that the motherboard's PCI Bios has a great deal to do with usability as a server. One can't tell from the outside but require PCI v2.1 compatibilty as a start. Today we simply must try the entire system as a unit to discover if it works properly, and that includes memory chips too (alas). Goodness, this takes us back to the olden days of systems integration houses. Alternatively, pay lots of good money to the major makers of server class hardware such as Compaq and HP. Joe D. ------------------------------ Date: Sun, 11 Jan 1998 20:15:26 -0800 From: David Cronshaw Subject: Re: Proper Chipsets for Servers >I was wondering if anyone had any insight regard which Chipset would >be most appropriate for a server. I don't have much knowledge regarding >these issues. I have seen from time to time people saying that the TX >and VX are not made for servers etc... Use the HX etc... Whats the deal? >Where can I find out more about this. Any help would be great. For my servers, and for any business computer where time is money, I want a computer that is not only basically reliable, but when something goes wrong with it I want to have to spend the least possible amount of time diagnosing the problem. For these reasons, I always insist on using parity checked RAM. Then, if I'm unlucky enough to have a RAM problem, I will most likely get parity errors and diagnosis will be easy i.e. Mean Time To Repair (MTTR) minimized. Without parity hecked RAM, it can take many crashes and many hours of head scratching to identify a flaky SIMM. Given the above, the VX and TX chipsets are out - they don't support parity checking on main memory. The HX chipset suports parity checking and has been my choice for the past year and a half. However, I'm told the HX chipset is no longer in production. The choice from Intel now is the Pentium Pro/Pentium II chipset, the F?440 chipset. The Pentium Pro was designed as a server CPU, it has better error checking internally too - though I've not yet seen a problem attributed to a CPU chip yet. I've read that AMD has a substitute for the HX430 chipset which supposedly supports ECC as well as Parity (so does the HX430, by spec, but I've had very mixed results getting the HX chipset to work in ECC mode. Apparently, the HX chipset is very fussy about the RAM timing when it comes to ECC mode. Intel says the only RAM chips they have verified ECC mode with is Micron Technologies and they're hard to come by). --------- Date: Sun, 11 Jan 1998 22:39:43 -0500 From: Larry Hansford Subject: Re: Proper Chipsets for Servers Given that, what are the thoughts on EDO memory chips or SDRAM memory chips? It is getting increasingly difficult to find True Parity chips without ordering them. I currently have only True Parity chips in all servers, but I'm curious about the future directions we will be able to go. ------------------------------ Date: Tue, 13 Jan 1998 18:15:12 -0700 From: Joe Doupnik Subject: On cpu's and caching, etc Recently there has been a flurry of messages concerning which CPU to use in servers, and memory limitations associated with various PCI chipsets. Today's Internet NEWS has this message in comp.sys.intel. In our situation it's not running slower that concerns us so much as does it run with all installed memory. For NetWare servers a sneaking suspicion is "not cached" is close to not having the memory available to NW. Confirmation or rejection of this suspicion would be beneficial to the list. While on the topic of server memory, here's another gotcha waiting for us. Memory quality varies a lot and lesser quality can cause a server to crash/abend/halt/latch-up/go-west unexpectedly at odd times, depending on whether the machine has memory parity and whether the memory area holds sensitive information. I have a server with such conditions, using ECC memory correction in the 440FX (PPro) PCI chipset no less: abends with NMI errors about ever other day. Replacement top level quality memory is on order. Be sure to order the best quality memory available for your new server. Joe D. -------------- >>Is that just the HX2 that supports over 64MB ? I have a Gigabyte HX >>(not HX2) and I'm not sure it it will cache above 64 MB. > >If I were writing for a PC tabloid, the teaser headline would read: > > WILL YOUR PC RUN SLOWER IN 1998? > >No, this is not an obscure "Year 2000" problem two years early. It >is a chipset problem that will affect most PCs that upgrade their >RAM to go above 64 MBytes. When you add more RAM above that magic >number, your PC will run slower, dramatically, visibly slower. It's >already happened to me. My former Micron P5-133, based on the >Micronics M54Hi (Intel 430FX chipset) motherboard, ran slower at 96 >MB than at 64. > >Why? - Because that extra RAM is not cached on most PCs (actually, > it is only cached by the on-chip L1 cache, but that is small, > and cache faults occur frequently). On Windows 95 in > particular, tasks are usually dispatched high, into that > UNcached RAM, and they take significant wait-state hits > whenever the CPU's limited 16K cache doesn't contain the > the next instruction or data object required. > >During 1997, 32 MB was a typical RAM configuration for a new >mid-range PC. Many of those users upgraded to 64 MB during the >year. As we entered 1998, 64 MB was becoming a typical RAM config. >During 1998, many users will attempt to install more than 64 MB, and >they will not be happy when they see what ensues (I wasn't). > >With the spot price of DRAM having fallen below $1/MB, adding tons >of RAM is tempting, particularly if you are running memory-intensive >apps like image editors for recent "photo-quality" inkjet printers. >Moving to 128MB can make a difference on these apps, however; don't >go above 64MB unless you know your PC can properly support it. > >Who Is NOT Affected? > >* If you are running a Pentium-II or Pentium PRO, you may leave the > room. The P-II and P6 have their own on-board L2 cache, and have > enough tagram to cache at least 512 MB of main RAM (this 512 limit > will become the cache scandal of the year 2000, assuming society > hasn't collapsed :-) > >* If you are running DOS or Windows 3.1x, you may leave the room, > because you can't even address more than 64MB. If you install > more than 64MB, the OS won't even see it, and presumably will use > only the first (fully cache) 64MB. > >* If you have a machine with a non-Intel chipset (SIS, VIA), you may > or may not be affected. I don't have enough info on how these > chipsets have to be implemented for full RAM cachability. > >* If your PC has no L2 cache (as some stripped-down Pentium-IIs may > later on in 1998), or has fake cache (see final note), you may > leave the room. It won't run any slower with even more uncached > RAM. > >Who IS Affected? > >* Anyone running a Pentium (P5) or PentiumMMX (P55C) using the > common Intel-brand: "FX" (430FX), "TX" (430TX), "VX" (430VX) and > possibly other chipsets, which is the majority of home PCs. > >* Anyone running a P5 or P55C using an Intel-brand "HX" (430HX) > chipset that does not have a Second Cache Tag SRAM, which is most > home HX machines. Most vendors shipping HX boards save a few > cents by using an 8kx8 tag, which only maps 64 MB, when what you > want is a 16kx8 tag, which can map 512 MB. Can you add the tag > later? Often, no. > >But I Have 512K Cache! > >Doesn't matter - this is not about how much L2 Cache SRAM you have, >but how much cache TAG SRAM you have, and whether or not your >chipset supports using that tag to map more than 64 MB main DRAM >into cache. > >But My Machine Can Support 128 MB Total RAM! > >Doesn't matter - this is not about how much RAM your machine can >ACCEPT, or even ADDRESS, but how much it fully SUPPORTS, and >frankly, the claim of "128MB support" on most PCs is at the very >least misleading. However, since this issue doesn't seem to be >widely known yet, I suspect that many PC clone vendors will not even >be able to answer if you ask: "Is the max RAM config fully >cachable?" > >What is Going On Here? > >It appears that Intel is silently creating a PC market segmentation. >The new "TX" chipset only intro'd in 1997, and it has the 64MB cache >barrier. The message seems to be: > - If you want more than 64 MB effective RAM, get a Pentium-II. > >What Was That Remark About Fake Cache? > >If you bought a PC in the last couple of years from a mom&pop >cloner, small mail-order cloner, or some of the sleazier big >cloners, there's a non-trivial chance that the L2 cache chips (if >any) on your motherboard contain no active components, and your BIOS >has been hacked to fraudulently report "256K" or "512K" cache >installed at boot time. Do your chips have the faint text words >"WRITE BACK" on them? Be afraid; be very afraid. Here are some >cache test resources. > >These pages contain info and links to test routines. > >http://poseidon.csci.unt.edu/fakecache/vendors/ > >http://sh1.ro.com/~andy/fake.html > >http://www.sysopt.com/cache.html > >http://www.simtel.net/pub/simtelnet/msdos/sysinfo/cachchk6.zip > >ftp://mina.sr.hp.com/mina/msdos/misc/cachchk6.zip > >You can also get tons of hits at >http://www.dejanews.com/ >with the search string "fake cache" and "64 cache not" (going back >well into 1996). > >Bob Niland >rjn@frii.com ------------------------------ Date: Tue, 13 Jan 1998 21:32:31 -0700 From: Joe Doupnik Subject: Re: Proper Chipsets for Servers >>Given the above, the VX and TX chipsets are out - they don't support >>parity checking on main memory. > >This statement is not true at all. I currently have 4 servers with 128 MB >of Parity RAM on TX motherboards with the Parity check and ECC turned on. >I have replaced 3 chips because of the parity checking. The TX chipset is >the replacement for the HX (this from Intel directly). --------- Unless I have forgotten completely, the last time I looked at the 430TX PCI chipset specs on www.intel.com there was no mention of supporting parity with ECC memory features. That makes me suspicious that your servers are using a better chipset. Alas, finding out means popping the cover and squinting at the chipset numbers, then cross referencing them with the Intel scoop (easily located, "Search" for "430TX", select "spec sheets" and the chip numbers will be shown). The three PCI chips are clearly labeled as such on their covers in bright white ink. Joe D. ------------------------------ Date: Thu, 22 Jan 1998 12:56:26 -0700 From: The Abundant One Subject: motherboards and 64MB cache limits. (addendum) Chipsets for the Pentium Processor Family 430VX = 64MB 430HX = 512MB 430TX = 64MB 430FX = 64MB 430MX = 64MB Chipsets for the Pentium. II Processor Family 440FX = 512MB 440LX = 512MB Chipsets for the Pentium. Pro Processor Family 440FX = 1024MB 450GX = ??? 450KX = ??? The last two chipsets I couldnt find specific numbers for but here is a quote from one of intel's web page that pretty much answers all: "As the L2 cache and tag RAM are internal to Pentium. II processors and Pentium Pro processors, the system cacheability is dependent on the cacheability of the processor. The Pentium II processor can cache a maximum of 512MB of memory while the Pentium Pro processor can cache all addressable memory." -http://support.intel.com/support/chipsets/440fx/index.htm --------- Date: Fri, 23 Jan 1998 23:16:38 -0700 From: Joe Doupnik Subject: Re: motherboards and 64MB cache limits. >How is NetWare affected by this? > >Does it load top down in memory as do other OSs? > >Also, I have been to the Intel site and there is some confusing >information. As stated below, Intel reports that Chipsets for the Ppro can >cache 1024MB. But I thought L2 cache determined the cacheability and Intel >reports that the Ppro chips come with 256k, 512k, or 1MB L2 cache on board. > >Am I missing something here? ------------ Yes, how cache memory operates. Skip the technical details, fixate on max memory as a start. That's the "span" (addressibility) of the cache. The amount of cache then influences the granularity (distinguishable elements). You can also read about memory management strategies for different versions of NetWare in Novell's Application Notes (see their web server). That is very instructive reading. Thus you will find what's loaded at the top of memory, such as volume structures. Joe D. ------------------------------ Date: Sun, 8 Feb 1998 09:34:45 +0100 From: Rainer Scheppelmann Subject: Re: NLM that reports temperatures, voltages, and fan speeds >Is there an NLM that will read the temperatures, voltages, and fan >speeds from these new Pentium II motherboards? If this NLM writes >the information to a log file, that would be even better. I'm using >a motherboard with an Award BIOS. You might want to look at Tobit's Safe-T: http://na.tobit.com/~add-ons/index.htm It doesn't connect to the motherboard's sensors but comes with its own temperature sensors (4 to start with, connected via 8 bit card) and is very expandable. The NLM writes a logfile and there are drivers for other OSs, too. ------------------------------ Date: Tue, 10 Feb 1998 21:43:08 -0700 From: Joe Doupnik Subject: Re: re-mirroring >>The hard drives are 4gb each, hopefully it will be remirrored when I go >>to work in the morning. Hopefully is a good server this is the first >>problem since I brought in online 4 months ago. It's a Gateway 7000. > >Well if you want to see the message that is being displayed on boot up you >should load server -ns and load and mount the drives manually. If you >still can not get it to remirror you can break and recreate the mirror >quite simply. I actually do it when the server crashes during production. >The first thing I do it reboot the server with -ns and load the disk >driver. I run VRepair and the break the mirror and reboot. This way the >server comes right back up and the users can login right away without >having to wait for the drives to remirror. Then later on I remirror >them. > >Israel Forst --------- Yes. However, my preference is to let mirroring proceed in the background. Users can still login etc because mirroring is a low priority thread. To speed mirroring turn off read-after-write checking on the drives. Breaking a mirror can be asking for trouble when trouble is present; the other drive is your safety net. The odds can work the other way by retaining one drive off line while fiddling with the other. Your choice, which is why system managers are paid so highly (sic)! In any case "Don't Panic" and do think of steps that can be taken even in the heat of battle. The server -ns -na test is precisely one such step that should have been available to you, and now it is. Joe D. --------- Date: Tue, 10 Feb 1998 23:57:04 -0500 From: Israel Forst Subject: Re: re-mirroring Joe: I have found that often found that if the sys volume attempts to mount while the mirror is out of sync it will remirror before it allows the users to log in. I assume it is remirroring it as it remounts (just a guess.) It these cases it can take a while for 9 GB drives to remount so after running vrepair and making sure the night before's backup was good I break it. Of course if the nature of the server is not a critical one I would not do it but in most cases the client is loosing money every minute the server is down so I do my best to get it back up ASAP. ------------------------------ Date: Wed, 11 Mar 1998 13:59:35 +0200 From: Mike Glassman - Admin Subject: Re: raid5 vs mirror vs duplex mirror and point of failure >My understanding is with a mirrored-non-duplex system, if the >controller fails, it can trash both drives Defining a crash of both disks as the fact that you are unable to access the data on both disks connected to one controller, then the answer to this is yes. It sometimes happens that a controller crash will also cause disk problems, but this is not something one can say always happens, as it doesn't. And of course, since both didsk are insaccesible, then yes, you have disk failures, but not disk trashing. >I dont know what happens in a mirror-duplex setting if one controller >fails - will the mirrored drive become corrupted? Again, this depends on the type of controller crash. In most cases, a simple replacement of the failed controller to one of the same type, will bring the system up fine. In some case, the disk will be 'trashed' too. In my opinion, it's always a good idea to keep a spare controller for duplexed systems if possible. In the case of a disk crash with a non-controller crash, place disk 2 as 1 and new disk as 2 to full recovery. >in a raid5 setting , it seems if the contoller fails, the entire >subsystem >may fail as well ( meaning corruption, not the obvious "replace the >controller") We had such an issue occur here 2 weeks ago. a Raid 5 IBM controller failed totaly, and in the process trashed 3 out of 6 disks in the raid array. We replaced the controller, rebuilt the disk array after replacing the 3 defective drives and reinstalled Dos, Netware and remirrored for full recovery. This was an SFT system so we were lucky, had it been a standalone issue, we'd have been in trouble, even with backups. The issue of same controller types is even more important under Raid then in standard disk duplexing tho, so keep that in mind. In some of our sites, where we are no SFT'd, and where there is only one server, we are moving all standard mirrors, to duplexed mirrored for better redundancy. Cheep disks and controllers make this option easily available, and a good standby answer. Of course, for those of us with SFT III servers, and who are contemplating the shift to Moab once it's released to get the advantages of it and all it will bring, the only option untill Orion (clustering) will be to use the Vinca Netware support instead of SFT III. Since this is fully compatible for Moab already (works on the beta systems too like a dream), it is a good tween-time solution. ------------------------------