ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Windows Server 2003 Cluster Dead

    IT Discussion
    windows windows server windows server 2003 windows 2003 cluster windows cluster storageworks 500 storageworks 500 g2 das
    9
    29
    2.0k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller
      last edited by

      Dealing with a Windows Server 2003 two node cluster with an HP StorageWorks 500 G2 DAS unit (SCSI attached). The cluster died this morning.

      Node 1 is up and running, but cannot start the cluster. Cluster Manager opens and doesn't even list the cluster. We assume that because of this, the DAS (RAID via SCSI) shared storage is not mounted, since the tool to mount it never fires.

      Node 2 is not up, it is down and doesn't even provide output to the console (blank screen) and cannot be pinged. So the assumption is, is that the hardware has died.

      In theory the cluster's purpose was to fail over. But now it appears that the cluster itself has caused the outage. Anyone know how to get this fixed and up and running with the remaining node?

      1 Reply Last reply Reply Quote 2
      • scottalanmillerS
        scottalanmiller
        last edited by

        Looks like the power went out on the node last night, which might have been a trigger.

        1 Reply Last reply Reply Quote 0
        • scottalanmillerS
          scottalanmiller
          last edited by

          Yesterday morning, we got a SCSI error that the RAID didn't respond in time.

          1 Reply Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller
            last edited by

            The power down happened about 18 hours after the SCSI error.

            1 Reply Last reply Reply Quote 0
            • scottalanmillerS
              scottalanmiller
              last edited by

              Last Event Log from Node 2 was 7 hours after the SCSI event, 11 hours before the Node 1 power cycle.

              1 Reply Last reply Reply Quote 0
              • scottalanmillerS
                scottalanmiller
                last edited by

                13 hours AFTER the power cycle, the event log reports that the Quorum disk "Q" cannot be found.

                1 Reply Last reply Reply Quote 0
                • scottalanmillerS
                  scottalanmiller
                  last edited by

                  Obviously once the Q disk was missing, it could not join the cluster.

                  1 Reply Last reply Reply Quote 0
                  • scottalanmillerS
                    scottalanmiller
                    last edited by

                    Doing a controller power cycle now. Bringing down the physical cluster now, then the DAS. Then going to power on the DAS, give it time, and bring the nodes up. Expect very little, but it is a place to start.

                    1 Reply Last reply Reply Quote 1
                    • scottalanmillerS
                      scottalanmiller
                      last edited by

                      DAS is powering up, lights on it do not look good.

                      1 Reply Last reply Reply Quote 0
                      • scottalanmillerS
                        scottalanmiller
                        last edited by scottalanmiller

                        10 drives in the array, believed to be RAID 10. 2 drives in RAID 1 as well.

                        1 Reply Last reply Reply Quote 0
                        • scottalanmillerS
                          scottalanmiller
                          last edited by scottalanmiller

                          One drive in the large array is flashing orange, so looks like one drive has failed.

                          All other drives are green.

                          1 Reply Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller
                            last edited by

                            Bringing up Node 1 again now. With only one drive failed in the DAS unit, any RAID (other than RAID 0) should have survived.

                            1 Reply Last reply Reply Quote 0
                            • scottalanmillerS
                              scottalanmiller
                              last edited by

                              Okay, that process brought things up. Not the cluster, but the disks are back. We can see the Quorum plus other disks now.

                              1 Reply Last reply Reply Quote 0
                              • scottalanmillerS
                                scottalanmiller
                                last edited by

                                Trying to bring up Node 2 now, but I'm not hopeful on that.

                                1 Reply Last reply Reply Quote 0
                                • scottalanmillerS
                                  scottalanmiller
                                  last edited by

                                  Node 1 is healthy, Node 2 is gone. Cluster won't come up, but the workloads did. So they are good for now.

                                  1 Reply Last reply Reply Quote 3
                                  • DanpD
                                    Danp
                                    last edited by

                                    Do they have a plan to replace this outdated tech with something current?

                                    scottalanmillerS 1 Reply Last reply Reply Quote 0
                                    • scottalanmillerS
                                      scottalanmiller @Danp
                                      last edited by

                                      @Danp said in Windows Server 2003 Cluster Dead:

                                      Do they have a plan to replace this outdated tech with something current?

                                      Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.

                                      FATeknollogeeF 1 Reply Last reply Reply Quote 1
                                      • FATeknollogeeF
                                        FATeknollogee @scottalanmiller
                                        last edited by

                                        @scottalanmiller said in Windows Server 2003 Cluster Dead:

                                        @Danp said in Windows Server 2003 Cluster Dead:

                                        Do they have a plan to replace this outdated tech with something current?

                                        Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.

                                        Love it when that happens!!

                                        DustinB3403D 1 Reply Last reply Reply Quote 0
                                        • DustinB3403D
                                          DustinB3403 @FATeknollogee
                                          last edited by

                                          @FATeknollogee said in Windows Server 2003 Cluster Dead:

                                          @scottalanmiller said in Windows Server 2003 Cluster Dead:

                                          @Danp said in Windows Server 2003 Cluster Dead:

                                          Do they have a plan to replace this outdated tech with something current?

                                          Yes, there was a six month plan in place already, but it just got moved to something like a six day plan.

                                          Love it when that happens!!

                                          The system is still f***** because they have to replace it today and they have to worry about good backups today.

                                          2003 is ancient

                                          scottalanmillerS 1 Reply Last reply Reply Quote 0
                                          • ObsolesceO
                                            Obsolesce
                                            last edited by

                                            I'm guessing the thing hasn't been maintained at all which would have brought this about sooner but in a controlled manner.

                                            scottalanmillerS 1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 1 / 2
                                            • First post
                                              Last post