ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Replacing the Dead IPOD, SAN Bit the Dust

    IT Discussion
    inverted pyramid of doom architecture ipod san storage virtualization risk
    14
    100
    17.0k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • JaredBuschJ
      JaredBusch
      last edited by

      Not knowing they workload of the two hosts, it is really a no brainer to assume that neither system was pegged.

      I would bulk up storage for the better host and get things there until it cries.

      Then see how much more I got and add things to the second host if required.

      DustinB3403D 1 Reply Last reply Reply Quote 3
      • DustinB3403D
        DustinB3403 @dafyre
        last edited by

        @dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:

        Why not RLS a la StarWind ?

        Because the hosts aren't uniform.

        StrongBadS Oles BorysO 2 Replies Last reply Reply Quote 0
        • DustinB3403D
          DustinB3403 @JaredBusch
          last edited by

          @JaredBusch said in Replacing the Dead IPOD, SAN Bit the Dust:

          Not knowing they workload of the two hosts, it is really a no brainer to assume that neither system was pegged.

          I would bulk up storage for the better host and get things there until it cries.

          Then see how much more I got and add things to the second host if required.

          Pretty much what I was going to say, add storage to the newer more powerful unit (assuming the hosts weren't pegged) and import the data to local storage.

          Get a new backup device for new backups and go from there.

          1 Reply Last reply Reply Quote 0
          • StrongBadS
            StrongBad @DustinB3403
            last edited by

            @DustinB3403 said in Replacing the Dead IPOD, SAN Bit the Dust:

            @dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:

            Why not RLS a la StarWind ?

            Because the hosts aren't uniform.

            Could make them uniform, of course.

            DustinB3403D 1 Reply Last reply Reply Quote 1
            • JaredBuschJ
              JaredBusch
              last edited by

              The better choice for this would be to dump the existing infrastructure and just migrate it all to a @scale solution.

              Yeah it is more expensive than building up the existing hosts.

              But it is obvious the company has no idea what it is doing with this gear. So take that out of the equation by getting on a managed solution.

              scottalanmillerS AconboyA 2 Replies Last reply Reply Quote 3
              • DustinB3403D
                DustinB3403 @StrongBad
                last edited by

                @StrongBad said in Replacing the Dead IPOD, SAN Bit the Dust:

                @DustinB3403 said in Replacing the Dead IPOD, SAN Bit the Dust:

                @dafyre said in Replacing the Dead IPOD, SAN Bit the Dust:

                Why not RLS a la StarWind ?

                Because the hosts aren't uniform.

                Could make them uniform, of course.

                You could, but will the existing system withstand the time needed to get the hardware platform uniform and functional.

                Will both hosts support the same amount of storage, the same RAM, CPU etc.

                Is it cost effective to go down that approach. Versus just getting a single stable server.

                1 Reply Last reply Reply Quote 0
                • scottalanmillerS
                  scottalanmiller
                  last edited by

                  First question is: Is failover needed? Doing the process of "reading back" what was there in the past, there was an EQL SAN single point of failure without a failover device (dual controllers is not failover in any sense.) So historically they've been running without high availability. So the big question is... do they need it now? If high availability is needed now, why wasn't it needed in the past?

                  Two videos worth watching on this:

                  https://mangolassi.it/topic/11324/scott-alan-miller-smb-system-architectural-patterns

                  Youtube Video

                  https://mangolassi.it/topic/25/mainframe-architectural-pattern-for-smb-it-scott-alan-miller-speaking-at-spicecorps-dfw-2012

                  Youtube Video

                  1 Reply Last reply Reply Quote 2
                  • scottalanmillerS
                    scottalanmiller @JaredBusch
                    last edited by

                    @JaredBusch said in Replacing the Dead IPOD, SAN Bit the Dust:

                    The better choice for this would be to dump the existing infrastructure and just migrate it all to a @scale solution.

                    Yeah it is more expensive than building up the existing hosts.

                    But possibly NOT as expensive as replacing the SAN itself. So while it's not cheap compared to what they could do, it might be cheap compared to what they expected to do.

                    JaredBuschJ 1 Reply Last reply Reply Quote 2
                    • JaredBuschJ
                      JaredBusch @scottalanmiller
                      last edited by

                      @scottalanmiller said in Replacing the Dead IPOD, SAN Bit the Dust:

                      @JaredBusch said in Replacing the Dead IPOD, SAN Bit the Dust:

                      The better choice for this would be to dump the existing infrastructure and just migrate it all to a @scale solution.

                      Yeah it is more expensive than building up the existing hosts.

                      But possibly NOT as expensive as replacing the SAN itself. So while it's not cheap compared to what they could do, it might be cheap compared to what they expected to do.

                      Correct, but I am assuming that someone here is telling them to shit on the SAN anyway...

                      scottalanmillerS 1 Reply Last reply Reply Quote 0
                      • scottalanmillerS
                        scottalanmiller @JaredBusch
                        last edited by

                        @JaredBusch said in Replacing the Dead IPOD, SAN Bit the Dust:

                        @scottalanmiller said in Replacing the Dead IPOD, SAN Bit the Dust:

                        @JaredBusch said in Replacing the Dead IPOD, SAN Bit the Dust:

                        The better choice for this would be to dump the existing infrastructure and just migrate it all to a @scale solution.

                        Yeah it is more expensive than building up the existing hosts.

                        But possibly NOT as expensive as replacing the SAN itself. So while it's not cheap compared to what they could do, it might be cheap compared to what they expected to do.

                        Correct, but I am assuming that someone here is telling them to shit on the SAN anyway...

                        Probably. But maybe they don't realize how expensive and bad of an idea that that is. The cost analysis should be crazy.

                        1 Reply Last reply Reply Quote 0
                        • dafyreD
                          dafyre
                          last edited by

                          Are they back in an operation state at the moment, or still waiting on used parts delivery?

                          scottalanmillerS 1 Reply Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller @dafyre
                            last edited by

                            Waiting on parts, but they will have those very soon.

                            wrx7mW 1 Reply Last reply Reply Quote 0
                            • AconboyA
                              Aconboy @JaredBusch
                              last edited by

                              @JaredBusch Not that much more expensive and far more reliable for the job at hand

                              1 Reply Last reply Reply Quote 2
                              • wrx7mW
                                wrx7m @scottalanmiller
                                last edited by

                                @scottalanmiller said in Replacing the Dead IPOD, SAN Bit the Dust:

                                Waiting on parts, but they will have those very soon.

                                The parts mentioned are the SAN controllers?

                                scottalanmillerS 1 Reply Last reply Reply Quote 0
                                • scottalanmillerS
                                  scottalanmiller @wrx7m
                                  last edited by

                                  @wrx7m said in Replacing the Dead IPOD, SAN Bit the Dust:

                                  @scottalanmiller said in Replacing the Dead IPOD, SAN Bit the Dust:

                                  Waiting on parts, but they will have those very soon.

                                  The parts mentioned are the SAN controllers?

                                  Yes, they need two new SAN controllers and one new backplane.

                                  1 Reply Last reply Reply Quote 1
                                  • NerdyDadN
                                    NerdyDad
                                    last edited by

                                    Thanks @scottalanmiller for helping me out with this predicament.

                                    Current status of SAN. Firmware is as updated as it can go right now. I have 2 drives that are rebuilding from a RAID6 array. I have one more drive that is warning me about potential failure but not going to replace it until the other 2 are done rebuilding. The SAN is a Dell EqualLogics PS5000X. Firmware of the controllers are second to the latest firmware.

                                    Host is a Dell PowerEdge R610 with the 86 GB of RAM and 16 vCPUs with VMware ESXi 6.0. This host currently supports 3 VM's, totaling at about 350 GB of production data. 2 of these VM's is on the local datastore of the host, but 1 VM is actually on that SAN that we need. It totals at 220 GB of data. There are no backups (my mistake).

                                    We've tried flipflop failovers with the controllers and it only lasts us so long. Long enough to boot the VM backup but not enough time to actually backup the data. The backplane has been replaced. We've tried replacing controllers and all of the disks turned orange instead of green. We went back with the original controller and array began to operate normally again.

                                    Dell support has advised us to allow for the array to continue rebuilding which was at 17%. Once done, I'm going to attempt to connect to it again and try to pull off the data. Support guy thought that we were overtaxing the SAN and basically freezing it up.

                                    Besides retiring the thing, are there any pointers that I should consider in order to ensure that the backup or migration is a success?

                                    scottalanmillerS 2 Replies Last reply Reply Quote 2
                                    • scottalanmillerS
                                      scottalanmiller
                                      last edited by

                                      I'd say that there are probably three key options for this as broad stroke approaches, each is valuable for its own reasons:

                                      • Mainframe: Just put disks in the local machines and do away with the clustering. The clustering added cost and risk without any actual benefits in the past. So why carry any of that forward. Just put disks into the local machines for the lowest cost, simplest solution. Points of failure are reduced, overall risk is reduced, bottlenecks are removed, flexibility is increased all for the lowest cost of investment. Costs nearly nothing, very effective, no downsides compared to the old solution. All positive movement.

                                      • Self Made Cluster: Replicated Local Disks and a hypervisor with high availability like is in place today. This is more costly and likely means some hardware upgrades to get the two hosts closer together, but at two hosts is very low cost and will provide dramatically more protection than the old approach.

                                      • Hyperconvergence: Do a full update moving to a totally hyperconverged product that provides complete support top to bottom. This is the most costly but replaces all hardware, gets inclusive support and requires the least internal IT effort.

                                      NerdyDadN 1 Reply Last reply Reply Quote 2
                                      • scottalanmillerS
                                        scottalanmiller @NerdyDad
                                        last edited by

                                        @NerdyDad said in Replacing the Dead IPOD, SAN Bit the Dust:

                                        I have 2 drives that are rebuilding from a RAID6 array. I have one more drive that is warning me about potential failure but not going to replace it until the other 2 are done rebuilding.

                                        Oh no, that isn't good. Two lost controllers and two lost drives on RAID 6? What's the projected drive replacement time, a week at least, I would guess. It's almost better to not bother replacing the drives and just take a backup.

                                        NerdyDadN 1 Reply Last reply Reply Quote 1
                                        • scottalanmillerS
                                          scottalanmiller @NerdyDad
                                          last edited by

                                          @NerdyDad said in Replacing the Dead IPOD, SAN Bit the Dust:

                                          Support guy thought that we were overtaxing the SAN and basically freezing it up.

                                          He is likely correct. That is generally expected with a RAID 6 rebuild, especially with two drives rebuilding at once.

                                          1 Reply Last reply Reply Quote 1
                                          • NerdyDadN
                                            NerdyDad @scottalanmiller
                                            last edited by

                                            @scottalanmiller I likely won't put in my last spare drive unless I absolutely have to. My main end goal is to somehow migrate the data and retire the SAN. It went from 0-17% in about 3 hours. I'm going to let it continue and hopefully it will be done in the morning. I will check on it once I get back to the office.

                                            scottalanmillerS 1 Reply Last reply Reply Quote 3
                                            • 1
                                            • 2
                                            • 3
                                            • 4
                                            • 5
                                            • 1 / 5
                                            • First post
                                              Last post