ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Xenvbd issues in Windows Event Viewer

    IT Discussion
    4
    11
    3.6k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • momurdaM
      momurda
      last edited by

      Hello,
      I ve got 2 Xenserver hosts, both of which are running 6.2. On my first day at this new job (back in end of Jan 2016), CEO told me that occasionally things will just freeze and nothing works. Well, this didnt happen for a long time once i started. Until Friday, when i noticed every server on XenServer host 1 was unresponsive for a few minutes. Once i was able to access the vms again, every single linux vm running on the host had gone into read only mode, and every windows vm on the host had many warnings from source Xenvbd, eventid 129, and source: disk eventid 153. Both of these warnings in Windows make me think the disk drivers on this xen host are out of date or incorrect. I did some more searching through event logs, and these warnings (misconfiguration somewhere) has been happening for years! Probably since day 1 when this company went to Xenserver (3-4 years ago probably.)
      I am a bit new to XenServer, was wondering if any of you may know what is going on. I hope upgrading to 6.5 will fix this, as i plan on doing that at some point in the near future.

      1 Reply Last reply Reply Quote 0
      • travisdh1T
        travisdh1
        last edited by

        Upgrading to 6.5 is good, and will probably help.

        Let's look at the storage subsystem first tho. Sounds like something is wrong on the storage front. Do you have any errors in the storage subsystem? If local is an array rebuilding? That sounds like what happens when a local array rebuilds to me at least.

        You also probably want to check that each of the guests has the XenServer Tools installed, which is the package that has any additional drivers the guests need.

        1 Reply Last reply Reply Quote 0
        • momurdaM
          momurda
          last edited by

          I checked the storage array before posting, there is nothing to indicate there are any errors there. I also checked Xenserver Tools install on the servers which were affected, and all of the server that went unresponsive had the same warnings in windows even viewer. the one server that does not have xentools, has no errors or warnings during that time. The version of xenvbd SCSI Adapter is 7.0.0.120 on more than one of the servers with the problem.
          0_1458593852740_Capture.PNG

          this is what im looking at on one server

          1 Reply Last reply Reply Quote 1
          • momurdaM
            momurda
            last edited by

            So, i have been dealing with this for awhile, ive got some more info, and a possible solution though i am just shooting in the dark here still i think.
            when i run :
            lsmod | grep 'iscsi'
            iscsi_tcp 18333 20
            libiscsi_tcp 21043 1 iscsi_tcp
            libiscsi 53218 4 bnx2i,ib_iser,iscsi_tcp,libiscsi_tcp
            scsi_transport_iscsi 77023 6 bnx2i,ib_iser,iscsi_tcp,libiscsi
            scsi_mod 209749 18 bnx2i,ib_iser,iscsi_tcp,libiscsi,scsi_transport_iscsi,sr_mod,sg,isci,libsas,scsi_transport_sas,libata,scsi_dh_rdac,scsi_dh_hp_sw,scsi_dh_emc,scsi_dh_alua,scsi_dh,megaraid_sas,sd_mo

            This seems to show the use of broadcom drivers for my iscsi connections. This host has intel network interfaces. However, my other xenserver host(exact same hardware as problem host) lists the same output for lsmod | grep 'iscsi' and never has this issue.
            If i use
            lsmod -l | grep bnx2i
            bnx2i 55493 0
            cnic 77183 1 bnx2i
            libiscsi 53218 4 bnx2i,ib_iser,iscsi_tcp,libiscsi_tcp
            scsi_transport_iscsi 77023 6 bnx2i,ib_iser,iscsi_tcp,libiscsi
            scsi_mod 209749 17 bnx2i,ib_iser,iscsi_tcp,libiscsi,scsi_transport_iscsi,sg,isci,libsas,scsi_transport_sas,libata,scsi_dh_rdac,scsi_dh_hp_sw,scsi_dh_emc,scsi_dh_alua,scsi_dh,megaraid_sas,sd_mod

            Not sure why either of my Xenhosts (i took over this environment) are using iscsi drivers for nics they dont have installed. The Intel Gigabit driver is there, just not used for iscsi as far as i can tell.

            [root@XS001 log]# lsmod | grep igb
            igb 180177 0
            [root@XS001 log]# modprobe -l | grep igb
            /lib/modules/3.10.0+2/extra/igb.ko
            [root@XS001 log]# ethtool -i eth0
            driver: igb
            version: 5.2.9.4
            firmware-version: 1.61, 0x8000090e
            bus-info: 0000:02:00.0
            supports-statistics: yes
            supports-test: yes
            supports-eeprom-access: yes
            supports-register-dump: yes
            supports-priv-flags: no

            If i am totally wrong about this please tell me...

            1 Reply Last reply Reply Quote 0
            • momurdaM
              momurda
              last edited by

              Oh shoot, forgot to say i have upgraded my xenserver host to 6.5sp1 and all updates released until last week.

              DashrenderD 1 Reply Last reply Reply Quote 1
              • DashrenderD
                Dashrender @momurda
                last edited by Dashrender

                @momurda said in Xenvbd issues in Windows Event Viewer:

                Oh shoot, forgot to say i have upgraded my xenserver host to 6.5sp1 and all updates released until last week.

                Are you still having the lockup issue since upgrading?
                I assume yes because of previous post, but wanted to be sure.

                1 Reply Last reply Reply Quote 0
                • momurdaM
                  momurda
                  last edited by

                  Yes, randomly about once a day, all vms on this host will become unresponsive for about 5 minutes. Linux vms go into read only mode, and windows vms spit out this warning as well as some disk io operation warnings. Then it goes back to normal on windows vms, i just restart the linux vms to get them back to normal.

                  1 Reply Last reply Reply Quote 0
                  • momurdaM
                    momurda
                    last edited by

                    My proposed solution, comes from an old bug report from citrix' jira site.
                    https://bugs.xenserver.org/browse/XSO-241
                    Not specific to my hardware, but the citrix person here says:

                    I also notice the broadcom cnic and bnx2i driver's are loading due to you using iscsi on your host.
                    Can you try running the following commands in dom0 (to disable those modules) and reboot the server:
                    mv /lib/modules/3.10.0+2/extra/bnx2i.ko /lib/modules/3.10.0+2/extra/bnx2i.bak
                    mv /lib/modules/3.10.0+2/extra/cnic.ko /lib/modules/3.10.0+2/extra/cnic.bak
                    mv /lib/modules/3.10.0+2/kernel/drivers/net/ethernet/broadcom/cnic.ko /lib/modules/3.10.0+2/kernel/drivers/net/ethernet/broadcom/cnic.bak
                    depmod -a

                    This seems to just rename the modules to .bak files so that they wont be recognized on boot. I assume then that igb would be loaded for iscsi communication and hopefully solve the issue. Not sure what risks there are, if this messes things up badly i could just rename the files again i think and be back to where i was.

                    1 Reply Last reply Reply Quote 0
                    • DashrenderD
                      Dashrender
                      last edited by

                      where are you booting XS from? USB or SD card?

                      1 Reply Last reply Reply Quote 0
                      • momurdaM
                        momurda
                        last edited by

                        Dashrender's post got me thinking, because i really didnt know the answer, having not set any of this up. I rebooted the server Friday about noon, and it refused to POST. I then remembered this winter, about 4 days after i started here there was a power outage at the office, and that none of the servers were correctly hooked up to the ups' to shutdown gracefully.

                        Ok well, after some more testing Friday due to losing my pool master, i determined the bios on this mobo was bad, and i have replaced the mobo and now there seems to be no problems. I will keep tracking the Windows event log and see if the errors are gone or not over the next week or so.
                        This happened also when i upgraded the XenServer to 6.5sp1(a few reboots from 6.2). I realized that it kept getting stuck most of the time during post, and i would have to hold down power button til it turned off, then turn it back on, and normally it would boot correctly. The errors i were getting were quite strange, and some online resolutions from citrix or 'the internet' hinted at interrupt remapping being the issue for certain intel chipset. But i didnt have the affected chipset.

                        Still mystified about XenServer using the broadcom network drivers with Intel nics though. I will dig into this more later on; just glad to have 2 hosts again now instead of 1.

                        1 Reply Last reply Reply Quote 2
                        • scottalanmillerS
                          scottalanmiller
                          last edited by

                          Awesome, good progress, at least.

                          1 Reply Last reply Reply Quote 1
                          • 1 / 1
                          • First post
                            Last post