Skip to main content

VMWare ESXi - Hardware monitoring - vSphere host disconnected


Recently a client of mine experienced a sudden disconnect of one of their hosts from vSphere Client. A quick check quickly confirmed that the VMs were still running happily along, however the host was "tagged" as disconnected. Forcing a reconnect did not work, so it was time to get my hands dirty and start digging. 

I connected to the host in question with SSH and changed directory to the location of the syslog files (you do store your logs on a SAN disk right and not on a local drive?). After some quick searches I noticed this entry in the hostd.log file:


2012-12-24T19:01:01.309Z [FFFC2B90 info 'VmkVprobSource'] VmkVprobSource::Post event: (vim.event.EventEx) {
-->    dynamicType = <unset>,
-->    key = 1348822881,
-->    chainId = 761820257,
-->    createdTime = "1970-01-01T00:00:00Z",
-->    userName = "",
-->    datacenter = (vim.event.DatacenterEventArgument) null,
-->    computeResource = (vim.event.ComputeResourceEventArgument) null,
-->    host = (vim.event.HostEventArgument) {
-->       dynamicType = <unset>,
-->       name = "nameofesxhost.local",
-->       host = 'vim.HostSystem:ha-host',
-->    },
-->    vm = (vim.event.VmEventArgument) null,
-->    ds = (vim.event.DatastoreEventArgument) null,
-->    net = (vim.event.NetworkEventArgument) null,
-->    dvs = (vim.event.DvsEventArgument) null,
-->    fullFormattedMessage = <unset>,
-->    changeTag = <unset>,
-->    eventTypeId = "esx.problem.visorfs.inodetable.full",
-->    severity = <unset>,
-->    message = <unset>,
-->    arguments = (vmodl.KeyAnyValue) [
-->       (vmodl.KeyAnyValue) {
-->          dynamicType = <unset>,
-->          key = "1",
-->          value = "tmp:/auto-backup.17245931/etc/ssh/ssh_host_rsa_key",
-->       },
-->       (vmodl.KeyAnyValue) {
-->          dynamicType = <unset>,
-->          key = "2",
-->          value = "tar",
-->       }
-->    ],
-->    objectId = "ha-eventmgr",
-->    objectType = "vim.HostSystem",
-->    objectName = <unset>,
-->    fault = (vmodl.MethodFault) null,
--> }
2012-12-24T19:01:01.326Z [FFFC2B90 info 'ha-eventmgr'] Event 46 : The root filesystem's file table is full.  As a result, the file tmp:/auto-backup.17245931/etc/ssh/ssh_host_rsa_key could not be created by the application 'tar'.

This made me suspect that a volume was running out of space. Some quick commands later (vdf -h and stat -f /) showed that sufficient space and inodes was available. Google to the rescue as I search for "esx.problem.visorfs.inodetable.full" the following article from VMWare popped up:


Apparently the hardware monitoring service (sfcbd) is flooding the /var/run/sfcb directory with files (>5000) and creating the problem. Now you could just do a /etc/init.d/sfcbd-watchdog stop, delete the files in /var/run/sfcb and then run /etc/init.d/sfcbd-watchdog start, however I logged in on the console of the host in question and did a full restart management agent. That did the trick.

Comments

Popular posts from this blog

Serialize data with PowerShell

Currently I am working on a big new module. In this module, I need to persist data to disk and reprocess them at some point even if the module/PowerShell session was closed. I needed to serialize objects and save them to disk. It needed to be very efficient to be able to support a high volume of objects. Hence I decided to turn this serializer into a module called HashData.



Other Serializing methods

In PowerShell we have several possibilities to serialize objects. There are two cmdlets you can use which are built in:
Export-CliXmlConvertTo-JSON
Both are excellent options if you do not care about the size of the file. In my case I needed something lean and mean in terms of the size on disk for the serialized object. Lets do some tests to compare the different types:


(Hashdata.Object.ps1)

You might be curious why I do not use the Export-CliXML cmdlet and just use the [System.Management.Automation.PSSerializer]::Serialize static method. The static method will generate the same xml, however we …

Build your local powershell module repository - ProGet

So Windows Powershell Blog released a blog a couple of days ago (link). Not too long after, a discussion emerged about it being to complicated to setup. Even though the required software is open source (nugetgalleryserver), it looks like you need to have Visual Studio Installed to compile it. I looked into doing it without visual stuidio, however I have been unable to come up with a solution. I even tweeted about it since I am not an developer. Maybe someone how is familiar with “msbuild” could do a post on how to do it without VS.

Anyhow one of my twitter-friends (@sstranger) came to the rescue and pointed me in the direction of ProGet, hence the title of this post. ProGet comes in 2 different licensing modes
Free (reduced functionality)Enterprise (paid version with extra features)The good news is that the free version supports hosting a local PowershellGet repository which was my intention anyway. So off we go and create a Configration that can install ProGet for us. This is the conf…

Monitoring Orchestrator runbook events from Operations Manager

Today I will follow up on my colleague’s post Mr ITblog (Knut Huglen) about monitoring Orchestrator Runbook events.  He has build a nice double up SNMP loopback feature that does self monitoring in Orchestrator resulting in entries written to a special Windows Eventlog. Now we need to raise alerts in SCOM when one of his runbooks fails or sends a platform event, who knows there could be trouble lurking in his paradise.

We are not going to do anything fancy, however these are the steps we will be focusing on today:
Create a Management Pack for our customizations Create rules that collects the events from the orchestrator serverOff we go then and fire up the SCOM console and a powershell window. First we create a MP, I am going to use powershell to do this, however you may use the SCOM console as well (Administration – ManagementPacks – Action: Create Management Pack):



Import the Management Pack into SCOM and move on to the Authoring section in the SCOM console. Create a new rule:



Give the…