Skip to main content

VMWare ESXi - Hardware monitoring - vSphere host disconnected


Recently a client of mine experienced a sudden disconnect of one of their hosts from vSphere Client. A quick check quickly confirmed that the VMs were still running happily along, however the host was "tagged" as disconnected. Forcing a reconnect did not work, so it was time to get my hands dirty and start digging. 

I connected to the host in question with SSH and changed directory to the location of the syslog files (you do store your logs on a SAN disk right and not on a local drive?). After some quick searches I noticed this entry in the hostd.log file:


2012-12-24T19:01:01.309Z [FFFC2B90 info 'VmkVprobSource'] VmkVprobSource::Post event: (vim.event.EventEx) {
-->    dynamicType = <unset>,
-->    key = 1348822881,
-->    chainId = 761820257,
-->    createdTime = "1970-01-01T00:00:00Z",
-->    userName = "",
-->    datacenter = (vim.event.DatacenterEventArgument) null,
-->    computeResource = (vim.event.ComputeResourceEventArgument) null,
-->    host = (vim.event.HostEventArgument) {
-->       dynamicType = <unset>,
-->       name = "nameofesxhost.local",
-->       host = 'vim.HostSystem:ha-host',
-->    },
-->    vm = (vim.event.VmEventArgument) null,
-->    ds = (vim.event.DatastoreEventArgument) null,
-->    net = (vim.event.NetworkEventArgument) null,
-->    dvs = (vim.event.DvsEventArgument) null,
-->    fullFormattedMessage = <unset>,
-->    changeTag = <unset>,
-->    eventTypeId = "esx.problem.visorfs.inodetable.full",
-->    severity = <unset>,
-->    message = <unset>,
-->    arguments = (vmodl.KeyAnyValue) [
-->       (vmodl.KeyAnyValue) {
-->          dynamicType = <unset>,
-->          key = "1",
-->          value = "tmp:/auto-backup.17245931/etc/ssh/ssh_host_rsa_key",
-->       },
-->       (vmodl.KeyAnyValue) {
-->          dynamicType = <unset>,
-->          key = "2",
-->          value = "tar",
-->       }
-->    ],
-->    objectId = "ha-eventmgr",
-->    objectType = "vim.HostSystem",
-->    objectName = <unset>,
-->    fault = (vmodl.MethodFault) null,
--> }
2012-12-24T19:01:01.326Z [FFFC2B90 info 'ha-eventmgr'] Event 46 : The root filesystem's file table is full.  As a result, the file tmp:/auto-backup.17245931/etc/ssh/ssh_host_rsa_key could not be created by the application 'tar'.

This made me suspect that a volume was running out of space. Some quick commands later (vdf -h and stat -f /) showed that sufficient space and inodes was available. Google to the rescue as I search for "esx.problem.visorfs.inodetable.full" the following article from VMWare popped up:


Apparently the hardware monitoring service (sfcbd) is flooding the /var/run/sfcb directory with files (>5000) and creating the problem. Now you could just do a /etc/init.d/sfcbd-watchdog stop, delete the files in /var/run/sfcb and then run /etc/init.d/sfcbd-watchdog start, however I logged in on the console of the host in question and did a full restart management agent. That did the trick.

Comments

Popular posts from this blog

Monitoring Orchestrator runbook events from Operations Manager

Today I will follow up on my colleague’s post Mr ITblog (Knut Huglen) about monitoring Orchestrator Runbook events.  He has build a nice double up SNMP loopback feature that does self monitoring in Orchestrator resulting in entries written to a special Windows Eventlog. Now we need to raise alerts in SCOM when one of his runbooks fails or sends a platform event, who knows there could be trouble lurking in his paradise.

We are not going to do anything fancy, however these are the steps we will be focusing on today:
Create a Management Pack for our customizations Create rules that collects the events from the orchestrator serverOff we go then and fire up the SCOM console and a powershell window. First we create a MP, I am going to use powershell to do this, however you may use the SCOM console as well (Administration – ManagementPacks – Action: Create Management Pack):



Import the Management Pack into SCOM and move on to the Authoring section in the SCOM console. Create a new rule:



Give the…

Build your local powershell module repository - ProGet

So Windows Powershell Blog released a blog a couple of days ago (link). Not too long after, a discussion emerged about it being to complicated to setup. Even though the required software is open source (nugetgalleryserver), it looks like you need to have Visual Studio Installed to compile it. I looked into doing it without visual stuidio, however I have been unable to come up with a solution. I even tweeted about it since I am not an developer. Maybe someone how is familiar with “msbuild” could do a post on how to do it without VS.

Anyhow one of my twitter-friends (@sstranger) came to the rescue and pointed me in the direction of ProGet, hence the title of this post. ProGet comes in 2 different licensing modes
Free (reduced functionality)Enterprise (paid version with extra features)The good news is that the free version supports hosting a local PowershellGet repository which was my intention anyway. So off we go and create a Configration that can install ProGet for us. This is the conf…

Powershell – Log like you mean it

How do you do logging in powershell? Why should you do logging? What should you log? Where do you put your log? How do you remove your log? How do you search your log? All important questions and how you answer then depends upon what your background is like and the preferences you have. This will be a 2 part blog post and this is part 1.


Why should you log?

Well it is not mandatory, however I have 2 reasons:
Help with debugging a script/module/functionSelf documenting script/module/function
Firstly; Do you know any program that does not contain any bugs? Working with IT for the last 2 decades, I cannot name one. When you create scripts/modules/functions, you will create bugs, that is where they live and try to make your life a living mess.

Secondly: Adding a little extra information to your logging will make them self documenting. Do you like writing documentation? Well I normally am not fond of it and use logging while debugging to get two birds with one stone.


What should you log?

Anyt…