History "Invalid Host State 2"

Hi guys,
I recently discovered a strange behavior in IcingaDB-Web. When clicking on the Host History tab some servers show “Invalid host state 2” and no data.
This behavior started to show after upgrading to IcingaDB.
I already tried to create a new host config (with same host ID), reduce the config to the bare minimum but without success.
Changing the Host ID to something new worked. Changing it back results in the same error. That looks to me as if something got screwed up in the database during the migration.
Running a SQL query with one of the affected hostnames returns seemingly correct values. (see below) The host ID looks strange, but seems to work. (What data type is this?)

Looking at the php file in GitHub, mentioned in the Log, it seems that a state “2” is not expected:

But Host state “2” is DOWN according to this document:

Where does the host state come from?

Any ideas what is happening and if there is a way to get this fixed without having to rename all affected hosts?

Further information follows below.

best regards,
Andreas

Here is the Application Log record:

InvalidArgumentException in /usr/share/icingaweb2/modules/icingadb/library/Icingadb/Common/HostStates.php:71 with message: Invalid host state 2
#0 /usr/share/icingaweb2/modules/icingadb/library/Icingadb/Widget/ItemList/BaseHistoryListItem.php(250): Icinga\Module\Icingadb\Common\HostStates::text()
#1 /usr/share/icinga-php/ipl/vendor/ipl/web/src/Common/BaseListItem.php(121): Icinga\Module\Icingadb\Widget\ItemList\BaseHistoryListItem->assembleVisual()
#2 /usr/share/icinga-php/ipl/vendor/ipl/web/src/Common/BaseListItem.php(141): ipl\Web\Common\BaseListItem->createVisual()
#3 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(344): ipl\Web\Common\BaseListItem->assemble()
#4 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(566): ipl\Html\HtmlDocument->ensureAssembled()
#5 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(390): ipl\Html\HtmlDocument->render()
#6 /usr/share/icinga-php/ipl/vendor/ipl/html/src/BaseHtmlElement.php(297): ipl\Html\HtmlDocument->renderUnwrapped()
#7 /usr/share/icinga-php/ipl/vendor/ipl/html/src/BaseHtmlElement.php(365): ipl\Html\BaseHtmlElement->renderContent()
#8 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(568): ipl\Html\BaseHtmlElement->renderUnwrapped()
#9 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(390): ipl\Html\HtmlDocument->render()
#10 /usr/share/icinga-php/ipl/vendor/ipl/html/src/BaseHtmlElement.php(297): ipl\Html\HtmlDocument->renderUnwrapped()
#11 /usr/share/icinga-php/ipl/vendor/ipl/html/src/BaseHtmlElement.php(365): ipl\Html\BaseHtmlElement->renderContent()
#12 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(568): ipl\Html\BaseHtmlElement->renderUnwrapped()
#13 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(390): ipl\Html\HtmlDocument->render()
#14 /usr/share/icinga-php/ipl/vendor/ipl/html/src/HtmlDocument.php(568): ipl\Html\HtmlDocument->renderUnwrapped()
#15 /usr/share/icinga-php/ipl/vendor/ipl/web/src/Compat/ViewRenderer.php(62): ipl\Html\HtmlDocument->render()
#16 /usr/share/icinga-php/vendor/vendor/shardj/zf1-future/library/Zend/Controller/Action/Helper/ViewRenderer.php(970): ipl\Web\Compat\ViewRenderer->render()
#17 /usr/share/icinga-php/vendor/vendor/shardj/zf1-future/library/Zend/Controller/Action/HelperBroker.php(277): Zend_Controller_Action_Helper_ViewRenderer->postDispatch()
#18 /usr/share/icingaweb2/modules/icingadb/library/Icingadb/Web/Controller.php(484): Zend_Controller_Action_HelperBroker->notifyPostDispatch()
#19 /usr/share/php/Icinga/Web/Controller/Dispatcher.php(76): Icinga\Module\Icingadb\Web\Controller->dispatch()
#20 /usr/share/icinga-php/vendor/vendor/shardj/zf1-future/library/Zend/Controller/Front.php(954): Icinga\Web\Controller\Dispatcher->dispatch()
#21 /usr/share/php/Icinga/Application/Web.php(294): Zend_Controller_Front->dispatch()
#22 /usr/share/php/Icinga/Application/webrouter.php(105): Icinga\Application\Web->dispatch()
#23 /usr/share/icingaweb2/public/index.php(4): require_once(String)
#24 {main}

Test SQL Query:

select host.id,host.name,history.event_type from host inner join history on host.id=history.host_id where host.name="pfhn-vm-dc2.am.p-f.biz";

(skipped a lot of data here)

| <     !▒▒F▒▒▒p۸▒▒▒G▒▒            | pfhn-vm-dc2.am.p-f.biz | state_change   | 1714326007628 |
| <     !▒▒F▒▒▒p۸▒▒▒G▒▒            | pfhn-vm-dc2.am.p-f.biz | state_change   | 1714326122852 |
| <     !▒▒F▒▒▒p۸▒▒▒G▒▒            | pfhn-vm-dc2.am.p-f.biz | state_change   | 1714326238077 |
| <     !▒▒F▒▒▒p۸▒▒▒G▒▒            | pfhn-vm-dc2.am.p-f.biz | notification   | 1714326238147 |
| <     !▒▒F▒▒▒p۸▒▒▒G▒▒            | pfhn-vm-dc2.am.p-f.biz | notification   | 1714326238147 |
| <     !▒▒F▒▒▒p۸▒▒▒G▒▒            | pfhn-vm-dc2.am.p-f.biz | state_change   | 1716442960924 |

26866 rows in set

(I just made the query up to see if I get any reasonable results, no actual code from Icinga here.)
What would be the correct SQL query?

Give as much information as you can, e.g.

  • Icinga Web 2 version 2.12.2
  • Used modules and their versions (System - About)
  • businessprocess 2.5.1
  • dependency_plugin 0.0.0
  • doc 2.12.2
  • grafana 3.0.1
  • icingadb 1.1.3
  • map 2.0.0
  • migrate 2.12.2
  • reactbundle 0.7.0
  • x509 1.3.2
  • Icinga 2 version used (icinga2 --version) r2.14.5-1
  • PHP version used (php --version) 8.0.30
  • Server operating system and version RHEL 8.10

Thanks a lot for this detailed description and please excuse the issue you are facing.

Did you use the icingadb-migrate command to migrate your IDO history to Icinga DB?

Could you please execute the following query and post its output? Feel free to redact the output as you seem fit.

select * from host join state_history on host.id = state_history.host_id where object_type = 'host' and (soft_state = 2 or hard_state = 2);

(Edit: You maybe want to call mysql with the --binary-as-hex flag to display the binary IDs as hex instead of some character soup.)

If the data was migrated from IDO and you still have the IDO database available, please check if you can find a matching entry in the IDO’s icinga_statehistory table. The timestamp of Icinga DB’s state_history.event_time (type bigint, representing Unix timestamp) should match the IDO’s icinga_statehistory.state_time (type timestamp), if migrated.

Hello Alvar,

Did you use the icingadb-migrate command to migrate your IDO history to Icinga DB?

Yes, the data has been migrated with this command.

I exported the result of the Select statement to a CSV file and it has about 60MB and 112000 records. Even with the binary-as-hex switch the output looks largely like “character soup”.
Here is an excerpt with the last four records (slightly edited for readablility)

bg-fs1;10.0.198.249;hostalive;4;60;240;60;y;y;y;y;n;25;30;y;n;master;host;1702895439823;hard;2;1;1;0;1;PING CRITICAL - Packet loss = 100%;4;de-monitor3
jb-esxi01;192.168.84.253;hostalive;4;60;240;60;y;y;y;y;n;25;30;y;n;master;host;1697539602162;soft;2;0;0;1;1;PING CRITICAL - Packet loss = 100%;4;de-monitor3
hh-dc1;192.168.55.251;hostalive;4;60;240;60;y;y;y;y;n;25;30;y;n;master;host;1613743489865;soft;2;0;1;1;3;PING CRITICAL - Packet loss = 100%;4;de-monitor3
jb-cl1;192.168.84.242;hostalive;4;60;240;60;y;y;y;y;n;25;30;y;n;master;host;1689611763787;soft;2;0;1;1;2;PING CRITICAL - Packet loss = 100%;4;de-monitor3

I still have the IDO database, but I can´t get the two datasets to match.
In the IDO DB the field is type timestamp and in the IcingaDB it is bigint?
How do I have to convert the bigint to timestamp?
Using the statement from the link in your last post I tried this:

SELECT sh.statehistory_id, UNIX_TIMESTAMP(sh.state_time) AS state_time, sh.state_time_usec, sh.state, sh.state_type, sh.current_check_attempt, sh.max_check_attempts, sh.last_state, sh.last_hard_state, sh.output, sh.long_output, sh.check_source, o.objecttype_id, o.name1, COALESCE(o.name2, '') AS name2 FROM icinga_statehistory sh USE INDEX (PRIMARY) INNER JOIN icinga_objects o ON o.object_id=sh.object_id where state_time = 1689611763787;

Hi Andreas,

Thanks for your reply and the information.

If I am reading this correctly, the hard;2;1 and soft;2;0 parts should be state_type, soft_state and hard_state. So there seems to be a soft_state of 2 for a host check.

For reference, there are unfortunately two readings of host state: For one, there is the “normalized” version of 0 for UP and 1 for DOWN[1]. And then there is the “raw” version of 0 and 1 for UP, 2 and 3 for DOWN. The first can be found as a Host runtime attribute and the second is referenced in the check result state mapping.

Within the IDO, the normalized state value (0 or 1) should be stored. I have verified this both by reading some Icinga 2 code and within a local test setup.. However, it seems like your data differs.

This may be due to some code changes between Icinga 2 versions.

However, to verify this, could you please execute the following query on your IDO database and post its output? It should show all possible states from the icinga_statehistory table joined against icinga_objects, filtering for host objects (objecttype_id = 1).

select distinct state from icinga_statehistory inner join icinga_objects on icinga_objects.object_id = icinga_statehistory.object_id where objecttype_id = 1;

Your query was almost correct. You can use from_unixtime to get a datetime string from a Unix timestamp. Please note that you have to remove the last three digits from Icinga DB’s timestamp (or divide it by 1 000) since Icinga DB uses milliseconds while the function expects seconds.

Your query should look like. Please post the (redacted) output.

SELECT sh.statehistory_id, UNIX_TIMESTAMP(sh.state_time) AS state_time, sh.state_time_usec, sh.state, sh.state_type, sh.current_check_attempt, sh.max_check_attempts, sh.last_state, sh.last_hard_state, sh.output, sh.long_output, sh.check_source, o.objecttype_id, o.name1, COALESCE(o.name2, '') AS name2 FROM icinga_statehistory sh USE INDEX (PRIMARY) INNER JOIN icinga_objects o ON o.object_id=sh.object_id where state_time = FROM_UNIXTIME(1689611763);

Thanks again for your feedback. I am looking forward for your reply to verify the hypothesis of invalid data in the column. If so, I would create a patch for the icingadb-migrate command.


  1. And 99 for PENDING. But just pretend that this does not exists. ↩︎

Hello Alvar,

However, to verify this, could you please execute the following query on your IDO database and post its output? It should show all possible states from the icinga_statehistory table joined against icinga_objects , filtering for host objects (objecttype_id = 1 ).

Here is the result of the first query:

+-------+
| state |
+-------+
|     0 |
|     1 |
|     2 |
+-------+

And some examples for the second query:

"24653058";"1702895439";"823987";"2";"1";"1";"4";"1";"1";"PING CRITICAL - Packet loss = 100%";"";"de-monitor3";"1";"bg-fs1";""
"27540973";"1717552268";"345009";"2";"0";"2";"4";"1";"0";"PING CRITICAL - Packet loss = 100%";"";"de-monitor3";"1";"bu-cisco2960x-7";""
"27540974";"1717552268";"560440";"3";"0";"1";"4";"0";"0";"No answer from host 10.88.17.35:161";"";"de-monitor3";"2";"bu-fs1";"win-srv-Emp-Sync"
"22454049";"1689611763";"787553";"2";"0";"2";"4";"1";"0";"PING CRITICAL - Packet loss = 100%";"";"de-monitor3";"1";"jb-cl1";""

Is this sufficient or do you need more data?

regards,
Andreas

Thanks again for your feedback.

To ease the readability, I have put your CSV output into a table.

sh.statehistory_id UNIX_TIMESTAMP(sh.state_time) AS state_time sh.state_time_usec sh.state sh.state_type sh.current_check_attempt sh.max_check_attempts sh.last_state sh.last_hard_state sh.output sh.long_output sh.check_source o.objecttype_id o.name1 COALESCE(o.name2 ‘’) AS name2
“24653058” “1702895439” “823987” “2” “1” “1” “4” “1” “1” “PING CRITICAL - Packet loss = 100%” “” “de-monitor3” “1” “bg-fs1” “”
“27540973” “1717552268” “345009” “2” “0” “2” “4” “1” “0” “PING CRITICAL - Packet loss = 100%” “” “de-monitor3” “1” “bu-cisco2960x-7” “”
“27540974” “1717552268” “560440” “3” “0” “1” “4” “0” “0” “No answer from host 10.88.17.35:161” “” “de-monitor3” “2” “bu-fs1” “win-srv-Emp-Sync”
“22454049” “1689611763” “787553” “2” “0” “2” “4” “1” “0” “PING CRITICAL - Packet loss = 100%” “” “de-monitor3” “1” “jb-cl1” “”

Indeed, there are entries with sh.state = 2 while o.objecttype_id = 1. So the state is not the “normalized” one, but the “raw” one.

Again, I went to my test setup and verified that I have only 0 or 1 as a state value for hosts. However, as speculated before, this may be due to Icinga 2 changed through versions.

MariaDB [icinga2]> SELECT distinct sh.state FROM icinga_statehistory sh USE INDEX (PRIMARY) INNER JOIN icinga_objects o ON o.object_id=sh.object_id where o.objecttype_id = 1;
+-------+
| state |
+-------+
|     0 |
|     1 |
+-------+
2 rows in set (0.015 sec)

I am going to create a patch against the icingadb-migrate tool, capping the state for host events. With luck, I am able to do this tomorrow.

Since you do already have the invalid values within your Icinga DB relational database, you can clean them up as follows. Please proceed with caution and create a backup beforehand. If something goes wrong and you have no backup, you are on your own. (Please excuse the frightening disclaimer.)

update state_history set soft_state = 1 where id in (select state_history.id from host join state_history on host.id = state_history.host_id where object_type = 'host' and soft_state between 2 and 98);
update state_history set hard_state = 1 where id in (select state_history.id from host join state_history on host.id = state_history.host_id where object_type = 'host' and hard_state between 2 and 98);

Yes, our Icinga2 Installation is very old and was originally a Nagios, then Icinga1 and now Icinga2 with IcingaDB. It has seen some stuff. :slight_smile:

I will update the existing records and let you know about the results. But I will probably not be able to do so before next week.

Thanks for your support already. Much appreciated!

Hello Alvar,

So I just ran the two SQL Statements you provided to me. Seems that the problem was only caused by soft state records:

MariaDB [icingadb]> update state_history set soft_state = 1 where id in (select state_history.id from host join state_history on host.id = state_history.host_id where object_type = 'host' and soft_state between 2 and 98);
Query OK, 108216 rows affected (2 min 16.338 sec)
Rows matched: 108216  Changed: 108216  Warnings: 0

MariaDB [icingadb]> update state_history set hard_state = 1 where id in (select state_history.id from host join state_history on host.id = state_history.host_id where object_type = 'host' and hard_state between 2 and 98);
Query OK, 0 rows affected (2 min 3.710 sec)
Rows matched: 0  Changed: 0  Warnings: 0

So far the machines that showed the wrong behavior are looking good now.

Thanks for the support.

best regards,
Andreas

I am glad to hear this!

There is now https://github.com/Icinga/icingadb/pull/975, addressing this very issue in the icingadb-migrate utility for future users.

Thanks again for your detailed bug report, the provided information and your patience.

I am going to mark this thread as resolved. If you still experience any issues, feel free to continue here.