IcingaDB failing exactly 5 minutes after start

Hey,

I’m facing some weird IcingaDB behaviour after upgrade:

  • Icinga2 from 2.13 to 2.14
  • IcingaDB from v1.1.0 to v1.2.0

It is Debian 11 and to upgrade I’m using apt upgrade.

Everything seems to be running fine, but after 5 minutes (every time) icingadb is failing with the following error. I was trying to look to what might cause the issue, but haven’t found anything reasonable.

Aug 06 04:32:38 chc-prd-vici01 icingadb[611247]: history-sync: Synced 1 state history items
Aug 06 04:32:44 chc-prd-vici01 icingadb[611247]: runtime-updates: Upserted 1 ServiceState items
Aug 06 04:33:04 chc-prd-vici01 icingadb[611247]: config-sync: Aborted config sync after 5m0.177655907s
Aug 06 04:33:05 chc-prd-vici01 icingadb[611247]: Error 1048 (23000): Column 'flatvalue' cannot be null
                                                 can't perform "INSERT INTO \"customvar_flat\" (\"flatname_checksum\", \"flatvalue\", \"customvar_id\", \"environment_id\", \"id\", \"flatname\") VALUES (:flatname_checksum, :flatvalue, :customvar_id, :environment_id, :id, :flatname)"
                                                 github.com/icinga/icingadb/internal.CantPerformQuery
                                                         github.com/icinga/icingadb/internal/internal.go:30
                                                 github.com/icinga/icingadb/pkg/icingadb.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2.1
                                                         github.com/icinga/icingadb/pkg/icingadb/db.go:412
                                                 github.com/icinga/icingadb/pkg/retry.WithBackoff
                                                         github.com/icinga/icingadb/pkg/retry/retry.go:60
                                                 github.com/icinga/icingadb/pkg/icingadb.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2
                                                         github.com/icinga/icingadb/pkg/icingadb/db.go:407
                                                 golang.org/x/sync/errgroup.(*Group).Go.func1
                                                         golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78
                                                 runtime.goexit
                                                         runtime/asm_amd64.s:1695
                                                 retry deadline exceeded
                                                 github.com/icinga/icingadb/pkg/retry.WithBackoff
                                                         github.com/icinga/icingadb/pkg/retry/retry.go:95
                                                 github.com/icinga/icingadb/pkg/icingadb.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2
                                                         github.com/icinga/icingadb/pkg/icingadb/db.go:407
                                                 golang.org/x/sync/errgroup.(*Group).Go.func1
                                                         golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78
                                                 runtime.goexit
                                                         runtime/asm_amd64.s:1695
Aug 06 04:33:05 chc-prd-vici01 systemd[1]: icingadb.service: Main process exited, code=exited, status=1/FAILURE
Aug 06 04:33:05 chc-prd-vici01 systemd[1]: icingadb.service: Failed with result 'exit-code'.
Aug 06 04:33:05 chc-prd-vici01 systemd[1]: icingadb.service: Consumed 4.839s CPU time.

This is how it looks like with icingadb debug logs enabled:

Aug 06 04:53:21 chc-prd-vici01 icingadb[612338]: database: Finished executing "INSERT INTO \"customvar_flat\" (\"environment_id\", \"id\", \"flatname\", \"flatname_checksum\", \"flatvalue\", \"customvar_id\") VALUES (:environment_id,:id,:flatname,:flatname_checksum,:flatvalue,:customvar_id) ON DUPLICATE KEY UPDATE \"environment_id\" = VALUES(\"environment_id\"),\"id\" = VALUES(\"id\"),\"flatname\" = VALUES(\"flatname\"),\"flatname_checksum\" = VALUES(\"flatname_checksum\"),\"flatvalue\" = VALUES(\"flatvalue\"),\"customvar_id\" = VALUES(\"customvar_id\")" with 0 rows in 4m59.707201867s
Aug 06 04:53:21 chc-prd-vici01 icingadb[612338]: database: Finished executing "INSERT INTO \"host_state\" (\"performance_data\", \"environment_id\", \"host_id\", \"hard_state\", \"next_check\", \"next_update\", \"output\", \"previous_hard_state\", \"is_problem\", \"soft_state\", \"execution_time\", \"is_reachable\", \"previous_soft_state\", \"is_flapping\", \"properties_checksum\", \"severity\", \"acknowledgement_comment_id\", \"long_output\", \"id\", \"check_attempt\", \"is_acknowledged\", \"state_type\", \"check_timeout\", \"scheduling_source\", \"last_state_change\", \"latency\", \"is_handled\", \"in_downtime\", \"check_commandline\", \"check_source\", \"last_comment_id\", \"normalized_performance_data\", \"last_update\") VALUES (:performance_data,:environment_id,:host_id,:hard_state,:next_check,:next_update,:output,:previous_hard_state,:is_problem,:soft_state,:execution_time,:is_reachable,:previous_soft_state,:is_flapping,:properties_checksum,:severity,:acknowledgement_comment_id,:long_output,:id,:check_attempt,:is_acknowledged,:state_type,:check_timeout,:scheduling_source,:last_state_change,:latency,:is_handled,:in_downtime,:check_commandline,:check_source,:last_comment_id,:normalized_performance_data,:last_update) ON DUPLICATE KEY UPDATE \"performance_data\" = VALUES(\"performance_data\"),\"environment_id\" = VALUES(\"environment_id\"),\"host_id\" = VALUES(\"host_id\"),\"hard_state\" = VALUES(\"hard_state\"),\"next_check\" = VALUES(\"next_check\"),\"next_update\" = VALUES(\"next_update\"),\"output\" = VALUES(\"output\"),\"previous_hard_state\" = VALUES(\"previous_hard_state\"),\"is_problem\" = VALUES(\"is_problem\"),\"soft_state\" = VALUES(\"soft_state\"),\"execution_time\" = VALUES(\"execution_time\"),\"is_reachable\" = VALUES(\"is_reachable\"),\"previous_soft_state\" = VALUES(\"previous_soft_state\"),\"is_flapping\" = VALUES(\"is_flapping\"),\"properties_checksum\" = VALUES(\"properties_checksum\"),\"severity\" = VALUES(\"severity\"),\"acknowledgement_comment_id\" = VALUES(\"acknowledgement_comment_id\"),\"long_output\" = VALUES(\"long_output\"),\"id\" = VALUES(\"id\"),\"check_attempt\" = VALUES(\"check_attempt\"),\"is_acknowledged\" = VALUES(\"is_acknowledged\"),\"state_type\" = VALUES(\"state_type\"),\"check_timeout\" = VALUES(\"check_timeout\"),\"scheduling_source\" = VALUES(\"scheduling_source\"),\"last_state_change\" = VALUES(\"last_state_change\"),\"latency\" = VALUES(\"latency\"),\"is_handled\" = VALUES(\"is_handled\"),\"in_downtime\" = VALUES(\"in_downtime\"),\"check_commandline\" = VALUES(\"check_commandline\"),\"check_source\" = VALUES(\"check_source\"),\"last_comment_id\" = VALUES(\"last_comment_id\"),\"normalized_performance_data\" = VALUES(\"normalized_performance_data\"),\"last_update\" = VALUES(\"last_update\")" with 0 rows in 4m59.707277104s
Aug 06 04:53:21 chc-prd-vici01 icingadb[612338]: database: Finished executing "DELETE FROM \"service_state\" WHERE id IN (?)" with 0 rows in 4m59.707266439s
Aug 06 04:53:21 chc-prd-vici01 icingadb[612338]: database: Finished executing "DELETE FROM \"host_state\" WHERE id IN (?)" with 0 rows in 4m59.707306438s
Aug 06 04:53:21 chc-prd-vici01 icingadb[612338]: database: Finished executing "INSERT INTO \"customvar\" (\"value\", \"environment_id\", \"name\", \"name_checksum\", \"id\") VALUES (:value,:environment_id,:name,:name_checksum,:id) ON DUPLICATE KEY UPDATE \"value\" = VALUES(\"value\"),\"environment_id\" = VALUES(\"environment_id\"),\"name\" = VALUES(\"name\"),\"name_checksum\" = VALUES(\"name_checksum\"),\"id\" = VALUES(\"id\")" with 0 rows in 4m59.707282198s
Aug 06 04:53:21 chc-prd-vici01 icingadb[612338]: database: Executed "INSERT INTO \"sla_history_state\" (\"hard_state\", \"object_type\", \"host_id\", \"service_id\", \"event_time\", \"environment_id\", \"endpoint_id\", \"id\", \"previous_hard_state\") VALUES (:hard_state,:object_type,:host_id,:service_id,:event_time,:environment_id,:endpoint_id,:id,:previous_hard_state) ON DUPLICATE KEY UPDATE \"id\" = VALUES(\"id\")" with 1 rows
Aug 06 04:53:22 chc-prd-vici01 icingadb[612338]: Error 1048 (23000): Column 'flatvalue' cannot be null
                                                 can't perform "INSERT INTO \"customvar_flat\" (\"id\", \"flatname\", \"flatname_checksum\", \"flatvalue\", \"customvar_id\", \"environment_id\") VALUES (:id, :flatname, :flatname_checksum, :flatvalue, :customvar_id, :environment_id)"
                                                 github.com/icinga/icingadb/internal.CantPerformQuery
                                                         github.com/icinga/icingadb/internal/internal.go:30
                                                 github.com/icinga/icingadb/pkg/icingadb.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2.1
                                                         github.com/icinga/icingadb/pkg/icingadb/db.go:412
                                                 github.com/icinga/icingadb/pkg/retry.WithBackoff
                                                         github.com/icinga/icingadb/pkg/retry/retry.go:60
                                                 github.com/icinga/icingadb/pkg/icingadb.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2
                                                         github.com/icinga/icingadb/pkg/icingadb/db.go:407
                                                 golang.org/x/sync/errgroup.(*Group).Go.func1
                                                         golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78
                                                 runtime.goexit
                                                         runtime/asm_amd64.s:1695
                                                 retry deadline exceeded
                                                 github.com/icinga/icingadb/pkg/retry.WithBackoff
                                                         github.com/icinga/icingadb/pkg/retry/retry.go:95
                                                 github.com/icinga/icingadb/pkg/icingadb.(*DB).NamedBulkExec.func1.(*DB).NamedBulkExec.func1.1.2
                                                         github.com/icinga/icingadb/pkg/icingadb/db.go:407
                                                 golang.org/x/sync/errgroup.(*Group).Go.func1
                                                         golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78
                                                 runtime.goexit
                                                         runtime/asm_amd64.s:1695
Aug 06 04:53:22 chc-prd-vici01 systemd[1]: icingadb.service: Main process exited, code=exited, status=1/FAILURE
Aug 06 04:53:22 chc-prd-vici01 systemd[1]: icingadb.service: Failed with result 'exit-code'.
Aug 06 04:53:22 chc-prd-vici01 systemd[1]: icingadb.service: Consumed 3.352s CPU time.

Can you please share the result of the following query on your database:

select * from icingadb_schema order by timestamp;

I already have an idea where the problem might come from:

When doing this upgrade, which of the schema upgrades did you apply? You have to apply all intermediate ones, in this case also 1.1.1.sql which happens to be where the not null constraint on flatvalue was removed.

1 Like

I have DEV and PROD icinga servers deployments. DEV is still in a broken state, so I run queries on two different databases, but they are almost identical deployments.

That’s the result before upgrade v1.1.0 (PROD icinga):

mysql> select * from icingadb_schema order by timestamp;
+----+---------+-------------------+
| id | version | timestamp         |
+----+---------+-------------------+
|  1 |       3 | 20230523062347000 |
+----+---------+-------------------+
1 row in set (0.00 sec)

and after the upgrade v1.2.0 (DEV icinga):

mysql> select * from icingadb_schema order by timestamp;
+----+---------+---------------+
| id | version | timestamp     |
+----+---------+---------------+
|  1 |       3 | 1674518617000 |
|  2 |       5 | 1722868599000 |
+----+---------+---------------+
2 rows in set (0.00 sec)

Of course, I haven’t applied intermediate schemas :slight_smile: I must have missed this in the manual. I just applied 1.2.0.sql