Author: @anon66228339
Original: https://monitoring-portal.org/t/retention-policies-and-continuous-queries-made-simple/1792
This is a small howto about downsampling data for InfluxDB.
Use this howto on your own risk!
And as always → create a backup first before you change anything!
Setup
InfluxDB name: icinga2
default retention policy: rp_1_year for new databases, or autogen for existing.
Terms
continuous query (CQ) An query that runs automatically and periodically within a database, normaly every 30 minutes if you didnt configure other times in your influxdb.ini
retention policy (RP) The part of InfluxDB’s data structure that describes for how long InfluxDB keeps data (duration).
Actual situation for most InfluxDB setups used with Icinga2
Most users setup InfluxDB like in this blog post, which works perfect.
Your Icinga2 server will send all the metrics (and meta data) to your InfluxDB, InfluxDB store all data inside the “DEFAULT” retention policy and keep it forever (or until you have enough and delete it). That is how InfluxDB works, you store data in a retention policy!
But what if you don’t want to store all the data indefinitely? You will need a retention policy (or more) and if you also want to downsample your data older then x hours|days|weeks|years you also need continuous queries Lets create a setup that will downsample all data older then 52 weeks (1 hour to one entry) and downsample again after 104 weeks (1 day to one entry) deleting data older then 3 years.
Explanation:
Icinga2 will send performance data to InfluxDB → InfluxDB will store all data inside the default retention policy.
The first continuous queries will pull data, that is older then 52 weeks, every hour from the default retention policy and write it into the retention policy “rp_2_years”.
The second continuous queries “cq_after_2_years” will pull data, that is older then 104 weeks, every day from the retention policy “rp_2_years” and write it into the retention policy “rp_3_years”.
The default retention policy will delete all data from the default policy that is older then 52 weeks and one day.
The retention policy “rp_2_years” will delete all data from the retention policy “rp_2_years” that is older then 104 weeks and 2 days.
The retention policy “rp_3_years” will delete all data from the retention policy “rp_3_years” that is older then 156 weeks.
Prepare a new InfluxDB database
- Log into influx cli
- Create database with non default retention policy named “one_year”
CREATE DATABASE "icinga2" WITH DURATION 52w1d REPLICATION 1 SHARD DURATION 168h NAME "rp_1_year"
- Change to your newly created database
use icinga2
- Create second retention policy “rp_2_years” and “rp_3_years”
CREATE RETENTION POLICY "rp_2_years" ON "icinga2" DURATION 104w2d REPLICATION 1
CREATE RETENTION POLICY "rp_3_years" ON "icinga2" DURATION 156w REPLICATION 1
- Check if all retention policies are there
> SHOW RETENTION POLICIES
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
rp_1_year 8760h0m0s 168h0m0s 1 true
rp_2_years 17520h0m0s 168h0m0s 1 false
rp_3_years 26208h0m0s 168h0m0s 1 false
- Create the first continuous query that will downsample all metric data (meta data will not be taken) older 52 weeks and downsample 1 hour of data into 1 entry writing it into the retention policy named
rp_2_years
CREATE CONTINUOUS QUERY "cq_after_1_year" ON "icinga2" BEGIN SELECT mean(value) AS value,mean(crit) AS crit,mean(warn) AS warn INTO "icinga2"."rp_2_years".:MEASUREMENT FROM "icinga2"."rp_1_year"./.*/ WHERE time < now() -52w GROUP BY time(1h),* END
- For the next continuous query we want to use data from the retention policy “rp_2_years”
CREATE CONTINUOUS QUERY "cq_after_2_year" ON "icinga2" BEGIN SELECT mean(value) AS value,mean(crit) AS crit,mean(warn) AS warn INTO "icinga2"."rp_3_years".:MEASUREMENT FROM "icinga2"."rp_2_years"./.*/ WHERE time < now() -104w GROUP BY time(1d),* END
- Check if all continuous are set like we wanted it.
> SHOW CONTINUOUS QUERIES
name: _internal
name query
---- -----
name: icinga2
name query
---- -----
cq_after_1_year CREATE CONTINUOUS QUERY cq_after_1_year ON icinga2 BEGIN SELECT mean(value) AS value, mean(crit) AS crit, mean(warn) AS warn INTO icinga2.rp_2_years.:MEASUREMENT FROM icinga2.rp_1_year./.*/ WHERE time < now() - 52w GROUP BY time(1h), * END
cq_after_2_year CREATE CONTINUOUS QUERY cq_after_2_year ON icinga2 BEGIN SELECT mean(value) AS value, mean(crit) AS crit, mean(warn) AS warn INTO icinga2.rp_3_years.:MEASUREMENT FROM icinga2.rp_2_years./.*/ WHERE time < now() - 104w GROUP BY time(1d), * END
For existing databases with a retention policy “autogen”
Hint
Its not possible to rename existing retention policies, but we can alter them.
Do not drop the default retention policy, you will loose all your data!
- Log into influx cli
- Change to your existing database
use icinga2
- Create second retention policy “rp_2_years” and “rp_3_years”
CREATE RETENTION POLICY "rp_2_years" ON "icinga2" DURATION 104w2d REPLICATION 1
CREATE RETENTION POLICY "rp_3_years" ON "icinga2" DURATION 156w REPLICATION 1
- Check if all retention policies are there
> SHOW RETENTION POLICIES
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 true
rp_2_years 17520h0m0s 168h0m0s 1 false
rp_3_years 26208h0m0s 168h0m0s 1 false
- Create the first continuous query that will downsample all metric data (meta data will not be taken) older 52 weeks and downsample 1 hour of data into 1 entry writing it into the retention policy named
rp_2_years
CREATE CONTINUOUS QUERY "cq_after_1_year" ON "icinga2" BEGIN SELECT mean(value) AS value,mean(crit) AS crit,mean(warn) AS warn INTO "icinga2"."rp_2_years".:MEASUREMENT FROM "icinga2"."autogen"./.*/ WHERE time < now() -52w GROUP BY time(1h),* END
- For the next continuous query we want to use data from the retention policy “rp_2_years”
CREATE CONTINUOUS QUERY "cq_after_2_year" ON "icinga2" BEGIN SELECT mean(value) AS value,mean(crit) AS crit,mean(warn) AS warn INTO "icinga2"."rp_3_years".:MEASUREMENT FROM "icinga2"."rp_2_years"./.*/ WHERE time < now() -104w GROUP BY time(1d),* END
- Check if all continuous are set like we wanted it.
> SHOW CONTINUOUS QUERIES
name: _internal
name query
---- -----
name: icinga2
name query
---- -----
cq_after_1_year CREATE CONTINUOUS QUERY cq_after_1_year ON icinga2 BEGIN SELECT mean(value) AS value, mean(crit) AS crit, mean(warn) AS warn INTO icinga2.rp_2_years.:MEASUREMENT FROM icinga2.autogen./.*/ WHERE time < now() - 52w GROUP BY time(1h), * END
cq_after_2_year CREATE CONTINUOUS QUERY cq_after_2_year ON icinga2 BEGIN SELECT mean(value) AS value, mean(crit) AS crit, mean(warn) AS warn INTO icinga2.rp_3_years.:MEASUREMENT FROM icinga2.rp_2_years./.*/ WHERE time < now() - 104w GROUP BY time(1d), * END
- Wait for at least 2 days, because second CQ runs only every day once, or more days depending on your amount of data, before you alter the default retention policy to delete all data older 52w and 1 day
ALTER RETENTION POLICY "autogen" ON "icinga2" DURATION 52w1d REPLICATION 1 SHARD DURATION 168h DEFAULT
- Check if the “autogen” policy has changed
> show retention policies
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 8760h0m0s 168h0m0s 1 true
rp_2_years 17520h0m0s 168h0m0s 1 false
rp_3_years 26208h0m0s 168h0m0s 1 false
If you still have questions, just ask here or open a new topic.
Cheers,
Carsten