{"id":2379,"date":"2026-01-02T20:30:00","date_gmt":"2026-01-03T01:30:00","guid":{"rendered":"https:\/\/enterpriseadmins.org\/blog\/?p=2379"},"modified":"2026-01-02T13:32:43","modified_gmt":"2026-01-02T18:32:43","slug":"an-unexpected-benefit-of-application-aware-backups-finding-and-fixing-database-bloat","status":"publish","type":"post","link":"https:\/\/enterpriseadmins.org\/blog\/scripting\/an-unexpected-benefit-of-application-aware-backups-finding-and-fixing-database-bloat\/","title":{"rendered":"An Unexpected Benefit of Application-Aware Backups: Finding and Fixing Database Bloat"},"content":{"rendered":"\n<p>While working on my recent post about why crash-consistent VM backups aren\u2019t always enough, I ran into an unexpected but very useful side effect of adding application-aware database backups.<\/p>\n\n\n\n<p>Once I started creating regular database dumps for my phpIPAM instances, I noticed something that had been completely invisible when relying solely on full VM backups: the database backups themselves were wildly different sizes.<\/p>\n\n\n\n<p>That observation kicked off a short investigation that ultimately led to cleaning up unnecessary data, shrinking backups, and better understanding what was actually stored in the application.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Initial Observation: Backup Size Discrepancies<\/h2>\n\n\n\n<p>I run multiple phpIPAM instances in my lab. Functionally, they\u2019re similar and store roughly comparable types of data. When I began dumping their databases as part of a snapshot freeze workflow, I expected the backups to be in the same general size range.  They weren\u2019t.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>One instance produced a database dump of roughly 489 MB uncompressed (about 23 MB compressed)<\/li>\n\n\n\n<li>Another instance produced a dump of only 5 MB uncompressed (under 1 MB compressed)<\/li>\n<\/ul>\n\n\n\n<p>At the VM level, this difference was completely masked. A full-VM backup doesn\u2019t make it obvious whether one application\u2019s data is growing abnormally or not\u2014it all just looks like blocks on disk.<\/p>\n\n\n\n<p>The database-level backups, however, made the discrepancy impossible to ignore.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Why VM-Level Backups Hid the Problem<\/h2>\n\n\n\n<p>This is one of those cases where VM backups were doing their job perfectly\u2014and still hiding a problem.<\/p>\n\n\n\n<p>From the perspective of the hypervisor:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The VM was healthy<\/li>\n\n\n\n<li>Snapshots completed successfully<\/li>\n\n\n\n<li>Backups restored without issue<\/li>\n<\/ul>\n\n\n\n<p>But VM backups don\u2019t provide visibility. They protect everything equally, whether the data is critical, redundant, or no longer useful.<\/p>\n\n\n\n<p>Application-aware backups, by contrast, force you to look directly at what\u2019s being protected. In this case, the size difference alone was enough to raise questions.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Digging into the phpIPAM Database<\/h2>\n\n\n\n<p>With the size discrepancy in hand, the next step was to look at the database itself.<\/p>\n\n\n\n<p>By inspecting table sizes and row counts, it quickly became clear that one instance was retaining a significant amount of historical or log-related data that the other was not.<\/p>\n\n\n\n<p>To connect to the database, which was running in a container, I ran:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>docker compose exec devipam-mariadb \/bin\/bash<\/code><\/pre>\n\n\n\n<p>Once I was inside the container, I connected to the database with<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>mariadb -u root -p<\/code><\/pre>\n\n\n\n<p>From here, ChatGPT helped me with some SQL queries.  The one to find the largest table was:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT\n     table_schema as `Database`, \n     table_name AS `Table`, \n     round(((data_length + index_length) \/ 1024 \/ 1024), 2) `Size in MB` \nFROM information_schema.TABLES \nORDER BY (data_length + index_length) DESC\nLIMIT 5;<\/code><\/pre>\n\n\n\n<p>This was pointing me at the <code>phpipam.logs<\/code> table, and to get a feel for some of the events it contained I ran:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT *\nFROM phpipam.logs\nLIMIT 5;<\/code><\/pre>\n\n\n\n<p>A few more investigative queries, grouping my username and command, led me to an existing phpIPAM issue:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>phpIPAM GitHub Issue #3545 \u2013 Excessive database growth due to retained data<\/p>\n<\/blockquote>\n\n\n\n<p>The issue documents how certain tables can grow unbounded over time, particularly with historical scan and discovery data enabled.  This issue (<a href=\"https:\/\/github.com\/phpipam\/phpipam\/issues\/3545\">https:\/\/github.com\/phpipam\/phpipam\/issues\/3545<\/a>) even provided a sample query to aid with cleanup.  The issue showed creating this as a recurring job, but based on my data this issue was no longer occurring on a regular basis, it was an issue that happened in the past.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Cleaning Up the Data<\/h2>\n\n\n\n<p>Armed with that context, I ran a small number of targeted queries to understand and then remove old, unnecessary entries. The goal wasn\u2019t to blindly delete data, but to:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Identify logs events responsible for the majority of the growth<\/li>\n\n\n\n<li>Confirm the data was no longer operationally useful<\/li>\n\n\n\n<li>Reduce backup size without impacting functionality<\/li>\n<\/ul>\n\n\n\n<p>The following query tested the logic I was going to use for removals:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT\n    COUNT(*) AS rows_to_delete,\n    MIN(date) AS oldest,\n    MAX(date) AS newest\nFROM phpipam.logs\nWHERE (command = 'user login' or command like 'users object % edit' or details like '% in ipaddresses edited. hostname: %')\n  AND date &lt; NOW() - INTERVAL 60 DAY;<\/code><\/pre>\n\n\n\n<p>This showed about 2.8m rows, dating back nearly 3 years, that I thought would be safe to delete.   Changing the statement (replacing the SELECT with a DELETE) resulted in the final cleanup query:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>DELETE FROM phpipam.logs\nWHERE (command = 'user login' or command like 'users object % edit' or details like '% in ipaddresses edited. hostname: %')\n  AND date &lt; NOW() - INTERVAL 60 DAY;<\/code><\/pre>\n\n\n\n<p>This query took about 20 seconds to execute and deleted the expected 2.8m rows.  The functionality of phpIPAM is unchanged, but the backup related results were immediate.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Database sizes across instances were now much closer<\/li>\n\n\n\n<li>Compressed backup sizes dropped significantly<\/li>\n\n\n\n<li>Backup and restore operations became faster<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">The Secondary Win: Smaller, Faster Backups<\/h2>\n\n\n\n<p>Reducing database size isn\u2019t just about saving disk space. Smaller application backups mean:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Faster freeze-script execution<\/li>\n\n\n\n<li>Shorter snapshot windows<\/li>\n\n\n\n<li>Less data to validate during restores<\/li>\n\n\n\n<li>Lower risk during recovery<\/li>\n<\/ul>\n\n\n\n<p>In other words, improving the quality of the data improved the reliability of the backup process itself.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Lessons Learned<\/h2>\n\n\n\n<p>This entire chain of events started with a simple goal: making sure I had a known good copy of application data.  What I didn\u2019t expect was that application-aware backups would act as a diagnostic tool:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>They exposed abnormal data growth<\/li>\n\n\n\n<li>They encouraged closer inspection of the database<\/li>\n\n\n\n<li>They led to tangible improvements in backup efficiency<\/li>\n<\/ul>\n\n\n\n<p>It\u2019s a good reminder that backups aren\u2019t just about recovery&#8230; they\u2019re also a feedback mechanism. When you actually look at what you\u2019re backing up, problems that were previously hidden at the VM layer become much easier to spot.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Crash-consistent VM backups remain a solid foundation, especially in lab environments.  But once you add application-aware backups, you may gain another layer of visibility.<\/p>\n\n\n\n<p>In this case, that visibility surfaced unnecessary data growth in phpIPAM, reduced backup sizes, and improved overall reliability. That\u2019s a win well beyond the original goal of \u201cjust\u201d having a safer backup.<\/p>\n\n\n\n<p>If nothing else, this experience reinforced one idea: when you back up data at the application level, you\u2019re forced to understand the application better.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>While working on my recent post about why crash-consistent VM backups aren\u2019t always enough, I ran into an unexpected but very useful side effect of adding application-aware database backups. Once I started creating regular database dumps for my phpIPAM instances, &hellip; <a href=\"https:\/\/enterpriseadmins.org\/blog\/scripting\/an-unexpected-benefit-of-application-aware-backups-finding-and-fixing-database-bloat\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":6,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[9,3],"tags":[],"class_list":["post-2379","post","type-post","status-publish","format-standard","hentry","category-lab-infrastructure","category-scripting"],"_links":{"self":[{"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/posts\/2379","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/comments?post=2379"}],"version-history":[{"count":2,"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/posts\/2379\/revisions"}],"predecessor-version":[{"id":2390,"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/posts\/2379\/revisions\/2390"}],"wp:attachment":[{"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/media?parent=2379"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/categories?post=2379"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/enterpriseadmins.org\/blog\/wp-json\/wp\/v2\/tags?post=2379"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}