Database sizing and setup
This page describes more details about the requirements for your database.
Which database modifications should I consider when publishing massive numbers of publication requests?
It is not recommended to publish enormous amounts of publication requests unless necessary. Nonetheless, if you must publish such amounts, particularly with PostgreSQL, please fine-tune your autovacuum operations as advised below.
Adjust autovacuum operations in line with the example recommended settings below to prevent considerable growth in a PostgreSQL instance in the publication context described above.
autovacuum_vacuum_cost_delay=10ms shared_buffers=4GB autovacuum_work_mem=2GB maintenance_work_mem=8GB autovacuum_max_workers=6 autovacuum_naptime=15s autovacuum_vacuum_threshold=25 autovacuum_vacuum_scale_factor=0.1 autovacuum_analyze_threshold=10 autovacuum_analyze_scale_factor=0.05 autovacuum_vacuum_cost_delay=10ms autovacuum_vacuum_cost_limit=1000
|You should adapt these settings to the resources available in your environment.|
The volume of data required to run Magnolia is negligible, typically below 10 MB. The main driver of the total data volume is the customer who decides what content to store in Magnolia and in how many versions to store the content.
A reasonable operational size should be:
Double the amount of the data the customer has if the content is versioned.
Less than that if the content is not versioned.
More than that if there is a lot of content which is versioned frequently.
There are four tables and three indexes created for every JCR workspace. The total number of workspaces depends on the modules the customer decides to use and the number of content types created. One content type means one workspace. The default Magnolia bundle comes with approximately 20 workspaces, hence there are around 80 database tables in total.
This is configurable. By default, the database will only store text-based data up to the size of one kilobyte. However, this limit can be increased.
The datastore (the binaries) can be configured to be stored in the database as well. While this increases the load on the database, it will greatly simplify backup and restore strategies and, therefore, this is the recommended setup. With all the data in the database, there will be an additional table per workspace, hence not four but five tables per content type.
The data stored is structured. However, it is not directly open for manipulation at the database level. The data is optimized for storage through a hierarchical-to-relational data mapping to make it perform the best at:
Runtime, as hierarchical data.
Storage time, in a relational database storage.
You can run all of your instances against a single database instance and just use different table spaces or prefixes for tables used by different workspaces.
However, to improve disaster tolerance, we recommend running fully redundant configurations, where each Magnolia instance is backed by an independent database instance.
All of the above is configurable. Hybrid setups can also be created and used. In a hybrid setup, some Magnolia instances are sharing a single database while others are backed by their own database instances.