Huge Pages and Caches

Moderator: Reach-Native-gp

Post Reply
rahulburman
Posts: 16
Joined: Thu Oct 01, 2020 4:13 pm

Huge Pages and Caches

Post by rahulburman »

How are Huge Page and Cache settings defined?

amitgope
Posts: 21
Joined: Thu Oct 01, 2020 4:10 pm

Re: Huge Pages and Caches

Post by amitgope »

Expanded Tree Cache

At Group level for E nodes, the Expanded Tree Cache is used to hold frequently-accessed XML documents. This cache is used as workspace for holding the result set of a particular query. MarkLogic recommends that most customers set the Expanded Tree Cache size to 1/8th of the physical memory on the server.
If it were a D node, the size recommended is 1024 for this setting.

Compressed Tree Cache
The Compressed Tree Cache is used to hold recently-accessed XML content in a compressed form. Its purpose is to minimize random disk reads for frequently-accessed content. MarkLogic recommends that most customers set the Compressed Tree Cache size to 1/16th of the physical memory on the server.
E-node
Size is set as 128.

List Cache
The List Cache is used to hold recently-accessed index termlists. Its purpose is to minimize disk reads for frequently-accessed index terms, which are used for almost every MarkLogic XQuery. MarkLogic recommends that most customers set the List Cache size to 1/8th of the physical memory on the server.
E-node
Size is set as 128.


Group caches and Linux huge pages
07 April 2020 12:09 PM
SUMMARY

This article discusses the MarkLogic group-level caches and Linux Huge Page configurations.

Group Caches
MarkLogic utilizes caches to increase retrieval performance of frequently-accessed objects. In particular, MarkLogic caches:

1. Expanded trees (Expanded Tree Cache)

On any groups that have app servers configured for your application (E-nodes), the Expanded Tree Cache is used to hold frequently-accessed XML documents. This cache is used as workspace for holding the result set of a particular query. MarkLogic recommends that most customers set the Expanded Tree Cache size to 1/8th of the physical memory on the server.

For groups that only manage forest content and do not have app servers configured (D-nodes), the Expanded Tree Cache is used only during the process of reindexing content. The cache size should be set to 1024 for D-nodes.

2. Compressed trees (Compressed Tree Cache)

On any groups that do not manage forest content (E-nodes), the Compressed Tree Cache is unused, and should be set to 128.

For groups that manage forest content (D-nodes), the Compressed Tree Cache is used to hold recently-accessed XML content in a compressed form. Its purpose is to minimize random disk reads for frequently-accessed content. MarkLogic recommends that most customers set the Compressed Tree Cache size to 1/16th of the physical memory on the server.

3. Lists (List Cache)

On any groups that do not manage forest content (E-nodes), the List Cache is unused, and should be set to 128.

For groups that manage forest content (D-nodes), the List Cache is used to hold recently-accessed index termlists. Its purpose is to minimize disk reads for frequently-accessed index terms, which are used for almost every MarkLogic XQuery. MarkLogic recommends that most customers set the List Cache size to 1/8th of the physical memory on the server.

Rule of Thirds
By default, MarkLogic Server will allocate roughly one third of physical memory to the aforementioned caches, but the server will try to utilize as much memory as possible. The "Rule of Thirds" provides a conceptual explanation of how MarkLogic uses memory on a server:

One third of physical memory for MarkLogic group-level caches
One third of physical memory for in-memory content (range indexes and in-memory stands)
One third of physical memory for workspace, app server overhead, and Linux filesystem buffer
It is very common for Linux servers running MarkLogic to show high memory utilization. In fact, it is desirable to have MarkLogic utilize much of the memory on the server. However, the server should use very little swap, as that will have a severe negative impact on performance. Adhering to the Rule of Thirds should generally ensure that a server is properly sized, and any cases of memory-related performance degradations should be compared against this rule to identify improper sizing.


Huge Pages

Caches and in-memory stands look for large blocks of contiguous memory space, while range indexes, workspace memory, and the Linux filesystem buffer utilize smaller blocks of memory.
In order to efficiently allocate the large blocks of memory for the group-level caches and in-memory stands, MarkLogic recommends the usage of Linux Huge Pages. Instead of the kernel allocating 4k pages of memory, huge pages are 2048k in size and can be quickly allocated for larger blocks of memory. At a minimum, MarkLogic recommends allocating enough huge pages to cover the group-level caches (roughly one third of physical memory). The upper end of recommended huge pages includes both the caches and in-memory stands.

On Linux systems, MarkLogic recommends setting Linux Huge Pages to 3/8 the size of your physical memory, and should be configured to reserve the space at boot time. For details on setting up Huge Pages, see the following Red Hat Enterprise Linux (RHEL) KB:

Post Reply