Reindexing affecting performance

Moderator: Reach-Native-gp

Post Reply
rahulburman
Posts: 16
Joined: Thu Oct 01, 2020 4:13 pm

Reindexing affecting performance

Post by rahulburman »

How does reindexing work and does it impact the performance of the system?

amitgope
Posts: 21
Joined: Thu Oct 01, 2020 4:10 pm

Re: Reindexing affecting performance

Post by amitgope »

To increase the accuracy of unfiltered searches and increase facet search, we have reindexing done by adding new indexes on elements. There can also be scenarios where we are not using some indexes and we want to get rid of them to reclaim disk space, we do reindexing.

When there is a server configuration change that triggers reindexing, ML host takes up first 500 documents that needs reindexing and reindexes them. Then the next
batch of 500 documents untill all are fininshed. At given time this can be the scenario:
1. the original document(original indexes)
2. the updated document(new indexes)
3. merged document(if stand is getting merged)

This is the reason that we need 3 times the size of the data as the base or recommended space requirement by MarkLogic.

Performance Impact
Reindexing is a resource intensive operation, as it uses both CPU and disk bandwidth. The CPU will be busy parsing the content and generating index entries while the disk will be reading fragments for reindex, writing new stands to disk, and running merges on these newly-created stands. You can expect significant performance impact in environments that are normally heavily utilized. You can decrease the impact of reindexing by using the reindexer throttle setting in the database configuration page. Reducing the value from 5 will introduce a delay between completion of 500 fragments and the next query for the following 500 fragments. In general it is best advised to create all indexes at once to avoid multiple updates on the same document.

Post Reply