Apache Solr Parallel Indexing


This project is not covered by Drupal’s security advisory policy.

Apache Solr Parallel indexing allows you to set the amount of CPU's you have to utilise the maximum of your system during index time.

Improvement

During this test, entity_cache was enabled and pre-warmed. Solr server was an external server and the website was really complex. Looks like a realistic test scenario to me!

Without Parallel Indexing

time drush --uri=https://mytestsite.com.dev solr-index --limit=1000
1000 items successfully processed. 972 documents successfully sent to Solr.                                                                                                                                                      [status]
real	1m21.416s
user	0m45.394s
sys	0m2.748s

Node types: complex ones, references and more.
Amount per run: 100
Calculation : 1k nodes in 1m 31s, with 100 nodes per run
Nodes per second : 11.

With Parallel indexing

time drush --uri=https://mytestsite.com.dev solr-index --limit=1000
1000 items successfully processed. 1008 documents successfully sent to Solr.                                                                                                                                                     [status]
real	0m31.858s
user	0m2.948s
sys	0m0.492s

Node types: complex ones, references and more.
Amount per run: 500
Settings: 8 CPU's at once
Calculation : 1k nodes in 31s
Nodes per second : 31

Requires

It requires httprl module as that is used to spawn the extra processes.

How to use?

  1. Enable Module
  2. go to admin/config/development/httprl and set it to -1
  3. Go to admin/settings/apachesolr/settings
  4. Click Advanced Settings
  5. Set the amount of nodes you want to index (I woud set it to 200, to test)
  6. Set the amount of CPU's you have (I would set to 2, to test)
  • Index
    1. using the batch button in the UI
    2. Using drush : drush --uri="https://mydrupal.dev" solr-index

    A note, the uri here is very important as it needs to know where it should send requests to.
    Gradually raise your limit to find the maximum of your system.

    Make sure your php timeouts and memory limits are able to keep up with the indexing process.
    For the sake of testing I recommend

    ini_set('memory_limit', -1);
    ini_set('max_execution_time', 300);
    

    Apache Solr Parallel Indexing项目的Drush安装命令:复制到剪贴板

    注:个别模块仍需开启相关子模块。

    项目分类:

    周安装量: 
    21
    维护状态: 
    积极维护中
    开发状态: 
    积极开发中

    开发版本下载:

    版本下载地址发布日期发布说明翻译下载
    7.x-1.x-devtar.gz (10.72 KB) | zip (11.44 KB)2013年10月18日发布说明简 | 繁 | 更多