This is my page of notes after playing with Manticore Search to adopt it.
Manticore Search is an open-source search engine designed specifically for search, including full-text search, with focus on low latency and high throughput. It was born in 2017 as a continuation of the famous Sphinx Search engine.
Things I like:
Other notes
data_dir
in config, which is to say “RT mode”).data_dir
. Replication is available only in this mode.Extract the zip, then edit the conf. file:
manticore.conf.in
# holy crap common { plugin_dir = /usr/local/manticore/lib } searchd { listen = 127.0.0.1:9312 listen = 127.0.0.1:9306:mysql # listen = 127.0.0.1:9308:http # http(s) port can be the same of binary protocol port (9312) log = E:/manticore36/log/searchd.log query_log = E:/manticore36/log/query.log pid_file = E:/manticore36/log/searchd.pid # PLAIN MODE is enabled by omitting "data_dir" (permits ALL index types, RT and Plain) # RT-MODE is required only if you need to enable REPLICATION # data_dir = E:/manticore36/data-rtmode # warning, data_dir *enables RT MODE* and does NOT allow index definitions at this config. (plain indexes) query_log_format = sphinxql }
Let's install as a service:
E:\Manticore\bin\searchd --install --config E:\Manticore\manticore.conf.in --servicename Manticore
Manticore can be started and stopped from the Services Control Panel or manually from the command line:
sc.exe start Manticore
sc.exe stop Manticore
If you don't install Manticore as Windows service, you can start it from the command line:
.\bin\searchd -c manticore.conf.in
To ensure a fast connection, use 127.0.0.1
and not localhost
which can be poorly resolved:
mysql -P9306 -h127.0.0.1
Manticore configuration supports shebang syntax, meaning that the configuration can be written in a programming language and interpreted at loading, allowing dynamic settings.
For example, indexes can be generated by querying a database table, various settings can be modified depending on external factors or external files can be included (which contain indexes and/sources).
The configuration file is parsed by declared declared interpreter and the output is used as the actual configuration. This is happening each time the configuration is read (not only at searchd startup).
This facility is not available on Windows platform.
In the following example, we are using PHP to create multiple indexes with different name and we also scan a specific folder for file containing extra declarations of indexes.
manticore.conf.in
#!/usr/bin/php ... <?php for ($i=1; $i<=6; $i++) { ?> index test_<?=$i?> { type = rt path = /var/lib/manticore/data/test_<?=$i?> rt_field = subject ... } <?php } ?> ... <?php $confd_folder='/etc/manticore.conf.d/'; $files = scandir($confd_folder); foreach($files as $file) { if(($file == '.') || ($file =='..')) {} else { $fp = new SplFileInfo($confd_folder.$file); if('conf' == $fp->getExtension()){ include ($confd_folder.$file); } } }
The configuration file supports comments, with # character used as start comment section. The comment character can be present at the start of the line or inline.
Extra care should be considered when using # in character tokenization settings as everything after it will not be taken into consideration. To avoid this, use # UTF-8 which is U+23.
# can also be escaped using \. Escaping is required if # is present in database credential in source declarations.
Nice usage of sql_query_pre
, sql_query_range
, sql_range_step
, and sql_query_post_index
. As seen here.
A table to keep some indexing information
CREATE TABLE `product_search_status` ( `id` varchar(30) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL, `value` bigint(20) UNSIGNED NOT NULL, PRIMARY KEY (`id`) USING BTREE ) ENGINE = InnoDB;
source configuration in manticore
# we set unicode charset and wait_timeout to a high value to prevent connection timeout errors sql_query_pre = SET NAMES utf8 sql_query_pre = SET SESSION wait_timeout=3600 # we store the index time for information sql_query_pre = REPLACE INTO product_search_status (id, value) VALUES ('last_indexed_time', UNIX_TIMESTAMP()) # we set start-end document ids so that manticore will know where to start and stop indexing sql_query_range = SELECT MIN(id), MAX(id) FROM product sql_range_step = 10000 # this is the main query to create documents sql_query = SELECT \ id, \ name AS name_ft, \ categories AS categories_ft, \ name \ FROM product \ WHERE id >= $start AND id <= $end # we store the most recent document id for information sql_query_post_index = REPLACE INTO product_search_status (id, value) VALUES ('last_indexed_id', $maxid)