Batch mode and load balancing with ElasticSearch
The elasticsearch-http() destination automatically sends multiple log messages in a single HTTP request, increasing the rate of messages that your Elasticsearch deployment can consume. For details on adjusting and fine-tuning the batch mode of the elasticsearch-http() destination, see the following section.
Batch mode
The batch-bytes(), batch-lines(), and batch-timeout() options of the destination determine how many log messages syslog-ng OSE sends in a batch. The batch-lines() option determines the maximum number of messages syslog-ng OSE puts in a batch in. This can be limited based on size and time:
-
syslog-ng OSE sends a batch every batch-timeout() milliseconds, even if the number of messages in the batch is less than batch-lines(). That way the destination receives every message in a timely manner even if suddenly there are no more messages.
-
syslog-ng OSE sends the batch if the total size of the messages in the batch reaches batch-bytes() bytes.
To increase the performance of the destination, increase the number of worker threads for the destination using the workers() option, or adjust the batch-bytes(), batch-lines(), batch-timeout() options.
Example: elasticsearch-http() batch mode
In the following example, a batch containing 100 messages, or a maximum of 512 kilobytes, and is sent every 20 seconds (20000 milliseconds).
destination d_elasticsearch-http {
elasticsearch-http(
url("http://your-server:9200/_bulk")
index("<index-to-store-messages>")
batch-lines(100)
batch-bytes(512Kb)
batch-timeout(10000)
);
};
Load balancing between multiple Elasticsearch indexers
Starting with version 3.19, you can specify multiple URLs, for example,
url("site1" "site2") for high availability and load distribution.
syslog-ng OSE distributes worker threads evenly across the specified URLs
and automatically removes failed targets from rotation. Each message batch is sent
to only one URL. Failed targets are automatically retried after the time-reopen() interval.
Starting with version syslog-ng OSE version 3.22, you can use any of the following formats to specify multiple URLs:
url("server1", "server2", "server3"); # comma-separated strings
url("server1" "server2" "server3"); # space-separated strings
url("server1 server2 server3"); # space-separated within a single string
NOTE: The order of messages as they arrive on the servers can differ from the order syslog-ng OSE received them due to parallel processing across multiple targets. Use load-balancing only if your server uses timestamps from the messages themselves rather than arrival time.
NOTE: Set workers() to at least the number of URLs to ensure all targets are utilized.
For example, with three URLs, use workers(3) or greater.
CAUTION: Always set persist-name() when using multiple URLs to prevent data loss if the URL list changes.
For detailed information about the load balancing algorithm, failure handling, URL templating, and configuration best practices, see HTTP Load Balancer in Details.
Example: elasticsearch-http() load balancing
The following destination sends log messages to three different indexer nodes. Each node has an assigned separate worker thread. A batch is defined to consist of 100 messages, or a maximum of 512 kilobytes, and is sent every 20 seconds (20000 milliseconds).
destination d_elasticsearch-http {
elasticsearch-http(
url("http://your-elasticsearch-http-server1:9200/_bulk" "http://your-elasticsearch-http-server2:9200/_bulk" "http://your-elasticsearch-http-server3:9200/_bulk")
batch-lines(100)
batch-bytes(512Kb)
batch-timeout(20000)
persist-name("elasticsearch-http-load-balance")
workers(3)
);
};