Settings¶
This page covers the various settings contained within the Redis Monitor. The sections are broken down by functional component.
Core¶
SLEEP_TIME
Default: 0.1
The number of seconds the main process will sleep between checking for new actions to take care of.
RETRY_FAILURES
Default: True
Retry an action if there was an unexpected failure while computing the result.
RETRY_FAILURES_MAX
Default: 3
The number of times to retry a failed action before giving up. Only applied when RETRY_FAILURES
is enabled.
HEARTBEAT_TIMEOUT
Default: 120
The amount of time the statistics key the Redis Monitor instance lives to self identify to the rest of the cluster. Used for retrieving stats about the number of Redis Monitor instances currently running.
Note
On actions that take longer than the timeout, the key will expire and your stats may not be accurate until the main thread can heart beat again.
Redis¶
REDIS_HOST
Default: 'localhost'
The Redis host.
REDIS_PORT
Default: 6379
The port to use when connecting to the REDIS_HOST
.
REDIS_DB
Default: 0
The Redis database to use when connecting to the REDIS_HOST
.
REDIS_PASSWORD
Default: None
The password to use when connecting to the REDIS_HOST
.
REDIS_LOCK_EXPIRATION
Default: 6
The number of seconds a vacant worker lock will stay within Redis before becoming available to a new worker
REDIS_SOCKET_TIMEOUT
Default: 10
The number of seconds to wait while establishing a TCP connection, or to wait for a response from an existing TCP connection before timing out.
Kafka¶
KAFKA_HOSTS
Default: 'localhost:9092'
The Kafka host. May have multiple hosts separated by commas within the single string like 'h1:9092,h2:9092'
.
KAFKA_TOPIC_PREFIX
Default: 'demo'
The Kafka Topic prefix to use when generating the outbound Kafka topics.
KAFKA_CONN_TIMEOUT
Default: 5
How long to wait (in seconds) before timing out when trying to connect to the Kafka cluster.
KAFKA_APPID_TOPICS
Default: False
Flag to send data to both the firehose and Application ID specific Kafka topics. If set to True
, results will be sent to both the demo.outbound_firehose
and demo.outbound_<appid>
Kafka topics, where <appid>
is the Application ID used to submit the request. This is useful if you have many applications utilizing your cluster but only would like to listen to results for your specific application.
KAFKA_PRODUCER_BATCH_LINGER_MS
Default: 25
The time to wait between batching multiple requests into a single one sent to the Kafka cluster.
KAFKA_PRODUCER_BUFFER_BYTES
Default: 4 * 1024 * 1024
The size of the TCP send buffer when transmitting data to Kafka
Zookeeper¶
ZOOKEEPER_ASSIGN_PATH
Default: /scrapy-cluster/crawler/
The location to store Scrapy Cluster domain specific configuration within Zookeeper. Should be the same as the crawler settings.
ZOOKEEPER_ID
Default: all
The file identifier to read crawler specific configuration from. This file is located within the ZOOKEEPER_ASSIGN_PATH
folder above. Should be the same as the crawler settings.
ZOOKEEPER_HOSTS
Default: localhost:2181
The zookeeper host to connect to. Should be the same as the crawler settings.
Plugins¶
PLUGIN_DIR
Default: 'plugins/'
The folder containing all of the Kafka Monitor plugins.
PLUGINS
Default:
{
'plugins.info_monitor.InfoMonitor': 100,
'plugins.stop_monitor.StopMonitor': 200,
'plugins.expire_monitor.ExpireMonitor': 300,
'plugins.stats_monitor.StatsMonitor': 400,
'plugins.zookeeper_monitor.ZookeeperMonitor': 500,
}
The default plugins loaded for the Redis Monitor. The syntax for this dictionary of settings is '<folder>.<file>.<class_name>': <rank>
. Where lower ranked plugin API’s are validated first.
Logging¶
LOGGER_NAME
Default: 'redis-monitor'
The logger name.
LOG_DIR
Default: 'logs'
The directory to write logs into. Only applicable when LOG_STDOUT
is set to False
.
LOG_FILE
Default: 'redis_monitor.log'
The file to write the logs into. When this file rolls it will have .1
or .2
appended to the file name. Only applicable when LOG_STDOUT
is set to False
.
LOG_MAX_BYTES
Default: 10 * 1024 * 1024
The maximum number of bytes to keep in the file based log before it is rolled.
LOG_BACKUPS
Default: 5
The number of rolled file logs to keep before data is discarded. A setting of 5
here means that there will be one main log and five rolled logs on the system, generating six log files total.
LOG_STDOUT
Default: True
Log to standard out. If set to False
, will write logs to the file given by the LOG_DIR/LOG_FILE
LOG_JSON
Default: False
Log messages will be written in JSON instead of standard text messages.
LOG_LEVEL
Default: 'INFO'
The log level designated to the logger. Will write all logs of a certain level and higher.
Note
More information about logging can be found in the utilities Log Factory documentation.
Stats¶
STATS_TOTAL
Default: True
Calculate total receive and fail stats for the Redis Monitor.
STATS_PLUGINS
Default: True
Calculate total receive and fail stats for each individual plugin within the Redis Monitor.
STATS_CYCLE
Default: 5
How often to check for expired keys and to roll the time window when doing stats collection.
STATS_DUMP
Default: 60
Dump stats to the logger every X seconds. If set to 0
will not dump statistics.
STATS_DUMP_CRAWL
Default: True
Dump statistics collected by the Scrapy Cluster Crawlers. The crawlers may be spread out across many machines, and the log dump of their statistics is consolidated and done in a single place where the Redis Monitor is installed. Will be dumped at the same interval the STATS_DUMP
is set to.
STATS_DUMP_QUEUE
Default: True
Dump queue metrics about the real time backlog of the Scrapy Cluster Crawlers. This includes queue length, and total number of domains currently in the backlog. Will be dumped at the same interval the STATS_DUMP
is set to.
STATS_TIMES
Default:
[
'SECONDS_15_MINUTE',
'SECONDS_1_HOUR',
'SECONDS_6_HOUR',
'SECONDS_12_HOUR',
'SECONDS_1_DAY',
'SECONDS_1_WEEK',
]
Rolling time window settings for statistics collection, the above settings indicate stats will be collected for the past 15 minutes, the past hour, the past 6 hours, etc.
Note
For more information about stats collection, please see the Stats Collector documentation.