Blame root/usr/share/container-scripts/cassandra/README.md

04789a8
Cassandra container
04789a8
===================
04789a8
04789a8
This repository contains Dockerfiles for Cassandra images for general usage and OpenShift.
41dba18
Currently only CentOS based image is available. The CentOS image is then available on
41dba18
[Docker Hub](https://hub.docker.com/r/centos/cassandra-3-centos7/) as centos/cassandra-3-centos7.
04789a8
04789a8
Description
04789a8
-----------
04789a8
04789a8
This container image provides a containerized packaging of the Cassandra daemon
04789a8
and client application. The cassandra server daemon accepts connections from clients
04789a8
and provides access to content from Cassandra databases on behalf of the clients.
04789a8
You can find more information on the Cassandra project from the project Web site
04789a8
(https://cassandra.apache.org/).
04789a8
41dba18
Usage
41dba18
-----
41dba18
41dba18
For this, we will assume that you are using the `centos/cassandra-3-centos7` image.
41dba18
If you want to set only the mandatory environment variables and store the database
41dba18
in the `/home/user/database` directory on the host filesystem, execute the following command:
41dba18
41dba18
```
41dba18
$ docker run -d -e CASSANDRA_ADMIN_PASSWORD=<password> -v /home/user/database:/var/opt/rh/sclo-cassandra3/lib/cassandra:Z centos/cassandra-3-centos7
41dba18
```
41dba18
2f90515
Environment variables and Volumes
2f90515
---------------------------------
2f90515
2f90515
The image recognizes the following environment variables that you can set during
2f90515
initialization by passing `-e VAR=VALUE` to the Docker run command.
2f90515
c1a201e
|    Variable name          |    Description                |
c1a201e
| :------------------------ | ---------------------------   |
c1a201e
|  CASSANDRA_ADMIN_PASSWORD | Password for the admin user   |
2f90515
2f90515
2f90515
The following environment variables influence the Cassandra configuration file. They are all optional.
2f90515
2f90515
|    Variable name                            |    Description                                                       |    Default
2f90515
| :------------------------------------------ | -------------------------------------------------------------------- | --------------
2f90515
|  CASSANDRA_CLUSTER_NAME                     | The name of the cluster.                                             | 'Test Cluster'
2f90515
|  CASSANDRA_DISK_OPTIMIZATION_STRATEGY       | The strategy for optimizing disk reads.                              | ssd
2f90515
|  CASSANDRA_ENDPOINT_SNITCH                  | Cassandra uses the snitch to locate nodes and route requests.        | SimpleSnitch
2f90515
|  CASSANDRA_NUM_TOKENS                       | Defines the number of tokens randomly assigned to this node.         | 256
2f90515
|  CASSANDRA_RPC_ADDRESS                      | The listen address for client connections.                           | ' '
2f90515
|  CASSANDRA_KEY_CACHE_SIZE_IN_MB             | Maximum size of the key cache in memory.                             | ' '
41dba18
|  CASSANDRA_CONCURRENT_READS                 | Allows operations to queue low enough in the stack so that the OS and drives can reorder them.  | 32
41dba18
|  CASSANDRA_CONCURRENT_WRITES                | Writes in Cassandra are rarely I/O bound, so the ideal number of concurrent writes depends on the number of CPU cores on the node. The recommended value is 8 × number_of_cpu_cores. | 32
2f90515
|  CASSANDRA_MEMTABLE_ALLOCATION_TYPE         | The method Cassandra uses to allocate and manage memtable memory.    | 'heap_buffers'
2f90515
|  CASSANDRA_MEMTABLE_CLEANUP_THRESHOLD       | Ratio used for automatic memtable flush.                             | 0.5
2f90515
|  CASSANDRA_MEMTABLE_FLUSH_WRITERS           | The number of memtable flush writer threads.                         | 1
41dba18
|  CASSANDRA_CONCURRENT_COMPACTORS            | Number of concurrent compaction processes allowed to run simultaneously on a node. | ' '
2f90515
|  CASSANDRA_COMPACTION_THROUGHPUT_MB_PER_SEC | Throttles compaction to the specified Mb/second across the instance. | 16
2f90515
|  CASSANDRA_COUNTER_CACHE_SIZE_IN_MB         | Maximum size of the counter cache in memory.                         | ' '
2f90515
|  CASSANDRA_INTERNODE_COMPRESSION            | Controls whether traffic between nodes is compressed.                | all
2f90515
|  CASSANDRA_GC_WARN_THRESHOLD_IN_MS          | Any GC pause longer than this interval is logged at the WARN level.  | 1000
41dba18
|  CASSANDRA_AUTO_BOOTSTRAP                   | It causes new (non-seed) nodes migrate the right data to themselves automatically. | true
2f90515
41dba18
More details about each variable can be found at: http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html
04789a8
04789a8
You can also set the following mount points by passing the `-v /host:/container` flag to Docker.
04789a8
41dba18
|  Volume mount point                        | Description              |
41dba18
| :----------------------------------------- | ------------------------ |
41dba18
|  /var/opt/rh/sclo-cassandra3/lib/cassandra | Cassandra data directory |
04789a8
04789a8
**Notice: When mouting a directory from the host into the container, ensure that the mounted
04789a8
directory has the appropriate permissions and that the owner and group of the directory
04789a8
matches the user UID or name which is running inside the container.**
04789a8
04789a8
04789a8
Ports
04789a8
-----
04789a8
41dba18
By default, Cassandra uses 7000 for cluster communication (7001 if SSL is enabled), 9042 for native protocol clients,
41dba18
and 7199 for JMX. The internode communication and native protocol ports are configurable in the Cassandra Configuration
41dba18
File (cassandra.yaml). The JMX port is configurable in cassandra-env.sh (through JVM options). All ports are TCP.
4e87487
04789a8
04789a8
Documentation
04789a8
-------------
04789a8
04789a8
See http://cassandra.apache.org/doc/latest/
04789a8
4e87487
04789a8
Requirements
04789a8
------------
04789a8
41dba18
* Memory: For production 32 GB to 512 GB; the minimum is 8 GB for Cassandra nodes. For development in non-loading
41dba18
testing environments: no less than 4 GB.
41dba18
* CPU: For production 16-core CPU processors are the current price-performance sweet spot. For development in
41dba18
non-loading testing environments: 2-core CPU processors are sufficient.
41dba18
* Disk space: SSDs are recommended for Cassandra nodes. The size depends on the compaction strategy used. With SSDs,
41dba18
you can use a maximum of 3 to 5 TB per node of disk space for uncompressed data.
04789a8
* Network: Recommended bandwidth is 1000 Mb/s (gigabit) or greater.
04789a8
04789a8
More on hardware requirements on https://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningHardware.html
04789a8
04789a8
04789a8
Custom configuration file
04789a8
-------------------------
04789a8
04789a8
It is allowed to use custom configuration files for cassandra server.
04789a8
41dba18
To use custom configuration file in container it has to be mounted into `/etc/opt/rh/sclo-cassandra3/cassandra/cassandra.yaml`.
41dba18
For example to use configuration file stored in `/home/user` directory use this option for `docker run` command:
41dba18
`-v /home/user/cassandra.yaml:/etc/opt/rh/sclo-cassandra3/cassandra/cassandra.yaml:Z`.
04789a8
41dba18
To configure multiple JVM options a `jvm.options` file needs to be mounted into the container. For example to use
41dba18
configuration file stored in `/home/user` directory use this option for
41dba18
`docker run` command: `-v /home/user/jvm.options:/etc/opt/rh/sclo-cassandra3/cassandra/jvm.options:Z`.
04789a8
04789a8
04789a8
Troubleshooting
04789a8
---------------
04789a8
41dba18
The cassandra daemon in the container logs to the standard output, so the log is available in the container log. The log
41dba18
can be examined by running:
04789a8
04789a8
docker logs <container>
04789a8
04789a8
04789a8
See also
04789a8
--------
04789a8
41dba18
Dockerfile and other sources for this container image are available on https://github.com/sclorg/cassandra-container.
41dba18
In that repository, Dockerfile for CentOS is called Dockerfile, Dockerfile for RHEL (Work-in-progress) is called Dockerfile.rhel7.