FileCodeBox

4.1.1 Enabling Spring Data Cassandra

To get started using Spring Data Cassandra, you’ll need to add the Spring Boot starter dependency for nonreactive Spring Data Cassandra. There are actually two separate Spring Data Cassandra starter dependencies to choose from: one for reactive data persistence and one for standard, nonreactive persistence.

We’ll talk more about writing reactive repositories later in chapter 15. For now, though, we’ll use the nonreactive starter in our build as shown here:

html

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-cassandra</artifactId>
</dependency>

This dependency is also available from the Initializr by checking the Cassandra check box.

It’s important to understand that this dependency is in lieu of the Spring Data JPA starter or Spring Data JDBC dependencies we used in the previous chapter. Instead of persisting Taco Cloud data to a relational database with JPA or JDBC, you’ll be using Spring Data to persist data to a Cassandra database. Therefore, you’ll want to remove the Spring Data JPA or Spring Data JDBC starter dependencies and any relational database dependencies (such as JDBC drivers or the H2 dependency) from the build.

The Spring Data Cassandra starter dependency brings a handful of dependencies to the project, specifically, the Spring Data Cassandra library. As a result of Spring Data Cassandra being in the runtime classpath, autoconfiguration for creating Cassandra repositories is triggered. This means you’re able to begin writing Cassandra repositories with minimal explicit configuration.

Cassandra operates as a cluster of nodes that together act as a complete database system. If you don’t already have a Cassandra cluster to work with, you can start a singlenode cluster for development purposes using Docker like this:

bash

$ docker network create cassandra-net
$ docker run --name my-cassandra \
      --network cassandra-net \
      -p 9042:9042 \
      -d cassandra:latest

This starts the single-node cluster and exposes the node’s port (9042) on the host machine so that your application can access it.

You’ll need to provide a small amount of configuration, though. At the very least, you’ll need to configure the name of a keyspace within which your repositories will operate. To do that, you’ll first need to create such a keyspace.

In Cassandra, a keyspace is a grouping of tables in a Cassandra node. It’s roughly analogous to how tables, views, and constraints are grouped in a relational database.

Although it’s possible to configure Spring Data Cassandra to create the keyspace automatically, it’s typically much easier to manually create it yourself (or to use an existing keyspace). Using the Cassandra CQL (Cassandra Query Language) shell, you can create a keyspace for the Taco Cloud application. You can start the CQL shell using Docker like this:

bash

$ docker run -it --network cassandra-net --rm cassandra cqlsh my-cassandra

If this command fails to start up the CQL shell with an error indicating “Unable to connect to any servers,” wait a minute or two and try again. You need to be sure that the Cassandra cluster is fully started before the CQL shell can connect to it.

When the shell is ready, use the create keyspace command like this:

bash

cqlsh> create keyspace tacocloud
    ... with replication={'class':'SimpleStrategy', 'replication_factor':1}
    ... and durable_writes=true;

Put simply, this will create a keyspace named tacocloud with simple replication and durable writes. By setting the replication factor to 1, you ask Cassandra to keep one copy of each row. The replication strategy determines how replication is handled. The SimpleStrategy replication strategy is fine for single data center use (and for demo code), but you might consider the NetworkTopologyStrategy if you have your Cassandra cluster spread across multiple data centers. I refer you to the Cassandra documentation for more details of how replication strategies work and alternative ways of creating keyspaces.

Now that you’ve created a keyspace, you need to configure the spring.data.cassandra.keyspace-name property to tell Spring Data Cassandra to use that keyspace, as shown next:

yaml

spring:
  data:
    cassandra:
      keyspace-name: taco_cloud
      schema-action: recreate
      local-datacenter: datacenter1

Here, you also set the spring.data.cassandra.schema-action to recreate. This setting is very useful for development purposes because it ensures that any tables and user-defined types will be dropped and recreated every time the application starts. The default value, none, takes no action against the schema and is useful in production settings where you’d rather not drop all tables whenever an application starts up.

Finally, the spring.data.cassandra.local-datacenter property identifies the name of the local data center for purposes of setting Cassandra’s load-balancing policy. In a single-node setup, "datacenter1" is the value to use. For more information on Cassandra load-balancing policies and how to set the local data center, see the DataStax Cassandra driver’s reference documentation http://mng.bz/XrQM.

These are the only properties you’ll need for working with a locally running Cassandra database. In addition to these two properties, however, you may wish to set others, depending on how you’ve configured your Cassandra cluster.

By default, Spring Data Cassandra assumes that Cassandra is running locally and listening on port 9042. If that’s not the case, as in a production setting, you may want to set the spring.data.cassandra.contact-points and spring.data.cassandra.port properties as follows:

yaml

spring:
  data:
    cassandra:
      keyspace-name: tacocloud
      local-datacenter: datacenter1
      contact-points:
      - casshost-1.tacocloud.com
      - casshost-2.tacocloud.com
      - casshost-3.tacocloud.com
      port: 9043

Notice that the spring.data.cassandra.contact-points property is where you identify the hostname(s) of Cassandra. A contact point is the host where a Cassandra node is running. By default, it’s set to localhost, but you can set it to a list of hostnames. It will try each contact point until it’s able to connect to one. This is to ensure that there’s no single point of failure in the Cassandra cluster and that the application will be able to connect with the cluster through one of the given contact points.

You may also need to specify a username and password for your Cassandra cluster. This can be done by setting the spring.data.cassandra.username and spring.data.cassandra.password properties, as shown next:

yaml

spring:
  data:
    cassandra:
      ...
      username: tacocloud
      password: s3cr3tP455w0rd

Now that Spring Data Cassandra is enabled and configured in your project, you’re almost ready to map your domain types to Cassandra tables and write repositories. But first, let’s step back and consider a few basic points of Cassandra data modeling.

4.1.1 Enabling Spring Data Cassandra ​

4.1.1 Enabling Spring Data Cassandra