Introduction to Cassandra Query Language (CQL) – Part 3 (June 27, 2016)

In this part, I will explain important CQL commands. In previous parts (part1 and part2) I wrote about basic CQL concepts and data types. If you have read previous parts, you know what a keyspace or table is and how to define them.

As I said in the first part, there are two categories of CQL commands:

  1. Data Definition commands: Used to define or manipulate data structure and schema of the data
  2. Data Manipulation commands: Used to read or write data.

The first group is explained in this post. I will explain the second group in a future post.

Data Definition commands

List of important DD commands:

  1. CREATE/ALTER/DROP KEYSPACE
  2. CREATE/ALTER/DROP TABLE
  3. USE
  4. DESCRIBE

You are familiar with “CREATE KEYSPACE” command which is used to create a new keyspace. Once a keyspace is created you can run “DROP KEYSPACE” to delete it.

CREATE KEYSPACE test WITH REPLICATION={'class':'SimpleStrategy', 'replication_factor':3};
DROP KEYSPACE test;
DROP KEYSPACE test2;  --this will give error because test2 does not exit
DROP KEYSPACE IF EXISTS test2; --Does nothing as test2 does not exist

The “CREATE KEYSPACE” command has some arguments which can be used to specify replication strategy. Replication strategy determines how and where replicas of the data will be stored when you are adding data to the tables inside the keyspace. In a future post, I will explain the different replication strategies that can be used in Cassandra. You can use “ALTER KEYSPACE” command to change replication strategy of a keyspace.

ALTER KEYSPACE "test" WITH REPLICATION =
  { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 };

Another option when creating/updating a keyspace is “DURABLE WRITES“. Normally Cassandra caches data in memory and flushed them to persistent storage periodically. If something wrong happens between two consecutive flush operations (e.g. machine shutdown), you will loose whatever data stored in the memory but not flushed to disk. If you enable “DURABLE WRITES” for a keyspace, Cassandra will use a special storage called “commit log” which will contain a copy of whatever there is in memory but not flushed to the disk. In the case of any data loss for memory-saved data, Cassandra will use the “commit log” to re-create the lost data and write them to the disk. Of course, this has it’s own load on the database. So you have to make a compromise. If data is not very important, you can disable this option but if not, you will have to enable it.

Here is an example of how to create a keyspace with durable writes disabled:

CREATE KEYSPACE test2
  WITH REPLICATION = 
  { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 } 
  AND DURABLE_WRITES = false;

These commands are used to create, edit or delete a table. You have seen “CREATE TABLE” command before:

-- create an empty table
CREATE TABLE test1 (
  field_name1 field_type1,
  field_name2 field_type2, 
...
  PRIMARY KEY (field_name1, field_name2, ...)
);

DROP TABLE test1;  -- delete the table and all of it's rows

To change a table you can use “ALTER TABLE” command. This command can be used to change the type of an existing column, add a new column or delete an existing column.

ALTER TABLE test1 ALTER field_name1 TYPE integer;

ALTER TABLE test1 ADD field_name3 integer;

ALTER TABLE test1 DROP field_name2;

You can also use “ALTER TABLE” command to edit table’s metadata. These metadata tell Cassandra how it should handle internal storage mechanisms for this table (e.g. caching parameters, compaction settings, …).

Other Commands

There are two other utility commands: USE and DESCRIBE.

You can use “USE x” to switch currently active keyspace.

You can use “DESCRIBE keyspace1” or “DESCRIBE table1” to see definition of a keyspace or a table. Note that to run the second version, you must have already switched to the keyspace which contains the table.