Apache Cassandra is a free and open-source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple data centers, with asynchronous masterless replication allowing low latency operations for all clients.
Released in 2008 by Facebook and developed initially by Indians Avinash Lakshman and Prashant Malik, Cassandra was initially used in Facebook’s inbox search feature. In this article, we are going to look at the support in Golang for the Cassandra database.
Why Cassandra?
- Open Source – Since 2009, it is completely free and has developed a dedicated userbase community. Cassandra can also be integrated with other Apache open-source projects like Hadoop (with the help of MapReduce), Apache Pig, and Apache Hive which is a massive boost.
- Peer to Peer Architecture – There is no single point of failure in Cassandra, since it uses a P2P architecture, instead of master-slave architecture. Any number of servers/nodes can be added to any Cassandra cluster in any of the data centers.
- Scalability – Cassandra cluster can be easily scaled-up or scaled-down. Interestingly, any number of nodes can be added or deleted in the Cassandra cluster without much disturbance. You don’t have to restart the cluster or change queries related to the Cassandra application while scaling up or down. This is why Cassandra is popular for having a very high throughput for the highest number of nodes. As scaling happens, read and write throughput both increase simultaneously with zero downtime or any pause to the applications.
- Performance – Cassandra has demonstrated brilliant performance under large sets of data.
- high-level data model – this is column-oriented. It means, Cassandra stores columns based on the column names, leading to very quick slicing. Unlike traditional databases, where column names only consist of metadata, in Cassandra column names can also consist of the actual data.
- Consistency settings – In Cassandra, Consistency can be of two types – Eventual consistency and Strong consistency. You can adopt any of these, based on your requirements. Eventual consistency makes sure that the client is approved as soon as the cluster accepts the write. Whereas, Strong consistency means that any update is broadcasted to all machines or all the nodes where the particular data is situated.
Golang Cassandra Support – CQL
CQL stands for Cassandra Query language.
By default, Cassandra provides a prompt Cassandra query language shell (cqlsh) that allows users to communicate with it. Using this shell, you can execute Cassandra Query Language (CQL). Using cqlsh, you can define a schema, insert data, and execute a query.
CQL Data Manipulation Commands
- INSERT − Adds columns for a row in a table.
- UPDATE − Updates a column of a row.
- DELETE − Deletes data from a table.
- BATCH − Executes multiple DML statements at once.
CQL Clauses
- SELECT − This clause reads data from a table
- WHERE − The where clause is used along with select to read a specific data.
- ORDERBY − The orderby clause is used along with select to read a specific data in a specific order.
Go-cql
Introducing go-cql, which implements a fast and robust Cassandra client for the Go programming language. Gocql has been tested in production against many different versions of Cassandra.
Installation is simple with go get github.com/gocql/gocql
The best part is you don’t really have to own a company to try it out for yourself, so let’s get to it.
sudo apt update
sudo apt install openjdk-8-jdk
Then we can verify the Java installation by running the following command:
java --version
Next we install the transport-https package to get our repo:
sudo apt install apt-transport-https
wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
The command above should output OK which means that the key has been successfully imported and packages from this repository will be considered trusted.
Next, add the Cassandra repository to the system by issuing:
sudo sh -c 'echo "deb http://www.apache.org/dist/cassandra/debian 311x main" > /etc/apt/sources.list.d/cassandra.list'
Once that is done, we are ready to install:
sudo apt update
sudo apt install cassandra
Cassandra service will automatically start after the installation process is complete. You can verify that Cassandra is running by typing:
nodetool status
Now we can run the cassandra shell prompt using:
cqlsh
Here with me till now? Good. A little over halfway through.
Now, in the cqlsh shell, make a table:
create keyspace example with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
create table example.tweet(timeline text, id UUID, text text, PRIMARY KEY(id));
create index on example.tweet(timeline);
Next, open up VSCode (or any other editor you like) and let’s start with the imports:
package main
import (
"fmt"
"log"
"github.com/gocql/gocql"
)
Then for the main function, we’ll use this, which is explained below:
func main() {
cluster := gocql.NewCluster("192.168.1.1", "192.168.1.2", "192.168.1.3")
cluster.Keyspace = "example"
cluster.Consistency = gocql.Quorum
session, _ := cluster.CreateSession()
defer session.Close()
if err := session.Query(`INSERT INTO tweet (timeline, id, text) VALUES (?, ?, ?)`,
"me", gocql.TimeUUID(), "hello world").Exec(); err != nil {
log.Fatal(err)
}
var id gocql.UUID
var text string
if err := session.Query(`SELECT id, text FROM tweet WHERE timeline = ? LIMIT 1`,
"me").Consistency(gocql.One).Scan(&id, &text); err != nil {
log.Fatal(err)
}
fmt.Println("Tweet:", id, text)
iter := session.Query(`SELECT id, text FROM tweet WHERE timeline = ?`, "me").Iter()
for iter.Scan(&id, &text) {
fmt.Println("Tweet:", id, text)
}
if err := iter.Close(); err != nil {
log.Fatal(err)
}
}
The above function does three things:
- initialises and connects to the cluster, applying for authentication protocols.
- session.Query(`INSERT INTO…) inserts a tweet into the cluster with the username and time.
- searches for a specific set of records whose ‘timeline’ column matches the value ‘me’.
- lists all the tweets using fmt.Println
References
- https://cassandra.apache.org/doc/latest/ – documentations are your best friend
- https://godoc.org/github.com/gocql/gocql
Until next time !