NoSQL databases are the second generation in the database sector, where the first generation of relational SQL databases reigned for long. NoSQL follows a non-tabular data store, which is different from the relational database tables. Now, the NoSQL databases are coming in a wide range of types based on the data models. Some of the major kinds are document store, key-value store, graph, and wide-column, etc. Each of these provides some flexible schemas and can scale-out quickly for a larger amount of data and high user loads.
When people refer to “NoSQL databases,” they are typically referring to non-relational databases. NoSQL is also called “non SQL,” whereas some others term it “not only SQL.” In any case, it is agreeable that NoSQL databases are those who can store more data in various formats than the relational databases.
There is also another misconception that the NoSQL databases may not be stored relationship data. However, NoSQL databases can effectively store the relationship data, but differently than the relational databases. Compared to the conventional SQL databases, relationship data stored in NoSQL databases are easier to be extracted as the data is not split between different tables. The NoSQL data models will let the related data to be nested together in a single data structure.
NoSQL databases also offer many distinct benefits in performance, scalability, data distribution, reliability, flexibility, and cost-effectiveness. This article will further discuss some of the data stores that come under the NoSQL databases.
Key-Value store database
A Key-value store or otherwise known as a key-value database is the structure that uses an associative array (for example, a map) where each key gets associated with one or more values in the collection array. In this model, each key-value represents an arbitrary string like hash value. This value gets stored a blob. The storage as BLOB will remove any need for indexing the data separately to enhance its performance.
In the case of the key-value store structure, there is no need to have a query language for database querying. They will let to store, update, and retrieve the data using GET, PUT, and DELETE commands, and data can be easily retrieved by making direct requests to the objects on the disk memory.
Some of the significant databases using classical key-value store architecture are:
- Apache Cassandra
- Aerospike
- Couchbase Server
- Berkeley DB
- Redis
- Memcached
- Riak
There is also a significant difference between different databases when it comes to key-value databases. For example, the Memcached database is not as persistent as Riak. Using Memcached for implementing caching of user preferences will load all data where the nodes go down, and refresh may require it from the actual source system.
Another example is that when we use the Riak database, we need not worry about the data loss, but the users should focus just on how the data is updated. It is also important to avoid choosing any key-value database based on your requirements, but it is also vital to determine which type of key-value database needs to be used for your purpose. Some considerations are as below.
- Queries – Queries perform based on the given key.
- Schema – It can store data based on the key-value pair. It can also store data corresponding to the given key in BLOB format.
- Scaling up – Keyspace is shared, which means that the starts with A data for the key may go to one server, and the access starts with the B data that may go to a different server. But, this will further expose the database system to potential data loss if the server fails.
- Replication – On writing down the data onto multiple machines, if two servers exist in the cluster, the value of key “ABC” maybe two different things for these two servers. It is a complex issue to resolve this and also creates problems during the updates.
If you are looking for support in NoSQL database administration remotely, get in touch with RemoteDBA to avail of cost-effective and expert help.
Document databases
Document databases are similar to key-value databases, which also stores data based on key-value. However, the document database’s significant difference is that it can store the values in XML, BSON JSON, etc. The database can understand the data format so that it becomes easier to perform the DBMS operations. Document database allows storing complex data too. You can store collections, trees, dictionaries, etc. effectively using the document database.
However, document databases do not support relational data. So, each document acts as a standalone. Instead, it refers to other documents by merely storing a key that corresponds to the particular document. Document databases also do not support joins, so they also overcome data sharing across various nodes. Some of the top databases which are document-based are:
- CouchDB
- MongoDB
- Terrastore
- RavenDB
- OrientDB etc.
In document-based also, there is no other way to query data than the key-value stores. We can perform a range of queries based on the key. Most of the above document-based DBs also supports single document transaction. Document-based databases are schemaless, which means that these do not need any schema to store data. Each of these documents may differ in the column numbers and may understand data in only JSON format.
Column-family databases
The next database architecture of column-family will store data in column families, i.e., rows. Each rowhas various columns associated with the row. Column families usually consist of a group of correlated data which we can further access altogether. Each column-family can be considered a container of rows in the relational database tables where the key will identify the row, and the rows will consist of several columns.
The rows do not always need to have the same columns itself, and also further columns can be added to any rows at any time without the need to add it to other rows. When a given column consists of a specific column map, we also maintain a super column consisting of a name and value, which acts as the column map. The Super column is a container of columns. Some of the top Column-family databases are as below.
- Hbase
- Cassandra
- Amazon DynamoDB
- Hypertable
Cassandra is a high-speed and scalable column-family database compared to others in writing operations as the data spread across the network cluster. Column-family databases offer the benefits of compression, aggregation queries, scalability, fast loading, and queries. Along with these, you may find many other NoSQL database structures, too, which are continuously getting upgraded to be more functional and user friendly.