When it comes to data storage and management systems and engaging data into advanced applications, NoSQL databases are mainstream. NoSQL, meaning “not only SQL”, is a movement encouraging developers and business people to open their minds and consider new possibilities beyond the classic relational approach to data persistence.
From SQL to NoSQL
The leap from SQL to NoSQL in database application development basically happened with the emergence of Big Data. Traditional or relational database management systems (RDBMSs), or shortly SQL databases, are mainly designed for structured data with predefined schema. On the other hand, schema-less NoSQLs deals just perfectly with unstructured, semistructured or other forms of data. We can think of NoSQL databases as additive, meaning we can add new kinds of relationships, new columns, new nodes, new labels, and new subgraphs to an existing structure without disturbing existing queries and application functionality.
What is unstructured data
According to experts in the field, 80 to 90 percent of the data in any organization and the amount is growing significantly. So, what is this unstructured data anyway in very familiar terms?
Unstructured data often refers to any text and multimedia content, such as e-mail messages, word processing documents, videos, photos, audio files, presentations, webpages and many other kinds of business documents. Even though there may be a sort of an internal structure of this type of content, it is still considered “unstructured” because the data doesn’t fit neatly in a database. By contrast, think of data such as phone numbers, zip codes or credit card numbers that accept a certain number of digits and can be mapped into pre-designed fields.
NoSQL as a new paradigm
Before the aggressive emergence of big data, most content of business documents was of the structured type, so developers were building data-processing applications using relational storage models. The few exceptions were covered by various tools that structured the rebellious data and loaded it into RDBMS. Back then, huge volumes of unstructured data (big data) were not a real thing.
NoSQL is a new mindset to approach databases and their management systems. It provides a mechanism to store and retrieve data, modeled in a non-relational way (without tabular relation), such as documents, graphs or columns.
NoSQL database management systems
There are different brands of NoSQL database management systems available on the market, each suitable for specific use cases. The following proved to be most useful for projects our teams have developed recently:
MongoDB is a general purpose, document-based, distributed database built for modern application developers and for the cloud era. Used by millions of developers to power the world’s most innovative products and services.
Apache Cassandra manages massive amounts of data fast. It is the right choice when looking for scalability and high availability without compromising performance. It is suitable for applications that can’t afford to lose data, even when an entire data center goes down.
Apache CouchDB ensures seamless multi-master sync using multiple formats and protocols to store, transfer, and process data. It is suitable for a variety of projects and products that scale from globally distributed server-clusters, over mobile phones to web browsers.
Types of NoSQL DBMS and their usage in applications
The schema-less model characterizing NoSQL DBMS ensure high performance, scalability, distributed storage, cloud enablement. Different types of applications (for instance, collaborative workspace, e-commerce stores, telecommunication software, sports video games etc) are built with different types of NoSQL DBMS.
So, if there is no schema, what is then? NoSQL DBMS are designed to have multiple operational models based on the data and target functionality. There are four major types of NoSQL DBMSs and we will introduce them in parallel with the types of applications they are used for.
Key-Value-Based Model. What basically happens in this model is that data is retrieved by a key value. The key-value model is suitable for storing basic information like user profiles, user sessions, shopping cart data, queuing and live information, etc.
Column-Based Model stores related data in a family of columns identified by a row key. It works like a table : new columns can also be added to any row at any point of time. So, it is not necessary to maintain the same columns for all the rows. The column-value model is suitable for log aggregation or blogging platforms.
Document-Based Model. Documents can be XML, JSON or other type as long as they have a hierarchical and self-defining structure. It is suitable for Content Management Systems, web-based and real-time analytics, e-commerce applications etc.
Graph-Based Model stores entities (nodes) with their relationships (edges). Visually, it would look like a tree where all the nodes are connected based on their relationships. Graph databases are suitable for social networks, recommendation engines, geospatial data, etc.
Just because we have a cutting-edge data modelling technique is not by default a strong reason to replace a well-established and known data platform; there must also be an immediate and considerable practical benefit.
Ropardo consulting teams will help you determine the compatibility of a certain database with your project challenges and understand the benefits based on sets of use cases and data patterns whose performance improves significantly when implemented in one of the considered NoSQL databases. For instance, one good reason for choosing a graph database is the enhanced performance when dealing with connected data. In relational databases, join-intensive query performance declines as the data set gets bigger. With a graph database, performance will remain relatively constant, even as the data set grows, because queries are localized to a segment of the graph. As a result, the execution time for each query is proportional only to the size of the graph segment traversed to satisfy that query. Another reason is the model’s flexibility, in order to avoid modelling our domain in exhaustive detail ahead of time.
It is essential to remind you that NoSQL (which actually means “Not Only SQL”) are not meant to replace SQL databases but to cover the gaps created by the emergence of unstructured data in relational database management systems.