In the upcoming weeks, I will be including more posts about web development. After thinking about it I think a good transition topic is non-relational databases since they’re not only used in analytics but also web development. This way we have a background before looking at the benefits, characteristics, tips, and tricks of a NoSQL database such as MongoDB.
NoSQL databases are the response to the demands brought by services that are expected to be “always on”, fast no matter the number of users or their location, greater amounts of data being generated, and most importantly the need for dynamic schemas (although there is no explicit schema like in relational databases, there is an implicit one).
Among its main characteristics are:
Not having SQL as the standard language, each different NoSQL database can have its own particular language to carry out queries. But not to worry in some cases they have thought about the hassle of learning another query language and have stuck to SQL or allow extensions that facilitate using that language.
The schema is either flexible or not predefined, which means you can add new data without having to have defined the outline for the data beforehand.
It involves a compromise when it comes to guaranteeing all of the ACID (atomicity, consistency, isolation, durability) properties of database transactions in order to improve performance and increase availability. Instead, the consistency can be described with the model BASE (Basic availability, Soft-state, and Eventual Consistency) …and yes they’re as loose as they sound. There is no strict consistency when it comes to replicas and eventually, the data will return the last updated value.
NoSQL databases are designed with a focus on growth, through methods such as data fragmentation and having identical copies on multiple servers. They are generally distributed through various models such as P2P or master-slave and are open source.
When are NoSQL Databases ideal?
NoSQL is ideal when the data schema from the various sources varies and it becomes too costly to standardize them in a relational database. A good example would be an airfare comparison website or app where the data obtained differs in its sources and formats.
Another scenario where NoSQL is ideal is when the data has many relationships and is produced in real-time. Some of the strongest examples are Facebook and Instagram where content is uploaded at certain peak hours, yet no reduction in performance should be perceived by the users.
If the business is dealing with great volumes of data yet performance, flexibility, and speed need to be insured. An example would be an eCommerce business with a high traffic rate and pages such as Amazon, that deals with billions (2.73B when I wrote this article and ran the statistics on similarweb.com) of requests and serving around a billion pages each day. Aside from the use of various technologies they even have their own proprietary database.
Also in a business such as a SaaS where data needs to be processed in real-time or as close as possible, a NoSQL would be preferable.
More Comparisons with Relational Databases
Let’s face it we’re all about comparing, so here a few more comparisons.
Relational databases have more standards (ex. SQL as the query language) given the amount of time they’ve been around. This may change over time with the different NoSQL databases; indeed, a good example is the use of the format JSON.
Unlike relational databases there isn’t a single model in regards to structure and relations. There are various different types such as:
Column-oriented database: where a key is used to identify multiple values. Example: HBase
Key-value store database: each item is stored as a key and value, where the key serves to identify the value. Example: Azure, Cassandra
Graph database: it uses graph structures for queries with nodes and edges (lines connecting nodes). They’re best used with data where there are many relationships since querying and storing data is easier given the mentioned structure. Example: Neo4J
Object-oriented Database(OODB): they represent data in the form of objects and classes (collection of objects in this case). It provides features such as concurrency, transaction, and recovery. Ex. Perst
Document database: data is stored as documents using BSON to store JSON files and data types. Ideal for semi-structured data. Ex. MongoDB.
Next topic will be MongoDB, which should provide a greater insight through examples of the benefits and how data is dealt with in a NoSQL database.