Introduction to Hashing

Introduction to Hashing

For validation purposes passwords are stored in a database and if it wasn’t insecure enough that the user used “asdf123” as his password storing them as plain text can be a liability. To solve this, one must apply a hashing function to the password. Hashing is generating a new value according to a mathematical algorithm (ex. MD5, SHA256, etc.). Hashing doesn’t work only with strings but also documents, images, media files, etc. The same content run through the hashing function will produce the same output, which allows one to store that hashed value in the database. Another benefit of knowing that the same input will produce the same output with the selected algorithm is for version control or file integrity. If the original content is hashed, the recipient can compare the digest of the different files and if they’re the same, one can be sure they haven’t changed. In the example below using JavaScript, notice the use of the algorithm sha1 and the use of salt. Salt is an additional protection you can use for your hashing process where you add a number to the password before it is hashed, in this case through the use of uuidv1:

Hashing is not only useful for single elements but it can also work with key-value pairs through the use of hash tables.

Hash Tables

In a hash table the key is passed to the hashing function to produce an index into an array in which an element is stored. Accessing that element becomes efficient because the exact index is known:

As seen in the example above two or more keys can end up having the same hash value and therefore given the same index number in the hash table, this is known as collision. To prevent a collision one can use either separate chaining or open addressing. In separate chaining each cell of the table, points to a linked list where the value associated to that particular index will be placed. In open addressing if the position is taken then the value is placed in another open position. The best practice is to use a strong algorithm that provides a low chance of collision. The benefit of the hash table as hinted earlier is that the efficiency and speed of the search increases since one doesn’t have to search through all the records until the desired record is found given that the exact index is known by using the hashed key. This means it has a better complexity than linear or binary search as seen here.

Hashing vs Encryption

Since both hashing and encryption have to do with cybersecurity it’s quite common to think they are interchangeable terms. Unlike hashing that is a one-way function, encryption is a two-way function. This means that what is encrypted can be decrypted with the proper key (think of it like protecting a document placed in a vault). Hashing is one-way and there’s no way to reveal the original value once it was hashed. When a hacker manages to get a hold of hashed passwords what he really does is use a collection of common passwords and apply a hashing algorithm with the hopes of getting a match later on by comparing them with the hashed passwords he stole. Although hashing is used for security it’s also used to insure the integrity (not counting the hash of course) of an element while encryption is about providing security but without producing any change on the element it’s protecting. Hashing also provides benefits for searching given its indexing feature.

Maps and Hashing

A map object is useful because it holds key-value pairs through which it can iterate through the use of looping. When it comes to hashing objects in a map, keys are ordered in order of insertion, they can be any value not just strings or symbols, and they aren’t in the map object by default but have to be placed into it. Maps unlike objects can also be directly iterated and perform better in addition and removal scenarios.

Objects on the other hand hold default keys that can cause collision issues with the keys, the keys are restricted to strings or symbols, the number of items in an object have to be determined manually instead of using the size property, and its slower when dealing with addition and removal of key-value pairs.


Miguel Morales

Leave a Reply

Your email address will not be published. Required fields are marked *