Blockchain Basics : What is Hashing?

The use of blockchain technology integrates a number of significant scientific breakthroughs from several disciplines.
Building a blockchain and using such services are both based on the cryptographic hash function.
It is impossible to envision a decentralized network that completely does away with hashing.

But what hashing is entirely about and why blockchains need it?
That is what we will be discussing here.
But what if you don't know anything about blockchains and encryption?

Well, NO Problemo. Just stay with me and you will understand this.

First of all what hashing means exactly?

Hashing : For everyone (Literally)

You might have seen something like this somewhere :
00000000000000000025e2ba026a8ad462b9a693d80fd0887def167f5f888a11

This is nothing but a code for something. It is a string that contains Numbers(0-9) and Alphabets or Characters(a-z).

While this String does not seem to have any meaning, it could possibly be holding the password someone's bank account with Millions of Dollars or even just the letter A.

This is the power of hashing.

You have a function or simply a black box.

You put in your name : Spongebob and you will get : f0e2e750791171b0391b682ec35835bd6a5c3f7c8d1d0191451ec77b4d75f240.

Magical right?
It's actually mathematical.

This is from a SHA-256 hashing algorithm used by Bitcoin, but as a beginner you don't have to know the details.
Basically this is how hashing works. But there are a few essential rules that determine a hashing function :

  • Hashing is a method for cryptographically encoding data.
  • It produces a fixed-length output from any input.
  • The same input always produces the same hash.
  • The input cannot be reconstructed from the hash.
  • Modern hash functions make it virtually impossible to produce the same output from two different inputs.

hashing

Properties of Hashing

1. Hashing produces outputs of fixed length

Any input will always result in a distinct, fixed-length output thanks to hashing.
Let's examine what it entails using a few illustrations.

Hashing examples

The hashes of three distinct inputs are created using the SHA-256 hash algorithm and are displayed in the table above.
The hash is absolutely different in each of the three instances, yet it has the same length.
SHA-256 creates 256-bit long hashes, which are often represented as 64 symbols made up of the integers 0–9 and the letters A–F.
The hash length is set at 64 characters, regardless of the length of the input, which can be anything from a single word (hello) to an entire book (Charles Dickens' Bleak House).

2. Small change equals Entirely different Hash

For even a single letter change in the input, hashing functions generate an entirely different output or hash as it is called.
But why is this done?
This is done to prevent hackers to predict outputs easily from these changes. If the output can be predicted, it means the hashing function is weak.

Let's see an example how the output changes :

Changes in hash

3. Hashing functions are one-way functions

One way functions in the simplest terms mean that :

  • Going from A to B is easy
  • Going from B to A is possible, but is not practical.

Hashing functions are made to be this way. This provides us the security that we need when using hashing. If it would have been easy to decode the hash, then there would be no point in using a hash function for security purposes.

Yes, hashing functions are used for producing outputs that are not a security threats, like Github commit keys.

4. Resilient to Collisions

When two separate inputs result in the same output from a hashing algorithm, this is known as a collision.
Theoretically, hashing allows for this as there are a finite amount of distinct hashes but an infinite number of inputs.
The likelihood of crashes is really slim.
Thus, it may be claimed that hashing is collision-resistant but not collision-immune.

The Bitcoin algorithm, SHA-256, generates hashes with a length of 256 bits (a 256 digit-long string of 1s and 0s).
This indicates that it is capable of producing a total of 2256 distinct hashes.
When there are more inputs than there are outputs conceivable, let's say 2256+1, at least two of the inputs will have the same result, which is known as a collision.

So, now you know why hashing is used, fundamental concepts behind hashing and the properties of Hashing and hashing functions required for a good blockchain technology to be created.

We'll meet again in the next blog, where we will be discussing more about the SHA-256 algorithm and why it is being used currently.