CRC-32 hash tutorial

Put simple Tips and Tricks that are not entire Tutorials in this forum
User avatar
jeeswg
Posts: 5419
Joined: 19 Dec 2016, 01:58
Location: UK

CRC-32 hash tutorial

09 Aug 2017, 17:20

The mathematics of CRC-32:

Code: Select all

==================================================

CONTENTS

1. DECIMAL DIVISION
2. BINARY DIVISION
3. 'CRC DIVISION'
4. CRC-32 CALCULATION

==================================================

1. DECIMAL DIVISION

EXAMPLE 1:

An example of normal division:
1559/9 = 173 remainder 2

    173
   ____
9 |1559
   1557
   ----
      2

That is:
	 quotient
         ________
divisor |dividend

         remainder

EXAMPLE 2 INTRO:

An example of normal division:
11000010111/1001 = 10989021 remainder 90

Note: these numbers look like binary numbers,
but we will treat them as decimal numbers.

The basic principle is: subtract the greatest
possible multiple of 1001 each time.

E.g. 9900.
The biggest multiple of 1001 below 9900 is 1001*9=9009.
We note that: 9900-9009=891.
Thus we write '9' at the top i.e. the first '9' in '10989021'.
We write '891' below.

E.g. 1100.
The biggest multiple of 1001 below 1100 is 1001*1=1001.
We note that: 1100-1001=99.
Thus we write '1' at the top i.e. the first '1' in '10989021'.
We write '099' below.

EXAMPLE 2:

              10989021
      ________________
1001 |0000011000010111
      0000
      ----
       0000
       0000
       ----
        0001
        0000
        ----
         0011
         0000
         ----
          0110
          0000
          ----
           1100
           1001
           ----
            0990
            0000
            ----
             9900
             9009
             ----
              8911
              8008
              ----
               9030
               9009
               ----
                0211
                0000
                ----
                 2111
                 2002
                 ----
                  1091
                  1001
                  ----
                  0090

==================================================

2. BINARY DIVISION

EXAMPLE INTRO:

Hopefully looking at the decimal division above,
will have reminded you of how to do long division by hand,
and will make it easier to understand the calculation below,
where we use binary (base 2) instead of decimal (base 10).

We will use a leading '0b' to indicate binary numbers,
and use '**' to mean 'to the power of':
e.g. 0b10101 = 2**4 + 2**2 + 2**0 = 16 + 4 + 1 = 21

An example of binary division:
0b11000010111/0b1001 = 0b10101101 remainder 0b10

Which is equivalent to an example we used earlier:
1559/9 = 173 remainder 2

E.g. 0b1100.
The biggest multiple of 0b1001 below 0b1100 is 0b1001*0b1=0b1001.
We note that: 0b1100-0b1001=0b11.
Thus we write '1' at the top i.e. the first '1' in '0b11000010111'.
We write '011' below.

This is equivalent to decimal:
E.g. 12.
The biggest multiple of 9 below 12 is 9*1=9.
We note that: 12-9=3.
Thus we write '1' at the top i.e. the first '1' in '0b11000010111'.
We write '011' below, where 0b11 = 3.

One thing to be aware of is in decimal:
1000-1 = 999
In binary:
0b1000-0b1 = 0b111
Which helps when doing subtraction,
the equivalent of 9s in decimal, are 1s in binary.

EXAMPLE:

Example from:
crc_v3.txt
http://www.ross.net/crc/download/crc_v3.txt

          ...0000010101101 = 00AD =  173 = QUOTIENT
         ____-___-___-___-
9= 1001 ) 0000011000010111 = 0617 = 1559 = DIVIDEND
DIVISOR   0000.,,....,.,,,
          ----.,,....,.,,,
           0000,,....,.,,,
           0000,,....,.,,,
           ----,,....,.,,,
            0001,....,.,,,
            0000,....,.,,,
            ----,....,.,,,
             0011....,.,,,
             0000....,.,,,
             ----....,.,,,
              0110...,.,,,
              0000...,.,,,
              ----...,.,,,
               1100..,.,,,
               1001..,.,,,
               ====..,.,,,
                0110.,.,,,
                0000.,.,,,
                ----.,.,,,
                 1100,.,,,
                 1001,.,,,
                 ====,.,,,
                  0111.,,,
                  0000.,,,
                  ----.,,,
                   1110,,,
                   1001,,,
                   ====,,,
                    1011,,
                    1001,,
                    ====,,
                     0101,
                     0000,
                     ----
                      1011
                      1001
                      ====
                      0010 = 02 = 2 = REMAINDER

==================================================

3. 'CRC DIVISION'

EXCLUSIVE OR (XOR):

To perform an 'exclusive or' operation works like this:
e.g. where the numbers are binary, and '^' is the exclusive or operator:
       11001100
     ^ 10101010
     ----------
       01100110

Wherever the bits in a column are equal '0 0' or '1 1',
we write a 0 underneath.
Wherever the bits in a column are not equal '0 1' or '1 0',
we write a 1 underneath.

Note: for a normal 'or' operation, the result would be:
e.g. where the numbers are binary, and '|' is the or operator:
       11001100
     | 10101010
     ----------
       11101110

- For exclusive OR:
The first bit OR the second bit are equal to 1, but not both.
Equivalent to: exactly one bit is equal to 1.
- For OR:
The first bit OR the second bit OR both are equal to 1.
Equivalent to: at least one bit is equal to 1.

EXAMPLE INTRO:

An example of 'CRC division':
0b11010110110000 'divided by' 0b10011 gives 0b1100001010 remainder 0b1110.
Equivalent to: 13744 'divided by' 19 gives 778 remainder 14.

There is not necessarily a handy name for this.
I would call it 'CRC division'.

What we do is we perform binary division, but
we perform a different sort of subtraction:

Instead of subtracting the greatest possible multiple:
0b11010 - 0b10011 = 0b111
Equivalent to: 26 - 19 = 7

We instead do the following (where '^' is the exclusive or operator):
0b11010 ^ 0b10011 = 0b1001
Equivalent to: 26 ^ 19 = 9

Note also: the 'subtractions' will take place if and only if
the binary numbers are of the same length, and both start with '1'.

EXAMPLE:

Example by Tanenbaum81 from:
crc_v3.txt
http://www.ross.net/crc/download/crc_v3.txt

            1100001010
       _______________
10011 ) 11010110110000
        10011,,.,,....
        -----,,.,,....
         10011,.,,....
         10011,.,,....
         -----,.,,....
          00001.,,....
          00000.,,....
          -----.,,....
           00010,,....
           00000,,....
           -----,,....
            00101,....
            00000,....
            -----,....
             01011....
             00000....
             -----....
              10110...
              10011...
              -----...
               01010..
               00000..
               -----..
                10100.
                10011.
                -----.
                 01110
                 00000
                 -----
                  1110 = Remainder

EXAMPLE (REWRITTEN) INTRO:
A reminder of some key division-related terms, also shown earlier:
	 quotient
         ________
divisor |dividend

         remainder

In 'CRC division' we are not interested
in the quotient, only the remainder.
If we discard the quotient, then the procedure can be thought of as
as a binary number, with bits being shifted left,
and XORs taking place occasionally.

E.g. start with first (leftmost) bits of the dividend (0b11010110110000),
write them into a register:
[11010]110110000
If the first (leftmost) bit is 1,
perform a XOR on the number, using the divisor (0b10011).
[01001]110110000
Then move each digit left by 1, discarding the leftmost digit,
and add in the next digit from the dividend.
[10011]10110000
That is: we pop the first bit, and shift the other bits left by one bit.

This is example is shown in full below.

EXAMPLE (REWRITTEN):

all numbers are binary:
dividend: 0b11010110110000
divisor: 0b10011

[11010]110110000
starts with 1 so perform XOR: 11010 ^ 10011 = 01001:
[01001]110110000
pop the initial bit:
[10011]10110000
starts with 1 so perform XOR: 10011 ^ 10011 = 00000:
[00000]10110000
pop the initial bit:
[00001]0110000
pop the initial bit:
[00010]110000
pop the initial bit:
[00101]10000
pop the initial bit:
[01011]0000
pop the initial bit:
[10110]000
starts with 1 so perform XOR: 10110 ^ 10011 = 00101:
[00101]000
pop the initial bit:
[01010]00
pop the initial bit:
[10100]0
starts with 1 so perform XOR: 10100 ^ 10011 = 00111:
[00111]0
pop the initial bit:
[01110]

so the remainder is 0b1110

==================================================

4. CRC-32 CALCULATION

Here are some ASCII strings and their respective CRC-32 hashes:
0xE8B7BE43	a
0x9E83486D	ab
0x352441C2	abc
0x4C2750BD	abcdefghijklmnopqrstuvwxyz
0xCBF43926	123456789

There are 4294967296 = 0x100000000 possible CRC-32 hashes.
They are usually written as an 8-character hex number.

In the examples below we will calculate various CRC-32 hashes:
example 1: 'a'
example 2: 'ab' by using 'a'
example 3: 'abc' by using 'ab'
example 4: 'ab' in one go
example 5: 'abc' in one go

There are many different ways to calculate a CRC-32 hash,
I have tried to find the simplest way possible,
but it may be that an even simpler way is possible.

Note: the divisor is often referred to as the 'polynomial',
when performing 'CRC division'.

EXAMPLE 1:

calculate the CRC-32 hash for the ASCII string 'a':

inputs:
dividend: binary for 'a': 0b01100001 = 0x61 = 97
polynomial: 0b100000100110000010001110110110111 = 0x104C11DB7

0b01100001 = 0x61
reverse bits in each byte:
0b10000110 = 0x86
append 32 0 bits:
0b1000011000000000000000000000000000000000 = 0x8600000000
XOR the first 4 bytes with 0xFFFFFFFF:
0b0111100111111111111111111111111100000000 = 0x79FFFFFF00

'CRC division':
0111100111111111111111111111111100000000
 100000100110000010001110110110111
 ---------------------------------
  111000110011111011100010010010110
  100000100110000010001110110110111
  ---------------------------------
   110000101011110011011001001000010
   100000100110000010001110110110111
   ---------------------------------
    100000011011100010101111111101010
    100000100110000010001110110110111
    ---------------------------------
          111101100000100001001011101000

remainder: 0b00111101100000100001001011101000 = 0x3D8212E8
XOR the remainder with 0xFFFFFFFF:
0b11000010011111011110110100010111 = 0xC27DED17
reverse bits:
0b11101000101101111011111001000011 = 0xE8B7BE43

thus the CRC-32 hash for the ASCII string 'a' is 0xE8B7BE43

EXAMPLE 2:

calculate the CRC-32 hash for the ASCII string 'ab',
using the remainder from the previous example:

inputs:
dividend: binary for 'b': 0b01100010 = 0x62 = 98
polynomial: 0b100000100110000010001110110110111 = 0x104C11DB7
remainder for string 'a': 0x3D8212E8

0b01100010 = 0x62
reverse bits in each byte:
0b01000110 = 0x46
append 32 0 bits:
0b0100011000000000000000000000000000000000 = 0x4600000000
XOR the first 4 bytes with current remainder 0x3D8212E8:
0b0111101110000010000100101110100000000000 = 0x7B8212E800

'CRC division':
0111101110000010000100101110100000000000
 100000100110000010001110110110111
 ---------------------------------
  111010101100100101010110000101110
  100000100110000010001110110110111
  ---------------------------------
   110100010101001110110001100110010
   100000100110000010001110110110111
   ---------------------------------
    101001100110011001111110100001010
    100000100110000010001110110110111
    ---------------------------------
      100100000001101111000001011110100
      100000100110000010001110110110111
      ----------------------------------
         1001001111011010011111010000110

remainder: 0b01001001111011010011111010000110 = 0x49ED3E86
XOR the remainder with 0xFFFFFFFF:
0b10110110000100101100000101111001 = 0xB612C179
reverse bits:
0b10011110100000110100100001101101 = 0x9E83486D

thus the CRC-32 hash for the ASCII string 'ab' is 0x9E83486D

EXAMPLE 3:

calculate the CRC-32 hash for the ASCII string 'abc',
using the remainder from the previous example:

inputs:
dividend: binary for 'c': 0b01100011 = 0x63 = 99
polynomial: 0b100000100110000010001110110110111 = 0x104C11DB7
remainder for string 'ab': 0x49ED3E86

0b01100011 = 0x63
reverse bits in each byte:
0b11000110 = 0xC6
append 32 0 bits:
0b1100011000000000000000000000000000000000 = 0xC600000000
XOR the first 4 bytes with current remainder 0x49ED3E86:
0b1000111111101101001111101000011000000000 = 0x8FED3E8600

'CRC division':
1000111111101101001111101000011000000000
100011111110110100111110100001100
100000100110000010001110110110111
---------------------------------
    110110001101101100000101110110000
    100000100110000010001110110110111
    ---------------------------------
     101101010111011100010110000001110
     100000100110000010001110110110111
     ---------------------------------
       110111000101111001100011011100100
       100000100110000010001110110110111
       ---------------------------------
        10111100011111011101101101010011

remainder: 0b10111100011111011101101101010011 = 0xBC7DDB53
XOR the remainder with 0xFFFFFFFF:
0b01000011100000100010010010101100 = 0x438224AC
reverse bits:
0b00110101001001000100000111000010 = 0x352441C2

thus the CRC-32 hash for the ASCII string 'abc' is 0x352441C2

EXAMPLE 4:

calculate the CRC-32 hash for the ASCII string 'ab':

inputs:
dividend: binary for 'ab': 0b0110000101100010 = 0x6162
polynomial: 0b100000100110000010001110110110111 = 0x104C11DB7

0110000101100010
reverse bits in each byte:
1000011001000110
append 32 0 bits:
100001100100011000000000000000000000000000000000
XOR the first 4 bytes with 0xFFFFFFFF:
011110011011100111111111111111110000000000000000

'CRC division':
011110011011100111111111111111110000000000000000
 111100110111001111111111111111100
 100000100110000010001110110110111
 ---------------------------------
  111000100010011011100010010010110
  100000100110000010001110110110111
  ---------------------------------
   110000001000110011011001001000010
   100000100110000010001110110110111
   ---------------------------------
    100001011101100010101111111101010
    100000100110000010001110110110111
    ---------------------------------
         111101110000010000100101110100000
         100000100110000010001110110110111
         ---------------------------------
          111010101100100101010110000101110
          100000100110000010001110110110111
          ---------------------------------
           110100010101001110110001100110010
           100000100110000010001110110110111
           ---------------------------------
            101001100110011001111110100001010
            100000100110000010001110110110111
            ---------------------------------
              100100000001101111000001011110100
              100000100110000010001110110110111
              ---------------------------------
                 1001001111011010011111010000110

remainder: 0b01001001111011010011111010000110 = 0x49ED3E86
XOR the remainder with 0xFFFFFFFF:
0b10110110000100101100000101111001 = 0xB612C179
reverse bits:
0b10011110100000110100100001101101 = 0x9E83486D

thus the CRC-32 hash for the ASCII string 'ab' is 0x9E83486D

EXAMPLE 5:

calculate the CRC-32 hash for the ASCII string 'abc':

inputs:
dividend: binary for 'abc': 0b011000010110001001100011 = 0x616263
polynomial: 0b100000100110000010001110110110111 = 0x104C11DB7

011000010110001001100011
reverse bits in each byte:
100001100100011011000110
append 32 0 bits:
10000110010001101100011000000000000000000000000000000000
XOR the first 4 bytes with 0xFFFFFFFF:
01111001101110010011100111111111000000000000000000000000

'CRC division':
01111001101110010011100111111111000000000000000000000000
 100000100110000010001110110110111
 ---------------------------------
  111000100010010111111010010010110
  100000100110000010001110110110111
  ---------------------------------
   110000001000101011101001001000010
   100000100110000010001110110110111
   ---------------------------------
    100001011101010011001111111101010
    100000100110000010001110110110111
    ---------------------------------
         111101101000100000100101110100000
         100000100110000010001110110110111
         ---------------------------------
          111010011101000101010110000101110
          100000100110000010001110110110111
          ---------------------------------
           110101110110001110110001100110010
           100000100110000010001110110110111
           ---------------------------------
            101010100000011001111110100001010
            100000100110000010001110110110111
            ---------------------------------
              101000011001101111000001011110100
              100000100110000010001110110110111
              ---------------------------------
                100011111110110100111110100001100
                100000100110000010001110110110111
                ---------------------------------
                    110110001101101100000101110110000
                    100000100110000010001110110110111
                    ---------------------------------
                     101101010111011100010110000001110
                     100000100110000010001110110110111
                     ---------------------------------
                       110111000101111001100011011100100
                       100000100110000010001110110110111
                       ---------------------------------
                        10111100011111011101101101010011

remainder: 0b10111100011111011101101101010011 = 0xBC7DDB53
XOR the remainder with 0xFFFFFFFF:
0b01000011100000100010010010101100 = 0x438224AC
reverse bits:
0b00110101001001000100000111000010 = 0x352441C2

thus the CRC-32 hash for the ASCII string 'abc' is 0x352441C2

==================================================
==================================================

Some scripts/functions for 'CRC division' and calculating CRC-32 hashes:

Code: Select all

;==================================================

;'CRC DIVISION' SCRIPT

q:: ;examples of 'CRC division'
;0b11010110110000 'divided by' 0b10011 gives 0b1100001010 remainder 0b1110.
;Equivalent to: 13744 'divided by' 19 gives 778 remainder 14.
;example from:
;crc_v3.txt
;http://www.ross.net/crc/download/crc_v3.txt
vDividend := 13744 ;0b11010110110000 = 0x35B0
vDivisor := 19 ;0b10011 = 0x13

;0b1100001000000000 'divided by' 0b100011101 gives remainder 0b1111.
;Equivalent to: 49664 'divided by' 285 gives remainder 15.
;example from:
;Sunshine's Homepage - Understanding CRC
;http://www.sunshine2k.de/articles/coding/crc/understanding_crc.html
;vDividend := 49664 ;0b1100001000000000 = 0xC200
;vDivisor := 285 ;0b100011101 = 0x11D

vWidth := Floor(Log(vDivisor)/Log(2))
vCeil := 2**vWidth

vRem := 0
vQuotient := 0
vCount := Floor(Log(vDividend)/Log(2))
Loop, % vCount+1
{
	vRem := (vRem << 1) + !!(2**(vCount+1-A_Index) & vDividend)
	if (vRem & vCeil)
		vRem ^= vDivisor, vQuotient += 2**(vCount+1-A_Index)
}
MsgBox, % vQuotient " r " vRem "`r`n" Format("0x{:X} r 0x{:X}", vQuotient, vRem)
return

;==================================================

;CRC-32 CALCULATION FUNCTIONS (MANUAL / DLLCALL)

w:: ;get CRC-32 hash for ANSI text (ASCII text)
vText := "abcdefghijklmnopqrstuvwxyz" ;0x4C2750BD
;vText := "ABCDEFGHIJKLMNOPQRSTUVWXYZ" ;0xABF77822
;vText := "123456789" ;0xCBF43926

VarSetCapacity(vTemp, StrLen(vText)+1, 0)
StrPut(vText, &vTemp, "CP0")
;StrPut(vText, &vTemp, "CP1252")
MsgBox, % JEE_HashCrc32(&vTemp, StrLen(vText))
MsgBox, % JEE_HashCrc32Manual(&vTemp, StrLen(vText))
return

JEE_HashCrc32(vAddr, vSize)
{
	vHash := DllCall("ntdll\RtlComputeCrc32", UInt,0, Ptr,vAddr, Int,vSize, UInt)
	return Format("0x{:08X}", vHash)
}

JEE_HashCrc32Manual(vAddr, vSize)
{
	vHash := 0xFFFFFFFF
	Loop, % vSize
	{
		vByte := NumGet(vAddr+A_Index-1, 0, "UChar")
		vHash := vHash ^ vByte
		Loop, 8
			vHash := (vHash & 1) ? (vHash >> 1) ^ 0xEDB88320 : (vHash >> 1)
	}
	return Format("0x{:08X}", ~vHash)
}

;==================================================
==================================================

BEST LINKS:

crc_v3.txt
http://www.ross.net/crc/download/crc_v3.txt
get CRC-32 hash value, read file in chunks - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=31329
CRC-32 - Rosetta Code
https://rosettacode.org/wiki/CRC-32#AutoHotkey
CRC32 algorithm/implementation in C without a look up table and with a public license - Stack Overflow
https://stackoverflow.com/questions/210 ... -public-li
crc.c.txt
http://www.hackersdelight.org/hdcodetxt/crc.c.txt
How to calculate CRC32 by hand? – An Integrated World
https://www.anintegratedworld.com/how-t ... 2-by-hand/

==================================================

LINKS:

[example calculations]
crc_v3.txt
http://www.ross.net/crc/download/crc_v3.txt
Sunshine's Homepage - Understanding CRC
http://www.sunshine2k.de/articles/codin ... g_crc.html
How to calculate CRC32 by hand? – An Integrated World
https://www.anintegratedworld.com/how-t ... 2-by-hand/

[dll function]
Wine API: RtlComputeCrc32
https://source.winehq.org/WineAPI/RtlComputeCrc32.html

[algorithms / AutoHotkey code]
get CRC-32 hash value, read file in chunks - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=5&t=31329
CRC-32 - Rosetta Code
https://rosettacode.org/wiki/CRC-32#AutoHotkey
AHK_Scripts/CRC32.ahk at master · jNizM/AHK_Scripts · GitHub
https://github.com/jNizM/AHK_Scripts/bl ... C32.ahk#L2
CRC32 algorithm/implementation in C without a look up table and with a public license - Stack Overflow
https://stackoverflow.com/questions/210 ... -public-li
crc.c.txt
http://www.hackersdelight.org/hdcodetxt/crc.c.txt
[pdf appendix contains algorithms]
SAR-PR-2006-05.pdf
http://stigge.org/martin/pub/SAR-PR-2006-05.pdf
Calculating 32-bit CRCs (CRC-32)
http://mdfs.net/Info/Comp/Comms/CRC32.htm

[online solver]
On-line CRC calculation and free library
https://www.lammertbies.nl/comm/info/cr ... ation.html

[other hashes: information about other hashing algorithms e.g. MD5 and SHA-1]
MD5 - Scripts and Functions - AutoHotkey Community
https://autohotkey.com/board/topic/16409-md5/
SHA: Secure Hashing Algorithm - Computerphile - YouTube
https://www.youtube.com/watch?v=DMtFhACPnTY
EXTRA BITS - SHA1 Problems - Computerphile - YouTube
https://www.youtube.com/watch?v=f8ZP_1K2Y-U
Hashing Algorithms and Security - Computerphile - YouTube
https://www.youtube.com/watch?v=b4b8ktEV4Bg

[Stack Exchange discussions]
c - How is a CRC32 checksum calculated? - Stack Overflow
https://stackoverflow.com/questions/258 ... calculated
algorithm design - How do I calculate CRC32 mathematically? - Cryptography Stack Exchange
https://crypto.stackexchange.com/questi ... ematically
c# - Is there a built-in function to reverse bit order - Stack Overflow
https://stackoverflow.com/questions/358 ... -bit-order

==================================================

SOME OTHER 'HOW DOES IT WORK?' QUESTIONS:

How does a PC work?
Charles Petzold - Code

How do txt files work?
ANSI: 1 byte per character (0-255).
UTF-8: 1 byte per character (0-127, ASCII). 2 bytes per character (128-2047). 3 bytes per character (4096-65535).
UTF-16: 2 bytes per character (0-65535).
Note: where ANSI 128-255 ('extended ASCII') is often similar to, but not exactly the same as, Unicode 128-255, and where ANSI 128-255 varies depending on the codepage.

How do bmp files work?
graphics: create bmp files from scratch - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=34952

How do wav files work?
create wav files from scratch (draw a wav file) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=34584

How do file hashes work?
CRC-32 hash tutorial - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=35671

Return to “Tips and Tricks”

Who is online

Users browsing this forum: No registered users and 3 guests