Skip to main content
Engineering LibreTexts

12-F.22: Integrity Check & Checksums

  • Page ID
  • Integrity Checking

    File integrity monitoring (FIM) is an internal control or process that performs the act of validating the integrity of an operating system and application software files using a verification method between the current file state and a known, good baseline. This comparison method often involves calculating a known cryptographic checksum of the file's original baseline and comparing with the calculated checksum of the current state of the file. Other file attributes can also be used to monitor integrity.

    Generally, the act of performing file integrity monitoring is automated using internal controls such as an application or process. Such monitoring can be performed randomly, at a defined polling interval, or in real-time.

    Changes to configurations, files and file attributes across the IT infrastructure are common, but hidden within a large volume of daily changes can be the few that impact file or configuration integrity. These changes can also reduce security posture and in some cases may be leading indicators of a breach in progress. Values monitored for unexpected changes to files or configuration items include:

    • Credentials
    • Privileges and security settings
    • Content
    • Core attributes and size
    • Hash values
    • Configuration values

    Hash Function

    A cryptographic hash function (CHF) is a mathematical algorithm that maps data of arbitrary size (often called the "message") to a bit array of a fixed size (the "hash value," "hash," or "message digest"). It is a one-way function, that is, a function which is practically infeasible to invert. Ideally, the only way to find a message that produces a given hash is to attempt a brute-force search of possible inputs to see if they produce a match, or use a rainbow table of matched hashes. Cryptographic hash functions are a basic tool of modern cryptography.

    The ideal cryptographic hash function has the following main properties:

    • It is deterministic, meaning that the same message always results in the same hash.
    • It is quick to compute the hash value for any given message.
    • It is infeasible to generate a message that yields a given hash value (i.e., to reverse the process that generated the given hash value).
    • It is infeasible to find two different messages with the same hash value.
    • A small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value (avalanche effect).

    Cryptographic hash functions have many information-security applications, notably in digital signatures, message authentication codes (MACs), and other forms of authentication. They can also be used as ordinary hash functions, to index data in hash tables, for fingerprinting, to detect duplicate data or uniquely identify files, and as checksums to detect accidental data corruption. Indeed, in information-security contexts, cryptographic hash values are sometimes called (digital) fingerprints, checksums, or just hash values, even though all these terms stand for more general functions with rather different properties and purposes.

    The MD5 Checksum

    The md5sum is designed to verify data integrity using MD5 (Message Digest Algorithm 5). MD5 is 128-bit cryptographic hash and if used properly it can be used to verify file authenticity and integrity.


    md5sum [ OPTION ] [FILE]...

    Using the md5sum command will calculate the checksum on the specified file.

    pbmac@pbmac-server $  md5sum /home/mandeep/test/test.cpp
    c6779ec2960296ed9a04f08d67f64422  /home/pbmac/test/test.cpp

    The sha256sum Command

    The program sha256sum is designed to verify data integrity using the SHA-256 (SHA-2 family with a digest length of 256 bits). SHA-256 hashes used properly can confirm both file integrity and authenticity. SHA-256 serves a similar purpose to a prior algorithm recommended by Ubuntu, MD5, but is less vulnerable to attack.

    Comparing hashes makes it possible to detect changes in files that would cause errors. The possibility of changes (errors) is proportional to the size of the file; the possibility of errors increases as the file becomes larger. It is a very good idea to run a SHA-256 hash comparison check when you have a file like an operating system install CD that has to be 100% correct.

    There are several iterations of the sha command: sha1sum; sha256sum; sha384sum; and sha512sum/. They are basically identical but use a different number of bits to compute the checksum.

    Adapted from:
    "md5sum Command in Linux with Examples" by msdeep14, Geeks for Geeks is licensed under CC BY-SA 4.0
    "HowToSHA256SUM" by Anthony Geoghegan, Ubuntu Community Wiki is licensed under CC BY-SA 3.0

    • Was this article helpful?