Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Hashcash or Proof of Work

0.00/5 (No votes)
24 Feb 2017 1  
Hashcash is a proof-of-work system used to limit email spam and denial-of-service attacks, and more recently has become known for its use in bitcoin (and other cryptocurrencies) as part of the mining algorithm.

Introduction

"Hashcash is a proof-of-work system used to limit email spam and denial-of-service attacks, and more recently has become known for its use in bitcoin (and other cryptocurrencies) as part of the mining algorithm. Hashcash was proposed in March 1997 by Adam Back." (wikipedia) You can read Adam Back's paper here.

The idea is that a message, like an email, "proves" that it is a legitimate message by including hashing some string in such a manner that it proves that a computer spent some time/energy on a particular algorithm -- in particular, computing a SHA-1 hash such that the first 20 bits of the hash are 0. Because this takes a certain amount of computational time to find such a qualifying hash through brute force, it costs the sender a small amount to find the hash, which is seen as prohibitive for spammers that send large number of emails. A hashcash can be viewed as "a white-listing hint to help hashcash users avoid losing email due to content based and blacklist based anti-spam devices." (hashcash.org)

This "proof of work" concept is primarily used nowadays as the bitcoin mining function. These "act as a vote in the blockchain evolution and validate the blockchain transaction log." Or, to put it another way: "Bitcoin uses Hashcash to provide security from malicious alterations of the Blockchain, by imposing a cost for alteration that a miner must hope to recoup through rewards given for cooperation... In Bitcoin, the difficulty of the Hashcash problem is varied over time depending on the recent history of solution times, targeting a ten minute solution on average." (The Book of Bitcoin)

Other Implementations

hashcash.org has a link to a C# implementation on SourceForge. However, in my testing of this algorithm, there are some bugs. A small bug is in the date stamp:

string stampDate = date.ToString("yymmdd");

Oops, that's year - minute - day format!

A more significant bug is that the resulting header frequently does not verify with:

SHA1CryptoServiceProvider sha = new SHA1CryptoServiceProvider();
byte[] hash = sha.ComputeHash(Encoding.UTF8.GetBytes(header));

It turns out that the resulting hash often has only the first 16 or 18 bits set to 0, and I believe this is the result of an algorithmic problem in how the base64 value is computed with regards to completing the octet.

Algorithm

A hashcash header has the following fields (wikipedia):

  • version: (currently 1)
  • bits: the number of leading bits that are 0
  • timestamp: a date/time stamp (time is optional)
  • resource: the data string being transmitted, for example, an IP address, email address, or other data
  • extension: ignored in version 1
  • random seed: base-64 encoded random set of characters
  • counter: base-64 encoded binary counter between 0 and 220, (1,048,576)

If you code this, there are a few questions that come up and a flaw in the algorithm.

  1. How many characters should the random seed be?
  2. When encoding the binary counter, should it be encoded big or little endian? Should leading zeros (big endian) or trailing zeros (little endian) be excluded when converting an integer (4 bytes) to a byte array?
  3. A more important issue is that many cases do not have a solution with a maximum counter value of 220. I have seen a counter value of 8,069,934 (0x7B232E) required to come to a solution.

My revised algorithm is:

  • The random seed is 8 characters.
  • The counter starts at int.MinValue() and increments until a solution is found.
  • The counter is converted to base64 from the 4 little endian bytes representing the integer.
  • If the counter reaches int.MaxValue(), an exception is thrown.

Implementation

I certainly don't suggest that this algorithm is written efficiently, but then again, since it was meant to consume CPU cycles, I'm not particularly concerned about that.

Verification

Let's look first at how the header is verified:

public class HashCash
{
  public static bool Verify(string header)
  {
    // We assume the bits that are going to be 0 are going to be between 10 and 99.
    int zbits = int.Parse(header.Substring(2, 2));
    int bytesToCheck = zbits / 8;
    int remainderBitsToCheck = zbits % 8;
    byte[] zArray = Enumerable.Repeat((byte)0x00, bytesToCheck).ToArray();
    byte remainderMask = (byte)(0xFF << (8 - remainderBitsToCheck));
    SHA1CryptoServiceProvider sha = new SHA1CryptoServiceProvider();
    byte[] hash = sha.ComputeHash(Encoding.UTF8.GetBytes(header));

    return hash.Take(bytesToCheck).SequenceEqual(zArray) && ((hash[bytesToCheck] & remainderMask) == 0);
  }
}

There are other ways to skin this cat, for example using a BitArray, but the above is the implementation that I chose.

We can verify that header example on the wikipedia page like this:

var check = HashCash.Verify("1:20:1303030600:adam@cypherspace.org::McMybZIhxKXu57jd:ckvi");
Console.WriteLine(check ? "Passed Verification" : "Failed Verification");

This passes. Because it passes, we can have a certain degree of trust that the message is real. Further validation can be done to improve the validity of the message:

  • The number of zero bits used to compute the hash.
  • The timestamp is within an acceptable range.
  • The random seed is unique (not re-used).

All of this helps to white-list the message.

Initialization

A few constructors offer some ways of initializing the header:

public HashCash(string resource, int zbits = 20)
{
  rand = GetRandomAlphaNumeric();
  this.msgDate = DateTime.Now;
  this.resource = resource;
  this.zbits = zbits;
  Initialize();
}

public HashCash(DateTime msgDate, string resource, int zbits = 20)
{
  rand = GetRandomAlphaNumeric();
  this.msgDate = msgDate;
  this.resource = resource;
  this.zbits = zbits;
  Initialize();
}

public HashCash(DateTime msgDate, string resource, string rand, int zbits = 20)
{
  this.rand = rand;
  this.msgDate = msgDate;
  this.resource = resource;
  this.zbits = zbits;
  Initialize();
}

If you don't provide the randomized seed, one is computed for you:

public string GetRandomAlphaNumeric(int len = 8)
{
  var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";

  return new String(chars.Select(c => chars[rnd.Next(chars.Length)]).Take(len).ToArray());
}

Internally, some values that are used all the time are computed:

private void Initialize()
{
  counter = 0;
  sha = new SHA1CryptoServiceProvider();
  bytesToCheck = zbits / 8;
  remainderBitsToCheck = zbits % 8;
  zArray = Enumerable.Repeat((byte)0x00, bytesToCheck).ToArray();
  remainderMask = (byte)(0xFF << (8 - remainderBitsToCheck));
}

Testing a Header

Once we've constructed the header, testing it involves verifying that the first n bits are 0:

private bool AcceptableHeader(string header)
{
  byte[] hash = sha.ComputeHash(Encoding.UTF8.GetBytes(header));

  return hash.Take(bytesToCheck).SequenceEqual(zArray) && ((hash[bytesToCheck] & remainderMask) == 0);
}

Computing the Header

This involves constructing the header and for each failure, incrementing the counter until the hashed header passes the bit test:

public string Compute()
{
  string[] headerParts = new string[]
  {
    "1",
    zbits.ToString(),
    msgDate.ToString("yyMMddhhmmss"),
    resource,
    "",
    Convert.ToBase64String(Encoding.UTF8.GetBytes(rand)),
    Convert.ToBase64String(BitConverter.GetBytes(counter))
  };

  string ret = String.Join(":", headerParts);
  counter = int.MinValue;
  Iterations = 0;

  while (!AcceptableHeader(ret))
  {
    headerParts[COUNTER_IDX] = Convert.ToBase64String(BitConverter.GetBytes(counter));
    ret = String.Join(":", headerParts);

    // Failed 
    if (counter == int.MaxValue)
    {
      throw new HashCashException("Failed to find solution.");
    }

    ++counter;
    ++Iterations;
  }

  return ret;
}

Testing

I put together a simple test that performs the "proof of work" 100 times:

static void TestHashCash()
{
  var check = HashCash.Verify("1:20:1303030600:adam@cypherspace.org::McMybZIhxKXu57jd:ckvi");
  Console.WriteLine(check ? "Passed Verification" : "Failed Verification");

  int totalTime = 0;

  for (int i = 0; i < iterations; i++)
  {
    try
    {
      HashCash hc = new HashCash("foo.bar@foobar.com");
      DateTime start = DateTime.Now;
      string header = hc.Compute();
      DateTime stop = DateTime.Now;
      bool ret = HashCash.Verify(header);

      if (!ret)
      {
        throw new HashCashException("Verification failed.");
      }

      int ms = (int)((stop - start).TotalMilliseconds);
      Console.WriteLine(i + "-> Time: " + ms + "ms Iterations = " + hc.Iterations);
      totalTime += ms;
    }
    catch (HashCashException ex)
    {
      Console.WriteLine(ex.Message);
      break;
    }
  }

  Console.WriteLine("Average time: " + (int)(totalTime / iterations) + "ms");
}

Example output (the last 19 iterations):

It certainly takes on average more than one second to compute an acceptable hash!

Conclusion

I find this to be a really interesting -- it's sort of the opposite of captcha. A hashcash verifies that the sender is a machine (no human could ever perform this computation) but that:

  1. The machine is not being used for spamming or other unsolicited messaging.
  2. The machine sending the message is authenticating the message header (this could be expanded to include the message body as well).
  3. An approach like this can be used as a throttle, or governor, to prevent even a legitimate program from overwhelming a server.
  4. This "proof of work" algorithm has been used to help prevent denial-of-service attacks.

NHashCash (the sourceforge link I posted earlier) is also included but the test for that has been commented out.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here