Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / DevOps

4-bit Encoder/Decoder

4.38/5 (4 votes)
12 Feb 2019CPOL 19.2K  
Code for a 4-bit encoder to store 15 different symbols with higher efficiency

Introduction

Converts an 8 bit string to a 4-bit string (max. 15 different characters allowed).

Respectively: Converts two 8 bit strings to one 8 bit string.

Through this conversion, strings can be stored using only 1/2 of the size of a usual string. This might be useful for a huge amount of data, that uses 15 different characters at max (like phone numbers).

Background

I was thinking, that storing telephone numbers in a database as strings is a waste of memory. But storing as an integer is also not possible. My solution was to use an encoded string.

Using the Code

Below, you see the implementation of the class. At the bottom, there is a test() function, that shows how to use the code.

For customizing the symbols, that can be represented/encoded, change Encode4Bits._mappingTable. Never use more than 15 customized values.

Python
class Encode4Bits:
    def __init__(self):
        # first element is always "END"
        self._mappingTable = ['\0', \
                              '0','1','2','3','4','5','6','7','8','9', \
                              '-','','','','']

    def _encodeCharacter(self,char):
        """@return index of element or None, if not exists"""
        for p in range(len(self._mappingTable)):
            if(char == self._mappingTable[p]):
                return p
        return None

    def encode(self, string):
        strLen = len(string)

        # ===== 1. map all chars to an index in our table =====
        mappingIndices = []
        for i in range(strLen):
            char = string[i]
            index = self._encodeCharacter(char)
            if(index is None):
                raise("ERROR: Could not encode '" + char + "'.")
            mappingIndices.append(index)
        mappingIndices.append(0)
        
        # ===== 2. Make num values even =====
        # 4 bit => 2 chars in one byte. Therefore: need even num values
        if(len(mappingIndices) % 2 != 0):
            mappingIndices.append(0)

        # ===== 3. create string =====
        ret = ""
        i = 0
        while True:
            if(i >= len(mappingIndices)):
                break # finished
            val1 = mappingIndices[i]
            val2 = mappingIndices[i+1]
            val1 = val1 << 4           
            mixed = val1 | val2
            char = chr(mixed)
            ret += str(char)
            i += 2

        return ret

    def decode(self, string):
        ret = ""
        for char in string:
            index1 = (ord(char) & 0xF0) >> 4
            index2 = (ord(char) & 0x0F)            
            ret += self._mappingTable[index1]
            ret += self._mappingTable[index2]
        
        return ret

def test():
    numberCompressor = Encode4Bits()
    encoded = numberCompressor.encode("067-845-512")
    decoded = numberCompressor.decode(encoded)
    print(len(decoded))
    print(len(encoded))


if __name__ == "__main__":
    test()

History

  • 8th February, 2019: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)