Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / compression

RLE: The Human Friendly Compression

5.00/5 (15 votes)
16 Apr 2022CPOL5 min read 26.4K   422  
RLE (Run Length Encoding): The Human Friendly Compression
In this article, we will take a look at RLE which is a lossless data compression algorithm.

Background

Every time one wants to save space for data storage, compression is the answer. RLE is a very simple compression algorithm.

The xPrompt Download and prg Files

xPrompt.zip contains the binary of the xHarbour Scripting Engine for Windows. This is the app needed to run the prg code.

Usage:

  • Copy prg file in the same directory as xPrompt
  • Launch xPrompt
  • Type "do FileName.prg" to run the code

xHarbour is freeware and can be downloaded at xHarbour.org[^].

xHarbour is part of the xBase family languages: xBase - Wikipedia[^].

Contents

  1. Introduction
  2. The RLE Algorithm
    1. Example
    2. RLE Translation table
    3. Encoding
    4. Decoding
  3. Conway's Game of Life
    1. Example
    2. RLE Translation table
    3. Encoding
    4. Decoding
  4. Sokoban: A Practical Case
    1. Example
    2. RLE Translation Table
    3. Encoding
    4. Decoding
  5. Points of Interest
  6. Links
  7. History

1. Introduction

RLE means 'Run Length Encoding', this compression method focuses on repeats (runs) of same value to achieve compression. This also means that it is efficient only on well suited datasets. As it is really simple, encoding and decoding can be done by hand.

2. The RLE Algorithm

RLE is a lossless data compression algorithm applied to text. The principle is to replace runs of same data by the count and the data value only once. The operation is interesting for runs of 3 or more repeats. Since we use digits to encode runs, when input include digits, a translation must be done to avoid conflict between data and counts.

RLE is interesting every time one gets large bitmaps with long runs of same data:

  • Faxes: RLE is in world wide use with faxes, because a fax is a drawing made of dots that are only black or white and with large runs of same color.
  • Games levels: Games with 2D map levels like Sokoban are good candidates because at any point in the game, only the current level is needed in uncompressed form.
  • Large dataset: Conway's Game of Life is a good example of large datsets with only two values, and dominated by dead cells.

Example

In this example, bitmap is the input, on the right is the RLE encoding for each line. Below is the full RLE. And last is the size of input and the size of RLE. The output is not optimized.

Bitmap Letter A
Bitmap                 RLE
...................... 22.$
...................... 22.$
.......xxxxxxx........ 7.7x8.$
.....xxxxxxxxxxx...... 5.11x6.$
....xxxxxxxxxxxxx..... 4.13x5.$
....xxxx......xxxx.... 4.4x6.4x4.$
...xxxx........xxx.... 3.4x8.3x4.$
...xxx.........xxx.... 3.3x9.3x4.$
...............xxx.... 15.3x4.$
..............xxxx.... 14.4x4.$
......xxxxxxxxxxxx.... 6.12x4.$
....xxxxxxxxxxxxxx.... 4.14x4.$
...xxxxxxxxx...xxx.... 3.9x3.3x4.$
..xxxxx........xxx.... 2.5x8.3x4.$
..xxx..........xxx.... 2.3x10.3x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxxx......xxxxxx.... 2.4x6.6x4.$
...xxxxxxxxxxx.xxxxx.. 3.11x.5x2.$
....xxxxxxxxx...xxxx.. 4.9x3.4x2.$
.....xxxxxx.....xxxx.. 5.6x5.4x2.$
...................... 22.$
...................... 22.!
22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x4.$15.3x4.$14.4x
4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x4.$2.3x9.4x4.$2.3x9.4x4.$2.4x
6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!
Bitmap size= 506 RLE size= 204

RLE Translation table

Pattern Encoding Meaning
. (dot) . (dot) White point
x x Black point
  Digits Number of repeats
  $ End of Line
  ! End of Pattern

Encoding

In RLE.zip, program file is RLE_Enc_fax.prg, input pattern is in file Fax.txt, result is in file Fax.log.

dBase
* RLE encoding for Simple Bitmap
clear screen

/* Data file structure
pattern Name on first line
Pattern
empty line to indicate the end of pattern
*/

mdl= {}
* Read input file handle= fopen("Fax.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
	tmp= left(tmp,len(tmp)-2) // remove carriage return
	aadd(mdl, tmp)
	tmp= freadln(handle, ,1024)
enddo
fclose(handle)

* process input file set printer to Fax.log
set printer on
title= .T.
for scan=1 to len(mdl)
	? mdl[scan] // output pattern name
	rle= ""
	bitmap= 0

	scan++
	do while len(mdl[scan]) != 0
		? mdl[scan] // output pattern line
		rle_line= ""
		bitmap += len(mdl[scan])
		last= mdl[scan,1]
		cnt= 1
		for scanc=2 to len(mdl[scan])
			if last= mdl[scan, scanc]
				cnt++
			else
				if cnt> 1
					rle_line += str(cnt,,, .t.)
				endif
				rle_line += last

				last= mdl[scan, scanc]
				cnt= 1
			endif
		next
		* if last != "."
			if cnt> 1
				rle_line += str(cnt,,, .t.)
			endif
			rle_line += last
		* endif
		scan++
		if len(mdl[scan]) = 0
			rle_line += "!"
		else
			rle_line += "$"
		endif
		?? " "
		?? rle_line // output line encoding
		rle += rle_line

	enddo
	? rle // output full RLE
	? "Bitmap size= "+str(bitmap,,, .t.)
	?? " RLE size= "+str(len(rle),,, .t.)
	?

next
set printer off
set printer to
return

Decoding

In RLE.zip, program file is RLE_Dec_fax.prg.

dBase
* RLE decoding for simple Bitmap
clear screen

pattern= "22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x
4.$15.3x4.$14.4x4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x
4.$2.3x9.4x4.$2.3x9.4x4.$2.4x6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!"

* process RLE pattern cnt=0
for scan=1 to len(pattern)
	char= pattern[scan]
	if isdigit(char)
		cnt= cnt*10+ val(char)
	else
		if char= "$" .or. char= "!"
			? // next line
		else
			if cnt= 0
				cnt=1
			endif
			for rep= 1 to cnt
				? char
			next
		endif
		cnt= 0
	endif
next
?

return

3. Conway's Game of Life

GoL patterns fit very well with RLE compression because datasets are large, a cell is either dead or alive, and dead cells dominate datasets.

Run Length Encoded - LifeWiki[^]

Hundred of RLE compressed patterns can be found at Category:Patterns - LifeWiki[^].

Example

2-engine Cordership
...................OO.................... 19b2O$
...................OOOO.................. 19b4O$
...................O.OO.................. 19bOb2O$
......................................... $
....................O.................... 20bO$
...................OO.................... 19b2O$
...................OOO................... 19b3O$
.....................O................... 21bO$
.................................OO...... 33b2O$
.................................OO...... 33b2O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
....................................O.... 36bO$
...................................OO.... 35b2O$
..................................O...O.. 34bO3bO$
...................................OO..O. 35b2O2bO$
........................................O 40bO$
.....................................O.O. 37bObO$
......................................O.. 38bO$
......................................O.. 38bO$
......................................OO. 38b2O$
......................................OO. 38b2O$
......................................... $
......................................... $
.............O..........O................ 13bO10bO$
............OOOOO.....O.OO...........O... 12b5O5bOb2O11bO$
...........O..........O...O.........O.... 11bO10bO3bO9bO$
............OO........OOO.O.........OO... 12b2O8b3ObO9b2O$
.............OO.........OO............O.. 13b2O9b2O12bO$
OO.............O.....................OOO. 2O13bO21b3O$
OO...................................OOO. 2O35b3O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
........OO............................... 8b2O$
........OO...........OO.................. 8b2O11b2O$
...................OO..O................. 19b2O2bO$
........................O...O............ 24bO3bO$
..................O.....O...O............ 18bO5bO3bO$
...................O..OO...O.O........... 19bO2b2O3bObO$
....................OOO.....O............ 20b3O5bO$
............................O............ 28bO!
19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO$35b2O$3
4bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5bOb2O1
1bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$$$$$
$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!
Bitmap size= 2009 RLE size= 292

Result without end of line optimization. Pay attention to the difference in RLE size.

19b2O20b$19b4O18b$19bOb2O18b$41b$20bO20b$19b2O20b$19b3O19b$21bO19b$33b2O6b$
33b2O6b$41b$41b$41b$41b$41b$41b$36bO4b$35b2O4b$34bO3bO2b$35b2O2bOb$40bO$37b
ObOb$38bO2b$38bO2b$38b2Ob$38b2Ob$41b$41b$13bO10bO16b$12b5O5bOb2O11bO3b$11bO
10bO3bO9bO4b$12b2O8b3ObO9b2O3b$13b2O9b2O12bO2b$2O13bO21b3Ob$2O35b3Ob$41b$41
b$41b$41b$41b$41b$8b2O31b$8b2O11b2O18b$19b2O2bO17b$24bO3bO12b$18bO5bO3bO12b
$19bO2b2O3bObO11b$20b3O5bO12b$28bO12b!
Bitmap size= 2009 RLE size= 413

RLE Translation Table

Pattern Encoding Meaning
. (dot) b Dead cell
O o Alive cell
  Digits Number of repeats
  $ End of Line
  ! End of Pattern

Since playfield defaults to dead cells, dead cells ending a line can be removed.

Encoding

In RLE.zip, program file is RLE_Enc_gol.prg, input pattern is in file Gol.txt, result is in file Gol.log.

dBase
* RLE encoding for Conway's Game of Life
clear screen

/* Data file structure
GoL pattern Name on first line
Pattern
empty line to indicate the end of pattern
*/

mdl= {}
* Read input file handle= fopen("gol.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
	tmp= left(tmp,len(tmp)-2) // remove carriage return
	aadd(mdl, tmp)
	tmp= freadln(handle, ,1024)
enddo
fclose(handle)

* process input file set printer to GoL.log
set printer on
title= .T.
for scan=1 to len(mdl)
	? mdl[scan] // pattern name
	rle= ""
	bitmap= 0

	scan++
	do while len(mdl[scan]) != 0
		? mdl[scan] // output pattern line
		rle_line= ""
		bitmap += len(mdl[scan])
		last= mdl[scan,1]
		cnt= 1
		for scanc=2 to len(mdl[scan])
			if last= mdl[scan, scanc]
				cnt++
			else
				if cnt> 1
					rle_line += str(cnt,,, .t.)
				endif
				if last= "."
					rle_line += "b" // translation
				else
					rle_line += last
				endif

				last= mdl[scan, scanc]
				cnt= 1
			endif
		next
		if last != "."  // EOL optimization
			if cnt> 1
				rle_line += str(cnt,,, .t.)
			endif
			if last= "."
				rle_line += "b" // translation
			else
				rle_line += last
			endif
		endif
		scan++
		if len(mdl[scan]) = 0
			rle_line += "!"
		else
			rle_line += "$"
		endif
		?? " "
		?? rle_line // output line encoding
		rle += rle_line

	enddo
	? rle // output full RLE
	? "Bitmap size= "+str(bitmap,,, .t.)
	?? " RLE size= "+str(len(rle),,, .t.)
	?

next
set printer off
set printer to
return

Decoding

In RLE.zip, program file is RLE_Dec_Gol.prg and Gol.cpp.

dBase
* RLE decoding for simple Bitmap
clear screen

pattern= "19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO
$35b2O$34bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5
bOb2O11bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$
$$$$$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!"

* process RLE pattern cnt= 0
for scan=1 to len(pattern)
	char= pattern[scan]
	if isdigit(char)
		cnt= cnt*10+ val(char)
	else
		if char= "$" .or. char= "!"
			? // next line
		else
			if cnt= 0
				cnt=1
			endif
			if char= "b"
				char= "."
			endif
			for rep= 1 to cnt
				?? char
			next
		endif
		cnt= 0
	endif
next
?

return
BASIC
' Sample program for HP Prime calculator
RLEMap (px, py, RLE)
' Game of Life RLE decoding
BEGIN
  LOCAL Row, Col, Ptr, Cnt, Cod;
  Row:= py; Col:= px+2; Cnt:= 0;
  FOR Ptr FROM 1 TO DIM(RLE) DO
    Cod:= MID(RLE, Ptr, 1);
    CASE
      IF INSTRING("0123456789", Cod) <> 0 THEN Cnt:= Cnt*10+EXPR(Cod); END;
      IF Cod == "$" THEN Row:= Row+MAX(1, Cnt); Col:= px+2; Cnt:= 0; END;
      IF INSTRING("b.", Cod) <> 0 THEN Col:= Col+MAX(1, Cnt); Cnt:= 0; END;
      IF INSTRING("oA", Cod) <> 0 THEN
        FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO
          Lignes(IP(Row/60)+1, Col):= BITOR(Lignes(IP(Row/60)+1, Col), 
          BITSL(#2,(Row MOD 60) ));
        END;
        Cnt:= 0;
      END;
    END;
  END;
  LDisp();
END;
C++
// A RLE decoder for GoL in C.
void RLEMap(int px, int py, string RLE) {

	unsigned int Row, Col, Ptr, Cnt;
	char Cod;
	Row = py + 1;
	Col = px + 1 + 2;
	Cnt = 0;
	for (Ptr = 0; Ptr < RLE.length(); Ptr++) {
		Cod = RLE[Ptr];
		switch (Cod) {
		case '0':
		case '1':
		case '2':
		case '3':
		case '4':
		case '5':
		case '6':
		case '7':
		case '8':
		case '9':
			Cnt = Cnt * 10 + (Cod - '0');
			break;
		case '$':
			if (Cnt == 0)
				Cnt = 1;
			Row = Row + Cnt;
			Col = px + 1 + 2;
			Cnt = 0;
			break;
		case 'b':
		case '.':
			if (Cnt == 0)
				Cnt = 1;
			Col = Col + Cnt;
			Cnt = 0;
			break;
		case 'o':
		case 'O':
			if (Cnt == 0)
				Cnt = 1;
			for (; Cnt > 0; Cnt--) {
				CellSet(Row, Col);
				Col++;
			}
			Cnt = 0;
		}
	}
}

4. Sokoban: A Practical Case

Back in 2013-2014, I was involved in 'HP Prime Graphing Calculator' beta testing. A short time after the calculator came out, a member of a HP fans forum published a Sakoban game for the calculator.

In addition to add the possibility to use the keyboard (original version is touch screen only), I compressed all 40 levels using RLE.

The original source code SokobanBeta1.03.prime size was 112k (UTF16). My version SokobanBeta1.5.prime size was 52k (UTF16). Roughly half the size.

Example

Level 1-1 as matrix                          RLE
[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0]   2b4w4b3w$           
,[0,0,1,3,0,1,1,1,0,0,1,4,1,0,0,0,0,0,0,0]   2bwdb3w2bwsw$       
,[0,0,1,3,0,0,0,1,1,1,1,0,1,1,1,1,0,0,0,0]   2bwd3b4wb4w$        
,[0,0,1,3,0,0,0,0,0,1,1,0,1,0,0,1,0,0,0,0]   2bwd5b2wbw2bw$      
,[0,0,1,1,1,0,1,0,0,0,0,0,0,0,2,1,0,0,0,0]   2b3wbw7bcw$         
,[0,0,0,0,1,0,0,0,1,0,1,1,1,1,0,1,1,0,0,0]   4bw3bwb4wb2w$       
,[0,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0]   3b2wbwbw7bw$        
,[0,0,0,1,0,2,1,0,1,0,1,0,2,0,0,0,1,0,0,0]   3bwbcwbwbwbc3bw$    
,[0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0]   3bw10b3w$           
,[0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0]   3b12w$              
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]  !                   
$$2b4w4b3w$2bwdb3w2bwsw$2bwd3b4wb4w$2bwd5b2wbw2bw$2b3wbw7bcw$4bw3bwb4wb2w$
3b2wbwbw7bw$3bwbcwbwbwbc3bw$3bw10b3w$3b12w$$$!
Matrix Size= 20* 15= 300  RLE Size= 120
Matrix Size in source code= 630

RLE Translation Table

Pattern Encoding Meaning
0 b Background
1 w Wall
2 c Cube (Parcel)
3 d Destination (Target)
4 s Sokoban
  Digits Number of repeats
  $ End of Line
  ! End of Pattern

Encoding

This function encodes a Sokoban level. Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.

BASIC
RLEEnc (Mt) // RLE encoder
BEGIN
  LOCAL Row, Col, Cod, Cnt, Tmp, Sz;
  Tmp:= ""; Sz:=SIZE(Mt);
  FOR Row FROM 1 TO Sz(1) DO
    Cnt:= 1; Cod:= Mt(Row,1);
    FOR Col FROM 2 TO Sz(2) DO
      IF Cod == Mt(Row,Col) THEN
        Cnt:= Cnt+1;
      ELSE
        IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
        Tmp:= Tmp+ MID("bwcds",Cod+1,1);
        Cnt:= 1; Cod:= Mt(Row,Col);
      END;
    END;
    IF Cod <> 0 THEN
      IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
      Tmp:= Tmp+ MID("bwcds",Cod+1,1);
    END;
    Tmp:= Tmp+ "$";
  END;
  RETURN Tmp;
END;

As this is calculator programming, there is no easy way to do terminal output for encoding. I needed a complementary function to get around this problem.

Encoding function is internal to the program (it is invisible outside of the progam. So I created an exported function which is callable from user calculator interface.

The interest is that the result in user interface can be copy/pasted outside of the emulator and thus used as an easy way to paste the result in source code.

BASIC
// Apply RLE compression to all levels
// remove drawscreen() before converting
EXPORT Sok_RLE()
BEGIN
  LOCAL Lst:={};
  FOR NAJ FROM 10 TO 49 DO
    loadsubnivel();
    IF TYPE(matriz) == 2 THEN
      Lst:= CONCAT(Lst, {{NAJ, matriz}});
    ELSE
      Lst:= CONCAT(Lst, {{NAJ, RLEEnc (matriz)}});
    END;
  END;
  RETURN Lst;
END;

Decoding

Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.

BASIC
RLEDec (Rows, Cols, RLE) // RLE decoder
BEGIN
  LOCAL Row, Col, Ptr, Cnt, Cod, Tmp;
  Row:= 1; Col:= 1; Cnt:= 0;
  Tmp:= MAKEMAT(0,Rows,Cols);
  FOR Ptr FROM 1 TO DIM(RLE) DO
    CASE
      IF INSTRING("0123456789", MID(RLE, Ptr, 1)) <> 0 
      THEN Cnt:= Cnt*10+EXPR(MID(RLE, Ptr, 1)); END;
      IF MID(RLE, Ptr, 1) == "$" THEN Row:= Row+MAX(1, Cnt); Col:= 1; Cnt:= 0; END;
      DEFAULT
      Cod:= INSTRING("bwcds", MID(RLE, Ptr, 1));
      FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO Tmp(Row,Col):=Cod- 1; END; Cnt:= 0;
    END;
  END;
  RETURN Tmp;
END;

5. Points of Interest

  • An easy to handle data compression algorithm

7. History

  • 21st March, 2021: First draft
  • 6th September, 2021: First version
  • 8th September, 2021: Second version, added Conway's Game of life
  • 11th September, 2021: Third version, added Sokoban example
  • 2nd January, 2022: Added reference .sok files
  • 30th January, 2022: Fixed typo
  • 16th January, 2022: Added a C decoder for Game of Life, download updated

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)