In this article, we will take a look at RLE which is a lossless data compression algorithm.
Background
Every time one wants to save space for data storage, compression is the answer. RLE is a very simple compression algorithm.
The xPrompt Download and prg Files
xPrompt.zip contains the binary of the xHarbour Scripting Engine for Windows. This is the app needed to run the prg code.
Usage:
- Copy prg file in the same directory as xPrompt
- Launch xPrompt
- Type "do FileName.prg" to run the code
xHarbour is freeware and can be downloaded at xHarbour.org[^].
xHarbour is part of the xBase family languages: xBase - Wikipedia[^].
Contents
- Introduction
- The RLE Algorithm
- Example
- RLE Translation table
- Encoding
- Decoding
- Conway's Game of Life
- Example
- RLE Translation table
- Encoding
- Decoding
- Sokoban: A Practical Case
- Example
- RLE Translation Table
- Encoding
- Decoding
- Points of Interest
- Links
- History
1. Introduction
RLE means 'Run Length Encoding', this compression method focuses on repeats (runs) of same value to achieve compression. This also means that it is efficient only on well suited datasets. As it is really simple, encoding and decoding can be done by hand.
2. The RLE Algorithm
RLE is a lossless data compression algorithm applied to text. The principle is to replace runs of same data by the count and the data value only once. The operation is interesting for runs of 3 or more repeats. Since we use digits to encode runs, when input include digits, a translation must be done to avoid conflict between data and counts.
RLE is interesting every time one gets large bitmaps with long runs of same data:
- Faxes: RLE is in world wide use with faxes, because a fax is a drawing made of dots that are only black or white and with large runs of same color.
- Games levels: Games with 2D map levels like Sokoban are good candidates because at any point in the game, only the current level is needed in uncompressed form.
- Large dataset: Conway's Game of Life is a good example of large datsets with only two values, and dominated by dead cells.
Example
In this example, bitmap is the input, on the right is the RLE encoding for each line. Below is the full RLE. And last is the size of input and the size of RLE. The output is not optimized.
Bitmap Letter A
Bitmap RLE
...................... 22.$
...................... 22.$
.......xxxxxxx........ 7.7x8.$
.....xxxxxxxxxxx...... 5.11x6.$
....xxxxxxxxxxxxx..... 4.13x5.$
....xxxx......xxxx.... 4.4x6.4x4.$
...xxxx........xxx.... 3.4x8.3x4.$
...xxx.........xxx.... 3.3x9.3x4.$
...............xxx.... 15.3x4.$
..............xxxx.... 14.4x4.$
......xxxxxxxxxxxx.... 6.12x4.$
....xxxxxxxxxxxxxx.... 4.14x4.$
...xxxxxxxxx...xxx.... 3.9x3.3x4.$
..xxxxx........xxx.... 2.5x8.3x4.$
..xxx..........xxx.... 2.3x10.3x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxxx......xxxxxx.... 2.4x6.6x4.$
...xxxxxxxxxxx.xxxxx.. 3.11x.5x2.$
....xxxxxxxxx...xxxx.. 4.9x3.4x2.$
.....xxxxxx.....xxxx.. 5.6x5.4x2.$
...................... 22.$
...................... 22.!
22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x4.$15.3x4.$14.4x
4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x4.$2.3x9.4x4.$2.3x9.4x4.$2.4x
6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!
Bitmap size= 506 RLE size= 204
RLE Translation table
Pattern | Encoding | Meaning |
. (dot) | . (dot) | White point |
x | x | Black point |
| Digits | Number of repeats |
| $ | End of Line |
| ! | End of Pattern |
Encoding
In RLE.zip, program file is RLE_Enc_fax.prg, input pattern is in file Fax.txt, result is in file Fax.log.
clear screen
mdl= {}
handle= fopen("Fax.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
tmp= left(tmp,len(tmp)-2)
aadd(mdl, tmp)
tmp= freadln(handle, ,1024)
enddo
fclose(handle)
set printer to Fax.log
set printer on
title= .T.
for scan=1 to len(mdl)
? mdl[scan]
rle= ""
bitmap= 0
scan++
do while len(mdl[scan]) != 0
? mdl[scan]
rle_line= ""
bitmap += len(mdl[scan])
last= mdl[scan,1]
cnt= 1
for scanc=2 to len(mdl[scan])
if last= mdl[scan, scanc]
cnt++
else
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
rle_line += last
last= mdl[scan, scanc]
cnt= 1
endif
next
* if last != "."
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
rle_line += last
* endif
scan++
if len(mdl[scan]) = 0
rle_line += "!"
else
rle_line += "$"
endif
?? " "
?? rle_line
rle += rle_line
enddo
? rle
? "Bitmap size= "+str(bitmap,,, .t.)
?? " RLE size= "+str(len(rle),,, .t.)
?
next
set printer off
set printer to
return
Decoding
In RLE.zip, program file is RLE_Dec_fax.prg.
clear screen
pattern= "22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x
4.$15.3x4.$14.4x4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x
4.$2.3x9.4x4.$2.3x9.4x4.$2.4x6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!"
cnt=0
for scan=1 to len(pattern)
char= pattern[scan]
if isdigit(char)
cnt= cnt*10+ val(char)
else
if char= "$" .or. char= "!"
?
else
if cnt= 0
cnt=1
endif
for rep= 1 to cnt
? char
next
endif
cnt= 0
endif
next
?
return
3. Conway's Game of Life
GoL patterns fit very well with RLE compression because datasets are large, a cell is either dead or alive, and dead cells dominate datasets.
Run Length Encoded - LifeWiki[^]
Hundred of RLE compressed patterns can be found at Category:Patterns - LifeWiki[^].
Example
2-engine Cordership
...................OO.................... 19b2O$
...................OOOO.................. 19b4O$
...................O.OO.................. 19bOb2O$
......................................... $
....................O.................... 20bO$
...................OO.................... 19b2O$
...................OOO................... 19b3O$
.....................O................... 21bO$
.................................OO...... 33b2O$
.................................OO...... 33b2O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
....................................O.... 36bO$
...................................OO.... 35b2O$
..................................O...O.. 34bO3bO$
...................................OO..O. 35b2O2bO$
........................................O 40bO$
.....................................O.O. 37bObO$
......................................O.. 38bO$
......................................O.. 38bO$
......................................OO. 38b2O$
......................................OO. 38b2O$
......................................... $
......................................... $
.............O..........O................ 13bO10bO$
............OOOOO.....O.OO...........O... 12b5O5bOb2O11bO$
...........O..........O...O.........O.... 11bO10bO3bO9bO$
............OO........OOO.O.........OO... 12b2O8b3ObO9b2O$
.............OO.........OO............O.. 13b2O9b2O12bO$
OO.............O.....................OOO. 2O13bO21b3O$
OO...................................OOO. 2O35b3O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
........OO............................... 8b2O$
........OO...........OO.................. 8b2O11b2O$
...................OO..O................. 19b2O2bO$
........................O...O............ 24bO3bO$
..................O.....O...O............ 18bO5bO3bO$
...................O..OO...O.O........... 19bO2b2O3bObO$
....................OOO.....O............ 20b3O5bO$
............................O............ 28bO!
19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO$35b2O$3
4bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5bOb2O1
1bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$$$$$
$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!
Bitmap size= 2009 RLE size= 292
Result without end of line optimization. Pay attention to the difference in RLE size.
19b2O20b$19b4O18b$19bOb2O18b$41b$20bO20b$19b2O20b$19b3O19b$21bO19b$33b2O6b$
33b2O6b$41b$41b$41b$41b$41b$41b$36bO4b$35b2O4b$34bO3bO2b$35b2O2bOb$40bO$37b
ObOb$38bO2b$38bO2b$38b2Ob$38b2Ob$41b$41b$13bO10bO16b$12b5O5bOb2O11bO3b$11bO
10bO3bO9bO4b$12b2O8b3ObO9b2O3b$13b2O9b2O12bO2b$2O13bO21b3Ob$2O35b3Ob$41b$41
b$41b$41b$41b$41b$8b2O31b$8b2O11b2O18b$19b2O2bO17b$24bO3bO12b$18bO5bO3bO12b
$19bO2b2O3bObO11b$20b3O5bO12b$28bO12b!
Bitmap size= 2009 RLE size= 413
RLE Translation Table
Pattern | Encoding | Meaning |
. (dot) | b | Dead cell |
O | o | Alive cell |
| Digits | Number of repeats |
| $ | End of Line |
| ! | End of Pattern |
Since playfield defaults to dead cells, dead cells ending a line can be removed.
Encoding
In RLE.zip, program file is RLE_Enc_gol.prg, input pattern is in file Gol.txt, result is in file Gol.log.
clear screen
mdl= {}
handle= fopen("gol.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
tmp= left(tmp,len(tmp)-2)
aadd(mdl, tmp)
tmp= freadln(handle, ,1024)
enddo
fclose(handle)
set printer to GoL.log
set printer on
title= .T.
for scan=1 to len(mdl)
? mdl[scan]
rle= ""
bitmap= 0
scan++
do while len(mdl[scan]) != 0
? mdl[scan]
rle_line= ""
bitmap += len(mdl[scan])
last= mdl[scan,1]
cnt= 1
for scanc=2 to len(mdl[scan])
if last= mdl[scan, scanc]
cnt++
else
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
if last= "."
rle_line += "b"
else
rle_line += last
endif
last= mdl[scan, scanc]
cnt= 1
endif
next
if last != "."
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
if last= "."
rle_line += "b"
else
rle_line += last
endif
endif
scan++
if len(mdl[scan]) = 0
rle_line += "!"
else
rle_line += "$"
endif
?? " "
?? rle_line
rle += rle_line
enddo
? rle
? "Bitmap size= "+str(bitmap,,, .t.)
?? " RLE size= "+str(len(rle),,, .t.)
?
next
set printer off
set printer to
return
Decoding
In RLE.zip, program file is RLE_Dec_Gol.prg and Gol.cpp.
clear screen
pattern= "19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO
$35b2O$34bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5
bOb2O11bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$
$$$$$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!"
cnt= 0
for scan=1 to len(pattern)
char= pattern[scan]
if isdigit(char)
cnt= cnt*10+ val(char)
else
if char= "$" .or. char= "!"
?
else
if cnt= 0
cnt=1
endif
if char= "b"
char= "."
endif
for rep= 1 to cnt
?? char
next
endif
cnt= 0
endif
next
?
return
RLEMap (px, py, RLE)
BEGIN
LOCAL Row, Col, Ptr, Cnt, Cod;
Row:= py; Col:= px+2; Cnt:= 0;
FOR Ptr FROM 1 TO DIM(RLE) DO
Cod:= MID(RLE, Ptr, 1);
CASE
IF INSTRING("0123456789", Cod) <> 0 THEN Cnt:= Cnt*10+EXPR(Cod); END;
IF Cod == "$" THEN Row:= Row+MAX(1, Cnt); Col:= px+2; Cnt:= 0; END;
IF INSTRING("b.", Cod) <> 0 THEN Col:= Col+MAX(1, Cnt); Cnt:= 0; END;
IF INSTRING("oA", Cod) <> 0 THEN
FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO
Lignes(IP(Row/60)+1, Col):= BITOR(Lignes(IP(Row/60)+1, Col),
BITSL(#2,(Row MOD 60) ));
END;
Cnt:= 0;
END;
END;
END;
LDisp();
END;
void RLEMap(int px, int py, string RLE) {
unsigned int Row, Col, Ptr, Cnt;
char Cod;
Row = py + 1;
Col = px + 1 + 2;
Cnt = 0;
for (Ptr = 0; Ptr < RLE.length(); Ptr++) {
Cod = RLE[Ptr];
switch (Cod) {
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
Cnt = Cnt * 10 + (Cod - '0');
break;
case '$':
if (Cnt == 0)
Cnt = 1;
Row = Row + Cnt;
Col = px + 1 + 2;
Cnt = 0;
break;
case 'b':
case '.':
if (Cnt == 0)
Cnt = 1;
Col = Col + Cnt;
Cnt = 0;
break;
case 'o':
case 'O':
if (Cnt == 0)
Cnt = 1;
for (; Cnt > 0; Cnt--) {
CellSet(Row, Col);
Col++;
}
Cnt = 0;
}
}
}
4. Sokoban: A Practical Case
Back in 2013-2014, I was involved in 'HP Prime Graphing Calculator' beta testing. A short time after the calculator came out, a member of a HP fans forum published a Sakoban game for the calculator.
In addition to add the possibility to use the keyboard (original version is touch screen only), I compressed all 40 levels using RLE.
The original source code SokobanBeta1.03.prime size was 112k (UTF16). My version SokobanBeta1.5.prime size was 52k (UTF16). Roughly half the size.
Example
Level 1-1 as matrix RLE
[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0] 2b4w4b3w$
,[0,0,1,3,0,1,1,1,0,0,1,4,1,0,0,0,0,0,0,0] 2bwdb3w2bwsw$
,[0,0,1,3,0,0,0,1,1,1,1,0,1,1,1,1,0,0,0,0] 2bwd3b4wb4w$
,[0,0,1,3,0,0,0,0,0,1,1,0,1,0,0,1,0,0,0,0] 2bwd5b2wbw2bw$
,[0,0,1,1,1,0,1,0,0,0,0,0,0,0,2,1,0,0,0,0] 2b3wbw7bcw$
,[0,0,0,0,1,0,0,0,1,0,1,1,1,1,0,1,1,0,0,0] 4bw3bwb4wb2w$
,[0,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0] 3b2wbwbw7bw$
,[0,0,0,1,0,2,1,0,1,0,1,0,2,0,0,0,1,0,0,0] 3bwbcwbwbwbc3bw$
,[0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0] 3bw10b3w$
,[0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0] 3b12w$
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]] !
$$2b4w4b3w$2bwdb3w2bwsw$2bwd3b4wb4w$2bwd5b2wbw2bw$2b3wbw7bcw$4bw3bwb4wb2w$
3b2wbwbw7bw$3bwbcwbwbwbc3bw$3bw10b3w$3b12w$$$!
Matrix Size= 20* 15= 300 RLE Size= 120
Matrix Size in source code= 630
RLE Translation Table
Pattern | Encoding | Meaning |
0 | b | Background |
1 | w | Wall |
2 | c | Cube (Parcel) |
3 | d | Destination (Target) |
4 | s | Sokoban |
| Digits | Number of repeats |
| $ | End of Line |
| ! | End of Pattern |
Encoding
This function encodes a Sokoban level. Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.
RLEEnc (Mt)
BEGIN
LOCAL Row, Col, Cod, Cnt, Tmp, Sz;
Tmp:= ""; Sz:=SIZE(Mt);
FOR Row FROM 1 TO Sz(1) DO
Cnt:= 1; Cod:= Mt(Row,1);
FOR Col FROM 2 TO Sz(2) DO
IF Cod == Mt(Row,Col) THEN
Cnt:= Cnt+1;
ELSE
IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
Tmp:= Tmp+ MID("bwcds",Cod+1,1);
Cnt:= 1; Cod:= Mt(Row,Col);
END;
END;
IF Cod <> 0 THEN
IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
Tmp:= Tmp+ MID("bwcds",Cod+1,1);
END;
Tmp:= Tmp+ "$";
END;
RETURN Tmp;
END;
As this is calculator programming, there is no easy way to do terminal output for encoding. I needed a complementary function to get around this problem.
Encoding function is internal to the program (it is invisible outside of the progam. So I created an exported function which is callable from user calculator interface.
The interest is that the result in user interface can be copy/pasted outside of the emulator and thus used as an easy way to paste the result in source code.
EXPORT Sok_RLE()
BEGIN
LOCAL Lst:={};
FOR NAJ FROM 10 TO 49 DO
loadsubnivel();
IF TYPE(matriz) == 2 THEN
Lst:= CONCAT(Lst, {{NAJ, matriz}});
ELSE
Lst:= CONCAT(Lst, {{NAJ, RLEEnc (matriz)}});
END;
END;
RETURN Lst;
END;
Decoding
Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.
RLEDec (Rows, Cols, RLE)
BEGIN
LOCAL Row, Col, Ptr, Cnt, Cod, Tmp;
Row:= 1; Col:= 1; Cnt:= 0;
Tmp:= MAKEMAT(0,Rows,Cols);
FOR Ptr FROM 1 TO DIM(RLE) DO
CASE
IF INSTRING("0123456789", MID(RLE, Ptr, 1)) <> 0
THEN Cnt:= Cnt*10+EXPR(MID(RLE, Ptr, 1)); END;
IF MID(RLE, Ptr, 1) == "$" THEN Row:= Row+MAX(1, Cnt); Col:= 1; Cnt:= 0; END;
DEFAULT
Cod:= INSTRING("bwcds", MID(RLE, Ptr, 1));
FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO Tmp(Row,Col):=Cod- 1; END; Cnt:= 0;
END;
END;
RETURN Tmp;
END;
5. Points of Interest
- An easy to handle data compression algorithm
6. Links
7. History
- 21st March, 2021: First draft
- 6th September, 2021: First version
- 8th September, 2021: Second version, added Conway's Game of life
- 11th September, 2021: Third version, added Sokoban example
- 2nd January, 2022: Added reference .sok files
- 30th January, 2022: Fixed typo
- 16th January, 2022: Added a C decoder for Game of Life, download updated