Run-Length Encoding (C)
Topics: compression, encoding
Problem
Implement run-length encoding and decoding over caller buffers (no allocation):
collapse a run of identical bytes into a (count, byte) pair.
long rle_encode(const unsigned char *in, size_t inlen, unsigned char *out, size_t outcap);
long rle_decode(const unsigned char *in, size_t inlen, unsigned char *out, size_t outcap);
So aaabbc encodes to the bytes 03 'a' 02 'b' 01 'c', and decoding reverses it.
Both return the output length, or -1 on overflow (or malformed input for decode).
Key concepts
- Counts cap at 255. One byte holds the count, so a run of 300 identical bytes
splits into
255 x then 45 x. Encoding has to handle that split; decoding just
expands each pair.
- Decode validation. The input must be an even number of bytes (pairs), and a
count of
0 is meaningless, so both are errors. Reading a malformed stream should
fail, not produce garbage.
- It can make data bigger. Random data with no runs encodes to two bytes per
input byte. RLE only wins on data with long runs (scanlines of solid color, sparse
bitmaps), which is exactly where image formats apply it.
A tiny codec, but it teaches the shape of every compressor: a compact token format,
careful boundary handling, and a decoder that refuses bad input.
Run
cc -std=c11 -O2 -Wall tests.c -o /tmp/rle && /tmp/rle
Sign in to submit your solution.