Article 6HV4A Binary to text to binary

Binary to text to binary

by
John
from John D. Cook on (#6HV4A)

Gnu Privacy Guard includes a way to encode binary files as plain ASCII text files, and turn these text files back into binary. This is intended as a way to transmit encrypted data, but it can be used to convert any kind of file from binary to text and back to binary.

To illustrate this, I'll use Albrecht Durer's Melencolia I as an example. (More on this image and its mathematical significance here.)

Melancolia.jpg

This image is saved in the file Melancholia.jpg.

Binary to text

If we run

 gpg --enarmor Melencolia.jpg

at a command line, it produces a file Melancholia.jpg.asc, the asc suffix indicating an ASCII file.

We can look inside this file if we'd like.

-----BEGIN PGP ARMORED FILE-----Comment: Use "gpg --dearmor" for unpacking/9j/4AAQSkZJRgABAQEBLAEsAAD/4QBoRXhpZgAATU0AKgAAAAgABAEaAAUAAAABAAAAPgEbAAUAAAABAAAARgEoAAMAAAABAAIAAAExAAIAAAARAAAATgAAAAAAAAEsAAAAAQAAASwAAAABUGFpbnQuTkVUIHYzLjUuOAAA/9sAQwACAQECAQECAgICAgIC...wJ/wkzRf8IN4LCCVVwNCtM8sDnPl5zkk565NbcH7Lnw01K0gtrn4feCriB4QfLl0S2dV+bdgApwMknAwMk+tFFcktjqif//Z=rpd1-----END PGP ARMORED FILE-----

The cryptic text /9j/4A...//Z is a base 64 encoded representation of the binary file. Think of the file as one big binary number. Write that number in base 64, i.e. partition the bits into groups of six. Then represent the base 64 digits" as alphanumeric characters, plus the symbols + and /. More on the details here.

The line =rpd1 is a 24-bit CRC checksum. The equal sign is a separator, and rpd1 is a base 64 encoding the checksum.

The JPG file is 91,272 bytes and the ASCII file is 123,712 bytes. The ASCII file is about 1/3 larger because every six bits in the binary file corresponds to an eight-bit ASCII character. The ASCII file is a little bit more than 1/3 larger because of the human-friendly text above and below the base 64 encoding, the newline characters, and the checksum.

Text to binary

If we run

 gpg --dearmor Melencolia.jpg.asc

at a command line, it produces a file Melancholia.jpg.asc.gpg. This file is bit-for-bit exactly the same as the original file, which we could confirm by running

 diff Melencolia.jpg Melencolia.jpg.asc.gpg
Related postsThe post Binary to text to binary first appeared on John D. Cook.
External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title John D. Cook
Feed Link https://www.johndcook.com/blog
Reply 0 comments