1 Introduction 2 References 3 Resources Required

CSC 482/582 Encoding

Assignment #2 (50 points) Due: September 10, 2018

1 Introduction

The learning objectives of this assignment are for students to gain experience with the command line through working with simple types of data encoding. Note that we are focusing on encoding, not encryption. While encoding does not secure data as encryption does, encoding techniques are used by hackers to hide data from defenders and to trick users, so it is important to understand encoding as well as encryption.

2 References

1. Base64, 2. Caesar Cipher, 3. Rot13,

3 Resources Required

Students will need the following virtual machine and file to do this lab. ? SEED Ubuntu Linux VM, available in the CSC 482 folder on vSphere. ? csc482-a2-file.zip

4 Shift Ciphers

Shift ciphers like the Caesar Cipher or rot13, which encrypt text messages by consistently replacing one letter of the alphabet with another letter of the alphabet, can be easily implemented with the tr (translate) command. The command takes two inputs: a set of input characters and a set of output characters. Wherever the first letter of the input set is found in the input text, it is replaced with the first letter of the output character set. The same process is used for the second, third, and further characters in the input and output sets.

We can use the following command line to rot13 encode and decode text. The input alphabet begins with A, while the output alphabet begins with N, indicating that tr will replace all As with Ns. The sequence A-Z specifies the English alphabet in order, while in the output alphabet, N-Z indicates the second half of the alphabet and A-M the first half of the alphabet. This is exactly the wrapping of the alphabet performed by rot13. The input and output sets also specify the same translation for the lower case letters as it does for the upper case letters.

$ echo 'A Test' | tr 'A-Za-z' 'N-ZA-Mn-za-m' N Grfg $ echo 'N Grfg' | tr 'A-Za-z' 'N-ZA-Mn-za-m' A Test

1

Implement Caesar cipher (which we could call rot3) using the tr command.

5 Binary to Text Encoding

Binary to text encoding is used to send binary data, such as images or archives, over data channels that only permit text, such as e-mail. Text typically means printable characters and does not include representations of control characters or whitespace. One of the simplest techniques is to simply write the binary data in hexadecimal notation, e.g. write byte 254 as the string fe, but this encoding doubles the size of the data. While there are dozens of techniques used in different applications, Base64 is the most widely used encoding scheme, found in a wide variety of applications from e-mail attachments to web cookies. Base64 encodes binary using 64 text characters, meaning that log2(64) = 6 bits are stored in each Base64 character. Base64 converts three octets (8-bit bytes) into four encoded characters. This means that Base64 is a 75% efficient coding scheme for binary. Because all files do not have sizes that are multiples of three, Base64 must deal with the possibility of one or two bytes at the end of the file that must be handled specially. Base64 can handle this situation by padding the file with zeros to make its size evenly divisible by three. It is often possible to identify a Base64 encoded string by the presence of one or two final equal signs that result from padding. Unfortunately, while all Base64 systems agree that the first 62 characters are the upper and lower case letters of the English alphabet, the choices of the last two characters vary between implementations. Furthermore, different implementations of Base64 implement different padding schemes. The Wikipedia page describes over a dozen Base64 variants. We can convert binary data to a hexadecimal string using the xxd command, which is demonstrated below using the -l option to only convert the first 8 bytes of a JPEG file.

$ xxd -p -l 8 ~/Pictures/Ubuntu.jpg ffd8ffe000104a46 $ echo ffd8ffe000104a46 | xxd -r -p ...nonprintable output...

Linux offers a base64 command, which encodes data to Base64 by default. Using the -d option causes the command to decode data from Base64. Note that Base64 decoded data may be binary, resulting in output that contains strange or non-printable characters.

$ echo "This is a test" | base64 VGhpcyBpcyBhIHRlc3QK $ echo "VGhpcyBpcyBhIHRlc3QK" | base64 -d This is a test

6 Task

In this assignment, students will unpack the assignment file and find the secret word contained within the file. Note that the assignment file has been encoded several dozen times using different techniques described in this and previous assignments. Students will need to use the file command to check for filetypes, multiple

2

types of decoding, and loops to decode multiple uses of the same encoding, to obtain the original file that contains the secret phrase.

7 Deliverables

E-mail the instructor with a message containing the secret phrase and all commands used to decode the file. Do not use an attachment. Include all information in the message body. The e-mail must be from your NKU address and have a subject line of "CSC 482 Assignment 2."

3

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download