Creating a Data Type in C Bytes

CS 2505 Computer Organization I

Creating a Data Type in C

Assignment 9: Advanced Structures Part I

Bytes

For this assignment, you will use the struct mechanism in C to implement a data type that represents an array of bytes. This data structure could be used kind of like a base class for a string type, or a more general dynamic array type. An array of bytes can be modeled using the C struct:

// Bytes type, represents an array of byte-sized values. struct bytes_t {

// Automatically allocated default buffer. This is used to store the // bytes until usage > DEFAULT_SIZE, then we switch to using dynamic // memory (data). dflt is never used again once the switch occurs. uint8_t dflt[DEFAULT_SIZE];

// Initially NULL, points to the dynamically allocated array of bytes. uint8_t *data;

// The number of slots or bytes in use, initially 0. size_t usage;

// Actual size of the array, initially DEFAULT_SIZE. size_t dim; };

typedef struct bytes_t bytes_t;

Since C arrays don't automatically keep track of their dimensions or usage, it seems completely reasonable to capture all of that information inside of a struct, and use that to represent a flexible bytes type.

Your bytes_t type must be able grow and shrink as necessary, and will require you to dynamically allocate and free memory. For example, inserting some bytes into a bytes_t variable may cause dynamic allocation to occur, and when you no longer need a bytes_t variable you must free the associated memory (in the appropriate function).

Note: This is a two-part assignment, you will use the bytes_t type again so it's imperative that your code functions properly.

Operations A data type consists of a collection of values and a set of operations that may be performed on those values. For a bytes type, it would make sense to provide common operations that let you manipulate bytes; for example:

// Makes a deep copy of *src and stores it in *dest. Similar to *dest = *src.

//

// Pre:

//

src - Points to valid bytes_t variable and is not NULL.

//

dest - Points to valid bytes_t variable and is not NULL.

//

// Post:

//

If successful then *dest contains a copy of *src. Both *src and *dest

//

should be completely separate entities. Changing one should not

//

affect the other.

//

// Returns:

//

If copying *src into *dest succeeds the function will return true.

//

If dynamic allocation is required, but fails this function will

//

return false.

//

bool bytes_copy(bytes_t * const dest, const bytes_t * const src);

This is a purely individual assignment!

1

CS 2505 Computer Organization I

Assignment 9: Advanced Structures Part I

The design of bytes_copy() follows the expected semantics of an assignment statement. The function will make a copy of the contents of *src and store it in *dest, allocating memory as necessary and setting dest->usage and dest>dim as required.

We will copy one aspect of an OO design; it's useful to provide a function that will create a new bytes_t object:

// Initializes a bytes_t variable.

//

// Pre:

//

bytes - Points to valid bytes_t variable and is not NULL.

//

// Post:

//

data ? Is initialized to NULL.

//

usage - Is initialized to 0.

//

dim - Is initialized to DEFAULT_SIZE.

//

void bytes_init(bytes_t * const bytes);

To some degree, this plays the role of a constructor in an OO implementation; the basic difference is that this is actually responsible for creating the new object. In contrast, in a Java implementation, you would call new to allocate memory for the object dynamically and then call the constructor to initialize that memory correctly.

For this assignment an initialization function, a de-allocation function, and a print function are provided, your task is to complete the rest of the implementation. The required functions are:

// Status functions, these are one-liners: bool bytes_empty(const bytes_t * const bytes); uint8_t * bytes_data(const bytes_t * const bytes); size_t bytes_usage(const bytes_t * const bytes);

// More complex functions that provide common operations for the bytes_t. bool bytes_set(bytes_t * const bytes, size_t index, uint8_t val); bool bytes_get(const bytes_t * const bytes, size_t index,

width_t width, endian_t order, uint64_t * const result);

bool bytes_insert(bytes_t * const bytes, size_t index, uint8_t * buf, size_t len);

bool bytes_erase(bytes_t * const bytes, size_t index, size_t len); uint8_t * bytes_range(const bytes_t * const bytes, range_t * const range); bool bytes_copy(bytes_t * const dest, const bytes_t * const src);

The file bytes.h, posted on the course website, includes header comments for each of these functions. Pay attention to the comments in the header file. All the stated pre- and post-conditions are part of the assignment. Pay particular attention to avoiding memory leaks. You should place your implementation in the bytes.c file posted on the website.

You should consider implementing additional "helper" functions. Those should be private to your C file, so make them static; they will be invisible to the testing code and never called directly from outside your file.

Your solution will be compiled with a test driver, using the supplied header file, so if you do not conform to the specified interfaces there will very likely be compilation and/or linking errors.

Design Given the C structure above, a natural question is, why should we use an automatically allocated buffer by default? Why not just dynamically allocate all the time? The short answer is that it makes the assignment more interesting, but the longer answer is that there's some overhead to using dynamic memory.

To give you an idea of the penalty, below is a function that creates and then frees a few bytes_t variables. As a simple benchmark, this function was executed 232-1 times in a tight loop. Further, the assignment solution was modified to either

This is a purely individual assignment!

2

CS 2505 Computer Organization I

Assignment 9: Advanced Structures Part I

only use the automatically allocated buffer or only use dynamically allocated memory. When using only the automatically allocated buffer the program finished in about 5 minutes and 30 seconds, while using only dynamic memory required approximately 25 minutes to complete same task. This may or may not be a fair benchmark and may or may not represent real word use, however, it does illustrate that there is some additional overhead when dynamically allocating memory. If we choose DEFAULT_SIZE carefully, then most of our bytes_t variables won't require dynamically allocated memory at all, avoiding the performance penalty except when we actually need a large amount of bytes.

Using the default automatically allocated buffer is part of the assignment. You should only dynamically allocate when the usage grows beyond DEFAULT_SIZE. When this happens dynamically allocate memory and copy the contents from the default buffer, dlft. Expect do poorly on the tests if you don't follow this requirement.

// Rough benchmark function. Run me a bunch of times. void test_alloc() {

bytes_t a, b, c, d; bytes_init(&a); bytes_init(&b); bytes_init(&c); bytes_init(&d);

// Larger default size than the assignment. for (int x = 0; x < 50; x++) {

// Do some work a.data[x] = x; b.data[x] = x; c.data[x] = x; d.data[x] = x; }

bytes_free(&a); bytes_free(&b); bytes_free(&c); bytes_free(&d);

}

Helpful Hints and Information As you start working on the bytes project, here are some helpful hints and information that will be relevant for any assignment that uses dynamic memory in this class:

If the curator produces no output while grading your assignment, your code is crashing. The test suite is a wrapper that calls your functions and prints out the results. It doesn't do anything else, so the problem is most likely in your code. Given the nature of dynamic memory errors your program can crash in unexpected (unrelated) places and also run fine on one machine and then crash on another.

To expand on those points, where the code crashes is sometimes different than where the problem is actually located. For example, if the instructor provided bytes_free() function sometimes crashes on the curator, chances are there is something wrong with an another function. Most likely, the previously tested function broke something that is only now being triggered inside of bytes_free(). So long story short, you need to backtrack, look at the previous function calls etc.

Further, depending on the nature of the bug, the provided driver may run fine on your machine but crash on the curator. This is often due to very small array bounds issues, i.e. going 1 out of bounds, or improperly using realloc (ignoring the return value). Further, these errors may happen sporadically making the problems difficult to find. Look closely at your bytes.c file for the issues described above, gdb may also be helpful. If you really can't find the issue come to office hours or post on forum.

This is a purely individual assignment!

3

CS 2505 Computer Organization I

Other Tips Don't do things like this:

Assignment 9: Advanced Structures Part I

bytes_t *new; new->usage = etc; return *new;

bytes_t *new doesn't point to anything (or we don't know what it points to), so dereferencing it on the next line will likely cause problems. Surprisingly, this may actually run locally, but is a terrible idea, a logic error, and almost certainly won't work on the curator. Generally, the functions have bytes_t * as the parameters and you should use those preferentially rather than declaring new bytes_t variables.

If you do decide to create a new bytes_t variable, remember there is nothing wrong with using a normal variable rather than a pointer. You're allowed to declare and return structures like the primitive types:

bytes_t b; // fine b.data = malloc(. . .); return b; // also fine, returns a copy of the values in b

return &b; // BAD! returning a pointer to local variable, // DON'T do this, it's a terrible idea

Testing One issue you'll encounter in testing is that the Linux implementation of malloc() will zero memory when it is allocated to your process. That hides errors rather than fixing them. For better testing, add the following include:

#include

Then in main() add this call:

mallopt(M_PERTURB, 205);

This will guarantee that memory that is allocated with malloc() or realloc() is NOT automatically written with zeros. We recommend ALWAYS doing this when you're testing code that uses malloc() or realloc().

Valgrind Valgrind is another helpful tool for finding many array bounds issues as well as memory leaks. While it's available on Ubuntu/CentOS, it's already installed on the rlogin cluster. So if you are developing/testing on rlogin you should be to use Valgrind with no additional work. To invoke Valgrind: (assuming the executable is called driver):

valgrind --leak-check=yes driver

A quick start guide is here:



What to Submit You will submit a single .c file, containing nothing but the implementation of the specified C functions. Be sure to conform to the specified function interfaces.

Your submission will be compiled with a test driver and graded according to how many cases your solution handles correctly.

This assignment will be graded automatically. You will be allowed up to ten submissions for this assignment. Test your function thoroughly before submitting it. Make sure that your function produces correct results for every test case you can think of.

This is a purely individual assignment!

4

CS 2505 Computer Organization I

Assignment 9: Advanced Structures Part I

Any additional requirements specified above will be checked manually by a TA after the assignment is closed. Any shortcomings will be assessed a penalty, which will be applied to your score from the Curator.

The course policy is that the submission that yields the highest score will be checked. If several submissions are tied for the highest score, the latest of those will be checked.

The Student Guide and other pertinent information, such as the link to the proper submit page, can be found at:



Pledge Each of your program submissions must be pledged to conform to the Honor Code requirements for this course. Specifically, you must include the following pledge statement in the submitted file:

// On my honor:

//

// - I have not discussed the C language code in my program with

//

anyone other than my instructor or the teaching assistants

//

assigned to this course.

//

// - I have not used C language code obtained from another student,

//

or any other unauthorized source, either modified or unmodified.

//

// - If any C language code or documentation used in my program

//

was obtained from another source, such as a text book or course

//

notes, that has been clearly noted with a proper citation in

//

the comments of my program.

//

// - I have not designed this program in such a way as to defeat or

//

interfere with the normal operation of the Curator System.

//

//

Failure to include this pledge in a submission is a violation of the Honor Code.

This is a purely individual assignment!

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download