To simplify our allocator, we will only be using mmap, and we’ll allocate entire regions of memory at a time. The size of each region should be a multiple of the system page size; this means that if a program executes malloc(1) for a single byte of memory and our machine has a page size of 4096, we’ll allocate a region of 4096 bytes instead. Consequently, if the user requests 4097 bytes then we will allocate two regions’ worth of memory.
“Wait,” you’re thinking, “doesn’t that waste a whole lot of memory?” And the answer is a resounding yes. Herein lies the challenge: you will not only support allocating memory, but also use the free space managementalgorithms we discussed in class to split up and reuse empty regions:
You should be able to configure the active free space management algorithm via environment variables. To determine which algorithm to use:
char *algo = getenv("ALLOCATOR_ALGORITHM");
if (algo == NULL) {
algo = "first_fit";
}
if (strcmp(algo, "first_fit") == 0) {
...
} else if (strcmp(algo, "best_fit") == 0) {
...
} else if (strcmp(algo, "worst_fit") == 0) {
...
}
If $ALLOCATOR_ALGORITHM isn’t set, you’ll use first fit as the default.
Managing Memory
mmap will give us regions of memory… But how do we distinguish between different allocations (called blocks) stored inside the regions? One good approach is to prefix each allocation with some metadata; this is how many malloc implementations work. Simply embed a struct at the start of each memory block that describes how large it is, whether it has been freed or not, and – most importantly – include a pointer to the next block in the chain. Looks like studying linked lists paid off after all! Here’s how this looks logically in memory, with each allocation prefixed with a metadata struct:
This means that allocating a single byte of memory actually takes more space: the single byte, plus the size of the metadata.
Looking at the big picture, we can also see how separate regions play a role in this. Here’s a visualization of several allocations:
Note that both the block size as well as the user-requested allocation size (shown in parenthesis) are provided, along with block usage, memory addresses and ‘free’ status. Here is a metadata struct that contains this information:
/**
* Defines metadata structure for both memory 'regions' and 'blocks.' This
* structure is prefixed before each allocation's data area.
*/
struct mem_block {
/**
* Each allocation is given a unique ID number. If an allocation is split in
* two, then the resulting new block will be given a new ID.
*/
unsigned long alloc_id;
/**
* The name of this memory block. If the user doesn't specify a name for the
* block, it should be auto-generated based on the allocation ID.
*/
char name[32];
/** Size of the memory region */
size_t size;
/** Space used; if usage == 0, then the block has been freed. */
size_t usage;
/**
* Pointer to the start of the mapped memory region. This simplifies the
* process of finding where memory mappings begin.
*/
struct mem_block *region_start;
/**
* If this block is the beginning of a mapped memory region, the region_size
* member indicates the size of the mapping. In subsequent (split) blocks,
* this is undefined.
*/
size_t region_size;
/** Next block in the chain */
struct mem_block *next;
};
You are allowed to modify this struct if necessary.
Allocating Memory
Request new memory regions from the kernel via mmap. To perform the allocation, place a metadata struct at the start of a free memory address and then return a pointer to the ‘data’ portion of the memory shown in the first figure. Don’t return a pointer to the struct itself, because it will be overwritten by the user!
Memory allocations must be aligned to 8 bytes; in other words, the size of the memory blocks should be evenly divisible by 8.
Once basic allocation works, you can start splitting blocks that are not 100% used. For instance, if a block is 1000 bytes in size but only 100 bytes are used, an allocation of less than or equal to 900 bytes can be accommodated by splitting and resizing the block.
Freeing Memory
Set usage = 0. That’s it! You don’t need to clean anything up or change anything else. This lazy approach is why you sometimes can read ‘old’ values from memory that have been freed.
If the entire memory region has been freed (i.e., all of the blocks within it have a usage of 0), then you should free the region with munmap. This is the tricky case for correctly implementing free.
Reallocating Memory
If the user wants to realloc a pointer, check to see if its block can be resized in place. This is generally possible as long as the usage of the block is less than its size and there is enough space for the reallocation request. If not, simply malloc a new, appropriately sized block, copy the data there, and then free the old block.
Extra Features
Since we’re writing our own version of malloc, we might as well add some features while we’re at it.
Named Blocks: to help with debugging, you can optionally provide a name for each allocation. These names will be shown when state information is printed.
Memory State Information: your allocator should be able to print out the current state of the regions and blocks with the print_memory() function. See the format below:
-- Current Memory State --
[REGION] -
[BLOCK] - () ''
[BLOCK] - () ''
[BLOCK] - () ''
[REGION] -
[BLOCK] - () ''
In this example, there are two memory regions and four blocks. Each element is printed out in order, so there is an implied link between element 1 and element 2, and so on.
Note that there is a difference between the size of the block itself (which includes the metadata struct) and the allocation size requested by the user – malloc(14) results in an allocation size of 14.
Here’s an example print_memory() invocation from a program that performs a 100-byte malloc followed by a 128-byte malloc. You will note that the 100-byte allocation is aligned to 104 bytes and the printf function call also results in an extra allocation, shown as allocation (2).
-- Current Memory State --
[REGION] 0x00007f806a386000-0x00007f806a387000 4096
[BLOCK] 0x00007f806a386000-0x00007f806a386098 (0) 'Allocation 0' 152 152 104
[BLOCK] 0x00007f806a386098-0x00007f806a3864c8 (1) 'Allocation 1' 1072 1072 1024
[BLOCK] 0x00007f806a3864c8-0x00007f806a387000 (2) 'Allocation 2' 2872 176 128
printf calling malloc is a problem; if we try to print the current memory state from within our allocator, we can inadvertently create an infinite loop. You are given a version of print_memory that uses printf, but you will need to rewrite it such that it does not allocate memory (in other words, don’t use printf, see fputs et al.).
File Logging: You should also be able to write the same contents printed by print_memory to a file.
Visualization: the output from print_memory() can be passed into visualize.sh to display the memory regions and blocks as a PDF diagram. This is already prepared for you, but you’ll need to experiment with the scripts (and when to print out the memory state) to get a interesting visualization.
Scribbling: C beginners often get tripped up by a seemingly strange behavior exhibited by malloc: sometimes they get a nice, clean chunk of memory to work with, and other times it will have ‘residual’ values that crash their program (usually when it’s being graded!). One solution to this, of course, is to use calloc() to clear the newly-allocated block. Since you are implementing your own memory allocator, you now understand why this happens: free() leaves old values in memory without cleaning them up.
To help find these memory errors, you will provide scribbling functionality: when the ALLOCATOR_SCRIBBLEenvironment variable is set to 1, you will fill any new allocation with 0xAA (10101010 in binary). This means that if a program assumes memory allocated by malloc is zeroed out, it will be in for a rude awakening.
You should scribble new allocations (malloc()) as well when you are reusing blocks; however, you should notscribble when realloc() is called (hmm… why?)
Danger Ahead
Some C library functions call malloc, calloc, realloc, etc. This means that if your implementation isn’t correct, other functions may fail in strange and unpredictable ways. Finish implementing a simple (wasteful) allocator first with a single block per region before moving on to the other functionality. A suggested implementation sequence:
CS 340 Milestone One Guidelines and Rubric Overview: For this assignment, you will implement the fundamental operations of create, read, update,
Retail Transaction Programming Project Project Requirements: Develop a program to emulate a purchase transaction at a retail store. This
7COM1028 Secure Systems Programming Referral Coursework: Secure
Create a GUI program that:Accepts the following from a user:Item NameItem QuantityItem PriceAllows the user to create a file to store the sales receip
CS 340 Final Project Guidelines and Rubric Overview The final project will encompass developing a web service using a software stack and impleme