Lab 5: Synchronization

Due: March 24 @ 11:59 PM

In this lab you'll be making a hash table implementation safe to use concurrently, and a spin semaphore.

Setup

Ensure you're in the ece353 directory within VSCode. Make sure you have the latest skeleton code from us by running: git pull upstream main.

This will create a merge, which you should be able to do cleanly. If you don't know how to do this read Pro Git. Be sure to read chapter 3.1 and 3.2 fully. This is how software developers coordinate and work together in large projects. For this course, you should always merge to maintain proper history between yourself and the provided repository. You should never rebase in this course, and in general you should never rebase unless you have a good reason to. It will be important to keep your code up-to-date during this lab as the test cases may change with your help.

Part 1: Parallel Hash Table

You'll be given a serial hash table implementation, and two additional hash table implementations to modify. You're expected to implement two locking strategies and compare them with the base implementation. The hash table implementation uses separate chaining to resolve collisions. Each cell of the hash table is a singly linked list of key/value pairs. You are not to change the algorithm, only add mutex locks. Note that this is basically the implementation of Java concurrent hash tables, except they have an optimization that doesn't create a linked list if there's only one entry at a hash location.

You can run: cd pht to begin this part of lab.

Task

In order to make a hash table work concurrently, you need to first understand how it works serially. Read and understand the initial base implementation. Afterwards you'll create version 1 (v1) of the hash table that's thread-safe. For this version you're only allowed to use a single mutex. Finally, you'll create version 2 (v2) of the hash table that's also thread-safe and performant with as many mutexes as you'd like.

Understand the Serial Hash Table Implementation

The initial base implementation is heavily commented, based on using linked lists previously, you should have an understanding of what the SLIST functions do. First read src/hash-table-common.h, this defines the capacity of the hash table (which does not change, unlike real hash tables), and the hashing function. Next, read src/hash-table-base.h to see the declarations of all the functions in the base (serial) hash table. Finally, read src/hash-table-base.c for the implementation details.

Creating Hash Table v1

Using only pthread_mutex_* you should create a thread-safe version of hash_table_v1_add_entry. All other functions are called serially, mainly for sanity checks. By default, there is a data race finding and adding entries to the list.

For the first version, v1, you should only be concerned with correctness. Create a single mutex, only for v1, and make hash_table_v1_add_entry thread-safe by adding the proper locking calls. You should not create any global variables, and instead add any data you'd like to the structs (hash_table_v1, hash_table_entry and/or list_entry). You need to initialize the mutex in hash_table_v1_create and destroy it in hash_table_v1_destroy.

Your code changes should not modify contains or get_value. Any other code modifications are okay. However, you should not change any functionality of the hash tables. Only modify code in hash-table-v1.c for this version of the hash table.

Creating Hash Table v2

For the second version, v2, you should be concerned with correctness and performance. You can now create as many mutexes as you like in hash-table-v2.c. Make hash_table_v2_add_entry thread-safe by adding the proper locking calls. You should not create any global variables, and instead add any data you'd like to the structs (hash_table_v2, hash_table_entry and/or list_entry). You need to initialize the mutex in hash_table_v2_create and destroy it in hash_table_v2_destroy.

Additional APIs

Similar to the suggestion in Lab 4, the base implementation uses a linked list, but instead of TAILQ, it uses SLIST. You should note that the SLIST_ functions modify the pointers field of struct list_entry. For your implementation you should only use pthread_mutex_t, and the associated init/lock/unlock/destroy functions. You will have to add the proper #include yourself.

Building

First, make sure you're in the pht directory if you're not already. After, run the following commands:

meson setup build
meson compile -C build

Whenever you make changes, you can run the compile command again. You should only need to run setup once.

Starting

After you build you'll have a build/pht-tester executable. The executable takes two command link arguments: -t changes the number of threads to use (default 4), and -s changes the number of hash table entries to add per thread (default 25,000). For example, you can run: build/pht-tester -t 8 -s 50000.

Tester

The tester code generates consistent entries in serial such that every run with the same -t and -s flags will receive the same data. All hash tables have room for 4096 entries, so for any sufficiently large number of additions, there will be collisions. The tester code runs the base hash table in serial for timing comparisons, and the other two versions with the specified number of threads. For each version it reports the number of µs per implementation. It then runs a sanity check, in serial, that each hash table contains all the elements it put in. By default, your hash tables should run -t times faster (assuming you have that number of cores). However, you should have missing entries (we made it fail faster!). Correct implementations should at least have no entries missing in the hash table. However, just because you have no entries missing, you still may have issues with your implementation (concurrent programming is significantly harder).

Example Output

You should be able to run build/pht-tester -t 8 -s 50000 and get the following output:

Generation: 130,340 usec
Hash table base: 1,581,974 usec
  - 0 missing
Hash table v1: 359,149 usec
  - 28 missing
Hash table v2: 396,051 usec
  - 24 missing

Errors

You will need to check for errors for any pthread_mutex_* functions you use. You may simply exit with the proper error code. You are expected to destroy any locks you create. The given code passes valgrind with no memory leaks, you should not create any.

Tips

Since this is a lab about concurrency and parallelism, you may want to significantly increase the number of cores given to your virtual machine, or run your code on a Linux machine with more cores.

Grading

There is no grade.py for this lab, because it's performance sensitive. We run your program on a multi-core machine, and as long as your performance is reasonable (v1 is likely slower than the base, and v2 is likely much faster). Please see CompEng.gg. This part of the lab is worth 50 marks.

Common Issues

My Performance Isn't What Expected

You likely have a data race. You can look at the online tool, or run ThreadSanitizer yourself with the following commands:

meson setup -Db_sanitize=thread --wipe build
meson compile -C build

Then run the tester. If you want to disable ThreadSanitizer, use:

meson setup -Db_sanitize=none --wipe build
meson compile -C build

Note, you should be able to explain WHY you have a data race, and what incorrect behaviour may happen. The sanity check given to you is only one of the possible unexpected outcomes that may happen.

Part 2: Basic Spin Semaphore

Thanks to Yang Su for the initial version of the lab, and Tianchen Zhang for editing and contributing test cases.

You can run: cd bss to begin this part of lab.

Task

You will be implementing a spin semaphore. A semaphore contains a single value that can be incremented by post or decreased by wait if the value is above zero. When the wait encounters a non-positive value, a spin semaphore will spin until the value is above zero instead of making the thread sleep. Hence, the semaphore you are implementing will be different from the POSIX implementation.

You will implement the following two functions in bss/src/bss.c:

void bss_post(int* bss);
void bss_wait(int* bss);

`void bss_post(int* bss);`

This function should increase the value in the semaphore by one every time it is called.

`void bss_wait(int* bss);`

This function should decrease the value in the semaphore by one every time it is called. When the value is not positive, it will not deduct the value until it is above zero.

Tips

Note that the semaphore is only ONE integer. Be aware that multiple threads can attempt to update the semaphore at the same time. Therefore, certain portions of the functions need to be atomic. The following API might be useful for your implementation:

__atomic_load_n
__atomic_store_n
__atomic_exchange_n
__atomic_compare_exchange_n
__atomic_fetch_add_n
__atomic_fetch_sub_n

You need to specify a memory model for each of these atomic operations. Memory models determine constraints on the memory order when compilers reorder instructions. In a relaxed memory model, memory operations can be shuffled. On the other hand, a strict memory model requires the memory operations to be executed in the order specified by the program.

In this lab, you may use __ATOMIC_RELEASE on all store operations, __ATOMIC_ACQUIRE on all load operations, and __ATOMIC_ACQ_REL on operations involving both load and store. When using compare and exchange, you may use the strong variation by specifying the argument weak as false.

Detailed specifications are listed on this webpage: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Grading

This part of the lab is worth 50 marks, each test case is worth 10 points. Run meson test -C build to run the test cases for Part 2.

Submission

Simply push your code using git push origin main (or simply git push) to submit it. You need to create your own commits to push, you can use as many as you'd like. You'll need to use the git add and git commit commands. Push as many commits as you want, your latest commit that modifies the lab files counts as your submission. For submission time we will only look at the timestamp on our server. We will never use your commit times (or file access times) as proof of submission, only when you push your code to the course Git server.

Setup​

Part 1: Parallel Hash Table​

Task​

Understand the Serial Hash Table Implementation​

Creating Hash Table v1​

Creating Hash Table v2​

Additional APIs​

Building​

Starting​

Tester​

Example Output​

Errors​

Tips​

Grading​

Common Issues​

My Performance Isn't What Expected​

Part 2: Basic Spin Semaphore​

Task​

void bss_post(int* bss);​

void bss_wait(int* bss);​

Tips​

Grading​

Submission​

Setup

Part 1: Parallel Hash Table

Task

Understand the Serial Hash Table Implementation

Creating Hash Table v1

Creating Hash Table v2

Additional APIs

Building

Starting

Tester

Example Output

Errors

Tips

Grading

Common Issues

My Performance Isn't What Expected

Part 2: Basic Spin Semaphore

Task

`void bss_post(int* bss);`

`void bss_wait(int* bss);`

Tips

Grading

Submission