Lab 2: Process Manager
Due: September 30 @ 11:59 PM
In this lab you'll create a small library called Subsubprocess
(ssp) that acts as a subreaper in addition to creating and
monitoring processes.
Your implementation should use the concepts learned during the lectures, along
with some new system calls.
As a change, you create library called `libssp`` this lab, not an
executable.
You'll be using Git to submit your work and save your progress.
Setup
Ensure you're in the ece344 directory within VSCode.
Make sure you have the latest skeleton code from us by running:
git pull upstream main.
This will create a merge, which you should be able to do cleanly. If you don't know how to do this read Pro Git. Be sure to read chapter 3.1 and 3.2 fully. This is how software developers coordinate and work together in large projects. For this course, you should always merge to maintain proper history between yourself and the provided repository. You should never rebase in this course, and in general you should never rebase unless you have a good reason to. It will be important to keep your code up-to-date during this lab as the test cases may change with your help.
You can finally run: cd ssp to begin the lab.
Task
You're going to create a process manager similar to the Python
subprocess module in some respects.
Your version is going to be a C library with the following API:
void ssp_init();
int ssp_create(char *const *argv, int fd0, int fd1, int fd2);
void ssp_send_signal(int ssp_id, int signum);
int ssp_get_status(int ssp_id);
void ssp_wait();
void ssp_print();
The description of what each function should do is below:
void ssp_init()
This will always be called once before a user makes any other call to your library. You should initialize or setup anything you need here.
int ssp_create(char *const *argv, int fd0, int fd1, int fd2)
You will create a new process in this function, that new process should
eventually call execvp(argv[0], argv).
You must set file descriptors 0, 1, and 2 to match the arguments fd0,
fd1, and fd2 through calls to dup2.
Afterwards you must then close all other file descriptors except for 0, 1, and
2.
You will not rely on library users to properly manage file descriptors
(and in fact they can't since your library calls
fork).
You may find some interesting file descriptors left open by VSCode.
Your experience with Lab 1 will help you close the other file descriptors, since
each process has an fd directory.
As a hint: within this directory only consider files with d_type set
to DT_LNK.
The execvp wrapper will re-use the current env variable, and search
for the program given by argv[0] using the PATH environment
variable (this variable is a list of directories to search for executables).
If execvp fails you should immediately exit with errno set
by the execvp call.
Your library should record the process ID of the newly created process, the
name (you need to copy the argv[0] string, the library user may
re-use the memory for something else, so you can't rely on it), and its status.
You should record the created process and initially set the status to
-1 to indicate it's running.
You should return a unique ssp_id that your library will use to refer
to this created process.
The IDs should be sequential and start with 0.
void ssp_send_signal(int ssp_id, int signum)
You should send a signal signum to the process referred to by
ssp_id.
If the process is no longer running, you should not return an error and instead
do nothing.
int ssp_get_status(int ssp_id)
You should return the current status of the process referred to by
ssp_id without blocking.
void ssp_wait()
This function should block and only return when all processes created through
ssp_create terminate.
As a sanity check, all processes should have a status between 0 and
255 after this call completes.
void ssp_print()
This is a non-blocking call that outputs the PID, name, and current
status of every process created through ssp_create.
This should reflect the current state of the processes, so you should query
them in this function.
You should start by printing a header, which will be PID
right-justified with a width of 7 characters, a space, then CMD
left-justified to the width of the longest process name, a space, then
STATUS.
After the header, for each process created by ssp_create you should
output its pid, name, and current status.
Recall that the name is your copied argv[0] string.
status
The status of each process should match its exit status if it exits normally.
However, if the process terminates through a signal you should set the status
to be equal to the signal number plus 128. Recall that a status of -1
means the process is currently active.
Errors
You need to check for and properly handle errors.
Some errors are expected and should be handled without additional output
or exiting the process.
For fatal errors, you should exit with the errno of the first fatal
error.
Become a Subreaper (20% of the Grade)
This task may make your implementation more complex, or you may have to scrap
your first attempt.
It's advised to complete the other parts of the lab first.
However, when you're ready you should add a call to
prctl(PR_SET_CHILD_SUBREAPER, 1);
in `ssp_init`` to become a subreaper.
A subreaper will adopt all orphan processes created by child processes.
In other words, your process will be the new parent when an orphan process
gets re-parented (instead of init).
It'll be your job to call waitpid on any adopted process as soon as
they terminate.
As part of being a subreaper you should record any time an adopted process
terminates.
You should record its pid and status.
For its name you should simply call it "<unknown>".
Add these unknown processes to be displayed when you call ssp_print
after all the processes managed directly by the library.
These processes should be output in the order they terminate.
Building
First, make sure you're in the ssp directory if you're not already.
After, run the following commands:
meson setup build
meson compile -C build
Whenever you make changes, you can run the compile command again. You should only need to run setup once.
Testing
You cannot execute your library directly, however you can run the test programs
manually.
Please find the files in tests/*.c.
You should be able to read and understand what they're doing with your library.
You'll find the executables in build/tests/*.
You may also choose to run the test suite provided with the command:
meson test --print-errorlogs -C build
Grading
Run the ./grade.py script in the directory.
This will rebuild your program, run the tests, and give you a grade out of
100 based on your test results.
Note that these test cases may not be complete, more may be added before the
due date, or there may be hidden test cases.
These labs are new, so we may need to change.
Tips
You'll want to read the documentation on some more C functions (some are light syscall wrappers). Some header files you'll need to use are provided for you in the skeleton code. You may include additional parts of the standard library. It's highly recommended to at least use the following functions:
open fdopendir readdir closedir dup2 waitpid fork execvp malloc perror exit
Reading Test Cases for Clarification
Often the best way to understand software is to read code, it's best to get practice in labs.
Each test consists of two parts, a .py file that checks the expected output
and a .c file that compiles with your library. We'll go over the single-cat
test to demonstrate. First, if you just want to run the test by itself, you can
do:
meson test --print-errorlogs tests/single-cat -C build
First, let's read the C file that defines main and uses your library, it's
located at tests/single-cat.c:
#include "ssp.h"
#include <assert.h>
#include <errno.h>
#include <stddef.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
    ssp_init();
    char *const cat_argv[] = {
        "cat",
        NULL,
    };
    int pipedes[2];
    int err = pipe(pipedes);
    if (err == -1) {
        exit(errno);
    }
    int cat_id = ssp_create(cat_argv, pipedes[0], 1, 2);
    close(pipedes[0]);
    int status = ssp_get_status(cat_id);
    ssp_print();
    assert(status == -1);
    close(pipedes[1]);
    ssp_wait();
    status = ssp_get_status(cat_id);
    ssp_print();
    assert(status == 0);
    return 0;
}
The test starts by calling ssp_init();, as all tests do. After, it defines
an array of C strings to use as the argv argument of ssp_create and a pipe.
This program wants to create a new cat process, and make file descriptor 0 of
cat point to the read end of this pipe.
When the new process starts running the cat program you created in
ssp_create, it should only have 3 file descriptors open. File descriptor 0
should be the read end of the pipe, 1 should be the same as this process, and 2
should also be the same as this process. You should not modify the callers file
descriptors, they are responsible for them. You should only modify the file
descriptors in your newly created process within ssp_create. ssp_create
returns a new ID, which is how you keep track of processes, it should be 0
in this case.
In this test, the cat process should block until we close our write end of the
pipe. We call ssp_get_status with the ID returned by ssp_create. In this
case it should be 0, and the return value of ssp_get_status should be -1
indicating that the child cat process is still running. It then calls
ssp_print which should print to standard out that the process is still running
as well.
The test then closes the write end of the pipe, this causes the cat program
to terminate and become a zombie. The test calls ssp_wait so that it can
acknowledge the terminated process and the OS can clean it up. Your code in
ssp_wait should read the exit status of the cat process and store it.
Next, the test calls ssp_get_status again for the cat process, which should
now return 0, indicating the process terminated without errorr. The test calls
ssp_print again which should show the updated exit status.
As long as every assert statement in the test is true, it'll check the output
of all your ssp_print calls. The output checking code is in
#!/usr/bin/env python3
from base import test
expected = [
    '    PID CMD STATUS\n',
    '        cat -1\n',
    '    PID CMD STATUS\n',
    '        cat 0\n',
]
test(expected)
The only important part is the expected variable, that contains the expected
output of each line of your program. It shows the correct spacing and the
expected output of the CMD and STATUS. Note that it explicitly does not
include any values in the PID by design, you'll likely get a different pid
every time you run the program.
If you're debugging your implementation, it may be best to run the executable directly, outside the testing framework. In order to run this test you can run:
build/tests/single-cat
Since you're just running a normal executable again, you can use tools like
strace to see what's going on. For this lab I'd suggest using the -f option
so you can see system calls from new processes:
strace -f build/tests/single-cat
Submission
Simply push your code using git push origin main (or simply
git push) to submit it.
You need to create your own commits to push, you can use as many
as you'd like.
You'll need to use the git add and git commit commands.
You may push as many commits as you want, your latest commit that modifies
the lab files counts as your submission.
For submission time we will only look at the timestamp on our server.
We will never use your commit times (or file access times) as proof of
submission, only when you push your code to the course Git server.
Common Issues
I Didn't Check for Errors for Every Function
You absolutely should. You may waste hours and hours debugging something because
you didn't detect an error and exit early. For this lab you may find it helpful
to exit with a known exit status like 42, so you know when a child process
has that exit status it was because of a failure. Especially if you're modifying
file descriptors this may be the only easy way to debug.
This includes checking for errors from opendir, and readdir. readdir is
especially annoying to check for errors, but you should still do it! If
you read the documentation you'll find you should handle it similar to the
following snippet:
while (1) {
    errno = 0;
    entry = readdir(dir);
    if (entry == NULL) {
        // At this point we don't know if we're done or if there's an error.
        if (errno == 0) {
            // No error, we're done! Stop this crazy loop.
        }
        else {
            // There was an error! errno is set to indicate what it is.
            // Panic! Exit! Do something!
        }
    }
}
Signals Causing System Calls to Fail
We saw this in Lecture 6 in signal-example.c, you may find it annoying or impossible to handle system calls
failing with EINTR (like in signal-example-2.c). An alternative is to add some flags to your struct sigaction,
as below:
new_action.sa_flags = SA_RESTART | SA_NOCLDSTOP;
SA_RESTART will automatically restart system calls (so you won't see EINTR),
and SA_NOCLDSTOP will force the kernel to only send your process SIGCHLD
if the child terminates (and not any other event).
Signals Missing Children
Note that if a SIGCHLD is sent while you are in the signal handler, a new
signal will not be triggered. Your handler needs to make sure that there are no
more terminated children
before it exits.
VSCode Broke with the Devcontainer on Windows
There seems to be an issue outlined
here.
Make sure to pull the latest code from upstream that modifies
devcontainer.json.
After, go to File > Preferences, and search for the "Mount Wayland Socket"
option and make sure it's turned off.
It should correspond to the following VSCode settings:
"customizations": {
   "vscode": {
     "settings": {
       "dev.containers.mountWaylandSocket": false
     }
   }
 }
After that press F1 and search for "Rebuild and Reopen Container". Click that and let everything rebuild.