Lab 2: Process Manager

Due: September 30 @ 11:59 PM

In this lab you'll create a small library called Subsubprocess (ssp) that acts as a subreaper in addition to creating and monitoring processes. Your implementation should use the concepts learned during the lectures, along with some new system calls. As a change, you create library called `libssp`` this lab, not an executable. You'll be using Git to submit your work and save your progress.

Setup

Ensure you're in the ece344 directory within VSCode. Make sure you have the latest skeleton code from us by running: git pull upstream main.

This will create a merge, which you should be able to do cleanly. If you don't know how to do this read Pro Git. Be sure to read chapter 3.1 and 3.2 fully. This is how software developers coordinate and work together in large projects. For this course, you should always merge to maintain proper history between yourself and the provided repository. You should never rebase in this course, and in general you should never rebase unless you have a good reason to. It will be important to keep your code up-to-date during this lab as the test cases may change with your help.

You can finally run: cd ssp to begin the lab.

Task

You're going to create a process manager similar to the Python subprocess module in some respects. Your version is going to be a C library with the following API:

void ssp_init();
int ssp_create(char *const *argv, int fd0, int fd1, int fd2);
void ssp_send_signal(int ssp_id, int signum);
int ssp_get_status(int ssp_id);
void ssp_wait();
void ssp_print();

The description of what each function should do is below:

`void ssp_init()`

This will always be called once before a user makes any other call to your library. You should initialize or setup anything you need here.

`int ssp_create(char const argv, int fd0, int fd1, int fd2)`

You will create a new process in this function, that new process should eventually call execvp(argv[0], argv). You must set file descriptors 0, 1, and 2 to match the arguments fd0, fd1, and fd2 through calls to dup2. Afterwards you must then close all other file descriptors except for 0, 1, and 2.

You will not rely on library users to properly manage file descriptors (and in fact they can't since your library calls fork). You may find some interesting file descriptors left open by VSCode. Your experience with Lab 1 will help you close the other file descriptors, since each process has an fd directory. As a hint: within this directory only consider files with d_type set to DT_LNK.

The execvp wrapper will re-use the current env variable, and search for the program given by argv[0] using the PATH environment variable (this variable is a list of directories to search for executables). If execvp fails you should immediately exit with errno set by the execvp call. Your library should record the process ID of the newly created process, the name (you need to copy the argv[0] string, the library user may re-use the memory for something else, so you can't rely on it), and its status.

You should record the created process and initially set the status to -1 to indicate it's running. You should return a unique ssp_id that your library will use to refer to this created process. The IDs should be sequential and start with 0.

void ssp_send_signal(int ssp_id, int signum)

You should send a signal signum to the process referred to by ssp_id. If the process is no longer running, you should not return an error and instead do nothing.

int ssp_get_status(int ssp_id)

You should return the current status of the process referred to by ssp_id without blocking.

void ssp_wait()

This function should block and only return when all processes created through ssp_create terminate. As a sanity check, all processes should have a status between 0 and 255 after this call completes.

void ssp_print()

This is a non-blocking call that outputs the PID, name, and current status of every process created through ssp_create. This should reflect the current state of the processes, so you should query them in this function.

You should start by printing a header, which will be PID right-justified with a width of 7 characters, a space, then CMD left-justified to the width of the longest process name, a space, then STATUS. After the header, for each process created by ssp_create you should output its pid, name, and current status. Recall that the name is your copied argv[0] string.

status

The status of each process should match its exit status if it exits normally. However, if the process terminates through a signal you should set the status to be equal to the signal number plus 128. Recall that a status of -1 means the process is currently active.

Errors

You need to check for and properly handle errors. Some errors are expected and should be handled without additional output or exiting the process. For fatal errors, you should exit with the errno of the first fatal error.

Become a Subreaper (20% of the Grade)

This task may make your implementation more complex, or you may have to scrap your first attempt. It's advised to complete the other parts of the lab first. However, when you're ready you should add a call to prctl(PR_SET_CHILD_SUBREAPER, 1); in `ssp_init`` to become a subreaper.

A subreaper will adopt all orphan processes created by child processes. In other words, your process will be the new parent when an orphan process gets re-parented (instead of init). It'll be your job to call waitpid on any adopted process as soon as they terminate.

As part of being a subreaper you should record any time an adopted process terminates. You should record its pid and status. For its name you should simply call it "<unknown>". Add these unknown processes to be displayed when you call ssp_print after all the processes managed directly by the library. These processes should be output in the order they terminate.

Building

First, make sure you're in the ssp directory if you're not already. After, run the following commands:

meson setup build
meson compile -C build

Whenever you make changes, you can run the compile command again. You should only need to run setup once.

Testing

You cannot execute your library directly, however you can run the test programs manually. Please find the files in tests/*.c. You should be able to read and understand what they're doing with your library. You'll find the executables in build/tests/*.

You may also choose to run the test suite provided with the command:

meson test --print-errorlogs -C build

Grading

Run the ./grade.py script in the directory. This will rebuild your program, run the tests, and give you a grade out of 100 based on your test results. Note that these test cases may not be complete, more may be added before the due date, or there may be hidden test cases. These labs are new, so we may need to change.

Tips

You'll want to read the documentation on some more C functions (some are light syscall wrappers). Some header files you'll need to use are provided for you in the skeleton code. You may include additional parts of the standard library. It's highly recommended to at least use the following functions:

open fdopendir readdir closedir dup2 waitpid fork execvp malloc perror exit

Reading Test Cases for Clarification

Often the best way to understand software is to read code, it's best to get practice in labs.

Each test consists of two parts, a .py file that checks the expected output and a .c file that compiles with your library. We'll go over the single-cat test to demonstrate. First, if you just want to run the test by itself, you can do:

meson test --print-errorlogs tests/single-cat -C build

First, let's read the C file that defines main and uses your library, it's located at tests/single-cat.c:

#include "ssp.h"

#include <assert.h>
#include <errno.h>
#include <stddef.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
    ssp_init();

    char *const cat_argv[] = {
        "cat",
        NULL,
    };

    int pipedes[2];
    int err = pipe(pipedes);
    if (err == -1) {
        exit(errno);
    }
    int cat_id = ssp_create(cat_argv, pipedes[0], 1, 2);
    close(pipedes[0]);

    int status = ssp_get_status(cat_id);
    ssp_print();
    assert(status == -1);

    close(pipedes[1]);

    ssp_wait();
    status = ssp_get_status(cat_id);
    ssp_print();
    assert(status == 0);

    return 0;
}

The test starts by calling ssp_init();, as all tests do. After, it defines an array of C strings to use as the argv argument of ssp_create and a pipe. This program wants to create a new cat process, and make file descriptor 0 of cat point to the read end of this pipe.

When the new process starts running the cat program you created in ssp_create, it should only have 3 file descriptors open. File descriptor 0 should be the read end of the pipe, 1 should be the same as this process, and 2 should also be the same as this process. You should not modify the callers file descriptors, they are responsible for them. You should only modify the file descriptors in your newly created process within ssp_create. ssp_create returns a new ID, which is how you keep track of processes, it should be 0 in this case.

In this test, the cat process should block until we close our write end of the pipe. We call ssp_get_status with the ID returned by ssp_create. In this case it should be 0, and the return value of ssp_get_status should be -1 indicating that the child cat process is still running. It then calls ssp_print which should print to standard out that the process is still running as well.

The test then closes the write end of the pipe, this causes the cat program to terminate and become a zombie. The test calls ssp_wait so that it can acknowledge the terminated process and the OS can clean it up. Your code in ssp_wait should read the exit status of the cat process and store it.

Next, the test calls ssp_get_status again for the cat process, which should now return 0, indicating the process terminated without errorr. The test calls ssp_print again which should show the updated exit status.

As long as every assert statement in the test is true, it'll check the output of all your ssp_print calls. The output checking code is in

#!/usr/bin/env python3

from base import test

expected = [
    '    PID CMD STATUS\n',
    '        cat -1\n',
    '    PID CMD STATUS\n',
    '        cat 0\n',
]

test(expected)

The only important part is the expected variable, that contains the expected output of each line of your program. It shows the correct spacing and the expected output of the CMD and STATUS. Note that it explicitly does not include any values in the PID by design, you'll likely get a different pid every time you run the program.

If you're debugging your implementation, it may be best to run the executable directly, outside the testing framework. In order to run this test you can run:

build/tests/single-cat

Since you're just running a normal executable again, you can use tools like strace to see what's going on. For this lab I'd suggest using the -f option so you can see system calls from new processes:

strace -f build/tests/single-cat

Submission

Simply push your code using git push origin main (or simply git push) to submit it. You need to create your own commits to push, you can use as many as you'd like. You'll need to use the git add and git commit commands. You may push as many commits as you want, your latest commit that modifies the lab files counts as your submission. For submission time we will only look at the timestamp on our server. We will never use your commit times (or file access times) as proof of submission, only when you push your code to the course Git server.

Common Issues

I Didn't Check for Errors for Every Function

You absolutely should. You may waste hours and hours debugging something because you didn't detect an error and exit early. For this lab you may find it helpful to exit with a known exit status like 42, so you know when a child process has that exit status it was because of a failure. Especially if you're modifying file descriptors this may be the only easy way to debug.

This includes checking for errors from opendir, and readdir. readdir is especially annoying to check for errors, but you should still do it! If you read the documentation you'll find you should handle it similar to the following snippet:

while (1) {
    errno = 0;
    entry = readdir(dir);
    if (entry == NULL) {
        // At this point we don't know if we're done or if there's an error.
        if (errno == 0) {
            // No error, we're done! Stop this crazy loop.
        }
        else {
            // There was an error! errno is set to indicate what it is.
            // Panic! Exit! Do something!
        }
    }
}

Signals Causing System Calls to Fail

We saw this in Lecture 6 in signal-example.c, you may find it annoying or impossible to handle system calls failing with EINTR (like in signal-example-2.c). An alternative is to add some flags to your struct sigaction, as below:

new_action.sa_flags = SA_RESTART | SA_NOCLDSTOP;

SA_RESTART will automatically restart system calls (so you won't see EINTR), and SA_NOCLDSTOP will force the kernel to only send your process SIGCHLD if the child terminates (and not any other event).

Signals Missing Children

Note that if a SIGCHLD is sent while you are in the signal handler, a new signal will not be triggered. Your handler needs to make sure that there are no more terminated children before it exits.

VSCode Broke with the Devcontainer on Windows

There seems to be an issue outlined here. Make sure to pull the latest code from upstream that modifies devcontainer.json. After, go to File > Preferences, and search for the "Mount Wayland Socket" option and make sure it's turned off. It should correspond to the following VSCode settings:

"customizations": {
   "vscode": {
     "settings": {
       "dev.containers.mountWaylandSocket": false
     }
   }
 }

After that press F1 and search for "Rebuild and Reopen Container". Click that and let everything rebuild.

Setup​

Task​

void ssp_init()​

int ssp_create(char *const *argv, int fd0, int fd1, int fd2)​

void ssp_send_signal(int ssp_id, int signum)​

int ssp_get_status(int ssp_id)​

void ssp_wait()​

void ssp_print()​

status​

Errors​

Become a Subreaper (20% of the Grade)​

Building​

Testing​

Grading​

Tips​

Reading Test Cases for Clarification​

Submission​

Common Issues​

I Didn't Check for Errors for Every Function​

Signals Causing System Calls to Fail​

Signals Missing Children​

VSCode Broke with the Devcontainer on Windows​