lab07 : Application of graphs to machine learning

num ready? description assigned due
lab07 true Application of graphs to machine learning Wed 02/28 09:00AM Tue 03/12 11:59PM

Lab Goals

This lab is designed to offer a real-life application of graphs for a very relevant field of computer science: machine learning, by implementing the underlying graph algorithms of a neural network.

What this Lab IS NOT about
What this Lab IS about

Collaboration Policy

You may work on this programming assignment with a partner. Make sure you add both your names at the top of main.cpp along with your perm numbers. Partnering is highly encouraged: working in groups is a highly beneficial skill to have moving forward as well as in the industry.

Academic Integrity

All work submitted for this programming assignment must be your own, or from your partner.

Tips

Project Structure

The files you will work in are NeuralNetwork.cpp and Graph.cpp. Graph.cpp/hpp contains the following classes:

NeuralNetwork.cpp/hpp contains the following class:

Other important classes/files that are part of the project are:

Graph Structure

This particular graph structure maintains an adjacency list. The adjacency list uses integer identifiers to refer to each NodeInfo object. So, there is a std::vector<NodeInfo> nodes member variable which maps an identifier to its corresponding NodeInfo. The adjacency list is of type: std::vector<std::unordered_map<int, Connection>>, that is, a vector of hashmaps.

NeuralNetwork Structure

This class inherits from the graph class. You have not learned what this means, but for now, it means you can access graph members within the NeuralNetwork class. This is because a NeuralNetwork is a specific type of graph, much like how a binary tree is a type of graph.

A neural network is a specific type of graph, just like a binary tree is a type of graph. Here are the specifications of the neural network structure:

Here is a diagram which outlines the structure of a neural network with three layers: NeuralNetwork Example

The purpose of a neural network is to find a pattern of some training dataset in order to make accurate predictions on new, unseen data. In the image above, each $w_i$ and $b_i$ represent something called a weight and bias, respectively.

Step by Step Instructions

Step 0: Create a Git Repo and get the starter code

Refer to lab01 for instructions on how to set up a GitHub repository and pull the starter code for this lab. Here is the link for this lab’s starter code: https://github.com/ucsb-cs24-w24/STARTER-lab07

Step 1: Run the executable

Review the Makefile provided and look on top of each file to understand how the code is organized. If you are having trouble, take a look at this Makefile tutorial guide: https://zackglazewski.github.io/UCSBCS24-MakefileIntroduction/

We provided a simple test suite in test_neuralnet.cpp. For now, running ./test_neuralnet will fail, this is because some important functions have not been initialized yet. Once you implement steps 2 and 3, try playing around with main.cpp which will load and test the accuracy of a pretrained neural network model.

Step 2: Implement Getters and Setters

Your first step is implementing the getters and setters for the Graph and NeuralNetwork classes.

The following getters and setters need to be completed:

Despite the name NeuralNetwork::eval() and NeuralNetwork::train(), these functions actually work to set the evaluating member in NeuralNetwork. When NeuralNetwork::eval is called, the neural network should be put in “evaluation mode”, where evaluating is set to true. Likewise, when NeuralNetwork::train is called, the neural network should be put in “training mode”, where evaluating is set to false.

Step 3: Implement Graph methods

The next step is implementing some of the class methods for the Graph class.

Notice the function void Graph::resize(int size). Before adding or updating any part of the structure, resize must be called first.

For example, calling resize(20) will initialize nodes to be a vector of size 20, where every element is a nullptr. Likewise, adjacencyList will be a vector of 20 empty maps. When writing these functions, assume that resize has already been called: it happens in the NeuralNetwork’s constructors.

You will need to implement the following functions:

One indication that you have implemented the functions so far correctly is that the test_structure test case ./test_neuralnet 4 should pass.

Step 4: Implement Predict with BFT

The next two steps will be the most challenging part of the programming assignment. We recommend taking the self-test assignment on Gradescope before implementing these functions.

You will need to implement a BFT (Breadth First Traversal) algorithm for:

To implement a BFT, you should not use the NeuralNetwork::layers member variable. Instead, implement it using a queue as stated in the starter code of the predict function.

A BFT is required for this function because of how input flows through a neural network, which will be discussed further down below.

We have provided you two important functions that take care of the math for you:

Your job is to write the BFT algorithm that visits nodes and connections in the right order.

Here is a rundown of how the prediction algorithm works.

NeuralNetwork Example

For a neural network, the collections of weights and biases serve as the model or - the mechanism in which we make a prediction. We insert the input into each node of the input layer. This diagram takes in two inputs; since there are two nodes in the input layer, each node gets a different input. The value is then transformed by each weight as it “flows” to the next layer. Once it reaches the next layer, it again gets transformed by the bias and activation function - in that order. After this, the same process happens as it flows to the next layer. In this example, it flows until it has been transformed by the output layer, which leaves the value in a state we interpret as a prediction. For any given layer, we first visit all the nodes and connections to the next layer before visiting the nodes and connections of the next layer. Here is an example of how input flows from the input layer to the hidden layer in our example:

  1. Assume the node with id 1 contains the input value $x_1$, and the node with id 2 contains the input value $x_2$.
  2. To compute the value for node 3: $h_1$, we first accumulate the weighted sum: \(h_1 = x_1w_1 + x_2w_4\)
  3. Next, we add the bias term: \(h_1 = x_1w_1 + x_2w_4 + b_1\)
  4. Finally, activate the node with the activation function for its layer, which we will call activate:
\[h_1 = activate(x_1w_1 + x_2w_4 + b_1)\]

Using these steps, we can compute the value for every other node, including the output node ($y_1$), which will be our prediction.

NOTE: Keep these two facts in mind when writing your prediction algorithm.

The other important thing to realize is that a layer must wait for the computation of the previous layer to finish, hence the need for a BFT traversal.

Step 5: Implement Contribute with DFS

Your next task is to implement a DFT (Depth First Traversal) algorithm for:

A DFT is required for this process because of how errors are “back-propagated” through the neural network.

When a neural network makes a prediction, there is a sense of “how bad” the prediction was. The neural network has a mechanism for finding this out, which we’ve abstracted behind the visit functions. Curious students are encouraged to check the info in the Gentle Introduction to Neural Networks tutorial that provides more details, but it is not required to understand this lab.

Changing our weights and biases causes a different prediction, and the question to be answered is: how do we change these weights and biases to make a better prediction?

The error begins in the output layer. The output layer uses fancy math to figure out how much to change its weights and biases. Then, a part of that error is passed down to the previous layer, which performs the same procedure. This “back-propagation” continues until we reach the input layer and its weights are updated.

Here is a helpful gif to illustrate both the prediction flow and back-propagation flow:

Prediction and Backprop

Why do we use DFT for back-propagation? There are a couple of reasons:

  1. The error received in one layer depends on some new error that the next layer computes and passes down. The error a node receives from its out-neighbors is an “incoming contribution” and the error that a node passes down to its in-neighbors is its “outgoing contribution”.
  2. The connections are only forward-directed; you cannot traverse backward in the network without the help of a recursive call.

NeuralNetwork Example

Here is an example, referring back to our original neural network: suppose we are currently at node 3

  1. Iterate through all your neighbors. For each neighbor:
    • get the neighbor’s incoming contribution by recursively calling contribute on node 6.
    • Using that incoming contribution, visit the connection corresponding to that neighbor.
    • In this case, we visit the connection with weight $w_7$.
  2. Visit node 3 and pass in your running outgoing contribution to be updated.

Now, an example using node 1:

  1. Get contribution from node 3, and visit the connection with $w_1$.
  2. Get contribution from node 4, and visit the connection with $w_2$.
  3. Get contribution from node 5, and visit the connection with $w_3$.
  4. Visit node 1 and pass in your running outgoing contribution to be updated.

We have provided you with two important functions that take care of the math for you:

Notice incomingContribution and outgoingContribution defined in the function. During your DFT, incomingContribution will represent what comes from the recursive result of calling contribute on a neighbor. outgoingContribution will be updated throughout the duration of the recursive call and, eventually, be returned - becoming the new incomingContribution for the previous layer.

In place of a “found” set, there is a map called contributions. This acts as a way to keep track of nodes that we have already visited, as well as store their previously computed contributions.

Your job is to write the DFT algorithm that visits nodes and connections in the right order.

When you are ready to proceed with this step, see the following section, which will give a brief background of back-propagation:

Step 6: Implement Update

After calling contribute, the update function assumes (precondition) that the delta of every node and connection has been accumulated. Now, each delta must be applied to every node’s bias and every connection’s weight.

Your goal in the update is to traverse the graph (in any way you want) to update every weight and bias. A single call to update affects every node and connection and reset the $delta$ values to zero.

For each update, follow these steps:

  1. To update the bias: $bias_{new} = bias_{old} - (lr * delta)$
  2. To update the weight: $weight_{new} = weight_{old} - (lr * delta)$
  3. Reset the $delta$ values for each node and connection to zero.

where $lr$ is the learning rate (a member variable of NeuralNetwork), and controls how impactful we consider contributions to be.

Step 7: Test and Submit

test_neuralnet.cpp has been given as a brief way to check your neural network. Please test your code against this file and make sure you pass all the tests before trying to submit to gradescope.

If you are having issues, it is helpful to run a debugger like gdb to make sure your code is following the logic described above.

Step 8: Rest

Take a break! You’ve done well.

External Resources

Graphviz

The graph class overloads the « operator and outputs the graph in dot format. Copy the portion that looks like: digraph G {...} and paste it here to visualize the graph:

BFS and DFS
Neural Networks

Here are some great resources to help you out. I would say that statquest has great videos to understand the implementation of neural network structures, whereas 3blue1brown is more conceptual based:

Credits

This assignment was conceived and created by UCSB CS ULA, Zackary Glazewski in consultation with Diba Mirza for CS 24. Thanks to the UCSB CS24 teaching staff for reviewing and providing feedback: Torin Schlunk, Mehak Dhaliwal, Nawel Alioua, Joseph Ng, Shinda Huang, Xinlei Feng, Yaoyi Bai, Ally Chu, and Sanjana Shankar.

CC BY-NC-SA 2.0, Zackary Glazewski and Diba Mirza, Feb 2024.