pa03 : Application of graphs to machine learning

num ready? description assigned due
pa03 true Application of graphs to machine learning Wed 02/25 09:00AM Fri 03/13 11:59PM

Required video watching before you start the lab

Watch this video before starting for a visual introduction to neural networks and how values flow forward through them.

Video 1 (19 min): 3Blue1Brown: But what is a neural network?

Lab Goals

This lab is designed to offer a real-life application of graphs for a very relevant field of computer science: machine learning, by implementing the underlying graph algorithms of a neural network.

What this Lab IS NOT about
What this Lab IS about

Collaboration Policy

You may work on this programming assignment with a partner. Make sure you add both your names at the top of main.cpp along with your perm numbers. Partnering is highly encouraged: working in groups is a highly beneficial skill to have moving forward as well as in the industry.

Academic Integrity

All work submitted for this programming assignment must be your own, or from your partner.

Tips

Project Structure

The files you will work in are NeuralNetwork.cpp and Graph.cpp. Graph.cpp/hpp contains the following classes:

Why is Connection a class and not just a weight? In a simple weighted graph, an edge can be represented as a single number. But during training, each connection needs to store additional state: the current weight and a delta term that accumulates how much the weight should change. Since an edge needs to carry mutable state across multiple phases (forward pass, backward pass, update), it is modeled as an object rather than a plain number. This is a common pattern when edges in a graph carry rich metadata.

NeuralNetwork.cpp/hpp contains the following class:

Other important classes/files that are part of the project are:

Graph Structure

This particular graph structure maintains an adjacency list. The adjacency list uses integer identifiers to refer to each NodeInfo object. So, there is a std::vector<NodeInfo*> nodes member variable which maps an identifier to its corresponding NodeInfo. Note that the vector stores pointers to NodeInfo objects — this is why updateNode is required to allocate on the heap. The adjacency list is of type: std::vector<std::unordered_map<int, Connection>>, that is, a vector of hashmaps.

NeuralNetwork Structure

This class inherits from the graph class. You have not learned what this means, but for now, it means you can access graph members within the NeuralNetwork class. This is because a NeuralNetwork is a specific type of graph, much like how a binary tree is a type of graph.

A neural network is a specific type of graph, just like a binary tree is a type of graph. Here are the specifications of the neural network structure:

Here is a diagram which outlines the structure of a neural network with three layers: NeuralNetwork Example

The purpose of a neural network is to find a pattern of some training dataset in order to make accurate predictions on new, unseen data. In the image above, each $w_i$ and $b_i$ represent something called a weight and bias, respectively.

Step by Step Instructions

Step 0: Create a Git Repo and get the starter code

Refer to lab01 for instructions on how to set up a GitHub repository and pull the starter code for this lab. Here is the link for this lab’s starter code: https://github.com/ucsb-cs24-w26/STARTER-pa03

Step 1: Run the executable

Review the Makefile provided and look on top of each file to understand how the code is organized. If you are having trouble, take a look at this Makefile tutorial guide: https://zackglazewski.github.io/UCSBCS24-MakefileIntroduction/

We provided a simple test suite in test_neuralnet.cpp. For now, running ./test_neuralnet will fail, this is because some important functions have not been initialized yet. Once you implement steps 2 and 3, try playing around with main.cpp which will load and test the accuracy of a pretrained neural network model.

Step 2: Implement Getters and Setters

Your first step is implementing the getters and setters for the Graph and NeuralNetwork classes.

The following getters and setters need to be completed:

Despite the name NeuralNetwork::eval() and NeuralNetwork::train(), these functions actually work to set the evaluating member in NeuralNetwork. When NeuralNetwork::eval is called, the neural network should be put in “evaluation mode”, where evaluating is set to true. Likewise, when NeuralNetwork::train is called, the neural network should be put in “training mode”, where evaluating is set to false.

Step 3: Implement Graph methods

The next step is implementing some of the class methods for the Graph class.

Notice the function void Graph::resize(int size). Before adding or updating any part of the structure, resize must be called first.

For example, calling resize(20) will initialize nodes to be a vector of size 20, where every element is a nullptr. Likewise, adjacencyList will be a vector of 20 empty maps. When writing these functions, assume that resize has already been called: it happens in the NeuralNetwork’s constructors.

You will need to implement the following functions:

One indication that you have implemented the functions so far correctly is that the test_structure test case ./test_neuralnet 4 should pass.

Step 4: Implement Predict with BFT

The next two steps will be the most challenging part of the programming assignment. We recommend taking the self-test assignment on Gradescope before implementing these functions.

You will need to implement a BFT (Breadth First Traversal) algorithm for:

To implement a BFT, you should not use the NeuralNetwork::layers member variable. Instead, implement it using a queue as stated in the starter code of the predict function.

A BFT is required for this function because of how input flows through a neural network, as described below.

We have provided you two important functions that take care of the math for you:

Your job is to write the BFT algorithm that visits nodes and connections in the right order.

Here is a rundown of how the prediction algorithm works.

NeuralNetwork Example

For a neural network, the collections of weights and biases serve as the model or - the mechanism in which we make a prediction. We insert the input into each node of the input layer. This diagram takes in two inputs; since there are two nodes in the input layer, each node gets a different input. The value is then transformed by each weight as it “flows” to the next layer. Once it reaches the next layer, it again gets transformed by the bias and activation function - in that order. After this, the same process happens as it flows to the next layer. In this example, it flows until it has been transformed by the output layer, which leaves the value in a state we interpret as a prediction. For any given layer, we first visit all the nodes and connections to the next layer before visiting the nodes and connections of the next layer. Here is an example of how input flows from the input layer to the hidden layer in our example:

  1. Assume the node with id 1 contains the input value $x_1$, and the node with id 2 contains the input value $x_2$.
  2. To compute the value for node 3: $h_1$, we first accumulate the weighted sum: \(h_1 = x_1w_1 + x_2w_4\)
  3. Next, we add the bias term: \(h_1 = x_1w_1 + x_2w_4 + b_1\)
  4. Finally, activate the node with the activation function for its layer, which we will call activate:
\[h_1 = activate(x_1w_1 + x_2w_4 + b_1)\]

Using these steps, we can compute the value for every other node, including the output node ($y_1$), which will be our prediction.

Activation functions transform the value of a node after the weighted sum and bias are applied:

The specific activation function used at each layer is set by the model — your code doesn’t need to choose it, just call visitPredictNode.

NOTE: Keep these two facts in mind when writing your prediction algorithm.

The other important thing to realize is that a layer must wait for the computation of the previous layer to finish, hence the need for a BFT traversal.

What problem is training solving — and how? (Optional)

Training adjusts every weight and bias in the network to make better predictions. The math behind how those adjustments are computed is deliberately abstracted in this assignment — the visit functions handle it. But training is solving a real and interesting mathematical problem. If you’re curious about what that problem is and how backpropagation solves it, these questions are worth sitting with:

These videos answer those questions, visually and without heavy math:

Video 2 (20 min) — Gradient descent, how neural networks learn: 3Blue1Brown: Gradient descent, how neural networks learn

Video 3 (20 min) — Backpropagation, intuitively: Backpropagation, intuitively

If the videos feel abstract, this interactive visualizer walks through the same concepts — cost function, loss landscape, gradient descent, and backpropagation — step by step:

Training Neural Networks: Cost · Gradient Descent · Backpropagation

If you watch them, here is how the key visuals map to this assignment:

What you saw in the video What it is in the code
Neurons “lighting up” layer by layer predict() — your BFT visits nodes and connections in this order
The cost/loss landscape (hilly surface) The error value computed from the difference between prediction and ground truth
Gradient descent — rolling downhill weight -= lr * delta in update()
Error flowing backward through the network contribute() — your DFT traversal

Step 5: Implement Contribute with DFS

Your next task is to implement a DFT (Depth First Traversal) algorithm for:

A DFT is required because of the following structural problem.

The goal of contribute() is to compute a delta (a nudge) for every weight and bias in the network, based on how wrong the prediction was. To compute the delta for any node, you need the delta estimates from the nodes it connects to in the next layer. But the edges only go forward — there is no backward edge to follow. And the error is only measurable at the output node, where you can compare prediction to truth.

The question is: how do you get that error from the output back to every node in the network, when all edges point the other way?

DFS. Recursive calls let you dive forward to the output first, compute the error there, then pass the result back up the call stack — naturally reversing the direction of traversal. visitContributeNode and visitContributeNeighbor abstract the math; your job is to write the traversal that calls them in the right order.

Step through the interactive visualizer below to see exactly how contribute(), visitContributeNeighbor(), and visitContributeNode() are called during one backprop pass — and why the contributions map prevents recomputation. Use the arrow keys or the Prev/Next buttons to advance through each step.

Open in full screen

Why do we use DFT for back-propagation? There are a couple of reasons:

  1. The error received in one layer depends on some new error that the next layer computes and passes down. The error a node receives from its out-neighbors is an “incoming contribution” and the error that a node passes down to its in-neighbors is its “outgoing contribution”.
  2. The connections are only forward-directed; you cannot traverse backward in the network without the help of a recursive call.

NeuralNetwork Example

Here is an example, referring back to our original neural network: suppose we are currently at node 3

  1. Iterate through all your neighbors. For each neighbor:
    • get the neighbor’s incoming contribution by recursively calling contribute on node 6.
    • Using that incoming contribution, visit the connection corresponding to that neighbor.
    • In this case, we visit the connection with weight $w_7$.
  2. Visit node 3 and pass in your running outgoing contribution to be updated.

Now, an example using node 1:

  1. Get contribution from node 3, and visit the connection with $w_1$.
  2. Get contribution from node 4, and visit the connection with $w_2$.
  3. Get contribution from node 5, and visit the connection with $w_3$.
  4. Visit node 1 and pass in your running outgoing contribution to be updated.

We have provided you with two important functions that take care of the math for you:

In the starter code for the recursive helper, two local variables are declared for you: incomingContribution and outgoingContribution. incomingContribution holds the value returned by a recursive call to contribute on a neighbor — it is the error signal coming back from the next layer. outgoingContribution accumulates across all neighbors during the loop and is eventually returned to the caller, where it becomes that node’s incomingContribution. Both are passed by reference into the visit functions, which update them with the appropriate math.

In place of a “visited” set, there is a map called contributions. This acts as a way to keep track of nodes that we have already visited, as well as store their previously computed contributions.

Your job is to write the DFT algorithm that visits nodes and connections in the right order.

Step 6: Implement Update

After calling contribute, the update function assumes (precondition) that the delta of every node and connection has been accumulated. Now, each delta must be applied to every node’s bias and every connection’s weight.

Your goal in the update is to traverse the graph (in any way you want) to update every weight and bias. A single call to update affects every node and connection and reset the $delta$ values to zero.

For each update, follow these steps:

  1. To update the bias: $bias_{new} = bias_{old} - (lr * delta)$
  2. To update the weight: $weight_{new} = weight_{old} - (lr * delta)$
  3. Reset the $delta$ values for each node and connection to zero.

where $lr$ is the learning rate (a member variable of NeuralNetwork), and controls how impactful we consider contributions to be.

Step 7: Test and Submit

test_neuralnet.cpp has been given as a brief way to check your neural network. Please test your code against this file and make sure you pass all the tests before trying to submit to gradescope.

If you are having issues, it is helpful to run a debugger like gdb to make sure your code is following the logic described above.

Step 8: Rest

Take a break! You’ve done well.

External Resources

Graphviz

The graph class overloads the « operator and outputs the graph in dot format. Copy the portion that looks like: digraph G {...} and paste it here to visualize the graph:

BFS and DFS
Neural Networks
TensorFlow Playground

An interactive neural network running live in your browser. You can add and remove neurons and layers, choose activation functions, adjust the learning rate, and watch the network train in real time on classification and regression problems. This is useful for building intuition about what the weights and biases in your PA03 network are actually doing — you can see how the decision boundary shifts as training progresses, and experiment with how architecture and hyperparameter choices affect whether the network learns at all. No installation required.

Credits

This assignment was conceived and created by UCSB CS ULA, Zackary Glazewski in consultation with Diba Mirza for CS 24. Thanks to the UCSB CS24 teaching staff for reviewing and providing feedback: Torin Schlunk, Mehak Dhaliwal, Nawel Alioua, Joseph Ng, Shinda Huang, Xinlei Feng, Yaoyi Bai, Ally Chu, and Sanjana Shankar.

CC BY-NC-SA 2.0, Zackary Glazewski and Diba Mirza, May 2025.