Introduction
In this blog, we will build a simple neural network from scratch in Rust. We'll start by setting up our project, then implement the core components of a neural network, and finally train it on a basic dataset.
Project Setup
First, let's set up a new Rust project. Open your terminal and run:
cargo new neural_network
cd neural_networkIf you're unfamiler with ML Models in Rust, check our previous guide on how to build a machine learning model in rust.
Next, we'll add the ndarray crate for numerical operations and rand crate for random number generation. Update your Cargo.toml file to include these dependencies:
[dependencies]
ndarray = "0.15"
rand = "0.8"Implementing the Neural Network
We'll start by creating a network.rs file in the src directory to hold our neural network implementation.
Defining the Network Structure
Create a Network struct that will hold our weights and biases:
// src/network.rs
use ndarray::{Array1, Array2, Axis};
use rand::thread_rng;
use rand::Rng;
pub struct Network {
weights1: Array2<f64>,
biases1: Array1<f64>,
weights2: Array2<f64>,
biases2: Array1<f64>,
}
impl Network {
pub fn new(input_size: usize, hidden_size: usize, output_size: usize) -> Self {
let mut rng = thread_rng();
let weights1 = Array2::from_shape_fn((hidden_size, input_size), |_| rng.gen_range(-1.0..1.0));
let biases1 = Array1::from_shape_fn(hidden_size, |_| rng.gen_range(-1.0..1.0));
let weights2 = Array2::from_shape_fn((output_size, hidden_size), |_| rng.gen_range(-1.0..1.0));
let biases2 = Array1::from_shape_fn(output_size, |_| rng.gen_range(-1.0..1.0));
Network {
weights1,
biases1,
weights2,
biases2,
}
}
}Forward Pass
Implement the forward pass of the network, which involves calculating the activations for each layer:
impl Network {
fn sigmoid(x: &Array1<f64>) -> Array1<f64> {
x.mapv(|x| 1.0 / (1.0 + (-x).exp()))
}
fn sigmoid_derivative(x: &Array1<f64>) -> Array1<f64> {
x * &(1.0 - x)
}
pub fn forward(&self, input: &Array1<f64>) -> (Array1<f64>, Array1<f64>, Array1<f64>) {
let hidden_input = self.weights1.dot(input) + &self.biases1;
let hidden_output = Self::sigmoid(&hidden_input);
let final_input = self.weights2.dot(&hidden_output) + &self.biases2;
let final_output = Self::sigmoid(&final_input);
(hidden_output, final_input, final_output)
}
}Backpropagation
Add backpropagation to adjust the weights and biases based on the error:
impl Network {
pub fn train(&mut self, input: &Array1<f64>, target: &Array1<f64>, learning_rate: f64) {
let (hidden_output, final_input, final_output) = self.forward(input);
let output_errors = target - &final_output;
let output_delta = &output_errors * &Self::sigmoid_derivative(&final_output);
let hidden_errors = self.weights2.t().dot(&output_delta);
let hidden_delta = &hidden_errors * &Self::sigmoid_derivative(&hidden_output);
self.weights2 = &self.weights2 + &output_delta.insert_axis(Axis(1)).dot(&hidden_output.insert_axis(Axis(0))) * learning_rate;
self.biases2 = &self.biases2 + &output_delta * learning_rate;
self.weights1 = &self.weights1 + &hidden_delta.insert_axis(Axis(1)).dot(&input.insert_axis(Axis(0))) * learning_rate;
self.biases1 = &self.biases1 + &hidden_delta * learning_rate;
}
}Training the Network
Now, let's create a main file to train our network on a simple dataset.
// src/main.rs
mod network;
use network::Network;
use ndarray::array;
fn main() {
let mut network = Network::new(2, 3, 1);
let inputs = vec![
array![0.0, 0.0],
array![0.0, 1.0],
array![1.0, 0.0],
array![1.0, 1.0],
];
let targets = vec![
array![0.0],
array![1.0],
array![1.0],
array![0.0],
];
let learning_rate = 0.1;
let epochs = 10000;
for _ in 0..epochs {
for (input, target) in inputs.iter().zip(targets.iter()) {
network.train(input, target, learning_rate);
}
}
for input in inputs.iter() {
let (_, _, output) = network.forward(input);
println!("{:?} -> {:?}", input, output);
}
}Explanation
- Network Initialization: We initialize the network with random weights and biases.
- Forward Pass: We calculate the activations for the hidden and output layers using the sigmoid function.
- Backpropagation: We calculate the error at the output and hidden layers, then update the weights and biases using gradient descent.
- Training: We train the network on the XOR dataset for a specified number of epochs.
- Testing: After training, we test the network on the same inputs to see how well it learned the XOR function.
Running the Code
cargo runYou should see the network's predictions for the XOR function.
Conclusion
In this blog, we built a simple neural network from scratch in Rust. We covered the core components, including initialization, forward pass, and backpropagation. This example can be expanded to more complex networks and datasets, providing a solid foundation for neural network implementation in Rust.
Feel free to experiment with different architectures, activation functions, and learning rates to see how they affect the network's performance.
