Blog 1: A Deep Dive into Deep Learning
- rigelchan
- Oct 21, 2022
- 7 min read
Welcome everyone to my journey down into deep learning. This blog will be focused more on my experience and a summary of knowledge on my learning rather than a deep tutorial on all the information covered. Join me as I struggle or succeed in learn what a Neural Network is, how Machine learning works, and hopefully be able to create some really cool models to showcase everything!
Neural Networks
What is a neural network you may ask? Based on its name you may think it is just a network of neurons... which you would be correct! This model is based on how actual neurons work where neurons accept inputs and gives an output. Rather than accepting and delivering neurotransmitters(a fancy biology term for biological information) these neurons used in a neural network accepts and delivers data.

Like the human body, many neurons are needed to make up a network. Each "neuron" or node receives information and delivers an output depending on the input it receives. In a neural network these nodes are arranged in layers. A basic neural network will have only 3 layers: an input layer, a hidden layer, and an output layer. The input layer is where the initial input is fed into the network.

Here the nodes process the data through weights and biases which will feed into an activation function to determine what the output will be. In more non-mathematical terms, depending on the input data will determine how the output will be. The weights and biases are calculated for the activation function which will determine if the neuron fires or not or returns a 0 or 1. This leads to something known as a loss function or a gradient descent. A loss function is the difference between the actual output for a given input vs. the predicted output for the same input. For your model you want to minimize your loss function. The gradient descent is an optimizing algorithm which goes through combinations of weights to find the best combination to minimize error.

As seen in the image above, a gradient descent tries to move from left or right to the centre where it will reach the optimal position. This dives into what activation functions are. Activation functions are what determines whether the calculated weight of the node reaches a specific threshold to output a value. There are many types of activation functions. A step function, similar to actual neurons, are an all or nothing situation. There is a threshold limit that must be reached, no matter how close you may be, in order to activate the neuron. This model does not work with a gradient descent as its outputs are linear and could not be optimized. A sigmoid activation function works where an output of the activation function is between 0 and 1 so it is not an all or nothing situation which allows for the use of a gradient descent. A linear function is not a good model for an activation function as it leads to an issue known as the XOR challenge where the neuron can only work with a linear functionality and fails when more than one dimension is added. The most proficient model is the ReLU(Rectified Linear Unit) where it is defined that any value of x below 0 returns a 0, and any value of x above 0 or equal to 0 returns the value x. This means that if x =-2, the output would be 0. If x = 2, the output would be 2. Although many different models may work better for your network, ReLU remains the popular model and the best place to start.



If you are like me, you probably understood a total of 3.5 things from that mess of a paragraph. Luckily for you I present to you a video with a visual model to show case everything on how a neural network works.
Wow that's a lot of information. But what does that all mean? Luckily, there is a well-known example to show a neural network in action. How the example goes is how do you know a number is a number? Sure, if it is standardized in say on your Microsoft Word, it is very easy for a program to identify it even without a neural network. But what if the input was not standardized? What if the information you input consists of multiple people handwriting numbers? We can identify a 4 is a 4 even if there are multiple ways in how it is written and shown, but how will a program know.

Through the use of a large library such as the MNIST and through the use of the Kera model, we are able to demonstrate how a neural network is able to distinguish real life images. After doing the example myself, it is a lot of technically math and rendering calculation which would not be that fun to show. Basically, by using python, you are able to get values of each pixel off an image from the MNIST library and give it a specific value depending on what is on the pixel. After you render the data and fix the parameters, you can start training the Kera model to allow it access the data from the MNIST library which gives the images of numbers and is able to separate the numbers from 0-9 at a high degree of accuracy. Here are just some of the snapshot for the procedure(don't worry about spending your time on this, there is a way more fun demo just up ahead)
Here you can see how I loaded the data in the model and am able to pull up one of the images from the MNIST library.

Here you can see part of the processing of the data itself in order to input it into the Kera model.

Once all the data is ready, we are able to create the model. Here is where I set the parameters of the model and a summary of it.

Once everything is ready, we now run the model through the entire library. As you can see below, there is some exciting action in the form of loading bars for each Epoch in the model.

When all the data is fed into the model, we are able to graph some of the results, such as the accuracy of the model.

As you can see, not very exciting stuff here but the technical capabilities and the demonstration of the technology is important background knowledge for what is to come.
MACHINE LEARNING
Welcome to at least what I think was the hardest part of my journey... so far...(excluding the numerous hours of installing and downloading and reinstalling and big fixing version issues of different software and libraries but that's for another day).
To put it in perspective, the concept of A.I. all encompassing knowledge of anything to do with well... artificial intelligence. This includes machine learning. Well machine learning are models built by accepting data and making decisions based on the data. Sound familiar? Neural networks or deep learning is a smaller section of machine learning where it is utilized as the framework for machine learning. Still with me? If you are great! If not... So to put it in perspective, to fully apply a neural network, we need to know how machine learning works!
In a higher level, machine learning teaches the program what to do. There are many different kinds of machine learning but here we will focus on reinforcement learning. How this works is the program will detect and observe its environment. From there it will do decide on the action it will commit. The action is then executed and depending on the action we will reward the program, whether it be a positive reward or a negative reward. From there the cycle repeats and the program is trained by this reward system where it will get positive rewards if it commits to an action, we want it to do and so it will continue to do it. On the flip side, when the program is rewarded a negative reward, the program will start to refrain from doing that action that gave it a negative reward. This learning, although simple, have to go through hundreds or thousands of episodes of the program committing to random actions until our desired action is achieved.
Now that we got all sorted and out of the way, it's time for the fun stuff and the main reason I was drawn into learning about neural networks and machine learning!
Unity Modelling
For those who may not know, Unity is a very popular game engine used for many famous games such as Pokemon GO and Among Us. Unity does not only have to be used for game making, which makes Unity a powerful tool for us to use to model some machine learning. Luckily for us, Unity already has an external library dedicated for machine learning. After following some tutorials on YouTube(and countless bug fixing and version adjustments on my installs and downloads), I was able to generate my own, albeit simple, machine learning model. This comes in-tandem to other programs that work to track and train the program. In the video below you will see how I did it:
The cumulation of all that knowledge... just to have a box touch a yellow ball. Isn't that incredible!?!?! For those who don't think all that work was worth it, you can just join me on my adventure, and I will do the heavy lifting. This simple act of a box touching a ball however is the basis of so much more. There have been people who have designed models that allow agents to traverse mazes, race cars through a track, create an A.I. to play snake, even as far as creating a Chess AI(oh boy is this even more complicated). Hopefully you won't expect too much out of me like hoping I create the next J.A.R.V.I.S. but if all goes well, I wish to demonstrate to you, my readers, all the cool and interesting things we can do from this groundwork that has been laid out. Who knows, maybe I will be able to show you all a simple but cool A.I. model.
Comments