Briefly cover topics to give you a very barebones introduction to machine learning.
Let’s try to understand machine learning.
In this post I’ll briefly cover topics to give you a very barebones introduction to machine learning.
Don’t worry if you’re not some prodigy or from some advanced civilization — the only knowledge you need to understand this is basic high school maths.
Microsoft defines Machine Learning as:
“Machine learning (ML) is the process of using mathematical models of data to help a computer learn without direct instruction.”
Machine Learning is getting computers to program themselves. If programming is automation, then machine learning is automating the process of automation. Wait. Did that make sense?
Machine learning is the science of getting computers to act without being explicitly programmed. That any better?
It is safe to say that there’s an unprecedented amount of definitions for machine learning.
I personally like to define it as “Machine learning involves us giving the computer some data and telling it to go learn from that data”. Basically like imitating a human. Learning from experience, so to speak. It’s like gardening. The seeds are the algorithms, the nutrients is the data, the gardener is us, people and plants are the programs.
Now if the definition was clear, we can move on to understanding a little better using some real-world applications.
Some interesting ways we can apply machine learning in the real world:
What is your area of interest and how could you use machine learning in that particular field?
There are tens of thousands of machine learning algorithms and hundreds of new algorithms are developed every year.
Machines need a lot of data to function, to learn from, and ultimately make decisions based on it, like I mentioned in my definition. This data can be any unprocessed fact, value, sound, image, text which can be interpreted and analyzed.
Once a dataset is ready, it is used for training, validating, and testing the ML model. Bigger the data set, better the learning opportunities for the model, higher are the chances of achieving accuracy in results.
This is where the 5Vs of Data come into play. Volume, Variety, Veracity, Velocity & Value.
Volume — Large data set makes it easy for the model to make the most optimal decisions.
Variety — The data set can have different forms of data such as images and videos. Variety in data has significance in ensuring accuracy in results.
Velocity — The speed at which the data is accumulated in the data set matters.
Value — The data set should have meaningful information on it.
Veracity — Accuracy in data is important while maintaining a data set.
Simply consider an algorithm as a mathematical or logical program that turns a data set into a model. There are different types of algorithms that can be chosen, depending on the type of problem that the model is trying to solve, resources available, and the nature of data.
In Machine learning, a model is a computational representation of real-world processes. An ML model is trained to recognize certain types of patterns by training it over a set of data using relevant algorithms. Once a model is trained, it can be used to make predictions.
To better explain this, let’s take a very simple example, if you feed the model images of cars with their respective names over an extended period of time, the model will now be able to recognize images of new cars and over time it will be able to predict with high accuracy the car’s name/make.
This also covers feature extraction. You need to input optimal and adequate data from data sets and never overfit it. Overfitting can be detrimental to the overall performance of the model.
Training includes approaches that allow ML models to identify patterns, and make decisions. There are different ways to achieve this including supervised learning, unsupervised learning, reinforcement learning, etc. All of which I will cover later in this post.
Now that we’ve gone over the key elements to machine learning, let’s discuss about the various methods of learning.
As with any method, there are different ways to train machine learning algorithms, each with their own advantages and disadvantages. To understand the pros and cons of each type of machine learning, we must first look at what kind of data they ingest. In ML, there are two kinds of data — labeled data and unlabeled data.
Labeled data has both the input and output parameters in a completely machine-readable pattern
Unlabeled data only has one or none of the parameters in a machine-readable form.
Three main methods of learning are implemented today, Supervised Learning, Unsupervised Learning, Reinforcement Learning.
Supervised learning is one of the most basic types of machine learning. In this type, the machine learning algorithm is trained on labeled data. Even though the data needs to be labeled accurately for this method to work, supervised learning is extremely powerful when used in the right circumstances.
Basically supervised learning is when we teach or train the machine using data that is well labelled. Which means some data is already tagged with the correct answer. It’s like showing a child something and telling them what it is, so they can recognize it in the future.
This means that supervised machine learning algorithms will continue to improve even after being deployed, discovering new patterns and relationships as it trains itself on new data.
Unsupervised learning is the training of a machine using information that is neither classified nor labeled and allowing the algorithm to act on that information without guidance. This means that human labor is not required to make the dataset machine-readable, allowing much larger datasets to be worked on by the program.
In supervised learning, the labels allow the algorithm to find the exact nature of the relationship between any two data points. However, unsupervised learning does not have labels to work off of, resulting in the creation of hidden structures.
You give the program a dataset, but no instructions on what the dataset means. You’re not defining any kind of outcome for the algorithm. Instead, it needs to figure out the patterns (if there are any) on its own.
Reinforcement Learning is learning by interacting with an environment. An RL agent learns from the consequences of its actions, rather than from being explicitly taught and it selects its actions on basis of its past experiences (exploitation) and also by new choices (exploration), which is essentially trial and error learning.
Think of your pet dog. You would give the dog a reward if he does what you want it to do and a small penalty/punishment for something he shouldn’t do. That way, the dog learns to never make the same mistake again.
Now that we’ve understood what the three main methods of learning are, we can proceed on to training a model.
The process of training an ML model involves providing an ML algorithm (that is, the learning algorithm) with training data to learn from. The term ML model refers to the model artifact that is created by the training process.
The model takes input in the form of data (x) and generates an output (y) based on the input data and its parameters. The optimisation algorithm tries to find the best combination of parameters so that given the example x the model’s output y is as close to the expected output as possible. The trained model will represent a specific function f that given x produces output y. So: y=f(x).
For example, let’s say that you want to train an ML model to predict if an email is spam or not spam. You would provide the ML with training data that contains emails for which you know the target (that is, a label that tells whether an email is spam or not spam). Then we would train an ML model by using this data, resulting in a model that attempts to predict whether new email will be spam or not spam.
There’s 3 types of models, Binary Classification, Multiclass Classification & Regression models.
ML models for binary classification problems predict a binary outcome (one of two possible classes) i.e., True or False/Yes or No/1 or 0.
ML models for multiclass classification problems allow you to generate predictions for multiple classes (predict one of more than two outcomes).
ML models for regression problems predict a numeric value.
Now that we’ve gone through how training is done, we can talk about Neural Networks, a subset of Machine Learning.
Scientists agree that our brain has around 100 billion neurons. These neurons have hundreds of billions connections between them.
Neurons (aka Nerve Cells) are the fundamental units of our brain and nervous system. The neurons are responsible for receiving input from the external world, for sending output (commands to our muscles) in the form of electrical signals.
The dendrites serve as the portion of the neuron that accepts multiple “input” signals, and through the cell body and the axon, it outputs through the terminal axon. This is exactly how a neuron in a neural network works as well — multiple inputs that are processed through a function and outputs a value.
Basically an artificial neural network. In this sense, neural networks refer to systems of neurons, either organic or artificial in nature.
There are still many improvements that are needed if we were to compare the artificial neural network to the biological neural network, but we can clearly see that the artificial neural network is a solution that was inspired from that of the animals.
Neural networks can adapt to changing input; so the network generates the best possible result without needing to redesign the output criteria.
As in the figure above, there are 3 layers present in an ANN.
1) Input Layer: It functions similarly to that of dendrites. The purpose of this layer is to accept input from another neuron.
2) Hidden Layer: These are the layers that perform the actual operation
3) Output Layer: It functions similarly to that of axons. The purpose of this layer to transmit the generated output to other neurons.
An artificial neuron receives an input. These inputs have a weight called “synapse”. These neurons (also called nodes) have an “activation function”. This activation function works on the input and processes it to give an output.
The weighted sum of inputs becomes an input signal to the activation function to give one output. These input weights are adjustable so that the neural network can adjust its parameters to give the desired output. There is no limit on how many hidden layers should be here. It can be as low as 1 or as high as 1000 or just about infinite.
An activation function is a function of input that the neuron receives. The activation function is used to convert the input signal on the node of ANN to an output signal.
Having multiple layers in a neural network is where the term “Deep Learning” comes from. The benefit of using multiple layers in the model is that each layer can use the information extracted in the previous layer to build up a more complex representation of the data. It’s because of this that neural networks have been shown to be so powerful, successfully trained to recognise cats in videos, recognise speech, and even play Atari video games.
Any machine learning algorithm is incomplete without an optimization algorithm. The main goal of an optimization algorithm is to subject our ML model (in this case neural network) to a series of trial and error processes which eventually results in a model having higher accuracy.
In the context of neural networks, we use a specific optimization algorithm called gradient descent.
The higher the gradient, the steeper the slope and the faster a model can learn. But if the slope is zero, the model stops learning. In mathematical terms, a gradient is a partial derivative with respect to its inputs.
There is a whole lot more that comes to optimization which I haven’t covered in this post to keep things brief and easy to understand.
There are a lot of different types of neural networks. A few of the most prominent ones are in this image .
Since explaining all of these would take a long time, I would cover them in a separate post soon. But, here’s a list of their applications.
1.Convolutional Neural Network(CNN): used in image recognition and classification
2.Artificial Neural Network(ANN): used in image compression
3.Restricted Boltzmann Machine(RBM): used for a variety of tasks including classification, regression, dimensionality reduction
4.Generative Adversarial Network(GAN): used for fake news detection, face detection, etc.
5.Recurrent Neural Network(RNN): used in speech recognition
6.Self Organizing Maps(SOM): used for topology analysis
It’s fair to say that we know the basics of what a neural network is by now. If you’d like to play around with small neural network examples, try Google’s Tensorflow Playground.
I don’t think the machines will take over. I don’t believe we’re going to live in a dystopic, AI dominated future. We’re going to be fine. I’m worried more about the type of data programmers feed into algorithms — I think Human Biases is more of a monster than anything else.
For more, check out these resources.