Introduction To Machine Learning | Machine Learning Basics …

Introduction To Machine Learning:

Undoubtedly, Machine Learning is the most in-demand technology in todays market. Its applications range from self-driving cars to predicting deadly diseases such as ALS. The high demand for Machine Learning skills is the motivation behind this blog. In this blog on Introduction To Machine Learning, you will understand all the basic concepts of Machine Learning and a Practical Implementation of Machine Learning by using the R language.

To get in-depth knowledge on Data Science, you can enroll for liveData Science Certification Trainingby Edureka with 24/7 support and lifetime access.

The following topics are covered in this Introduction To Machine Learning blog:

Ever since the technical revolution, weve been generating an immeasurable amount of data. As per research, we generate around 2.5 quintillion bytes of data every single day! It is estimated that by 2020, 1.7MB of data will be created every second for every person on earth.

With the availability of so much data, it is finally possible to build predictive models that can study and analyze complex data to find useful insights and deliver more accurate results.

Top Tier companies such as Netflix and Amazon build such Machine Learning models by using tons of data in order to identify profitable opportunities and avoid unwanted risks.

Heres a list of reasons why Machine Learning is so important:

Importance Of Machine Learning Introduction To Machine Learning Edureka

To give you a better understanding of how important Machine Learning is, lets list down a couple of Machine Learning Applications:

These were a few examples of how Machine Learning is implemented in Top Tier companies. Heres a blog on the Top 10 Applications of Machine Learning, do give it a read to learn more.

Now that you know why Machine Learning is so important, lets look at what exactly Machine Learning is.

The term Machine Learning was first coined by Arthur Samuel in the year 1959. Looking back, that year was probably the most significant in terms of technological advancements.

If you browse through the net about what is Machine Learning, youll get at least 100 different definitions. However, the very first formal definition was given by Tom M. Mitchell:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.

In simple terms, Machine learning is a subset of Artificial Intelligence (AI) which provides machines the ability to learn automatically & improve from experience without being explicitly programmed to do so. In the sense, it is the practice of getting Machines to solve problems by gaining the ability to think.

But wait, can a machine think or make decisions? Well, if you feed a machine a good amount of data, it will learn how to interpret, process and analyze this data by using Machine Learning Algorithms, in order to solve real-world problems.

Before moving any further, lets discuss some of the most commonly used terminologies in Machine Learning.

Algorithm: A Machine Learning algorithm is a set of rules and statistical techniques used to learn patterns from data and draw significant information from it. It is the logic behind a Machine Learning model. An example of a Machine Learning algorithm is the Linear Regression algorithm.

Model: A model is the main component of Machine Learning. A model is trained by using a Machine Learning Algorithm. An algorithm maps all the decisions that a model is supposed to take based on the given input, in order to get the correct output.

Predictor Variable: It is a feature(s) of the data that can be used to predict the output.

Response Variable: It is the feature or the output variable that needs to be predicted by using the predictor variable(s).

Training Data: The Machine Learning model is built using the training data. The training data helps the model to identify key trends and patterns essential to predict the output.

Testing Data: After the model is trained, it must be tested to evaluate how accurately it can predict an outcome. This is done by the testing data set.

What Is Machine Learning? Introduction To Machine Learning Edureka

To sum it up, take a look at the above figure. A Machine Learning process begins by feeding the machine lots of data, by using this data the machine is trained to detect hidden insights and trends. These insights are then used to build a Machine Learning Model by using an algorithm in order to solve a problem.

The next topic in this Introduction to Machine Learning blog is the Machine Learning Process.

The Machine Learning process involves building a Predictive model that can be used to find a solution for a Problem Statement. To understand the Machine Learning process lets assume that you have been given a problem that needs to be solved by using Machine Learning.

Machine Learning Process Introduction To Machine Learning Edureka

The problem is to predict the occurrence of rain in your local area by using Machine Learning.

The below steps are followed in a Machine Learning process:

Step 1: Define the objective of the Problem Statement

At this step, we must understand what exactly needs to be predicted. In our case, the objective is to predict the possibility of rain by studying weather conditions. At this stage, it is also essential to take mental notes on what kind of data can be used to solve this problem or the type of approach you must follow to get to the solution.

Step 2: Data Gathering

At this stage, you must be asking questions such as,

Once you know the types of data that is required, you must understand how you can derive this data. Data collection can be done manually or by web scraping. However, if youre a beginner and youre just looking to learn Machine Learning you dont have to worry about getting the data. There are 1000s of data resources on the web, you can just download the data set and get going.

Coming back to the problem at hand, the data needed for weather forecasting includes measures such as humidity level, temperature, pressure, locality, whether or not you live in a hill station, etc. Such data must be collected and stored for analysis.

Step 3: Data Preparation

The data you collected is almost never in the right format. You will encounter a lot of inconsistencies in the data set such as missing values, redundant variables, duplicate values, etc. Removing such inconsistencies is very essential because they might lead to wrongful computations and predictions. Therefore, at this stage, you scan the data set for any inconsistencies and you fix them then and there.

Step 4: Exploratory Data Analysis

Grab your detective glasses because this stage is all about diving deep into data and finding all the hidden data mysteries. EDA or Exploratory Data Analysis is the brainstorming stage of Machine Learning. Data Exploration involves understanding the patterns and trends in the data. At this stage, all the useful insights are drawn and correlations between the variables are understood.

For example, in the case of predicting rainfall, we know that there is a strong possibility of rain if the temperature has fallen low. Such correlations must be understood and mapped at this stage.

Step 5: Building a Machine Learning Model

All the insights and patterns derived during Data Exploration are used to build the Machine Learning Model. This stage always begins by splitting the data set into two parts, training data, and testing data. The training data will be used to build and analyze the model. The logic of the model is based on the Machine Learning Algorithm that is being implemented.

In the case of predicting rainfall, since the output will be in the form of True (if it will rain tomorrow) or False (no rain tomorrow), we can use a Classification Algorithm such as Logistic Regression.

Choosing the right algorithm depends on the type of problem youre trying to solve, the data set and the level of complexity of the problem. In the upcoming sections, we will discuss the different types of problems that can be solved by using Machine Learning.

Step 6: Model Evaluation & Optimization

After building a model by using the training data set, it is finally time to put the model to a test. The testing data set is used to check the efficiency of the model and how accurately it can predict the outcome. Once the accuracy is calculated, any further improvements in the model can be implemented at this stage. Methods like parameter tuning and cross-validation can be used to improve the performance of the model.

Step 7: Predictions

Once the model is evaluated and improved, it is finally used to make predictions. The final output can be a Categorical variable (eg. True or False) or it can be a Continuous Quantity (eg. the predicted value of a stock).

In our case, for predicting the occurrence of rainfall, the output will be a categorical variable.

So that was the entire Machine Learning process. Now its time to learn about the different ways in which Machines can learn.

A machine can learn to solve a problem by following any one of the following three approaches. These are the ways in which a machine can learn:

Supervised learning is a technique in which we teach or train the machine using data which is well labeled.

To understand Supervised Learning lets consider an analogy. As kids we all needed guidance to solve math problems. Our teachers helped us understand what addition is and how it is done. Similarly, you can think of supervised learning as a type of Machine Learning that involves a guide. The labeled data set is the teacher that will train you to understand patterns in the data. The labeled data set is nothing but the training data set.

Supervised Learning Introduction To Machine Learning Edureka

Consider the above figure. Here were feeding the machine images of Tom and Jerry and the goal is for the machine to identify and classify the images into two groups (Tom images and Jerry images). The training data set that is fed to the model is labeled, as in, were telling the machine, this is how Tom looks and this is Jerry. By doing so youre training the machine by using labeled data. In Supervised Learning, there is a well-defined training phase done with the help of labeled data.

Unsupervised learning involves training by using unlabeled data and allowing the model to act on that information without guidance.

Think of unsupervised learning as a smart kid that learns without any guidance. In this type of Machine Learning, the model is not fed with labeled data, as in the model has no clue that this image is Tom and this is Jerry, it figures out patterns and the differences between Tom and Jerry on its own by taking in tons of data.

Unsupervised Learning Introduction To Machine Learning Edureka

For example, it identifies prominent features of Tom such as pointy ears, bigger size, etc, to understand that this image is of type 1. Similarly, it finds such features in Jerry and knows that this image is of type 2. Therefore, it classifies the images into two different classes without knowing who Tom is or Jerry is.

Reinforcement Learning is a part of Machine learning where an agent is put in an environment and he learns to behave in this environment by performing certain actions and observing the rewards which it gets from those actions.

Panic? Yes, of course, initially we all would. But as time passes by, you will learn how to live on the island. You will explore the environment, understand the climate condition, the type of food that grows there, the dangers of the island, etc. This is exactly how Reinforcement Learning works, it involves an Agent (you, stuck on the island) that is put in an unknown environment (island), where he must learn by observing and performing actions that result in rewards.

Reinforcement Learning is mainly used in advanced Machine Learning areas such as self-driving cars, AplhaGo, etc.

To better understand the difference between Supervised, Unsupervised and Reinforcement Learning, you can go through this short video.

So that sums up the types of Machine Learning. Now, lets look at the type of problems that are solved by using Machine Learning.

Type of Problems Solved Using Machine Learning Introduction To Machine Learning Edureka

Consider the above figure, there are three main types of problems that can be solved in Machine Learning:

Heres a table that sums up the difference between Regression, Classification, and Clustering.

Regression vs Classification vs Clustering Introduction To Machine Learning Edureka

Now to make things interesting, I will leave a couple of problem statements below and your homework is to guess what type of problem (Regression, Classification or Clustering) it is:

Dont forget to leave your answer in the comment section.

Now that you have a good idea about what Machine Learning is and the processes involved in it, lets execute a demo that will help you understand how Machine Learning really works.

A short disclaimer: Ill be using the R language to show how Machine Learning works. R is a Statistical programming language mainly used for Data Science and Machine Learning. To learn more about R, you can go through the following blogs:

Now, lets get started.

Problem Statement: To study the Seattle Weather Forecast Data set and build a Machine Learning model that can predict the possibility of rain.

Data Set Description: The data set was gathered by researching and observing the weather conditions at the Seattle-Tacoma International Airport. The dataset contains the following variables:

The target or the response variable, in this case, is RAIN. If you notice, this variable is categorical in nature, i.e. its value is of two categories, either True or False. Therefore, this is a classification problem and we will be using a classification algorithm called Logistic Regression.

Even though the name suggests that it is a Regression algorithm, it actually isnt. It belongs to the GLM (Generalised Linear Model) family and thus the name Logistic Regression.

Follow this, Comprehensive Guide To Logistic Regression In R blog to learn more about Logistic Regression.

Logic: To build a Logistic Regression model in order to predict whether or not it will rain on a particular day based on the weather conditions.

Now that you know the objective of this demo, lets get our brains working and start coding.

Step 1: Install and load libraries

R provides 1000s of packages to run Machine Learning algorithms and mathematical models. So the first step is to install and load all the relevant libraries.

Each of these libraries serves a specific purpose, you can read more about the libraries in the official R Documentation.

Step 2: Import the Data set

Lucky for me I found the data set online and so I dont have to manually collect it. In the below code snippet, Ive loaded the data set into a variable called data.df by using the read.csv() function provided by R. This function is to load a Comma Separated Version (CSV) file.

Step 3: Studying the Data Set

Lets take a look at a couple of observations in the data set. To do this we can use the head() function provided by R. This will list down the first 6 observations in the data set.

Now, lets look at the structure if the data set by using the str() function.

In the above code, you can see that the data type for the DATE and RAIN variable is not correctly formatted. The DATE variable must be of type Date and the RAIN variable must be a factor.

Step 4: Data Cleaning

Visit link:
Introduction To Machine Learning | Machine Learning Basics ...

Related Posts

Comments are closed.