1.0) What is machine learning?
Definition provided by experts:
An expert such as Tom Mitchell who wrote the book “Machine Learning”, used in most introductory machine learning courses defines it as:
“The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience.”
At the introduction of the book he formalizes the above point into a better explanation that states:
“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E”
The short formalism above while great, in my opinion is incomprehensible to everyone apart from an expert at machine learningJ
In my humble opinion, the best way to approach machine learning is by looking at a bunch of problems we have all encountered (Software Engineering Folks), or heard off before. Let me list out the 4 most common problems
2.0) Common Machine Learning Problems
1) Spam Detection: Given email in an inbox, identify those email messages that are spam and those that are not. Having a model of this problem would allow a program to leave non-spam emails in the inbox and move spam emails to a spam folder
2) Stock Trading: Given the current and past price movements for a stock, determine whether the stock should be bought, held or sold. A model of this decision problem could provide decision support to financial analysts.
3) Face Detection: Given a digital photo album of many hundreds of digital photographs, identify those photos that include a given person. This problem is an example of machine learning as well as computer vision.
4) Movie Recommendation on Netflix: Based on you viewing history, recommending you new movies to watch.
3.0) Three types of machine learning algorithms
1) Supervised Learning: is the machine learning task of inferring a function from labeled training data. Labeled data for a learning problem is usually provided by a human. The human provides input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output.
Y = f(X)
The goal is to approximate the mapping function so well that when you have new input data (x) that you can predict the output variables (Y) for that data. Based on my understanding this is what frontier signals is trying to achieve. Supervised learning methods can be grouped into classification and regression methods discussed in 4.0
2) Semi-supervised learning: Problems where you have a large amount of input data (X) and only some of the data is labeled (Y) are called semi-supervised learning problems. These problems sit in between both supervised and unsupervised learning. A good example is a photo archive where only some of the images are labeled, (e.g. chair, cat, person) and the majorities are unlabeled. Many real world machine learning problems fall into this area.
3) Unsupervised learning: is where you only have input data (X) and no corresponding output variables. Algorithms are left to their own devises to discover and present the interesting structure in the data. Unsupervised learning problems can be further grouped into clustering and association problems discuss in section 4.0
4.0) Solving problems in section 2.0 using machine learning methods.
I picked the 4 machine learning problems above for a reason because each of them can be solved by using the 4 most commonly using machine learning methods listed below:
1) Classification: The first problem listed above can be solved using a technique called classification Data is labeled meaning it is assigned a class, for example spam or not non-spam. The decision being modeled is to assign labels to new unlabeled pieces of data. This can be thought of as a discrimination problem, modeling the differences or similarities between groups. Classification is an example of supervised learning.
2) Regression: The second problem above can be solved by using regression. Data in this case is labeled with a real value rather then a label. Examples that are easy to understand are time series data like the price of a stock over time. The decision being modeled is what value to predict for new unpredicted data.
3) Clustering: Data is not labeled, but can be divided into groups based on similarity and other measures of natural structure in the data. Using the same example in the problem above, we can put together all the pictures with faces without names in one folder.
4) Association: The fourth problem can be solved using a more complex technique called Association. Data is used as the basis for the extraction of propositional rules (think if-then). Such rules may, but are typically not directed, meaning that the methods discover statistically supportable relationships between attributes in the data, not necessarily involving something that is being predicted. An example in the case of Netflix might be a user liking a romantic movie because he liked an action movie with some romantic aspect to it.
5.0) What it takes to be a machine learning expert
The venn diagram below illustrates what it takes to be machine learning expert :)