Skip to content

Introduction to Machine Learning

Machine learning (ML) gained significant mainstream attention over two decades ago when IBM’s Deep Blue defeated world chess champion Gary Kasparov. As a subfield of computer science, ML is an essential component and product of artificial intelligence (AI) studies.

Today, ML is considered a mature technology area with widespread applications, including:

  • Recommending products to buyers (e.g., books or movies).
  • Predicting future market trends (e.g., real estate or stocks).
  • Assisting medical practitioners in diagnosis (e.g., classifying tumors as malignant or benign).
  • Optimizing energy consumption.

Human Learning

Humans must learn to perform tasks ranging from simple (like walking) to complex (like calculating a rocket launch angle). In cognitive science, learning is typically referred to as the process of gaining information through observation.

  • Prior Information related to a task is needed to execute that task properly.

  • Improvement through Learning: As a person acquires more information (learns), their efficiency in performing tasks improves. For example, experience from past rocket launches helps ensure the success of future launches.

Types of Human Learning (Analogies for ML)

The core philosophy of human learning involves learning from expert guidance and from experience. This is conventionally categorized into three types, which provide a strong analogy for the primary types of machine learning.

  1. Learning under expert guidance.
  2. Learning guided by knowledge gained from experts.
  3. Learning by self.

1. Guided Learning (Learning Under Expert Guidance)

Guided learning is the process of gaining information directly from a person (expert) possessing sufficient knowledge in a field due to past experience.

  • A student learning mathematics or science from a teacher.
  • A new professional learning the practical application of theoretical knowledge from an experienced mentor.

ML Analogy: Supervised Learning

Supervised learning involves learning from labeled training data (the "expert" or "teacher"), which provides known answers or values.

The machine is guided by this human-provided input to learn how to assign correct classes or values to new, unknown data.

2. Indirect Learning (Knowledge-Based Grouping)

This form utilizes previously acquired knowledge, imparted by an expert in a different context, to make decisions in a new context, with no direct "teacher" for the new task.

No direct learning for the task currently being performed (task unknown).

  • Grouping by Color: A baby can group objects of the same color together even though it was not explicitly taught the task of grouping. They can perform it by leveraging prior, separate knowledge of what colors are (taught by a parent).

  • Identifying Word Types: A student can identify the "odd word" from a set (e.g., a verb among nouns). This ability stems from foundational knowledge (labeling words as verbs or nouns) taught by a teacher long ago, applied to a new task.

ML Analogy: Unsupervised Learning

Unsupervised learning works with unlabeled data and attempts to find natural patterns or groupings (clusters) within the data.

The learning is not guided by labeled answers but utilizes the inherent structure of the data itself.

3. Self-Learning (Learning from Experience)

In this scenario, humans learn autonomously, often without direct instruction, by relying on trial and error.

Learning happens through continuous observation and correction, typically learning from the outcomes (mistakes or successes) of past actions. This builds an internal set of rules based on individual experience.

  • A child learning to ride a bicycle or an adult learning to drive a car.

ML Analogy: Reinforcement Learning (RL)

The machine (the agent) learns to act autonomously to achieve a goal.

It improves performance by gathering "experience" and receiving a reward for successful actions or a penalty for incorrect ones. This feedback mechanism guides the machine to learn by itself.

Machine Learning

Machine learning (ML) is the science and engineering of building machines capable of performing useful tasks without being explicitly programmed to do so.

The most concise and universally accepted definition is provided by Tom M. Mitchell:

ML is defined as

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

This means a machine learns if it can:

  • Gather experience (E) from past data, by doing a task
  • Improve its performance (P) at a similar task (T) in the future.

Past experience refers to past data related to the task, which is input to the machine.

ML can also be defined as the process of solving a practical problem by:

  1. Gathering a dataset.
  2. Algorithmically building a statistical model based on that dataset.

The Well-Posed Learning Problem

Before applying ML, it is essential to define the problem correctly. A framework for a well-posed learning problem involves answering three questions:

  1. What is the problem? Describe the problem informally and formally (using the Task, Experience and Performance definition) and list assumptions and similar problems.

  2. Why does the problem need to be solved? List the motivation for solving the problem, the benefits that the solution will provide and how the solution will be used.

  3. How to solve the problem? Describe how the problem would be solved manually to flush domain knowledge.

The Machine Learning Process

The ML process emulates human learning by developing algorithms that generate models from data. This process can be divided into three basic parts:

1. Data Input (Experience)

Past data or information (experience) related to the provided task is used as the foundation for decision-making. The effectiveness of the learning process depends heavily on the quality of the data; poor-quality training data will lead to imprecise models.

2. Abstraction (Model Building)

Abstraction takes the input data and based on that deriving a conceptual map. The input data is represented in a summarized knowledge representation, known as a model.

  • Models can take forms such as computational blocks (if/else rules), mathematical equations (representing linear regression), specific data structures (trees/graphs), or logical groupings of similar observations. This process is a form of Inductive Learning

Induction & Deduction

  • Induction (or inductive learning) is the tool of scientific reasoning that proceeds from specialization(data) to generalization(model). It involves summarizing specific observations into generalized rules. Learning from examples is an inductive process.

  • Deduction is reasoning from generalization to specialization, deriving specific cases and conclusions from basic principles.

3. Generalization (Application)

Generalization is the ability of a model to perform well on unseen samples (test data), which is the primary objective of machine learning. The model (abstracted representation) is applied to new, unknown data.

Rote Learning

The opposite of generalization is rote learning (memorization), where a model can only "predict" data it has already seen. It is the approach of simply memorizing training examples, also known as memorization-based learning.

It involves saving all input information as it is and retrieving it when needed, meaning it does not perform any real learning.

Since the model is trained on a finite set of data, generalization often requires an approximate or heuristic approach rather than perfect, reason-based decision-making especially when facing data characteristics previously unknown to the training set.

Risk of Overfitting

If a model is aligned too closely with its training data (capturing noise or outliers), its performance on new data will be poor. This is called overfitting.

Core ML Concepts: Hypothesis and Bias

  • Hypothesis Space: This is the collection of all possible models (hypotheses) that an algorithm can create. In concept of learning, ML is viewed as a search within this space for a hypothesis that is consistent with the training set. A consistent hypothesis correctly classifies all training examples.

  • Inductive Bias (Bias): This refers to the preference mechanism or assumptions an algorithm uses to choose one specific hypothesis from all the possibilities that fit the training data.

  • Every effective learning algorithm must have its own inductive bias. Without it, the algorithm would be unable to make a definitive choice between different hypotheses that all fit the data, leading to uncertain predictions on new samples.

Problems where Machine Learning should not be applied

Certain problems are not suitable for Machine Learning due to the nature of the task or the available data.

Machine learning should generally not be applied in the following situations:

  • Ineffectiveness Compared to Humans: Tasks where humans are already highly effective or where frequent human intervention is necessary are not ideal for machine learning. (air traffic control, which is complex and requires intense human involvement)

  • Simplicity of the Task: If a problem can be solved using traditional programming paradigms or simple rule-based systems, machine learning is unnecessary. (price calculators or dispute tracking systems, which rely on formulas or simple rules)

  • Process Optimization: Machine learning is most valuable when there are lapses in a business process. If a task or process is already optimized, implementing machine learning will not yield a sufficient return on investment.

  • Data not sufficient: If the available training data is insufficient or too small, the algorithms cannot learn effectively and leads to exponentially worse quality and poor prediction.

Types of Machine Learning

Machine learning problems are broadly classified into three primary categories.

CategoryLearning MechanismPrimary Goal
Supervised Learning
(Predictive Learning)
Learning from labeled training data (guided by a "teacher").Based on prior labeled data, Prediction of a class or value for unknown data.
Unsupervised Learning
(Descriptive Learning)
Learning from unlabeled data (self-discovery).In unlabeled data, Discovery of underlying groups, patterns, or structures.
Reinforcement Learning
(Agent Learning)
Learning by self (trial and error).Action to achieve a goal in an environment, guided by a system of rewards and penalties.

Sometimes, Semi-Supervised Learning (a hybrid of supervised and unsupervised) is included as a fourth category.

Questions

a) Explain Machine Learning using Tom Mitchell’s definition. Identify Task (T), Experience (E), and Performance (P) for a real-world problem.

b) Compare human learning and machine learning in terms of learning process and adaptability.

c) Compare human learning paradigms with machine learning paradigms at a conceptual level.

d) What are different types of human learning and how they are corresponding to the types of machine learning.

e) Describe the Machine Learning lifecycle with emphasis on abstraction and generalization.

c) Explain why some problems are not suitable for Machine Learning.