Lecture Notes for Machine Learning: Gaussian Discriminant Analysis

Kris
4 min readSep 26, 2024

--

The following content is summarized from Leture5 of the Course Stanford CS229.

Discriminative learning algorithms and generative learning algorithms differ fundamentally in how they model data and make predictions.

Discriminative Learning Algorithm

  • Directly model the decision boundary between different classes.
  • Method: Learn the conditional probability P(y x), where y is the class label and x is the input feature.
  • Example Algorithms: Logistic Regression, Support Vector Machines (SVM), Neural Networks.

Generative Learning Algorithm

  • Model how the data is generated for each class.
  • Method: Learn the joint probability P(x, y), which can be factored into P(x∣y)P(y). The algorithm derives P(y∣x) using Bayes’ theorem from this.
  • Example Algorithms: Naive Bayes, Gaussian Mixture Models (GMM), Hidden Markov Models (HMM).

Focus on the generative learning algorithm:

Recall the Bayes’ theorem:

P(A∣B): Posterior probability — the probability of event A (the hypothesis) occurring given that event B (the evidence) has occurred. This is what we are trying to compute.

Bayes’ Theorem plays a key role in some generative learning algorithms, especially for classification tasks. These algorithms model P(x∣y) (the likelihood) and P(y) (the prior probability), and then use Bayes’ Theorem to compute P(y∣x), which is the posterior probability used for classification.

The process is as follows:

  1. Generative models learn the distribution P(x∣y) for each class y and the prior probability P(y).
  2. When new data x needs to be classified, Bayes’ Theorem is applied to compute P(y∣x), and the class y with the highest posterior probability is chosen.

Gaussian Discriminant Analysis

Gaussian Discriminant Analysis (GDA) is a classification method based on a generative model used for classification problems, particularly when the data follows a Gaussian distribution.

GDA assumes that the data for each class is generated from a multivariate Gaussian distribution. Specifically, for each class y, the corresponding feature x is assumed to follow a Gaussian distribution.

The probability density function of the Gaussian distribution is given by:

μc and Σc​ are the mean vector and covariance matrix of class c, respectively. d is the dimensionality of the features. Note that Σc≥0 is symmetric and positive semi-definite.

GDA typically involves the following steps:

Estimate the prior probability for each class P(y).

Estimate the mean μc​ for each class:

Estimate the covariance matrix Σc for each class:

Using Bayes’ theorem, calculate the posterior probability and choose the class with the highest posterior probability:

GDA has two main variants: Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA), which are applied based on different assumptions.

  • Linear Discriminant Analysis (LDA): Assumes all classes share the same covariance matrix, Σc=Σ, resulting in linear decision boundaries.
  • Quadratic Discriminant Analysis (QDA): Allows each class to have its own covariance matrix Σc, leading to quadratic decision boundaries.

For the case of Linear Discriminant Analysis (LDA) for binary classification

Note that y~Bernoulli(Φ)

The log-likelihood function can be written as follows

The optimal solution of the parameters

Note that 1{⋅} is the indicator function
The numerator is the sum of feature vectors for all the examples of y=0 in the training set while the denominator is the sum of examples with y=0.

The figure below shows the multivariate Gaussian distribution in 3D and contour plots under different covariance matrices.

The figure below illustrates the concept of discriminative learning algorithms and generative learning algorithms, represented by logistic regression and GDA, respectively. It can be observed that the decision boundaries of the two approaches differ slightly.

Note that the green line is the decision boundary of logistic regression.

GDA and Logistic Regression

--

--

Kris
Kris

No responses yet