Categories

Tags


2021

Introduction to Classification March 18

Classification is one of the most basic problems in Statistics and Machine Learning. It is an example of a supervised learning problem where one deals with labeled data. In this note, we will learn about basic classification algorithms such as

  1. Logistic regression
  2. Linear and quadratic discriminant analysis
  3. Naive Bayes
  4. Multinomial logistic regression
  5. \(k\)-nearest neighbours
Continue reading…

2020

Stochastic Processes: Notes 3 November 18

Continuous time Markov chains

You are familiar with discrete time Markov chains. In this note, we will consider the continuous time version, where a chain can stay in a state for a random (continuous) amount of time. We will see that the Markov property forces that this random time is exponentially distributed. Given this, it is not surprising to guess that the homogeneous Poisson process \(\mathrm{PP}(\lambda)\) is a continuous time Markov chain.

Continue reading…
Stochastic Processes: Notes 2 October 27

In this note we will talk about Poisson point processes. There are many measure theoretic issues involved which we will silently hide under the carpet.

Poisson processes on \([0, \infty)\)

Poisson processes are the most prominent examples of counting processes, processes that count the number of events in some time interval.

Continue reading…
Stochastic Processes: Notes 1 September 16

Branching Processes

Suppose that we are interested in tracking the size of a closed population through generations. The population begins with a single individual (constituting the \(0\)-th generation with size \(Z_0 = 1\)), which produces \(X_{1, 1}\) offspring in generation \(1\) and then dies. Thus the size of the first generation is \(Z_1 = X_{1, 1}\). Each of these first generation individuals in turn produce more offspring in the next generation—the \(i\)-th one producing \(X_{2,i}\) many—and then dies. The total size of generation 2 is therefore \[ Z_2 = \sum_{i = 1}^{Z_1} X_{2, i}. \]

Continue reading…
Stochastic Processes: Notes 0 September 6

Welcome to Stochastic Processes, 2020! This course will introduce you to some of the most important stochastic processes of probability theory. As you have not learned measure theory yet, we will brush aside all measure-theoretic issues.

What are stochastic processes?

Any collection \((X_t)_{t \in I}\) of random variables defined on the same probability space is a stochastic process. The set \(I\) is called the indexing set.

Continue reading…
Local CLT and Stirling's formula April 6

The much-useful Stirling’s formula says that

\[ \tag{1} n! \sim \sqrt{2\pi n} e^{-n}n^{n}, \]

which can be equivalently rephrased as

\[ \tag{2} \frac{e^{-n} n^{n}}{n!} \sim \frac{1}{\sqrt{2\pi n}}. \]

Continue reading…

2019

Independence and homogeneity testing in contingency tables September 26

I am teaching Categorical Data Analysis this semester. An interesting phenomenon that one encounters while dealing with two dimensional contingency tables is that many types of tests for the hypothesis of independence of the two factors in the multinomial model turn out to be identical to corresponding tests for homogeneity in the product-multinomial model in that the relevant test statistics and their (asymptotic) null distributions turn out to be identical.

Continue reading…
Stochastic trace estimation August 19

Suppose we have an \(n \times n\) symmetric matrix \(M\) and we want to calculate the trace of \(M^k\) for some large exponent \(k\). For large values of \(n\) and \(k\) this has the complexity of matrix multiplication. In this post we will discuss a trick used in numerical linear algebra to cut down the complexity called stochastic trace estimation.

Continue reading…