February 18, 2016Long Gong Reading time ~5 minutes

Coding Practice - Increasing Triplet Subsequence

[Leetcode 334] Increasing Triplet Subsequence

this image is screen shot from leetcode

Given an array, return whether an increasing subsequence of length 3 exists or not in the array, i.e., given array $A$ , if there exist $0 \leq i < j < k \leq n - 1$ such that $A[i] < A[j] < A[k]$ , return true, otherwise, return false. More details you can refer to leetcode 334.

Solutions

Brute Force

The easiest idea would be the brute force. It works as follows. For every $0 < k < n - 1$ , check whether there exists $i < k$ and $j > k$ such that $A[i] < A[k]$ and $A[j] > A[k]$ . The following gives its pseudo code.

n = len(A)
for k in xrange(1,n - 1):
    f = False
    for i in range(k):
        if A[i] < A[k]:
            f = True
            break
    if f:
        for j in xrange(k + 1,n):
            if A[j] > A[k]:
                return True
return False

It is obvious, the above algorithm has time complexity of $O(n^2)$ and space complexity of $O(1)$ .

Small Improvement

In the brute force algorithm, if you do some preprocessing, you can easily avoid the inner for loop, which means you can reduce the time complexity to $O(n)$ . How does the preprocessing work? Actually it is quite easy. Imagine that if you have known the minimum number (denoting as $T[k]$ ) before $A[k]$ and the maximum number (denoting as $Q[k]$ ) after $A[k]$ , then you only need to check whether $T[k] < A[k] < Q[k]$ . And of course the preprocessing is to calculate $T[k]$ and $Q[k]$ . Its pseudo code is shown in the following.

n = len(A)

T = [A[0]]
Q = [A[-1]]

for k in xrange(1,n):
    T.append(min(T[-1],A[k - 1]))
    Q.insert(max(Q[0],A[n - k - 1]))

for k in xrange(1,n-1):
    if A[k] > T[k] and A[k] < Q[k]:
        return True

return False

It is easy to figure out that this algorithm has time complexity of $O(n)$ , however, it also has space complexity of $O(n)$ .

Even Better

Can we do better? Yes, of course, first of all, we can at least reduce the space from $2n$ to $n$ (e.g., replacing array $T$ with an integer number), however, the space complexity is still $O(n)$ , just with a smaller constant factor. It seems that based on the previous idea (i.e., through comparing the minimum number before an element and the maximum one after it), we can hardly reduce the space requirement further.

Let’s change our though a little bit. Imagine if you know the “second order minimum element”, here the “second order minimum element” is the minimum element which has at least one element before it and less than it, mathematically, $x = \min\{A[k]:\exists j < k, A[j] < A[k]\}$ . Then, through comparing with it, we can get whether there is an increasing triplet subsequence in $A[0..k]$ . Since, for any $k$ , we only need to know the “second order minimum element” in $A[0..k-1]$ , so we do not need preprocessing, and if we can update the “second order minimum element” in $O(1)$ space and $O(1)$ time when we go from $A[k]$ to $A[k+1]$ , then we can design an algorithm with time complexity of $O(n)$ and space complexity of $O(1)$ .

Now, let’s focus on the key question: can we update the “second order minimum element” in $O(1)$ space and $O(1)$ time when we go from $A[k]$ to $A[k+1]$ ? Assume $x$ is the “second order minimum element” in $A[1..k-1]$ , if $x < A[k]$ , then we are done, because it means there exists an an increasing triplet subsequence in $A[0..k]$ . However, things get tricky when $x < A[k]$ . We cannot directly update $x$ as $A[k]$ , since we can not guarantee there exists at least an element that is less than $A[k]$ in $A[0..k-1]$ . Go one step further, if we have the record of the minimum element (denoting as $y$ ) in $A[0..k-1]$ , then if $A[k] \leq y$ , then $x$ is also the “second order minimum element” in $A[0..k]$ , and if $A[k] > y$ , then we can guarantee that there exists at least one element which is $y$ is less than $A[k]$ , therefore, $A[k]$ becomes the “second order minimum element”. Therefore, we can update the “second order minimum element” in $O(1)$ space and $O(1)$ time when we go from $A[k]$ to $A[k+1]$ .

The following shows the pseudo code.

n = len(A)
x = sys.maxint
y = sys.maxint

for k in range(n):
    if A[k] > y:
        return True
    elif A[k] <= x:
        x = A[k]
    else:
        y = A[k]
return False

February 14, 2016Long Gong Reading time ~1 minute

Useful Latex Packages

1. Algorithms Packet

1.1. clrscode3e Package

clrscode3e package was developed and maintained by Professor Thomas H. Cormen. As indicated by the name of the package, it was designed to duplicate the pseudocode displaying style in the textbook, Introduction of Algorithms (Third edition), by Cormen, Leiserson, Rivest, and Stein (CLRS 3/e). More details about the package, you can refer to its documentation.

1.1.1. Setup

STEP 1: download the clrscode3e package.

STEP 2: In the header part of your TEX file, please include the following sentence.

\usepackage{clrscode3e}

STEP 3: Use the syntax of this package to type your pseudocode.

1.1.2. An Example

The following gives an example of how pseudocode generated by clrscode3e package looks like.

The source code is as follows (do not forget to including clrscode3e package in the header).

\begin{codebox}
\Procname{$\proc{Insertion-Sort}(A)$}
\li \For $j \gets 2$ \To $\attrib{A}{length}$
\li \Do
$\id{key} \gets A[j]$
\li \Comment Insert $A[j]$ into the sorted sequence
$A[1 \twodots j-1]$.
\li $i \gets j-1$
\li \While $i > 0$ and $A[i] > \id{key}$
\li \Do
$A[i+1] \gets A[i]$
\li $i \gets i-1$
\End
\li $A[i+1] \gets \id{key}$
\End
\end{codebox}

The generated pseudocode looks like.

January 11, 2016Long Gong Reading time ~1 minute

Advanced Graph Theory

Hypergraphs

Hypergraphs

A hypergraph $H(V,E)$ is a generation of a graph where each edge can contain any number of vertices. Here, $V$ is the set of vertices, and $E \subseteq \mathbf{P}(V) \setminus \emptyset$ is the set of hyperedges, where $\mathbf{P}(V)$ is the power set of $V$ (i.e., the set of all subset of $V$ ). If all hyperedge has the same size $k$ , then the hypergraph is called as $k$ -uniform hypergraph.

January 02, 2016Long Gong Reading time ~4 minutes

Useful Inequalities

3. In Probability Theory
4. In Real Analysis

In this post, a lot of useful equations would be introduced, and mots of them would be proved formally.

3. In Probability Theory

3.1. Combinatorial Number Approximation

This subsection describes several ways to approximate the combinatorial numbers, which would be very useful for example when doing some analysis on binomial distribution.

3.1.1. By Power Function

$\binom{n}{k} \le \frac{n^k}{k!}$

proof:

$\begin{aligned}\binom{n}{k} &= \frac{n!}{(n-k)! k!}\\ &= \frac{n(n-1)(n-2)\cdots (n -k + 1)}{k!}\\ &= \frac{n^k}{k!} \end{aligned}$

Lemma:

$\frac{1}{k!} \le \frac{e^k}{k^k}$

proof:

Since,

$e^x = 1 + \frac{1}{1!}x + \frac{1}{2!}x^2 + \cdots + \frac{1}{k!}x^k + \cdots$

Therefore,

$e^k \ge \frac{1}{k!}k^k$

Hence,

$\frac{1}{k!} \le \frac{e^k}{k^k}$

3.1.2 By Exponential Function (Upper)

$\binom{n}{k}\le \left(\frac{ne}{k}\right)^k$

proof:

According the above two inequalities, it is easy to obtain this inequality.

3.1.3. By Exponential Function (Lower)

$\binom{n}{k} \ge \left(\frac{n}{k}\right)^k$

proof:

$\begin{aligned}\binom{n}{k} &= \frac{n!}{(n-k)! k!}\\ &= \frac{n(n-1)(n-2)\cdots (n -k + 1)}{k!}\\ &= \prod_{i = 0}^{k - 1} \frac{n - i}{k - i} \end{aligned}$

As,

$n \ge k \ge 0 \Rightarrow in \ge ik \qquad \forall i > 0$

Hence, for any $i = 1,2,...,k-1$ ,

$kn - ik \ge kn - in \Rightarrow \frac{n-i}{k-i} \ge \frac{n}{k}$

Therefore,

$\begin{aligned}\binom{n}{k} &= \prod_{i = 0}^{k - 1} \frac{n - i}{k - i}\\ &\ge \prod_{i = 0}^{k - 1}\frac{n}{k}\\ &= \left(\frac{n}{k}\right)^k \end{aligned}$

3.2 Binomial Distribution

Give a binomial distribution $\mathbb{B}(n,p)$ , the summation of all odd elements is,

$\begin{aligned} \mathbb{P}(j \mbox{ is odd }) &= \sum_{k = 0}^{2k + 1 <= n}\binom{n}{2k + 1}p^{2k + 1}(1 - p)^{n - 2k - 1}\\ &= \frac{1}{2}(1 - (1 - 2p)^n) \end{aligned}$

3.1. Markov’s Inequality

Markov’s inequality gives an upper bound (a function of its expectation) for the probability that a non-negative function of a random variable is no less than a constant.

3.1.1 Basis Version

Given any random variable $X$ and $a > E(X)$ , we have,

$\mathbb{P}(X \ge a) \le \frac{\mathbb{E}(X)}{a}.$

proof:

3.1.2 Extended Version

Given a monotonically increasing function $\phi$ from non-negative real numbers to the non-negative reals, $X$ is a random variable, $a \ge 0$ , and $\phi (a) > 0$ , then,

$\mathbb{P}(\mid X \mid \ge a) \le \frac{\mathbb{E}(\phi (X))}{\phi (a)}.$

proof:

3.2. Chebyshev’s Inequality

Chebyshev’s inequality is about how “far” can the values of a distribution deviates from its mean. Formally speaking, it guarantees that for any distribution no more than $\frac{1}{k^2}$ of the distribution’s values can be more than $k$ standard deviations away from the mean.

3.2.1 Basic Version

Let $X$ be a random variable with finite expected value $\mu$ and finite non-zero variance $\sigma^2$ . Then for any real number $k > 0$ .

$\mathbb{P}(\mid X - \mu \mid \ge k \sigma) \le \frac{1}{k^2}.$

proof

Another expression is as follows:

$\mathbb{P}(\mid X - \mu\mid \ge t) \le \frac{\sigma^2}{t^2},$

For any $t > 0$ , the above expression can also written as,

$\mathbb{P}(\mid X - \mu \mid < t) > 1 - \frac{\sigma^2}{t^2},$

proof

3.2.2 Extensions

Asymmetric case: for any $k_1 + k_2 = 0$ and $k_1 < k_2$ , we have, $\mathbb{P}(k_1 < X < k_2) \ge 1 - \frac{\sigma^2}{(k_2 - k_1)^2},$ proof
Vector Version: for a random vector $X = (x_1,x_2,...,)$ with mean

$\mu = (\mu_1,\mu_2,...)$ , variance $\sigma^2=(\sigma_1^2,\sigma_2^2,...)$ and an arbitrary norm $\mid\mid\cdot\mid\mid$ that, $\mathbb{P}(\mid\mid X - \mu\mid\mid \ge k \mid\mid \sigma\mid\mid) \le \frac{1}{k^2}$ . proof

4. In Real Analysis

4.1. Holder’s Inequality

Suppose that $\mu_1, \mu_2, ..., \mu_n$ and $v_1, v_2, ..., v_n$ are non-negative numbers, let $p > 1$ , and $q$ is the dual of $p$ that is,

$\frac{1}{p} + \frac{1}{q} = 1$

Then, we have

$\sum_{i = 1}^{n} \mu_i v_i \le \left(\sum_{i = 1}^n\mu_i^p\right)^{1/p}\left(\sum_{i = 1}^n v_i^q\right)^{1/q}.$

Lemma: Given any two non-negative numbers $\alpha$ and $\beta$ , and two positive numbers $p$ and $q$ such that,

$\frac{1}{p} + \frac{1}{q} = 1$ ,

then, we have,

$\alpha \beta \le \frac{\alpha^p}{p} + \frac{\beta^q}{q}.$

4.2. Minkowski’s Inequality

Suppose that $\mu_1, \mu_2, ..., \mu_n$ and $v_1, v_2, ..., v_n$ are two non-negative sequences and $p > 1$ , then,

$\left(\sum_{i = 1}^{n} (\mu_i + v_i)^p\right)^{1/p} \le \left(\sum_{i = 1}^{n}\mu_i^p\right)^{1/p} + \left(\sum_{i = 1}^{n} v_i^p\right)^{1/p}.$

4.3. Infinite Norm

If $\mathbf{X} \in \mathbb{R}^n$ and $p_2 > p_2 \le 1$ , then,

$\mid\mid \mathbf{X} \mid\mid_{p_2} \le \mid\mid \mathbf{X} \mid\mid_{p_1}.$

moreover,

$\lim_{p \rightarrow \infty}\mid\mid \mathbf{X}\mid\mid_{p} = \max \{\mid x_i\mid:1 \le i \le n\}.$

4.4 Convergent Sequence & Cauchy Sequence

If a sequence $\{\mu_n\}$ in a metric space $(A,\rho)$ is convergent, then it is a Cauchy sequence.

January 02, 2016Long Gong Reading time ~3 minutes

Useful Data Structures

Hashing
- Basic Concepts
- Performances
1. Bloom Filter
- 1.1 Algorithmic Description
- 1.2 False Positive Analysis

Hashing

Basic Concepts

Definition 1: [Simple Uniform Hashing] A randomized algorithm $H$ for constructing hash functions $h: U \rightarrow \{1,...,M\}$ is called as simple uniform hashing, if given any element is $U$ is equally likely to hash into any of the $M$ slots.

Definition 1: [Universal Hashing] A randomized algorithm $H$ for constructing hash functions $h: U \rightarrow \{1,...,M\}$ is called universal, if for any $x \neq y$ in $U$ , we have,

$pr_{h \leftarrow H}[h(x) = h(y)] \le \frac{1}{M},$

We also say that a set H is a universal hash function family if randomly choosing $h \in H$ produces a universal hashing.

Performances

1. Bloom Filter

Bloom filter@Bloom1970 is a space-efficient probabilistic data structure. It is used for membership query, i.e., answering the question whether an element $y$ belongs to a given set $S$ . However, bloom filter has a so-call false positive issue, where it may answer “YES” when $y \notin S$ .

1.1 Algorithmic Description

In standard bloom filter, the baseline data structure is a bit array of $m$ bits with $k \ge 1$ independent hash functions. In this following description, we will use $B[1..m]$ to denote the bit array, and $h_1, h_2, ..., h_k$ to represent the $k$ hash functions.

At the very beginning, all of its bits are set to $0$ . Standard bloom filter supports the following two operations:

INSERT: the following pseudo-code inserts all elements of set $S$ into the bloom filter.

~~~python for x in S: for h in hash_functions.values(): B[h(x)] = 1 ~~~
QUERY: the following pseudo-code queries whether $y \in S$ .

~~~python answer = True for h in hash_functions.values(): answer &= B[h(y)] == 1 ~~~

1.2 False Positive Analysis

From Section 1.1, you might notice that, for an element $y \notin S$ , the corresponding $k$ bits (i.e., $h_1(y), h_2(y), ..., h_3(y)$ ) might be all 1’s, therefore $y$ might be falsely claimed to be in the set $S$ . The phenomenon is so-called false positive. In this subsection, we want to analyze the probability of this phenomenon, which is usually called as false positive probability.

Assume that the bloom filter has $m$ bits and $k$ independent hash functions, and $\mid S \mid = n$ .

First of all, for any given bit, the probability that after inserting all the elements of $S$ , its value is still $0$ is,

$\left(1 - \frac{1}{m}\right)^{kn} = \left(1 -\frac{1}{m}\right)^{kn}, \label{eq:prob0}$

As $\lim_{m \rightarrow \infty} \left(1 - \frac{1}{m}\right)^m = e^{-1}$ , when $m, n$ are large enough, the probability in Eqn. $\eqref{eq:prob0}$ can be approximated (well) by using $e^{-kn/m}$ . Therefore, the false positive probability can be expressed as,

$\begin{aligned} p_{fpp} &= (1 - p_0)^k \\ &= (1 - e^{-kn.m})^k \end{aligned}$

Case 1: given $n$ and $m$ , finding optimal $k$ .

Let $f(k) = p_{fpp} = (1 - e^{-kn/m})^k$ , and

$g(k) = \ln f(k) = k \ln (1 - e^{-kn/m})$

Hence, we have,

$\frac{d g}{d k} = \ln (1 - e^{-kn/m}) + \frac{kn}{m} \frac{e^{-kn/m}}{1 - e^{-kn/m}}$

Let $x = e^{-kn/m}$ , then,

$\begin{aligned} \frac{d g}{d k} &= \ln (1 - x) - \ln x \frac{x}{1 - x}\\ &= \frac{1}{1 - x} \left((1-x)\ln (1-x) - x\ln x\right) \end{aligned}$

It is easy to figure out, when $x = \frac{1}{2}$ (i.e., $k = \frac{m}{n}\ln 2$ ), $\frac{d g}{d k} = 0$ . To check whether $k = \frac{m}{n}\ln 2$ is the minimal point and the only extreme point, we can draw the curve of $y = (1-x)\ln (1-x) - x\ln x$ (which is shown in the following figure).

From the above curve, we can verify that $k = \frac{m}{n}\ln 2$ is the only extreme point and it is the minimal point, therefore, it is the minimum point. Therefore, the optimal $k$ is,

$\begin{equation} k = \frac{m}{n}\ln 2 \end{equation}$

References

Coding Dragon's Blog

Latest Posts

Coding Practice - Increasing Triplet Subsequence

[Leetcode 334] Increasing Triplet Subsequence

Solutions

Brute Force

Small Improvement

Even Better

Useful Latex Packages

1. Algorithms Packet

1.1. clrscode3e Package

1.1.1. Setup

1.1.2. An Example

Advanced Graph Theory

Hypergraphs

Useful Inequalities

3. In Probability Theory

3.1. Combinatorial Number Approximation

3.1.1. By Power Function

3.1.2 By Exponential Function (Upper)

3.1.3. By Exponential Function (Lower)

3.2 Binomial Distribution

3.1. Markov’s Inequality

3.1.1 Basis Version

3.1.2 Extended Version

3.2. Chebyshev’s Inequality

3.2.1 Basic Version

3.2.2 Extensions

4. In Real Analysis

4.1. Holder’s Inequality

4.2. Minkowski’s Inequality

4.3. Infinite Norm

4.4 Convergent Sequence & Cauchy Sequence

Useful Data Structures

Hashing

Basic Concepts

Performances

1. Bloom Filter

1.1 Algorithmic Description

1.2 False Positive Analysis