[Leetcode 334] Increasing Triplet Subsequence

this image is screen shot from leetcode

Given an array, return whether an increasing subsequence of length 3 exists or not in the array, i.e., given array , if there exist such that , return true, otherwise, return false. More details you can refer to leetcode 334.

Solutions

Brute Force

The easiest idea would be the brute force. It works as follows. For every , check whether there exists and such that and . The following gives its pseudo code.

n = len(A)
for k in xrange(1,n - 1):
    f = False
    for i in range(k):
        if A[i] < A[k]:
            f = True
            break
    if f:
        for j in xrange(k + 1,n):
            if A[j] > A[k]:
                return True
return False

It is obvious, the above algorithm has time complexity of and space complexity of .

Small Improvement

In the brute force algorithm, if you do some preprocessing, you can easily avoid the inner for loop, which means you can reduce the time complexity to . How does the preprocessing work? Actually it is quite easy. Imagine that if you have known the minimum number (denoting as ) before and the maximum number (denoting as ) after , then you only need to check whether . And of course the preprocessing is to calculate and . Its pseudo code is shown in the following.

n = len(A)

T = [A[0]]
Q = [A[-1]]

for k in xrange(1,n):
    T.append(min(T[-1],A[k - 1]))
    Q.insert(max(Q[0],A[n - k - 1]))

for k in xrange(1,n-1):
    if A[k] > T[k] and A[k] < Q[k]:
        return True

return False

It is easy to figure out that this algorithm has time complexity of , however, it also has space complexity of .

Even Better

Can we do better? Yes, of course, first of all, we can at least reduce the space from to (e.g., replacing array with an integer number), however, the space complexity is still , just with a smaller constant factor. It seems that based on the previous idea (i.e., through comparing the minimum number before an element and the maximum one after it), we can hardly reduce the space requirement further.

Let’s change our though a little bit. Imagine if you know the “second order minimum element”, here the “second order minimum element” is the minimum element which has at least one element before it and less than it, mathematically, . Then, through comparing with it, we can get whether there is an increasing triplet subsequence in . Since, for any , we only need to know the “second order minimum element” in , so we do not need preprocessing, and if we can update the “second order minimum element” in space and time when we go from to , then we can design an algorithm with time complexity of and space complexity of .

Now, let’s focus on the key question: can we update the “second order minimum element” in space and time when we go from to ? Assume is the “second order minimum element” in , if , then we are done, because it means there exists an an increasing triplet subsequence in . However, things get tricky when . We cannot directly update as , since we can not guarantee there exists at least an element that is less than in . Go one step further, if we have the record of the minimum element (denoting as ) in , then if , then is also the “second order minimum element” in , and if , then we can guarantee that there exists at least one element which is is less than , therefore, becomes the “second order minimum element”. Therefore, we can update the “second order minimum element” in space and time when we go from to .

The following shows the pseudo code.

n = len(A)
x = sys.maxint
y = sys.maxint

for k in range(n):
    if A[k] > y:
        return True
    elif A[k] <= x:
        x = A[k]
    else:
        y = A[k]
return False

1. Algorithms Packet

1.1. clrscode3e Package

clrscode3e package was developed and maintained by Professor Thomas H. Cormen. As indicated by the name of the package, it was designed to duplicate the pseudocode displaying style in the textbook, Introduction of Algorithms (Third edition), by Cormen, Leiserson, Rivest, and Stein (CLRS 3/e). More details about the package, you can refer to its documentation.

1.1.1. Setup

STEP 1: download the clrscode3e package.

STEP 2: In the header part of your TEX file, please include the following sentence.

\usepackage{clrscode3e}

STEP 3: Use the syntax of this package to type your pseudocode.

1.1.2. An Example

The following gives an example of how pseudocode generated by clrscode3e package looks like.

The source code is as follows (do not forget to including clrscode3e package in the header).

\begin{codebox}
\Procname{$\proc{Insertion-Sort}(A)$}
\li \For $j \gets 2$ \To $\attrib{A}{length}$
\li \Do
$\id{key} \gets A[j]$
\li \Comment Insert $A[j]$ into the sorted sequence
$A[1 \twodots j-1]$.
\li $i \gets j-1$
\li \While $i > 0$ and $A[i] > \id{key}$
\li \Do
$A[i+1] \gets A[i]$
\li $i \gets i-1$
\End
\li $A[i+1] \gets \id{key}$
\End
\end{codebox}

The generated pseudocode looks like.

Hypergraphs

A hypergraph is a generation of a graph where each edge can contain any number of vertices. Here, is the set of vertices, and is the set of hyperedges, where is the power set of (i.e., the set of all subset of ). If all hyperedge has the same size , then the hypergraph is called as -uniform hypergraph.

In this post, a lot of useful equations would be introduced, and mots of them would be proved formally.

3. In Probability Theory

3.1. Combinatorial Number Approximation

This subsection describes several ways to approximate the combinatorial numbers, which would be very useful for example when doing some analysis on binomial distribution.

3.1.1. By Power Function

proof:

Lemma:

proof:

Since,

Therefore,

Hence,

3.1.2 By Exponential Function (Upper)

proof:

According the above two inequalities, it is easy to obtain this inequality.

3.1.3. By Exponential Function (Lower)

proof:

As,

Hence, for any ,

Therefore,

3.2 Binomial Distribution

Give a binomial distribution , the summation of all odd elements is,

3.1. Markov’s Inequality

Markov’s inequality gives an upper bound (a function of its expectation) for the probability that a non-negative function of a random variable is no less than a constant.

3.1.1 Basis Version

Given any random variable and , we have,

proof:

3.1.2 Extended Version

Given a monotonically increasing function from non-negative real numbers to the non-negative reals, is a random variable, , and , then,

proof:

3.2. Chebyshev’s Inequality

Chebyshev’s inequality is about how “far” can the values of a distribution deviates from its mean. Formally speaking, it guarantees that for any distribution no more than of the distribution’s values can be more than standard deviations away from the mean.

3.2.1 Basic Version

Let be a random variable with finite expected value and finite non-zero variance . Then for any real number .

proof

Another expression is as follows:

For any , the above expression can also written as,

proof

3.2.2 Extensions

  1. Asymmetric case: for any and , we have, proof
  2. Vector Version: for a random vector with mean

, variance and an arbitrary norm that, . proof

4. In Real Analysis

4.1. Holder’s Inequality

Suppose that and are non-negative numbers, let , and is the dual of that is,

Then, we have

Lemma: Given any two non-negative numbers and , and two positive numbers and such that,

,

then, we have,

4.2. Minkowski’s Inequality

Suppose that and are two non-negative sequences and , then,

4.3. Infinite Norm

If and , then,

moreover,

4.4 Convergent Sequence & Cauchy Sequence

If a sequence in a metric space is convergent, then it is a Cauchy sequence.

Hashing

Basic Concepts

Definition 1: [Simple Uniform Hashing] A randomized algorithm for constructing hash functions is called as simple uniform hashing, if given any element is is equally likely to hash into any of the slots.

Definition 1: [Universal Hashing] A randomized algorithm for constructing hash functions is called universal, if for any in , we have,

We also say that a set H is a universal hash function family if randomly choosing produces a universal hashing.

Performances

1. Bloom Filter

Bloom filter@Bloom1970 is a space-efficient probabilistic data structure. It is used for membership query, i.e., answering the question whether an element belongs to a given set . However, bloom filter has a so-call false positive issue, where it may answer “YES” when .

1.1 Algorithmic Description

In standard bloom filter, the baseline data structure is a bit array of bits with independent hash functions. In this following description, we will use to denote the bit array, and to represent the hash functions.

At the very beginning, all of its bits are set to . Standard bloom filter supports the following two operations:

  1. INSERT: the following pseudo-code inserts all elements of set into the bloom filter.

    ~~~python for x in S: for h in hash_functions.values(): B[h(x)] = 1 ~~~

  2. QUERY: the following pseudo-code queries whether .

    ~~~python answer = True for h in hash_functions.values(): answer &= B[h(y)] == 1 ~~~

1.2 False Positive Analysis

From Section 1.1, you might notice that, for an element , the corresponding bits (i.e., ) might be all 1’s, therefore might be falsely claimed to be in the set . The phenomenon is so-called false positive. In this subsection, we want to analyze the probability of this phenomenon, which is usually called as false positive probability.

Assume that the bloom filter has bits and independent hash functions, and .

First of all, for any given bit, the probability that after inserting all the elements of , its value is still is,

As , when are large enough, the probability in Eqn. can be approximated (well) by using . Therefore, the false positive probability can be expressed as,

Case 1: given and , finding optimal .

Let , and

Hence, we have,

Let , then,

It is easy to figure out, when (i.e., ), . To check whether is the minimal point and the only extreme point, we can draw the curve of (which is shown in the following figure).

From the above curve, we can verify that is the only extreme point and it is the minimal point, therefore, it is the minimum point. Therefore, the optimal is,

References