Multivariate Logistic Mixtures

Abstract Logistic mixtures, unlike normal mixtures, have not been studied for their topography. In this paper we discuss analogs of some of the multivariate normal mixture results for the multivariate logistic distribution. We focus on graphical techniques that are based on displaying the elevation of the density on the ridgeline. These techniques are quite elementary, and carry full information about the location and relative heights of the modes and saddle points. Moreover, we turn to a technique that names II-Plot which denotes that the first differentiation of the second component density ratios the difference between the first differentiations of the second component density and the first component density.


Introduction
There is the work by Ray and Lindsay ( [6]) on the key features of multivariate normal mixtures, including the determination of the number of modes and general modality theorems. For the logistic distribution, on the other hand, such information seems to be lacking. The logistic distribution plays an important role in psychometrics for instance, for modeling item response functions (see [7]).
In this paper, we propose analogs of the multivariate normal mixture results for the multivariate logistic distribution (see [4]). The literature on determination of the number of modes in logistic mixture models has focused primarily on univariate mixtures. In fact, there is a simple description of modality when one is mixing two univariate components. Unlike the mixture of multivariate normal distributions (see [6]), for the logistic case it seems infeasible to express the ridgeline function explicitly. However, applying the implicit function theorem, we can prove that a unique explicit formula is possible locally. Moreover, we focus on displaying the elevation of the logistic mixture density on the ridgeline and address a technique called the Π-plot, both of which carry important information about modality properties of the mixture.
We conclude with remarks about the similarities and differences between the multivariate normal and logistic distributions in regards to their mixture properties and conclusions thereof.
where π i is the mixing proportion of component i, π i ∈ [0, 1], K i=1 π i = 1, and φ(x; µ, s) is the density of a multivariate logistic distribution with mean µ and standard deviation s. We will sometimes use φ i (x) as shorthand notation for φ(x; µ i , s i ), and call φ i the ith component density, where (following [4]) will be called the unit simplex. The function x * (α) from s K into R D will be called the ridgeline function, which satisfies the following condition The image of this map will be denoted by M and called the ridgeline surface or manifold. According to Lemma 2 (see Appendix A), we can get that Definition 1 is well-defined. Proof. Suppose that ∇g(x * ) = 0, so x * is a critical point.
If we let Thus from equation (2.4), we have that for every critical value x * there exists an α such that The above formula gives the theorem.
Remark 1. Due to the Taylor expansion (see Appendix B) until first order at a = −1 from the right side of formula (2.2), omitting the remainder, we can get that Following formulae (2.5) and (2.6), we can get When D = 2, according to ∇φ(y) φ(y) = y 1 + y 2 + 6e 1 − e(y 1 + y 2 ) 2 1 − e(y 1 + y 2 ) y 1 y 2 , , and then the convex hull of (D + 1)D i F i contains the value of Proof. Following formula (2.4), we can get Thus the above formula gives the corollary.

The ridgeline elevation plot
The next step in our analysis is to consider the diagnostic properties of the elevation plot, which is a plot of the ridgeline elevation f unction defined by h(α) = g(x * (α)).

Remark 2.
The positive and bounded density function φ i (x) characterized by the parameters (µ i , s i ) in its second-order Taylor expansion depends on x ∈ R D as a decreasing function of the Mahalanobis distance ( Remark 3. The density function of t distribution with ν degrees of freedom which follows [5], characterized by parameters (µ, Σ) depends on x ∈ R D as a decreasing function of the Mahalanobis , then we can get the result for t distribution from Remark 4 of [6].
Example 1. Consider the logistic mixture with D = 2 and K = 2, and with the parameters The corresponding ridgeline elevation function h(α) is shown in Figure 1. Following formula (2.4), we know We can define a linear subspace of vectors that are orthogonal to the surface's direction vectors d i in an appropriate sense, (2.13) Theorem 2. If w ∈ W, then along the path {x(α) + δw : δ ∈ R} the function g(x) takes its maximum valuea at δ = 0.

Proof.
We know that the point x(α) lies in one of the elliptical contours of the density φ i . According to formula (2.7), at this point the gradient in x of the density φ i (x) is proportional to v i , and so v i is orthogonal to the contour. Thus, if we were to start at x(α) and travel in any direction w orthogonal to v i , our path is in the support hyperplane to the elliptically shaped upper set {x : φ i (x) ≥ φ i (x * (α))}. Using the fact that the ellipse of φ i is convex, our path lies outside the ellipse, and so in the set {x : φ i (x) < φ i (x(α))}, except for equality at x = x * (α). This means, the point of x * (α) is a local maximum to φ i (x) along any path orthogonal to v i . Now, assume that w ∈ W. It follows from the form of d i that (2.14) However, from the fomula (2.4), we have Due to (2.13), we know that w v K = 0. Putting this together with (2.14) shows that w v i = 0 for i = 1, ..., K. That is, w is orthogonal to every v i , and hence, by the above paragraph, every component of the mixture density g(x) is locally maximized along the given line {x(α) + δw : δ ∈ R} at δ = 0, and thus, so is g(x). Corollary 2. If D ≥ K − 1, then at a critical point of h(α) whose second derivative matrix has K − 1 negative eigenvalues the function g(x) will have a critical point whose second derivative matrix has an additional D − K + 1 negative eigenvalues corresponding to the dimension of the orthogonal directions w. Specially, for D > K − 1, so that the h(α) plot is a true dimension reduction, then g(x) has no local minima, only saddlepoints and local maxima.
Proof. The directional vectors, together with their orthogonal complement W, span the space, and following Theorem 2, we know that the W vectors are all directions of local maximization.
Example 3. The mixture logistic density with D = 2 and K = 3, and the parameters Figure 3 shows the contours of the density given in Example 3.

Remark 4.
More expressive detail for the two-dimensional plots of this section could have been obtained by displaying the critical net of the density using, for example, the approximation techniques of Danovaro et al. [1]. This would show the maxima, saddlepoints and separatrices over the manifold region based on evaluating the elevation at a finite network of points.
Proof. First to prove that the logistic density is a Morse function. When D = 2, following formula (2.2), . After calculation, we know that

Universal
Then the Hessian matrix is non-degenerate (invertible) at critical points. Thus, following [1], we get that this remark is given.
Until this point we have focused on a terrain morphological technique that is based on Morse Theory. This technique is quite elementary, and carry full information about the location and relative heights of the modes and saddlepoints. Someone may be also interested in another technique that focuses on the Multi-Triangulation in [1] corresponding to a refinement modification and a coarsening modification. By doing so, we can gain compact data structures for encoding an Multi-Triangulation, and algorithm for extracting Triangulated Irregular Networks according to user-defined resolution requirements, following [2].
We can also derive the following simple calculation formula for Π(α): which can be verified by formulae (2.5) and (2.6).
As an example, let us examine the Π-plot (graph of Π(α)) of the two-component bivariate logistic mixture with two modes given in Example 1. As the mixing proportion in Example 1 is π = 0.5, we would draw a horizontal line across the (α, Π(α)) plot ( Figure 4) at height π = 0.5. This line crosses the curve once. Among these, near α = 0.5 correspond to the one mode, as was verified by the ridgeline elevation plot (see Figure 1).
For Example 2, similar with Example 1 we can find (see Figure 5), at height π = 0.5, near α = 0.5 corresponds to one mode, as was verified by the ridgeline elevation plot (see Figure 2).

The curvature function
In this section, we look more deeply into the properties of the Π function. Considering back to the above section, the first differentiation of Π function with respect to α is In general, we can use any function of α with the same numerator φ 2 (α)φ 1 (α) − φ 1 (α)φ 2 (α) to determine the sign changes of Π because the denominator is a positive function of α. As following we will use the curvature function κ(α) defined by

Properties of the curvature function κ(α)
Now we turn to give two properties of κ(α).
Lemma 1. (Two components, two dimensions, equal variance). In the equal variance case (s 1 = s 2 ), κ(α) reduces to the following expression The logistic mixture g will be bimodal if and only if π ∈ (π 1 , π 2 ), where and the α i are the solutions in [0, 1] of q(x) = 0.
Proof. According to that the case is in two dimensions, Then Corollary 3. Let g be the mixture of two logistic densities with means µ 1 and µ 2 and standard deviation s 1 and s 2 = σs 1 . g is bimodal if and only if π ∈ (π 1 , π 2 ), where and correspondingly the α i are the solutions in [0, 1] of Specially, the one dimensional case of logistic mixture is in the preprint paper "A Note On The Convex Combination Of Two 1-, 2-, 3-, and 4-Parameter Logistic Item Response Functions", which is according to [8].

Conclusion
In this paper, we have proposed a technique for the topography of multivariate logistic mixtures. This is performed by the ridgeline function x * (α) and the Π-Plot. Unlike with the case of multivariate normal mixtures, it is difficult to express the ridgeline function x * (α) explicitly. However we can prove that we can get a unique explicit formula of x * (α) through the implicit function theorem. Thus we can obtain the Π-Plot even if we can not get the ridgeline contour plot when K ≥ 3.
Future developments of the work described here consists of improving over the technique for displaying the contour plot when K = 3 following the Taylor expansion to take into account of the solution of ridgeline equation. The mathematical expression of the constructive implicit function theorem in logistic case is also very interesting. Otherwise, an application of this work can be used in PISA (Programme for International Student Assessment) analysis.
A Definition 1 is well-defined Lemma 2. The left side of formula (2.4) is satisfied implicit function theorem.
Proof. Let for any fixed k, we can get 1 + e y k + e y k D j =k e −yj 2 1 +