LYAPUNOV EXPONENTS AND LARGE DEVIATIONS ANALYSIS OF EIGENFUNCTIONS IN ANDERSON MODELS ON GRAPHS

We propose a new probabilistic approach to the analysis of decay of the Green’s functions and the eigenfunctions of the Anderson Hamiltonians on countable graphs. Our method is close in spirit to the Fractional Moment Method, but we show how the use of the fractional moments can be avoided, so that exponential decay of the Green’s functions can be established in some models where the fractional moments diverge, due to low regularity of the random potential. We elucidate the exceptional role of the Hölder continuity condition, usual in the FMM, in terms of Cramer’s condition in the large deviations problem for a suitably constructed rigorous path expansion.

Abstract We propose a new probabilistic approach to the analysis of decay of the Green's functions and the eigenfunctions of the Anderson Hamiltonians on countable graphs. Our method is close in spirit to the Fractional Moment Method, but we show how the use of the fractional moments can be avoided, so that exponential decay of the Green's functions can be established in some models where the fractional moments diverge, due to low regularity of the random potential. We elucidate the exceptional role of the Hölder continuity condition, usual in the FMM, in terms of Cramer's condition in the large deviations problem for a suitably constructed rigorous path expansion. We consider random Anderson Hamiltonians on locally finite countable graphs G, endowed with the canonical graph distance d(· , ·), of the form 1 (

Keywords
where ∆ G is the canonical graph Laplacian on G, V : G × Ω → R is an IID random field on G relative to some probability space (Ω, F, P), and ϵ > 0 is a parameter; in fact, ϵ −1 measures the amplitude of the disorder. Starting from the very first mathematical works (cf. [17,18,29,13]) on Anderson localization in multidimensional random environment, the analysis of the decay properties of the Green's functions ( = matrix elements of the resolvent in a suitable basis) was carried out with the help of one or another decoupling technique, allowing to effectively decompose a large (ultimately, infinite) system into smaller subsystems. This was less pronounced in [17,18], but even there the multiscale approach referred to the local Hamiltonians in the subsystems of finite size. In the reformulation of the Multi-Scale Analysis (MSA) performed in [29,13] and later works, the entire inductive procedure was based on the analysis of the resolvents of the finite-dimensional subsystems. The main tool of such analysis is the well-known second resolvent equation. The drawback of this technique is the necessity to keep track of the "resonances", or "small denominators", occuring in the inductively treated finite (but growing) subsystems. The MSA has become by now a powerful method -or, rather, family of methods -successfully applied to a very large class of Anderson models in discrete media (such as periodic lattices or, more generally, countable graphs with tempered rate of growth of balls) and continuous media (including those in Euclidean spaces and in the so-called "quantum graphs"). However, the above mentioned drawback is still prohibiting the application of the MSA to the graphs with exponential growth of balls, including the non-degenerate Cayley trees.
A important step forward has been made in 1993 by Aizenman and Molchanov [2] who proposed a different approach, based on the complete decoupling of the fractional moments of the Green's functions in small subsystems of a larger system. Initially applied to the strongly disordered systems, and having to rely on the Simon-Wolff criterion of localization [31] for the proof of spectral localization (a.s. pure point spectrum), this technique, known today as the Fractional Moment Method (FMM) has been improved and generalized in a series of works by Aizenman et al. (cf. e.g., [3,5,4]).
Compared to the MSA, the FMM is much less sensitive to the combinatorial complexity of the underlying graph 2 Z in which the Anderson localization problem is considered; most notably, FMM applies with no difficulty to the graphs with exponential growth of the volume of balls, including the regular trees. The reverse of the medal is the FMM's greater sensitivity to the regularity of the probability distribution of the random potential. Specifically, the key objects of the FMM, the expectations of some fractional power s ∈ (0, 1) of the Green's functions, are well-defined under the assumption of Hölder-continuity of the marginal probability distribution of the random potential. In the framework of the FMM, this comes as a matter-of-fact constatation; since the method simply stops working for less regular random potentials, the analysis of the role of the Hölder continuity also stops there.
On the other hand, the MSA excels in the situations where the random potential has an extremely poor marginal distribution (most notably, the Anderson-Bernoulli model in R d ; cf. [9]). Joint efforts by Germinet-Klein and Aizenman-Warzel resulted recently in a remarkable improvement of the celebrated Wegner estimate [6] and, ultimately, in the proof of Anderson localization in Euclidean space with alloy-type potential with any nontrivial probability distribution of the scatterers' amplitudes (cf. [22]). (The lattice counterpart of this model remains a challenging open problem).
In the present work, we elucidate the deep reasons for the exceptional role of the Hölder continuity in the moment analysis of the Anderson Hamiltonians: it can be interpreted as Cramer's condition in a related large deviations analysis, appearing in a special, rigorous path expansion for the Green's functions. In the theory of the Large Deviations Estimates 3 (LDE), the crucial role of Cramer's condition is well-known. It is also wellunderstood how weaker LDE can be obtained under weaker assumptions on the tail probabilities of the random (e.g., IID = independent and identically distributed) summands. In our approach, the role of these summands is played by the logarithms of the Green's functions. An advantage of this language is that the logarithms can have finite moments of any given order under a very weak assumption of log-Hölder (and not necessarily Hölder) continuity of the marginal probability distribution of the potential.
Although the path expansions used in our paper, and originating in earlier works [5,23,30,32], do not solve all problems appearing in the moment analysis when only a relatively low regularity is assumed, they clearly evidence that there is much more place for the ergodic theory in the localization analysis than there used to be, even beyond the particular class of trees, starting with the one-dimensional lattice Z 1 , where one or another variant of the transfer-operator can be employed. Basically, the two most important implications of the underlying ergodicity in general multi-dimensional media used so far were • the a.s.-non-randomness of the spectrum and of its components (a.c., s.c. and p.p. spectra); • the large deviations estimates in the framework of the "Lifshitz tails", resulting in the non-perturbative (holding for an arbitrarily small amplitude of the disorder) result on localization near the spectral edges. From the analytic perspective, the properly constructed path expansions allow to achieve a rigorous decoupling of a large system into smaller subsystems before any probabilistic estimates are made; by comparison, the AM/FMM techniques achieve the efficient decoupling only in the expectations (fractional moments), and the latter may or may not exist.
Aiming at the decay bounds on the eigenfunction correlators, we also have to turn, at some point, to the expectations, but an important distinction from the general FMM strategy is that we calculate the expectations 3 Note that Aizenman and Warzel (cf., e.g., [8] and references therein) used the large deviations estimates in their deep analysis of the delocalization phenomena on trees, under the assumption of Lipschitz continuity of the IID random potential.
of the eigenfunction correlators, e.g., where P I is the spectral projection on some interval I ⊂ R, and these quantities are bounded (by 1), while the Green's functions are not. (This important fact has been pointed out in numerous works on the FMM techniques.) Therefore, one can achieve a satisfactory upper bound on the eigenfunction correlators, even if the related estimates on the Green's functions are obtained only with high probability. In the FMM approach, the convergence of the fractional moments is a sine qua non condition.
In a way, our approach compared to the FMM reminds the Cheshire cat's smile: it can be considered as the Fractional Moment Method ... possibly without fractional moments; the latter may appear in a disguised form -in the Cramer's condition, if and when it is fulfilled.
I would like to summarize the above discussion by saying the following: • First, the main point of the new approach, building on and further developing the moment analysis and the path expansions used earlier (cf. [2,5,23]), is that the decoupling is to be performed as early as possible in the analysis of the resolvents, while the use of expectations, on the contrary, has to be postponed to the latest possible stage. • Secondly, in the main body of the decay analysis of the resolvents, the powerful techniques of the ergodic theory should (and, as we show, can, at least in some classes of models) be used more systematically, to obtain the key estimates in probability, when their counterparts in expectation are not available.
The initial motivation for this work was an attempt to find a method capable to "interpolate" between the MSA and the FMM. The technique presented here ostensibly gravitates toward the AM/FMM approach, but the remaining difficulties might require a more substantial use of the general MSA's philosophy.
In this paper, we focus mainly on the positivity of the "Lyapunov exponents" (= decay exponents of the Green's functions), in the perspective of its application to the analysis of eigenfunction correlators (cf. Section 4). As to spectral localization, it is discussed only in passing (cf. subsection 4.2).
The derivation of the localization bounds from the fixed-energy analysis can be done (and has been done in the past) in different ways. Recall that Martinelli and Scoppola [26] proved the absence of the a.c. spectrum on a lattice under the assumption of fast decay of the Green's functions. Their argument, based on the Chebyshev inequality in the energy-disorder product space, has been further developed and used, e.g., by Bourgain and Kenig [9], where it is a part of an elaborate, all-in-on scaling procedure, and, in a more distinctly encapsulated form, by Elgart et al. [15]. This allows to easily transform the fixed-energy estimates into their energyinterval counterparts. As to the spectral localization, it was derived from the energy-interval estimates for the Green's functions by Fröhlich et al. [18] and, in a modified way, by von Dreifus and Klein [13]. Dynamical localization was inferred from the energy-interval estimates by Germinet-De Bièvre [20] and by Damanik-Stollmann [12] (in a stronger form). A particularly short and transparent derivation was developed by Germinet and Klein [21], and we use their approach (with minor adaptations) in Section 4.
The passage from the Lyapunov exponents to the localization is not automatic, however. The two most spectacular counterexamples are provided by the random Anderson model on a non-degenerate (not onedimensional) Cayley tree, and by the quasi-periodic Almost Mathieu (a.k.a. Harper's) operator in Z 1 in the strong disorder regime. In the former model, the exponential decay of the Green's functions can be insufficient to overcome the exponential growth of the spheres of radius L → ∞, while in the latter the spectrum may be purely singular continuous, due to a result by Gordon [19], if the basic frequency is abnormally fast approximated by rational numbers. Nevertheless, we prefer to address in this paper mainly the Lyapunov exponents.
The structure of the paper is as follows: (1) The main path ("slim wormhole") expansion is introduced in Section 2 and further developed in Section 3. It appeared earlier in [23] (and less explicitly, in [5]). See also recent works [30], [32]. Unlike [23], we derive it from the Schur complement formula, without the loop elimination in the formal random walk expansion. (2) Some other expansions are discussed in Section 5. In particular, we propose the "fat" version of the wormhole expansion, not considered in [23], and a more promising "sandwiched" expansion. (3) In Section 3, we employ the "slim wormhole" expansions and standard methods of the large deviations analysis to prove exponential decay of Green's functions for Anderson Hamiltonians on various types of locally finite graphs. (4) The derivation of spectral and dynamical localization from the positivity of "Lyapunov exponents" is discussed in Section 4, where we follow essentially the works by Elgart et al. [15] and Germinet-Klein [21]. (5) The large deviations estimates used in the main text are proven in Appendix.

Schur formula and wormhole expansions
2.1. Schur formula for block-inversion. The algorithm for inverting a finite-dimensional matrix represented in a block form has many names; an enlightening discussion of its amazing story can be found, e.g., in the book [34]. In the physical literature, particularly in quantum chemistry and ab initio numerical methods thereof, one usually refers to the Feshbach (-Fano) method. Since we are going to apply this method only to finite-dimensional operators, we refer to it as the Schur method (cf. [28]). The popularity of the Feshbach method in physics, makes it -today! -quite surprising that it had not been used in the early physical papers on the Anderson localization. A more problematic, perturbative path expansion, employed already in the seminal paper [1] by P. W. Anderson, results in a much more perilous journey towards the proof of localization, even for the fixed-energy Green's functions. There seems to be no obstacle for using a regularized, well-defined self-avoiding path expansion stemming from the Schur-Feshbach approach, which can even be finite yet provide all information a physicist may need, while satisfying the most rigorous mathematician.
For a block matrix, which need not be real-symmetric or Hermitian, one has 2.2. The case rank A = 1. Consider a finite connected graph G, the graph Laplacian ∆ = ∆ G associated with it, and a function V : G → R. Explicitly, where n G (x) = |S 1 (x)| is called the coordination number (the number of the nearest neighbors) of the point x in the graph G. Introduce the discrete Schrödinger operator H = ϵ∆ + gV , and its matrix in the standard delta-basis in ℓ 2 (G). The parameter ϵ > 0 is often referred to as the "hopping" amplitude, while |g| measures the amplitude of the disorder. Certainly, by rescaling of the energy one can eliminate one of the parameters ϵ and g. For the purposes of path expansions, it will be convenient to use essentially ϵ, and only occasionally g. Clearly, small values of ϵ > 0 correspond to the strong disorder regime.
Denote by G G\{x0} (x, y; E) the matrix elements (in the delta-basis) of the resolvent (D − E) −1 (where the latter is well-defined). Let For notational brevity, set (2.3) and for y 0 ̸ = x 0 , we have The energy shift in E(·) is usually referred to as the "self-energy".
The set G \ {x 0 } may be disconnected, i.e., decomposed into a union of disjoint connected subgraphs, among which exactly one, say, G i(y0) , contains y 0 . More generally, given a subset of vertices X ⊂ G \ {y 0 }, there is a disjoint union of connected subgraphs In this case, denote G ⊖X := G i(y0) . Clearly, for any x 1 ∈ G \X , one has We see that the first step expands the initial Green's function in a sum of the "reduced" ones relative to G \ {x 0 }, with x 1 ranging over the entire set G \ {x 0 }.
By recursion, we derive from (2.4) where some terms may be zero, due to the possible disconnectedness of the set G \ {x 0 } (but not all, since G is connected). Call a self-avoiding path (SAP) any finite sequence of pairwise distinct points going over the edges: (|γ| is the number of steps made by γ), and denote It is also useful to have a notation for the subpaths of γ; for j ≥ 0, we set Unless otherwise specified, below all paths will be assumed to be self-avoiding (SAP). Denote: • by Γ n (x) the set of all paths γ in G with γ(0) = x and |γ| = n; • by Γ(x, y) the set of all paths γ starting at x and ending at y (γ : x y); • by Γ n (x, y) the set of all paths γ : x y of length n. We call such paths "n-bridges" (between x and y). Now (2.5) becomes a particular case of a more general formula, easily obtained by recursion. Fix two distinct points x 0 , y 0 ∈ G. Next, pick any 0 ≤ n ≤ d(x 0 , y 0 ). Then we have the following identity: for any (2.6) The above identity also holds, of course, for some real energies, viz. for those E for which the denominators The role of the paths is two-fold: • the point x j+1 may (but not necessarily does) come one step closer than x j to the point y 0 ; • the step x j x j+1 eliminates at least one point (possible more) from the domain where the ma- The resolvents evolve along the excluded "tunnel" γ(j + 1) = {x 0 , . . . , x j+1 } across the space between x 0 and y 0 , so we call the resulting sum a wormhole expansion. In the simplest version that we introduced, the tunnel is 1-point wide, so we call it slim. More general ("fat") wormholes are discussed in Section 5.

2.4.
Assessing the remainder factor G G⊖γ . Aizenman and Molchanov [2] and Aizenman et al. [5] used the Krein formula to reduce a similar estimation to a two-dimensional, random matrix problem, regardless of the size of the ambient graph G.
(2.8) The factor G G⊖γ (y 0 , y 0 ; E) can be re-written in the form similar to the other factors, again with the help of the Schur formula, applied to the operator H X |γ|+1 (γ) , where the block operator A is the restriction 1 y0 H X |γ|+1 (γ) 1 y0 to X |γ|+1 (γ). By the Schur formula (2.1) for the diagonal entry, .
(2.9) This is a rigorous identity; if |G| < ∞, then the sum is actually finite. Note that we could stop expanding the resolvent after any given number of steps n < ∞.
The identity (2.9) implies the following inequality, which plays the crucial role in our analysis: (2.10) 2.5. From paths to ergodic theory. Given a selfavoiding path γ = (x 0 , . . . , x n ), introduce the sequence of random variables Although the sequence (X 0 , . . . , X n ) is not IID (not even independent), it will not be difficult to adapt standard methods of the large deviations theory to the sums In terms of X j (γ) and S n (γ), the inequality (2.10) takes the form ϵ |γ| e Sn(γ;E;ω) . (2.11) Suppose that sup x,y |Γ n (x, y)| ≤ e CBRn (here "BR" stands for "bridges"). Setting and assuming m > 0, we obtain an expansion suitable for finite and infinite graphs with |Γ n (x, y)| ≤ e CBRn : allowing the value +∞ for positive series, we have with the usual convention that the sum over the empty set of indices is zero. This occurs, for example, when n < d(x 0 , y 0 ); if the graph G is finite, then the length of any SAP γ is bounded by |G| < ∞, rendering finite the above sum over n.

Large Deviations Estimates. Some natural limitations.
Fix an integer n ≥ 0, two distinct points x 0 , y 0 ∈ G, and set for brevity Γ n = Γ n (x 0 ; y 0 ). We have, for any a > 0, Let us first discuss the problem at hand informally.
• If the ergodic theorem were applicable to the sums ∑ n j=0 X j , then the probability in the above RHS would tend to 0 as n → ∞, for any a > max j E [ X j ].
• The cardinality of Γ n may grow exponentially in n, even for d-dimensional lattices with d > 1. If the r.v. X j had some finite exponential moment, this would not be a problem, at least in the strong disorder regime, for one could apply the standard LDE method based on Cramer's condition (finiteness of an exponential moment). • It is not difficult to see (cf. the discussion in subsection 2.7) that Cramer's condition on X i amounts to the Hölder continuity of the marginal PDF F V of the IID random potential. However, our goal is to find a method which can also afford probabilistic bounds much weaker than exponential, at least under some additional hypotheses upon the combinatorial properties of the underlying graph G. • The (unwanted) events in the family are highly correlated; at least, a large number of them may be. Yet, it is not quite obvious how to turn to our advantage such a high correlation.
Some of the above mentioned difficulties may be the price to pay for limiting the analysis to the simplest path expansions ("slim wormholes"), where at each step a subspace of dimension 1 is split off with the help of the Schur formula. Quite possibly, some more elaborate variants of this procedure (see Section 5) could bypass these difficulties.
Summarizing these observations, it seems difficult to assess directly the sums of the original variables X j , unless they obey Cramer's condition. The general experience accumulated in the LDE theory suggests a different approach: replacing X j by their truncated counterparts, with a judiciously chosen threshold b n < +∞. See the details in Appendix C.
In our case, we cannot restrict the truncation procedure to the sites occupied by one given path γ, since the random variables to be truncated depend upon the path through λ j (γ). What makes things even worse, and considerably, is that the number of relevant paths grows exponentially already in the periodic lattices There is a particular situation where the above mentioned difficulty does not occur: the combinatorial explosion concerns the number of bridges between two distant points x 0 , y 0 with d(x 0 , y 0 ) = n, and not the volume of the ball B n (x 0 ). From this point of view, the trees (starting with Z 1 ) represent the simplest case. A moderate number of loops can also be tolerated, if they result only in a tempered growth of the number of n-bridges. The slower grows the number of n-bridges as n → ∞, and the lower regularity of the random potential can be tolerated.
Again, we see that the model where the Aizenman-Molchanov method excels in the localization analysis -the regular Anderson model on a Cayley tree -appears, with no surprise, in both categories which we have discussed above: in the case where F V is Lipschitzor Hölder-continuous (hence the Cramer's condition for X j ), even if the balls grow exponentially and the loops are present, and in the category of trees.
Remark 2.1. There is no point trying to hide the fact that we make use of the condition of tempered growth of n-bridges only out of necessity. Nevertheless, progress in nanophysics of the "quantum graphs" might lead to the study of some micro-or mesoscopic structures with moderate number of loops, where such a condition may be fulfilled. In relatively small (or moderately large) graphs, the distinction between an exponential and subexponential bound is merely the matter of choosing appropriate constants.
2.7. Right vs. left tails of X j . Observe that the conditional right-tail probabilities P { X j > t | F j−1 }, albeit they are themselves random (dependent upon the condition), admit a common a.s. upper bound in terms of the continuity modulus s V of the IID r.v. V (·; ω): owing to F j−1 -measurability of λ j , we have cannot have finite moments of order r ≥ 1 even for Lipschitz-continuous random potentials V , the new variables X j may have finite moments of any fixed order, under a very weak assumption of log-Hölder continuity of the PDF F V . The Hölder-continuity of F V results, obviously, in finiteness of some exponential moment of The role of the left tails of the probability distribution of X j is quite different: in the context of the localization analysis, the heavy left tails are more than welcome.
Note that Clearly, this is not an efficient way to assess the large deviations, for one consciously looses the benefit of the lower (possibly negative!) value of the expectation E [ X j ], replacing X j by X j . However, this simplifies a number of technicalities in the subsequent analysis. After this reduction, the behavior of the tail probabilities of the random potential V becomes irrelevant, and the finiteness (resp., divergence) of the moments of S (+) n is determined only by the continuity modulus s V .

Slim wormholes and positivity of Lyapunov exponents
Formally speaking, the Lyapunov exponents for the eigenfunctions of random operators can be defined in the context of one-dimensional, or quasi-one-dimensional, media, such as Z 1 or strips of finite width. Nevertheless, one often considers the decay exponent of the Green's functions on lattices Z d and more general graphs, as a direct analog of the Lyapunov exponent(s). In this section, we will employ the slim wormhole expansions to obtain some simple, single-point criteria of positivity of such exponents.
We investigate some specific classes of models; the most important parameters distinguishing these classes are: • the combinatorial characteristics of the underlying graph G; • the local regularity properties of the marginal distribution of the IID random potential V : G × Ω → R.
3.1. Graphs with exponential growth of balls and bridges. We call a bridge between two distinct points x, y ∈ G any SAP γ connecting these points. (Selfcrossing bridges are indeed rare in real life.) Recall that the set of bridges of length n (n-bridges) between x and y was denoted by Γ n (x, y). The case where |Γ n (x, y)| may grow exponentially in n is the most challenging for our method; it has de facto been addressed by Aizenman and Molchanov [2] and, with the help of a different version of the fractionalmoment analysis, by Aizenman et al. [5]. Actually, for the methods of [2,5] only the fact of exponentially bounded growth of balls is relevant (and represents, of course, the main difficulty that the MSA cannot overcome). We have to further distinguish between subclasses of this exponential class of graphs, depending on the asymptotic behavior of their number of n-bridges. This sets apart the trees, where specific recursive methods apply, as they do in the one-dimensional case, but also the "moderately looped" graphs, where one does not have the benefit of simple algebraic recursions, while the exponential explosion remains a combinatorial challenge.
Here we are forced to apply the classical assumption of the FMM: the Hölder regularity of the random potential. Note that our approach clarifies the true role of the Hölder regularity in the moment analysis. What might seem to be a lucky accident in the elegant Aizenman-Molchanov decoupling lemma, is actually interpreted as Cramer's condition in the LDE component of our analysis. In the LDE theory, the special role of Cramer's condition is well-understood: it is basically the only tool allowing to obtain exponential large deviations estimates.
Remark 3.1. While it does not follow from our analysis, it seems to be a reasonable conjecture that, under weaker regularity assumptions than Hölder-continuity of the marginal PDF F V , the sub-exponential estimates provided by the MSA (or any reasonable improvements thereof) cannot be made exponential. I would be glad to be proven wrong on this point, if some more advanced variant of the MSA could establish exponential decay of eigenfunction correlators under weak regularity assumptions on V (x; ·).
As to the FMM itself, it simply does not apply to the probability distributions not satisfying the Höldercontinuity condition.
In the next subsection, where the number of bridges will be required to grow sub-exponentially (or not at all, as on the tree graphs), we will show that the AM/FMM assumption of the Hölder continuity can be relaxed, depending upon the growth asymptotic of the number of bridges.
We stress that in this section we are concerned only with the proof of decay properties of the Green's functions. The derivation of the almost sure p.p. spectrum will be the subject of Section 4.
The main statement of this subseciton is the following Theorem 3.1. The result in itself is not new, since the model in question has been studied by Aizenman and Molchanov [2] and subsequent works on the FMM. However, we establish here an important link between the localization analysis and the classical large deviations theory.   with some c ∈ R and m > c + C BR , would follow, if we could prove, for all γ ∈ Γ n and the given value of c, Step 3. Large deviations along a fixed path. We postpone the large deviations analysis until Appendix B; here we only state the final result, which is a reformulation of Proposition B.1 proven in Appendix B.  . (3.7)

Conclusion.
The claim follows from the results of Steps 1-3.
Remark 3.2. It is worth mentioning that the exponential decay of the Green's functions established by Theorem 3.1 does not reveal yet another lower bound on the "mass" m > 0 required for the spectral localization. We prove that the maximum value of the Green's functions is exponentially small with high probability, but localization requires that it remain small after multiplication by the surface of the L-sphere.

3.2.
Tempered growth of bridges. Now we will allow for a lower regularity of the probability distribution of the random potential, loosing the benefit of Cramer's condition in the large deviations analysis. The price to pay will be a stronger restriction on the growth rate of the number of n-bridges.
Perhaps, it would be appropriate to cite here the opening lines of the well-known paper by Dobrushin and Shlosman [14] (also devoted to the finite-volume criteria, in statistical mechanics), who in turn cite the opening lines of Lev N. Tolstoy's roman "Anna Karenina": "All happy families are alike, each unhappy family is unhappy in its own fashion". Put simply, the universal happiness ends where Cramer's condition ends (and the application of the FMM becomes impossible). To render things less dramatic, we abandon the very idea of presenting a reasonably complete set of results of the LDE theory and their applications to the path expansions in localization theory. Instead, in this subsection we make the weakest possible regularity assumption under which the MSA can still work in discrete, non-one-dimensional models.
The main result of this subsection is Theorem 3.3. Observe that the guaranteed rate of decay of the Green's functions remains exponential, but without Hölder continuity of F V , we have to be content with sub-exponential probabilistic bounds. In Theorem 3.3, we consider a particular case, where the balls may grow polynomially and the PDF F V is log-Hölder continuous (of sufficiently high order). While the combinatorial part of adaptation to the graphs with the rate of growth of balls, intermediate between power-law and exponential, is straightforward, it is more difficult (actually, impractical) to fit the large deviations estimates in one, universal formula interpolating between these two extreme cases. This has been well-known in probability theory since the early results by Linnik [25], who introduced several classes of probability laws, each requiring a specific adaptation of his general method. We refer to the review by Nagaev [27] and the bibliography given there; there are many more reviews and books available by now on this subject. Theorem 3.3 does not establish 4 Anderson localization; we only prove exponential decay of the Green's functions. The novelty of this result, apart from the use of the large deviations theory, is that it applies, in particular, to a class of Anderson models on graphs of exponential growth which are "moderately looped"; the latter property is expressed in terms of the number |Γ n (x, y)| of bridges of length n connecting two given points. For example, in a tree, |Γ n (x, y)| = 1, while the balls may grow exponentially, e.g., in the Cayley tree G K with the constant coordination number n G K ≡ K + 1, K ≥ 2. This renders inapplicable the existing MSA techniques; on the other hand, the FMM does not apply either, if the common marginal distribution of the IID random potential is only log-Hölder (but not Hölder) continuous, so that the fractional moments diverge. (3.8) and the continuity modulus of the PDF F V of an IID random field V : Z × Ω → R on Z, relative to some probability space (Ω, B, P), satisfies the following upper bound: for some A > d + 1 and C H < ∞,

Theorem 3.3. Assume that the number of n-bridges in the graph Z admits a power-law bound:
Then for any x ∈ G and L large enough, with some δ > 0, a ∈ R (cf.  (3.11) and the continuity modulus of the PDF F V of an IID random field V : Z × Ω → R on Z, relative to some probability space (Ω, B, P) satisfies the following upper bound: for some A > d + 1 and C H < ∞,

12)
Then, with notations of Section 2 and subsection 3.1, one has, for some a ∈ R, δ > 0 and all L ≥ L 0 (δ), ∑ Proof. We will see (cf. (3.16)) that the implicit requirement L ≥ L 0 can be replaced by a more explicit, but also more cumbersome probability bound valid for any L ≥ 1.
It follows from (3.12) that, for any t > 0, and uniformly in λ ∈ R, Denoting X + (ω) = X + (ω; x) = ln + |V (x; ω)−λ| ≥ 0, we have for any α ∈ (0, A), uniformly in x ∈ Z and λ ∈ R (cf., e.g., [16], Section V.6, Eqn (6.3)) Now we set α = A − δ > d + 1 + δ, so that A − α > 0 and the above expectation converges. Next, fix any path γ ∈ Γ n and set and S n (γ) = ∑ j X j (γ). Further let This makes unnecessary any restrictions on the decay rate of the left-tail probabilities P { X j < −t }, i.e., the tail probabilities P { |V (x; ω| > s }, as s → +∞. From the hypothesis (3.11), we infer, therefore, that (3.14) Recall that we associated with each SAP γ the sequence of decreasing sigma-algebras F i so that X i (hence, (3.15) Now we need the following large deviation estimate proven in Appendix C (formulated there in a more general fashion, making no reference to graphs and paths).
Replacing X j by X + j is the simplest, but not the most efficient way to bound P { S n > an }, since this gives rise to a larger value of the expectation µ than for the original variables X j . It is worth mentioning also that the requirement of finiteness of the expectation is actually a very mild condition on the decay of the tail probabilities for the random potential V , since for any t > 0,

From wormhole expansions to expectations
Most of the facts discussed in this section are wellknown; we state them in order to render our presentation reasonably self-contained. In essence, the following meta-theorem holds true:

Fast decay of Green's functions implies strong dynamical and spectral localization.
Certainly, the above statement is too vague and too general to be true. However, presenting a large variety of particular cases where its formal, precise instance holds true, is beyond the scope of this work, focusing, as was pointed out in the Introduction, on the positivity of Lyapunov exponents. To find a compromise between brevity and logical completeness of the text, we describe below the principal steps, leading to the proof of the above statement in specific models. The necessary additional assumptions are introduced along the way, on the as-needed basis.

Eigenfunction correlators. Step 1. From fixed energy to an energy interval.
We start with the following statement, essentially going back to [15] and reformulated in [11,10]. It is the core of one of the forms of the spectral reduction, inferring probabilistic localization bounds in an entire energy interval I ⊂ R from the pointwise estimates valid for each E ∈ I. Note that the parameter a > 0 below does not have the same meaning as in Section 3. and sup There is an event has Lebesgue measure mes(E x (2a; ω)) ≤ b and is covered by a union of intervals Moreover, since the function M x is continuous outside its poles, the intersection I ′ := I ∩ E x (2a; ω) actually is a finite union of intervals, provided that p > 5d.

Example 2.
In the case where |B L | ≤ e αL , α ≤ 1 2 µ, for some µ > 0, inequality (4.1) is satisfied with (4.5) If we assume in addition that (4.6) Of course, the obtained bound is efficient when µ > 8δ −1 α. With the Hölder exponent δ fixed, this means that either µ is to be sufficiently large (given the rate of growth of balls α < ∞), or the the rate of growth of balls α is small enough (with given Lyapunov exponent µ > 0). With α = 0 (sub-exponential graphs), any µ > 0 suffices, for L large enough.
Step 2. Using the Bessel inequality (cf. [21].) The following statement is an adaptation of the Germinet-Klein argument [21] to finite subgraphs of a countable graph. Its assertion is logically independent of Theorem 4.1, so we are free to allow here any interval I ⊆ R. However, to avoid confusing differences between the assumptions of various statements given in this section, we prefer to address here only the eigenfunction correlators in an arbitrarily large, but bounded (and fixed) interval I. Naturally, this setting suffices for the localization analysis of a.s.-uniformly bounded random operators, including discrete Schrödinger operators with a.s.-uniformly bounded random potential.
One possible extension to the case where both the random potential and the interval I are allowed to be unbounded, is discussed in [10], where the tail probabilities are required to satisfy the power-law bound The main idea is that the maximum value |V (x, ω)| for Proof. It suffices to combine (4.7) and Lemma 4.
and that the marginal PDF F V of an IID random field V : Z × Ω → R is log-Hölder continuous: Then for any B > 0, with A > 0 large enough and ϵ small enough, the eigenfunction correlators for the operator H = −ϵ∆+V (ω) admit a power-law decay bound: Proof. The claim follows from the estimate (3.13), where the key probabilities decay at least as O(L −A+d+2 ), and Naturally, the regularity interpolating between the log-Hölder continuity and Hölder continuity results (e.g., in the graphs with a power-law bound on the volume of balls) in the decay bound on eigenfunction correlators intermediate between the power-law and exponential one. For example, assuming that the continuity modulus satisfies one can prove strong dynamical localization with rate hence, strong dynamical localization of all orders b > 0: On the other hand, one can also trade higher regularity of the random potential for higher rate of growth of balls. It is clear from Eqn (4.7) that, in the graphs with exponential growth of balls, one needs exponential bounds on the probabilities of large deviations to overcome the exponentially large factor S(L), so our technique, just as the FMM, requires Hölder continuity of the marginal distribution for the proof of dynamical localization in this class of models.
Step 3. From finite subgraphs to the entire configuration space.
Only finite (but large) configuration spaces are relevant for the application to the physical models of localization/quantum transport (except, perhaps, for general cosmology ...). The actually infinite models may present only a mathematical challenge. Fortunately enough, the transition from large (but finite) to infinite models is quite easy, once some uniform bounds on the eigenfunction correlators are established in arbitrarily large finite configuration spaces.
Fix a graph Z and an interval I ⊆ R. Assume that for any pair x, y ∈ Z and for all sufficiently large subgraphs G ⊂ Z containing x and y, for some function f : N → R and any f ∈ B 1 (I) we have , y)). (4.8) The derivation of the analog of (4.8) for G = Z is well-known by now; see [5,4]; it is based on the Fatou lemma, applied to the (real-valued) spectral measures ν (x,y) G , associated with the eigenfunction correlators and defined for compactly supported continuous (or bounded here, it is assumed that x, y ∈ G. The operators H G converge in the strong resolvent sense as G ↑ Z (cf. [24]), so the measures ν (x,y) G converge vaguely to the respective spectral measure ν (x,y) ( = ν (x,y) Z ) for the operator H = H Z . By the Fatou lemma combined with standard approximation arguments, one can conclude that , y)).
See the details in [5,4]. A more direct approach to strong dynamical localization in an infinite configuration space is also known. In fact, as was pointed out, Theorem 4.1 is itself an adaptation of the result by Germinet-Klein [21] which applies directly to the infinite systems. This requires an a priori upper bound on the growth of generalized spectral projections for H on Z. See details in [21]. Recall that the techniques by Germinet-De Bièvre [20] and Damanik-Stollmann [12] also work in the infinite configuration space (and also require some a priori functional-analytic bounds).

4.2.
Towards the spectral localization. By the wellknown RAGE (Ruelle-Amrein-Georgescu-Enss) theorems, a sufficiently rapid 5 decay rate of the eigenfunction correlators implies a.s. point spectrum (spectral localization). However, an efficient, quantitative estimate on the decay rate of the eigenfunctions requires additional arguments, first of all a priori bounds of Shnol-Simon type on the growth rate of the generalized eigenfunctions. The usual procedure goes back to the seminal work by Fröhlich et al. [18], and in a reformulated form, to the paper by von Dreifus and Klein [13]. Lemma 4.1 suffices for assessing the eigenfunction correlators (which is indeed the main axis of the present paper. However, it has to be emphasized that the above approach does not provide the optimal bound on the eigenfunction decay. One of the alternatives to Lemma 4.1, based on a different analytic idea, was proposed in our work [10], but we are not going to discuss further this issue.

5.
Other types of expansions 5.1. More general stopping rules. It is clear that, in the recursive procedure described in Sections 2-3, one can set up more general rules to decide, which paths γ starting at x 0 ∈ G are to be expanded at a given length n ≥ 0. Naturally, these rules have to be consistent with the recursion: given a path γ, either all or none of its descendants are to be retained in a more elaborate expansion, dictated by the stopping rule. (The latter term is inspired by a similar notion of stopping rule, or Markov moment, used in the theory of Markov processes and martingales.) We call a stopping rule any function τ : Γ G → N such that for any γ ∈ Γ G , (i) either τ (γ ′ ) = |γ| for all descendants γ ′ (⊃ γ) of γ, (ii) or τ (γ ′ ) > |γ| for all descendants γ ′ ⊃ γ.
In the case (i), the path γ is stopped at the moment τ (γ) = |γ|, and all its descendants are suppressed by the stopping rule τ . Respectively, in the case (ii) the stopping rule allows all descendants γ ′ ⊃ γ with |γ ′ | = |γ| + 1, i.e., at least one more step is allowed by τ for the descendants of γ, in all possible directions.
A simple example of a stopping rule is given by the function τ : γ → |γ| ∧ n ≡ max(|γ|, n), with n ≥ 0. This stopping rule had already appeared implicitly in the previous subsection: it stops all paths after exactly n steps, regardless of the shape of the path. Now we can generalize the wormhole expansions: given any stopping rule τ , we have the identity for all E ∈ C + ∪ R(G; V ) (i.e., for complex or allowed real energies) .
The summation in the RHS is formally carried over all self-avoiding paths, even infinite if and when G is infinite, but only the finite subpath In the present paper, we do not develop an elaborate diagrammatic technique, so introducing general stopping rules may appear superfluous and not quite natural. However, it seems equally unnatural to limit the wormhole expansions to those corresponding to stopping rules of the form τ (γ) = f (|γ|). We merely point out that rigorous -and quite simple -identities can be easily obtained, in a large variety of forms, for the Green's functions of Anderson-type Hamiltonians on arbitrary (locally finite) countable graphs. The choice of a particular stopping rule may be dictated by a specific model.

Fat wormholes.
One obvious generalization of the slim wormhole expansion is obtained by splitting off at each step a subspace in ℓ 2 (G) of dimension large than 1, i.e., taking the decomposition of the form G = G A ⊔ G D with |G A | > 1. For example, in the procedure presented in Sections 2-3, one can replace are to be replaced by the respective local resolvents, making the estimates less explicit. However, this can be used in the framework of numerical analysis and provide a better description of the zone in the parameter space where localization occurs.
The "fat" version of the wormhole expansions were not discussed in [23], where the starting point is the perturbative, formal random walk expansion which is to be regularized with the help of the loop elimination. The Schur complement formula seems to be the easiest way to obtain such an expansion.

The Schur sandwiches.
Consider now the case where G is partitioned into three subgraphs, such that the complement of G A1 is itself disjoint: In other words, the subgraph G A is sandwiched between the components G D1 and G D2 which it separates completely. Applying the Schur decomposition formula (2.1) , we obtain for any x ∈ G D1 and y ∈ G D ′ 2 : and . This procedure can be iterated. For example, after two steps, using the decomposition ] (x, y), (5.4) and so on.
The "sandwiched" decomposition (5.2) is similar in spirit to the one introduced by Aizenman et al. [5], but, owing to the use of the Schur formula instead of the second resolvent equation, the middle component G GA 1 is not the full resolvent G G as it was in [5]. In (5.2), it is a modified resolvent of a small 6 subsystem G A1 . It is, of course, rather implicit, but for the purposes of the resonance analysis (proximity of the spectrum of gV GA 1 − E − K(E) to E), it is basically as good as the explicit multiplication operator gV GA 1 − E, conditional on the random potential outside G A1 . Indeed, conditional on F GD 1 ∪G D ′ 2 , which renders K(E; ω) nonrandom, the distribution of the eigenvalues of M A1 is at least as regular as for K(E) = 0, for only the regularity of the diagonal elements of M counts, with K(E) fixed.
Note also that in (5.4), the resolvents G D1 , G D2 and G D ′ 3 are independent and explicit, particularly when |G D1 | = |G D2 | = 1 (G D ′ 3 is to be decomposed further), while G A1 and G A2 become conditionally independent (albeit not necessarily identically distributed), given the sigma- . We plan to address in a forthcoming paper the case where |G Dj | = 1, j = 1, 2, ..., n ("slim sandwiched expansion").

Conclusion
The two most popular methods of the mathematical theory of Anderson localization, the Multi-Scale Analysis and the Fractional Moment Method, have a large common domain of application, but also exclusive areas where each of them is the indisputable champion.
It is a legitimate question, what are the natural boundaries for the FMM techniques (or rather general philosophy), in terms of the regularity of the underlying probability distribution. In the present paper, we give only some partial answers in this direction, but hope that a more systematic use of the methods of ergodic theory can be beneficial for extending the general FMM's strategy to a larger class of models where the fractional moments per se do not exist.
It is also intriguing to find out, if the MSA's strategy can be improved 7 so as to apply to the exponential graphs, or to achieve exponential bounds on the eigenfunction correlators. In the author's opinion, which may of course be incorrect, in the case where the random potential is not Hölder continuous, there may be some natural limits to the decay rate of eigenfunction correlators, independent of the techniques employed, due to the limitations following from the accurate large deviations analysis of rare, but inevitable resonances.
It seems likely that more general results could be obtained by the synthesis of the "mono-scale" (FMM-type) and the multi-scale approaches.

Appendix A. Progressive measurability issues
Given an SAP γ = (x 0 , . . . , x n ), n = |γ| ≥ 1, we have a decreasing sequence of subgraphs (cf. (2.7)) where, for any given subset X ⊂ G, G ⊖ X is defined as the connected subgraph of G \ X containing x n . Clearly, Y 0 Y 1 · · · Y n . Further, we have a decreasing sequence of sigma-algebras F i = σ [V (x; ·), x ∈ Y i ] and the random variables where the "self-energy" E(x i ; G ⊖ γ i ; E) is defined as in (2.6). We will never use any property of λ j other than "regressive" measurability with respect to the filtration F • : (Usually, one deals in probability theory with "progressive" measurability with respect to a growing family -"filtration" -of sigma-algebras.) At the same time, V (x i ; ω) are F i -measurable. We will say that (V (x i ; ·)) i≥0 are adapted to (F i ) i≥0 . Clearly, the random variables The above notations and facts will be used below without further notice.

Appendix B. Large deviations under Cramer's condition
In Proposition B.1, we adapt a classical argument, historically applied first to independent random variables, to the sequence {X j } adapted to the sigma-algebras {F i }. There is a vast literature devoted to the large deviations for the martingales, but in our situation the variables at hand do not necessarily form a martingale. 7 I thank Abel Klein for fruitful exchanges on this subject.
It is certainly possible to introduce a martingale, using the so-called compensator process, but a straightforward adaptation of Cramer's method is short enough and more transparent. The final result, of course, is not new, although it would be difficult to make a direct, formal reference to a published paper or monograph, without any adaptation.
In Section 5, we use Proposition B.1 in the particular case where X 1 , . . . , X n are independent (this is obviously allowed by the hypotheses).
Then for any a > 0 there exists I(a) > 0 such that Proof. Fix any t 1 ∈ (0, t 0 ) and consider on [0, The conditional probability distribution of X i , given F i+1 , is a (random) probability measure on R, depending upon the condition; for brevity, we denote it by P i . The respective conditional expectation is denoted by Clearly, for each i, φ i (0) = 0, and its derivative Set R i = X i +. . .+X n , 1 ≤ i ≤ n, so that S i−1 +R i = S n .
(We always use the standard convention that the sum over the empty set of indices is zero.) Now assess the Laplace transform of S n (ω) as follows: ] ] (recall: and by Chebyshev's inequality, we conclude that } ≤ e −(n+1)I(a) , I(a) > 0.
The most important point in the above assertion is that the decay exponent I(a) of the probability of large (linear in n) deviations is strictly positive for an arbitrarily small excess a > 0 of the expectation. As to the large values of a > 0, taking t * = t * (a 0 ) for some a 0 > 0, as in (B.2), we see that, owing to the strict positivity of t * , one has I(a 0 + a) := sup t∈ [0,t1] ((a 0 + a)t − φ(t)) ≥ (a 0 + a)t * − φ(t * ) = I(a 0 ) + at * −→ a→+∞ + ∞.

Appendix C. Large deviations for heavy-tailed sums
Here we use essentially, and adapt where necessary, the technique from [27], where the sums of independent random variables were considered. Again, it is to be emphasized that the result is not genuinely new and may constitute, at best, an exercise in a standard course of probability theory, covering the large deviations estimates.
Proposition C.1. Suppose that the random variables X i are adapted to a decreasing family {F i } and have uniformly bounded absolute moments of order α ≥ 2: Then for any a > µ and some c > 0, the following estimates hold true: (1) ) ) , n → ∞. (C.1) Proof. As usual in the estimation of heavy-tailed r.v., introduce the truncated r.v.
It is readily seen that for any x ∈ R (C.2) Since all X i are bounded, S n is also bounded and has finite exponential moments of any order, but we will have to choose judiciously a value h n > 0, taking into account the cut-off threshold b, and then use the Chebychev inequality ] .