Academia.eduAcademia.edu

Some aspects of the probabilistic work

2007, Springer eBooks

AI-generated Abstract

The paper discusses significant aspects of the probabilistic work of Andrey Kolmogorov, emphasizing his contributions to the axiomatization of probability, as well as the study of convergence for sums of independent random variables and processes over continuous time. It highlights the historical context in which Kolmogorov's work emerged, his influence on the field, and the reception of his axiomatic approach among his contemporaries.

Introduction

Anyone reading about the mathematical works of Kolmogorov must naturally expect to find broad considerations about the axiomatization of probability which Kolmogorov developed at the beginning of the nineteen thirties and which forms the contents of his famous publication Grundbegriffe der Wahrscheinlichkeitsrechnung (Foundations of the theory of probabilities) [Kol33], published by Springer in 1933. It is certain that among all the works of the Soviet mathematician, this small opuscule of about sixty pages is the most famous part, and is often the only one to which his name is attached for a quite large public but also for some mathematicians. Without wanting in any way to diminish the importance of this work, it is nevertheless quite astonishing that the attention was thus focused on what does not constitute the most original creation of Kolmogorov in the field of probability. The aim of this part, devoted to certain aspects of the probabilistic works of the scientist, is precisely to highlight some of his most remarkable works in that domain. In this imposing monument, a drastic choice was necessary and we chose to focus on the two purely probabilistic directions that Kolmogorov worked on, namely on the one hand the study of the various types of convergence for sums of independent random variables, which enabled him to continue the studies of his Russian predecessors Markov and Lyapounov, and on the other hand literally revolutionary considerations about processes in continuous time, whose branches extend ahead in time until some discoveries which go back to hardly thirty years. Nevertheless, as we found it difficult, and almost impossible, that a chapter devoted to the probabilistic works of the Soviet mathematician does not refer to the axiomatization of probability, we will begin with a short glance of his main contributions to this topic, inviting the reader to refer to the numerous articles dealing with the question in a more detailed way (see e.g. [vonP94], [SV06]). We will also refer to the essential text of Shiryaev [Shi89] for a more complete chart of Kolmogorov's works. One will find in the article [Maz03] some indications on the life of the mathematician and the status of the discipline in the stalinist USSR.

The axiomatization of probability calculus

An abstract framework

As mentioned above, Kolmogorov's publication Grundbegriffe der Wahrscheinlichkeitsrechnung [Kol33] is a modest monograph of 60 pages published in 1933 along with several articles devoted to the modern probability theory. The Russian translation of the text is dated 1936, and it was mostly achieved due to some political reasons at the time when an important pressure was put on Soviet scientists so that they publish their works in Russian and in USSR rather than abroad. As for the first English translation, it is dated 1950. This relatively important delay shows that the axiomatization suggested by the Russian scientist wasn't as generally accepted as we usually think it was. Several probabilists, and among the most eminent ones, such as Paul Lévy, will never use the axiomatization of Kolmogorov, which will not prevent them in any way having extraordinary ideas. In fact, outside the USSR, before the 50's, more or less only Cramér's treatise [Cra37] refers to this field. Besides, this author doesn't give any detailed explanation; he only uses Kolmogorov's axiomatization because it is the most practical one among those available at that time (in particular the theory of collectives suggested by von Mises). However, from the 50's, it will definitely be adopted by the younger generation. What is attractive in the formal framework proposed by this axiomatization, is the fact that it provides e.g. a global explanation of the multiple paradoxes which had in the past plagued this discipline (like those of Joseph Bertrand, Emile Borel etc.) : each time, the precise definition of the probability space as a description of the considered random experiment allows to suppress the ambiguity. (On this subject, see infra, as well as [SV06] and [Szé86]. One can also refer to Itô's comments in the foreword of [Itô86].)

The great force of Kolmogorov's treatise is to voluntarily consider a completely abstract framework, without seeking to establish bridges with the applied aspects of the theory of probability, beyond the case of finite probabilities. In general, the search for such bonds inevitably brings to face delicate philosophical questions and is thus likely to darken mathematical modelling.

While speaking about the questions of application only in the part devoted to the finite probabilities 4 , Kolmogorov is released from this constraint and can avoid the pitfalls that von Mises had not always circumvented. Indeed, the theory of the collectives also claimed to establish a discrimination between the experiments for which the application of the probabilities was legitimate, and others. But Kolmogorov, who presents a purely mathematical theory, does not have such an ambition, and thus not such a limitation. Within the abstract framework which he defines, any mathematical work is legitimate, and its validation for applications is a matter for other fields of knowledge. In particular, he allows himself to consider sets not having any topological structure, whereas for finer studies (like the phenomena of convergence), he will be free to work on better spaces through the use of images of probability laws. Besides this fact will place the axiomatization of Kolmogorov in opposition with the Bourbaki topological set-up concerning measure theory. The very general character of his theory will make it possible to the Russian mathematician to use in all its force the measure theory of Borel and Lebesgue, which is still relatively new at that time since its abstract version was mainly developed by Fréchet (quoted in the Grundbegriffe as the one who liberated measure theory from geometry) then by the Polish school (Banach, Sierpiński, Kuratowski. . . ) in the 20's.

From the beginning of the twentieth century, Borel had been a promoter of the use of the measure theory and integral of Lebesgue for the treatment of questions of probability. In 1909, he publishes a revolutionary paper where such a method enables him to obtain a first strong version of the law of large numbers and interpretations on the distribution of real numbers. Undoubtedly his moderate consideration for probabilistic mathematics and serious doubts as for the legitimacy of their applications prevented him from fully reaping the crops from the seeds which he had sown.

Kolmogorov introduces the by now classical concept of a probabilisty space in the form of a triplet (Ω, F , P ) composed of a set Ω provided with σalgebra (which he calls a set field) F and a normalized measure (probability) P . The random variables are simply functions X with real values defined on Ω such that for all a ∈ R, {ω ∈ Ω, X(ω) < a} ∈ F and their laws are the image measures of the probability P defined by P (X) (A) = P (X −1 (A)), for all A ∈ B(R), the Borelian σ-algebra of R.

The major contributions of Kolmogorov's work in the clarification of probabilistic concepts are incontestably the construction of a probability measure on an infinite product of spaces, which plays an important part in the theory of stochastic processes, and the formalization of the conditional law via the use of the Lebesgue-Radon-Nikodym theorem (the abstract version of this theorem had been published by Nikodym in 1930). Let us note incidentally that it wasn't the first time that a probability on a product space was built: the most famous example is given by Wiener [Wie23] who since 1923, by applying techniques that Daniell had developed a few years before to extend Lebesgue's integral to spaces of infinite dimensions, built the probability measure − now called Wiener's measure − associated with Brownian motion (see [RY91]).

Construction of the conditional law

Let us present in a few words the construction of the conditional law, while following the text of Kolmogorov but with modernized notations for the sake of clarity of our exposition.

First of all let us recall the definition of the elementary conditional probability of an event C (i.e. of an element of the σ-algebra F ) by an event D such as P (D) > 0, by P (C | D) = P (C ∩ D)/P (D). Now, let U denote a realvalued random variable and B an event. We try to build a random variable ω → π(U (ω); B), a Borelian function of U , so called conditional probability of B knowing U , and such that, for all A ∈ B(R) with P (U ∈ A) > 0, we have:

For any A ∈ B(R), we write Q B (A) = P (B ∩ U −1 (A)). Let us note that if P (U) is the law of U defined by P (U) (A) = P (U −1 (A)), then P (U) (A) = 0 implies Q B (A) = 0 and thus, according to the Lebesgue-Radon-Nikodym theorem, we can find a Borelian function f B such that ∀A ∈ B(R),

and we set π(U ; B) = f B • U . From this point, Kolmogorov will recover all classical properties of conditional probabilities. He nicely illustrates the strength of his formalism by explaining the paradox of Borel related to the random drawing of a point on a sphere: the interested reader can refer to e.g. [Bil95], p. 462 and [SV06].

The 0-1 law (or the all or nothing law )

As mentioned previously, in 1933, measure theory isn't yet very commonly accepted, in any case not in its abstract form, and when Fréchet will discover the monograph of the Russian mathematician, he will be disconcerted by the very abstract form taken by certain arguments, like that of the law known as the 0-1 law which Kolmogorov placed in the appendix of his work. This law was stated independently, in particular by Lévy in 1934 (when he didn't yet know the Grundbegriffe), and it is interesting to compare the two approaches of this result, which we will do by way of illustration of the strongly synthesizing character of the axiomatic proposed by Kolmogorov. For an easy reading, we will use today's vocabulary and notations, keeping only the spirit of the two proofs.

Theorem 1. Let (X n ) n≥1 denote a sequence of independent real-valued random variables. We introduce G n = σ(X n , X n+1 , . . . ) (the σ-algebra generated 5 by X n , X n+1 , . . . ), and G = n≥1 G n (the "tail σ-algebra").

Then, any element of G is of probability 0 or 1. Kolmogorov's proof: It is the most common proof taught today. Let A an element of G. Let us suppose that P (A) > 0 and note P A the conditional probability knowing A. According to the independence hypothesis made on the X k 's, for all B ∈ F n = σ(X 1 , X 2 , . . . , X n ), B is independent from the elements of G n+1 and thus from A, and we have:

Therefore, the probabilities P A and P coincide on all F n 's, and hence on Boole's algebra n≥1 F n and thus, according to the monotone class theorem, on the σ-algebra thus generated, i.e. F = σ(X 1 , X 2 , . . . , X n , . . . ). Since in particular A ∈ F, we have P A (A) = P (A), i.e. P (A) = 1.

Lévy's proof: In fact, Lévy contents himself to prove the result when variables X n follow a uniform law on [0,1]. In this case, the obtention of a realization of the sequence (X n ) n≥1 can be conceived as the one of a point in a cube of size 1 with an infinity of dimensions, the law of probability being given by Lebesgue's measure. Lévy's argument is then based on an observation which he affirms being obvious and which is equivalent in fact to a monotone class result: he points out (we employ the modern formalism) that for any event A of the σ-algebra F = σ(X 1 , X 2 , . . . , X n , . . . ) (which may therefore be written as follows

where B is a measurable set of R N ), and for all ε > 0, we can find n > 0 and D n ∈ F n = σ(X 1 , X 2 , . . . , X n ) such that 6 P (D n ΔA) < ε. In fact, his explanation is to say that measurable sets inside the infinitely dimensional cube are obtained by "M. Lebesgue's constructions" from the "intervals" of the cube, which are the sets of the

Let now E an element of G. Let us note that the independence of the (X n ) n≥1 's allows us to write that ∀n,

However,

. This is true for all ε > 0, thus P (E)(1 − P (E)) = 0.

Limit theorems and series of independent random variables

The direction in which Kolmogorov will develop his first works in Probability, undoubtedly guided by his elder Khinchin 7 , may be found to be in continuity with the former studies which had specified the conditions of validity of the limit theorems (in particular the law of large numbers) for sums of random variables throughout the nineteenth century. The first paper, which goes back to 1925 [KK25] and which is the only article jointly written by Kolmogorov and Khinchin, is remarkable in the way that it introduces a number of techniques which will be at the base of some later developments of the theory of probability, in particular in the study of the results of convergence for martingales. This first work relates to the convergence of series of independent random variables. The main result is stated as follows (in modern terms):

Theorem 2. Let (X n ) n≥1 a sequence of centered (i.e. of 0 expectation) independent real-valued random variables. Let us suppose that

Then n X n converges almost surely (a.s.), i.e. with probability 1.

The proof suggested by Kolmogorov is based on a famous inequality which is named after him today:

Proof. We write

Note that the A p 's form a partition of the entire probability space and that for all 1 ≤ p ≤ n, S n − S p is independent from S p 1I Ap and has 0 expectation.

and summing with respect to p, E(S 2 n ) ≥ ε 2 P ( max

It is only a few years later, in a note for the Comptes Rendus de l'Académie des Sciences (CRAS) of Paris in 1930 [Kol30], that Kolmogorov will obtain from the previous result his most famous consequence, that is to say this nowadays classical version of the strong law of large numbers: Corollary 1. Let us suppose that the variables X n are independent and centered. We write E(X 2 n ) = b n and we suppose that

Proof. First of all, let us note that for all fixed N > 0,

where lim n indicates the superior limit for n → +∞. Moreover, it is obvious that:

Therefore, for all ε > 0 and all N > 0:

By hypothesis, the last term can be made as small as we like, and so lim n σ n = 0 a.s.

Remark 1. The independence of the X n variables occurs twice in this proof: first when we apply Kolmogorov's inequality, and secondly when we write

(using the additivity of the variance for non-correlated variables).

As mentioned above, the result of Lemma 1 and its proof can directly be applied to the case of discrete martingales 8 : Corollary 2. Let (M n ) n≥1 denote a square integrable martingale such that E(M n ) = 0. Then

It is easily proven that (3.2) may be strengthened a little, in the form of the important Doob inequality: if (M n ) n≥1 is a square integrable martingale such that E(M n ) = 0, we have

As we know, the theory of martingales has, after Doob's works, invaded the scene of contemporary probability theory. In order to illustrate the strength of the inequality (3.2) and the notion of martingale, let us prove the following result and its corollary.

n≥p is a square integrable martingale, we have according to (3.2) for all N P ( max

As mentioned above, (M k ) k≥0 is a Cauchy sequence in L 2 and so, for each given m > 0, we can choose p m such that Corollary 3. Let (Z n ) n≥1 a sequence of independent random variables with the same Bernoulli law P (Z = 1) = P (Z = −1) = 1 2 . We consider the random walk on Z: Proof. We only indicate the broad outline of the proof, leaving the details to the interested reader (see e.g. [BMP01]). Let X θ n = e θSn (cosh θ) n . We verify that it is a martingale and that 9 (X θ n∧τ ) n≥1 is a L 2 martingale, which converges a.s. and in L 2 towards the variable W θ = e θa (cosh θ) τ 1I τ <+∞ . Passing to the limit as θ → 0, which is made possible by dominated convergence, we obtain that P (τ < +∞) = 1, and thus the desired result.

In 1924, Khinchin [Khi24] had proven a result which brought a radical precision to the law of large numbers, the law of the iterated logarithm. The generalization of Khinchin's result by Kolmogorov in 1929 ([Kol29]) was one of his greatest achievements.

Theorem 3. Let (X n ) n≥1 a sequence of independent real-valued random variables. Let us suppose that for all n, E(X n ) = 0 and b n = E(X 2 n ) < +∞. We

), we have a.s.

Today, Kolmogorov's proof still remains very much up to date, as it introduces techniques, in particular of large deviations, which became fundamental in the study of many limiting phenomena in probability theory. We will only show the less technical part of the result, leaving the reader consult one of the innumerable texts which present the complete proof (e.g. [Bil95]). Let us note, for ε > 0, φ ε (n) = (1 + ε) √ 2B n ln ln B n ; we shall prove that P (S n ≥ φ ε (n), infinitely often) = 0. According to Borel-Cantelli's Lemma, it suffices to prove that for a well-chosen subsequence n k ↑ +∞, we have

(3.4)

As mentioned above, we will obtain the result thanks to the following lemma which gives some large deviations estimates for the sequence (S n ).

Proof.

Let us fix n and in order to simplify the writing, this index n will be omitted in the next lines.

Let a > 0 such that aM ≤ 1. Then,

and thus:

As P (S > x) < E( e aS e ax ) (for all a > 0), we obtain the inequality P (S > x) < exp[−ax + a 2 B 2 (1 + aM 2 )] from which we easily deduce the points (i) and (ii) by taking successively a = x/B and a = 1/M .

As for the point (iii), let us note U = max 1≤k≤n S k , and that (U ≥ x) is the union of the events

Thus, we have

But S − S k is independent of E k , and therefore this last expression is also

and hence

From Lemma 2, we deduce (3.4) for some well-chosen subsequence (n k ). Indeed, let us choose these integers such as for all k, B n k−1 ≤ (1 + τ ) k ≤ B n k . From Lemma 2 (i)-(ii), we obtain, by using the hypotheses, for all μ > 0, and

Thus, we conclude by applying Lemma 2 (iii) and the fact that √

Processes in continuous time

In the beginning of the years 1930, a great number of probabilistic works of the Soviet school are related to the study of the stochastic processes in continuous time, meeting thus in particular the needs in physics or aiming at describing some "social phenomenons". The axiomatization due to Kolmogorov, which we commented on above, brought an essential element to the establishment of this theory. The theorem of construction of probability measures on a space of infinite dimension shows that the law of a stochastic process in continuous time is given in a unique way starting from the family of the finite-dimensional marginal laws of the process in question.

Chapman-Kolmogorov's equation

The first family of processes to which Kolmogorov, as a good heir to the Russian school of probability, turns naturally to, is that of the Markov processes, i.e. those which satisfy the property (known as the Markov property) of independence of the future with respect to the past conditionally to the knowledge of the present. The articleÜber die analytischen Methoden in der Wahrscheinlichkeitsrechnung [Kol31] sets definitely the analytical bases of the theory of Markov processes.

In this fundamental article published in 1931, the entire study of the process is focused around the function:

which represents the probability such that at time t the random phenomenon is in one of the states of the set A, if it is in the state x at time s, prior to t (0 ≤ s < t). As the necessary measurability assumptions are supposed to be satisfied, this function must verify the integral equation

where E stands for the set of all the possible states of the process. Equation (3.5), commonly called today Chapman-Kolmogorov's equation (Chapman had indeed noted it in a report [Cha28] on Brownian Motion in 1928), is the analytic translation of Markov's property and the measures P (s, x, t, dy) represent the transition probabilities of the process: if we denote by (X t ) t≥0 this process, then, for all measurable A (for the measure dy):

However, as Kolmogorov's article is purely analytical, as we can easily see from its title, it doesn't mention any pathwise realizations of the random process. Equation (3.5) can't be entirely solved in an explicit way in the too global framework in which it is posed. Thus, Kolmogorov seeks conditions of regularity on the probabilities P (s, x, t, dy) which would make it possible to obtain a more accessible form. Eager to use the new techniques of analysis related to Lebesgue's integral, he naturally focuses on the case where P (s, x, t, dy) is absolutely continuous, with density f (s, x, t, y) ≥ 0, according to Lebesgue's measure.

Figure

Equation (3.5), which is satisfied by the transition probabilities P (s, x, t, dy), translates for their densities as:

for all u ∈]s, t[ and all y ∈ R. Therefore, to obtain local conditions starting from (3.5), the natural idea is to realize a Taylor development of f , which needs regularity conditions on f and assumptions on the moments.

Kolmogorov asks that for all s, t, y, f (s, x, t, y) admits third order derivatives on x and on y which are uniformly bounded on s and t, on any set of the type {s, t : s − t > k}, k > 0. Besides, under the following assumptions for the moments:

he shows the existence of the limits

which he calls respectively the infinitesimal mean and the infinitesimal variance of the process and which will be known in the future, in the diffusions case, as the drift coefficient and diffusion coefficient. Thus, from the (3.5), the existence of the limits (3.8) and (3.9) and under the differentiability assumption of f mentioned previously, Kolmogorov obtains the two following partial differential equations:

The importance of these equations is such that one can consider them as being at the origin of the modern theory of stochastic processes. Let us give e.g. the main arguments of the proof of the first equation, which the author calls first fundamental differential equation and which is now known as the backward equation (the second being the forward equation).

Proof of equation (3.10). If we apply

which brings us immediately to the finite difference formula

To conclude, let us note that under the above assumptions, the ratio c(s, x, Δ)/Δ tends towards 0 when Δ tends towards 0.

As we already pointed out, the study of " random movements" whose law is governed by the (3.5) had already been outlined by Chapman in 1928 ([Cha28]) in a context of theoretical physics. The name of "Chapman-Kolmogorov" equation should not let one believe however that it was the only occasion, before the article of 1931, when this equation appeared. Kolmogorov himself, in this article, mentions a particular case studied by Louis Bachelier in 1900 ([Bac00]). He underlines, in a section that he devotes to the work of Bachelier, that the equation (3.11) had been written in his work of 1900 without however having been proven, in the case where the process is homogeneous in space, i.e. when the densities f (s, x, t, y) only depend on s, t and on the difference y − x. The equation appears also in the works of Marian Smoluchovski about the Brownian motion during the years 1910.

Some continuations of the articleÜber die analytischen Methoden in der Wahrscheinlichkeitsrechnung [Kol31] appeared shortly after its publication on behalf of other probabilists like Bernstein, to whom Kolmogorov's equations inspired his theory of the stochastic differential equations in 1932. However, this theory is based on the discrete model and only allows to obtain weak solutions in the continuous case. Another important work was that of Wolfgang Doeblin, carried out in 1940. Doeblin sent it whilst at war (where he died) to the Académie des Sciences de Paris, in a sealed envelope which wasn't opened until 2000 ([Doe40]) 10 . In this manuscript, Doeblin, very much ahead of his time, considers the pathwise aspects of the stochastic processes. More exactly, he establishes links between the strictly analytical point of view of Kolmogorov and that of Lévy who concentrates primarily on the paths construction and the fine properties of the processes, and especially of the Brownian motion, by purely probabilistic methods which often left his contemporaries perplexed. Doeblin builds equations very close to Itô's stochastic differential equations established some ten or fifteen years later and whose solutions are Brownian motions with a modified temporal variable: if the law of (X t ) satisfies the (3.5), then

where β is a real-valued Brownian motion and H the time change

This pathwise vision of processes offers then quite more than the analytical forward and backward (3.10) and (3.11). It allows Doeblin to establish results on the regularity of trajectories, the comparison of solutions, the properties of iterated logarithm, the functional central limit theorems and especially a preliminary version of the formula of change of variable that Itô will obtain a few years later ( [Itô44]) and which will inaugurate the era of stochastic calculus itself.

To establish his formula, Doeblin considers a function ϕ(t, x) of class C 1,2 (i.e. of class C 1 with respect to t and C 2 with respect to x) and increasing with respect to x, which ensures quite easily that the law of the process Y t = ϕ(t, X t ) is a solution of Kolmogorov's equation as soon as the law of (X t ) is also a solution. Thus, he proves that (Y t ) satisfies

where γ is a real-valued Brownian motion and

10 The readers who may be less interested in the technical aspects can content themselves by reading the article [BY03] It was necessary to await the construction of Itô's stochastic integral during and after the second world war to see the solutions of the (3.10) and (3.11) under a new aspect. These new equations are satisfied 11 by the process itself and no longer only by its transition probabilities. If β stands for a real-valued Brownian motion and if A(t, x) and B(t, x) are the functions defined as presented at the beginning of this section, then the transition probabilities of the process X solution of the stochastic differential equation

(3.12) satisfy the (3.5), (3.10) and (3.11). An essential tool for the obtention of this result was Itô's formula mentioned above. In its most common current form, this fundamental formula is stated as follows: if ϕ(t, x) is a function of class

where X solves the stochastic differential equation (3.12). These are the bases of the stochastic calculus which was going to encounter several developments during all the second half of the twentieth century. From the years 1950 onwards, Doob's martingale theory [Doo90], developed afterwards by P.A. Meyer and his school in Strasbourg, was going to make it possible to weaken the conditions imposed until then on the functions A(t, x) and B(t, x) to ensure the construction of stochastic processes. An essential remark in this direction was the observation that under natural assumptions of local boundedness and lipschitzianity, the (3.12) admits a single solution in law, in the sense that if β is another Brownian motion (eventually defined on another probability space), a solution X of:

follows the same law as X. This made it possible in the years 1970 to define the concept of weak solution for the (3.12) which isn't related to the specific choice of a particular Brownian motion any longer. A famous theorem of Yamada and Watanabe [YW71] asserts that pathwise uniqueness (the one which corresponds to an equation directed by a fixed Brownian motion) implies the unicity in law of the weak solutions. A powerful formulation was proposed by Stroock and Varadhan [SV79] in terms of martingale problems. The associated generator to the Markovian process (X t ) solution of (3.12) is the operator L on C 1,2 defined by:

Itô's formula allows to express this definition by saying that for all f ∈ C 1,2 ,

is a local martingale 12 . Then, it becomes natural to define a solution for (3.12) as follows. Let C = C(R + , R) the set of continuous functions from R + to R. We define the canonical projections on C by X t (ω) = ω(t) for ω ∈ C and the canonical filtration by C t = σ(X s , s ≤ t), t ≥ 0. Thus, a solution of (3.12) is a probability P on (C, (C t )) such that under P the processes defined by (3.14) are local martingales.

The most interesting aspect of the work of Stroock and Varadhan is that under very weak conditions (approximately, the continuity of the functions A and B), they showed that the preceding martingales problem admits a solution P . Under this probability, the canonical process satisfies the Markovian properties which were at the origin of Kolmogorov's studies. The reader interested by these subjects can consult with interest the important treatise of Jacod [Jac79].

Processes with independent and stationary increments

Among the processes whose laws verify Chapman-Kolmogorov's equation (3.5), there is a very important family that Lévy started to study at the beginning of the years 1930: those where functions f (s, x, t, y) are homogeneous in time and space, i.e. depend only on the differences t − s and y − x. In other terms, we are talking about the processes with independent and stationary increments for which Kolmogorov attempts to characterize the law in an article edited in two parts in 1932: Sulla forma generale di un processo stocastico omogeneo [Kol32a] and Ancora sulla forma generale di un processo omogeneo [Kol32b]. He simply considers a "random time function" X(λ), where λ ≥ 0 represents the time variable, such that for all λ 1 and λ 2 , (λ 2 ≥ λ 1 ) the difference X(λ 2 ) − X(λ 1 ) is independent from (X(λ), λ ≤ λ 1 ) and the law of which only depends on λ 2 − λ 1 , i.e. if we note

Then, he discovers that the relation

is a particular case of the (3.5). Nevertheless, we can notice that at that time, these processes aren't related to Markov processes, which are the subject of the study mentioned above. In fact, this formalization will only appear in the years 1950.

The aim of the article of 1932 is, according to Kolmogorov himself, to generalize some results given by Bruno de Finetti [Fin30] in the case where the laws given by the repartition functions Φ Δ admit second order moments,

x 2 dΦ Δ (x) < ∞. We will use the following notations:

Thus Kolmogorov obtains a particular case of the famous Lévy-Khinchin's formula: if ψ Δ (t) = e itx dΦ Δ (x) then ψ Δ (t) = [ψ 1 (t)] Δ and log ψ

where G is a finite measure such that G({0}) = 0 and where m 1 ∈ R, σ 2 0 ≥ 0. This formula is commonly attributed to Lévy and Khinchin who obtained its final version in 1934 ( [Lév34]) and 1937 ( [Khi37]), although, as we have seen, de Finetti and Kolmogorov had already established it in particular cases. More precisely, the result given by Kolmogorov is the following.

Theorem 4. When the law given by the repartition function Φ Δ has a second order moment, we have

where m 1 ∈ R, σ 2 0 ≥ 0, π(x, t) = (e itx − 1 − itx)/x 2 and where the measure dF (x) is defined by an increasing and bounded function F .

The previous formulae only concern unidimensional laws of the random function (X(λ)); consequently, it would be better to talk about the characterization of indefinitely divisible laws 13 rather than of a result on processes with independent and stationary increments. Nevertheless, let us note that at that time, this terminology didn't exist.

Kolmogorov's proof: First of all, we verify that ψ Δ is continuous with respect to Δ. Indeed, for Δ ≤ 1/n, we have σ 2 Δ = σ 2 1/n − σ 2 1/n−Δ ≤ σ 2 1/n = (1/n)σ 2 1 , and so σ 2 Δ → 0 when Δ → 0. Consequently, ψ Δ (t) → 1 when Δ → 0. We conclude thanks to the equality ψ Δ1+Δ2 (t) = ψ Δ1 (t)ψ Δ2 (t). The continuity of ψ Δ in Δ allows to show that the equality

which is true for all rationals Δ, is also verified for all reals Δ. The author deduces then (using de Finetti's proof) that

Moreover, we have

A classical argument allows to justify that for any sequence Δ n decreasing towards 0, there exists a subsequence Δ n k such that the sequence F Δn k (x) converges when k tends towards +∞, towards the function F (x) in all points x where the latter is continuous. Let us remark that F is an increasing function such that:

and taking into account that at fixed t, π(x, t) → 0 when x → ±∞, we obtain

But as σ 2 1 is finite, we have log ψ 1 (t) = itm 1 − σ 2 1 2 t 2 + o(t 2 ) (t → 0), and according to the previous computation

with the discovery of new grounds of applications, like financial mathematics, where the models using Lévy processes allow to compensate for the defects of the model known as the Black-Scholes model involving the geometrical Brownian motion.