Extending the Algebraic Manipulability of Differentials
Jonathan Bartlett1 and Asatur Zh. Khurshudyan2
1 The
Blyth Institute,
[email protected]
2 Institute of Mechanics, NAS of Armenia
August 19, 2018
Abstract
with each other, represent the limit of that ratio as
the changes get smaller and smaller. However, many
were not pleased with the limit notion, preferring to
view dx and dy as distinct mathematical objects.
Treating differentials as independent algebraic units
have a long history of use and abuse. It is generally
considered problematic to treat the derivative as a fraction of differentials rather than as a holistic unit acting
as a limit, though for practical reasons it is often done
for the first derivative. However, using a revised notation for the second and higher derivatives will allow for
the ability to treat differentials as independent units
for a much larger number of cases.
1
This question over the ontological status of differentials was somewhat paralleled by preferences in notation. Those favoring the validity of infinitesimals generally preferred the Leibniz notation, where dx and dy
are at least visually represented as individual units,
while those favoring the limit conception of the derivative generally prefer the Lagrange notation, where the
derivative is a holistic unit.
In an interesting turn of events, in the late 19th century,
the Leibniz notation for the derivative largely won out,
but the Langrangian conception of the derivative has
been the favored intellectual interpretation of it. Essentially, this means that equations are generally written
as if there were distinct differentials available, but they
are manipulated as if they only represent limits of a
ratio which cannot be taken apart.
Introduction
The calculus of variations has had a long, rich history,
with many competing notations and interpretations.
The fluxion was the original concept of the derivative
invented by Isaac Newton, and even had a notation
similar to the modern Lagrange notation. A competing notation for the derivative is the Leibniz notation,
where the derivative is expressed as a ratio of differentials, representing arbitrarily small (possibly infinitesimal) differences in each variable.
This dichotomy has led to an unfortunate lack of development of the notation. Because it is generally assumed
that differentials are not independent algebraic units,
the fact that issues arise when treating them as such
has not caused great concern, and has simply reinforced
the idea that they should not be treated algebraically.
Therefore, there has been little effort to improve the
notation to allow for a more algebraic treatment of individual differentials.
The calculus was originally thought of as examining
infinitely small quantities. When these infinitely small
quantities were put into ratio with each other, the result
could potentially be within the reals (a likely result
for smooth, continuous functions). But, on their own,
these infinitesimals were thought of as infinitely close
to zero.
However, as will be shown, the algebraic manipulability
of differentials can be greatly expanded if the notation
for higher-order derivatives is revised. This leads to an
overall simplification in working with calculus for both
students and practitioners, as it allows items which are
written as fractions to be treated as fractions. It prevents students from making mistakes, since their natu-
The concept of an infinitesimal caused a great deal of
difficulty within mathematics, and therefore calculus
was revised for the derivative to represent the limit
of a ratio. In such a conception, dx and dy are not
really independent units, but, when placed in ratio
1
ral inclination is to treat differentials as fractions.1 Additionally, there are several little-known but extremely
helpful formulas which are straightforwardly deducible
from this new notation.
Say that it is later discovered that x is a function of t
so that x = t 2 . The problem here is that the chain rule
for the second derivative is not the same as what would
be implied by the algebraic representation.
Even absent these practical concerns, we find that
reconceptualizing differentials in terms of algebraicallymanipulable terms is an interesting project in its own
right, and perhaps may help us see the derivative in a
new way, and adapt it to new uses in the future. There
may also be additional formulas which can in the future
be more directly connected to the algebraic formulation
of the derivative.
Here we arrive at one of the major problematic points
for using the current notation of the second derivative
algebraically. To demonstrate the problem explicitly, if
one were to take the second derivative seriously as a set
2
of algebraic units, one should be able to multiply dd xy2
2
by ddxt 2 to get the second derivative of y with respect to
t. However, this does not work. If the differentials are
2
being treated as algebraic units, then ddxt 2 is the same as
( )2
dx
, which is just the first derivative of x with respect
dt
to t squared. The first derivative of x with respect to
t is ddxt = 2t. Therefore, treating the second derivative
algebraically would imply that all that is needed to do
to convert the second derivative of y with respect to x
into the second derivative of y with respect to t is to
multiply by (2t) 2 .
2
The Problem of Manipulating
Differentials Algebraically
When dealing with the first derivative, there are generally few practical problems in treating differentials
algebraically. If y is a function of x, then ddyx is the first
derivative of y with respect to x. This can generally be
treated as a fraction.
However, this reasoning leads to the false conclusion
2
that ddt 2y = 24t 4 . If, instead, the substitution is done
at the beginning, it can be easily seen that the result
should be 30t 4 :
For instance, since ddyx is the first derivative of x with respect to y, it is easy to see that these values are merely
the inverse of each other. The inverse function theorem of calculus states that ddyx = d1y . The generaliza-
y = x3
x = t2
y = (t 2 ) 3
dx
y = t6
tion of this theorem into the multivariable domain essentially provides for fraction-like behavior within the
first derivative.
y ′ = 6t 5
y ′′ = 30t 4
Likewise, in preparation for integration, both sides of
the equation can be multiplied by dx. Even in multivariate equations, differentials can essentially be multiplied and divided freely, as long as the manipulations
are dealing with the first derivative.
This is also shown by the true chain rule for the second derivative, based on Faà di Bruno’s formula [2].
This formula says that the chain rule for the second
derivative should be:
( )2
dy d2 x
d2 y d2 y dx
=
(1)
+
dt 2
dx 2 dt
dx dt 2
Even the chain rule goes along with this. Let x depend
on parameter u. If one has the derivative dduy and multiplies it by the derivative ddux then the result will be
dy
d x . This is identical to the chain rule in Lagrangian
notation.
This, however, is extremely unintuitive, and essentially
makes a mockery out of the concept of using the differential as an algebraic unit.
It is generally assumed that this is a problem for the
idea that second differentials should be treated as algebraic units. However, it is possible that the real problem is that the notation for second differentials has not
been given as careful attention as it should.
It is well recognized that problems occur when if one
tries to extend this technique to the second derivative
[1]. Take for a simple example the function y = x 3 .
The first derivative is ddyx = 3x 2 . The second derivative
is
d2 y
dx2
= 6x.
The habits of mind that have come from this have even
affected nonstandard analysis, where, despite their appreciation for the algebraic properties of differentials,
1 Since many in the engineering disciplines are not formally
trained mathematicians, this also can prevent professionals in
applied fields from making similar mistakes.
2
have left the algebraic nature of the second derivative
either unexamined (as in [3]) or examined poorly (i.e.,
leaving out the problematic nature of the second derivative, as in [4, pg. 4]).
dx as separate steps. Originally, in the Leibnizian conception of the differential, one did not even bother solving for derivatives, as they made little sense from the
original geometric construction of them [5, pgs. 8, 59].
3
For a simple example, the differential of x 3 can be found
using a basic differential operator such that d(x 3 ) =
3x 2 dx. The derivative is simply the differential divided
3
)
2
by dx. This would yield d(x
d x = 3x .
A Few Notes on Differential
Notation
For implicit derivatives, separating out taking the differential and finding a particular derivative greatly simplifies the process. Given an function (say, z 2 = sin(q)),
the differential can be applied to both sides just like any
other algebraic manipulation:
Most calculus students glaze over the notation for
higher derivatives, and few if any books bother to give
any reasons behind what the notation means. It is important to go back and consider why the notation is
what it is, and what the pieces are supposed to represent.
z 2 = sin(q)
d(z 2 ) = d(sin(q))
In modern calculus, the derivative is always taken with
respect to some variable. However, this is not strictly
required, as the differential operation can be used in a
context-free manner. The processes of taking a differential and solving for a derivative (i.e., some ratio of
differentials) can be separated out into logically separate operations.2
2z dz = cos(q) dq
From there, the equation can be manipulated to solve
for ddqz or ddqz , or it can just be left as-is.
The basic differential of a variable is normally written
simply as d(x) = dx. In fact, dx can be viewed merely
a shorthand for d(x).
In such an operation, instead of doing ddx (taking the
derivative with respect to the variable x), one would
separate out performing the differential and dividing by
The second differential is merely the differential operator applied twice [5, pg. 17]:
2 The idea that finding a differential (i.e., similar to a derivative, but not being with respect to any particular variable) can be
separated from the operation of finding a derivative (i.e., differentiating with respect to some particular variable) is considered
an anathema to some, but this concept can be inferred directly
from the activity of treating derivatives as fractions of differentials. The rules for taking a differential are identical to those for
taking an implicit derivative, but simply leaving out dividing the
final differential by the differential of the independent variable.
For those uncomfortable with taking a differential without
a derivative (i.e., without specifying an independent variable),
imagine the differential operator d() as combining the operations
of taking an implicit derivative with respect to a non-present
variable (such as q) followed by a multiplication by the differential of that variable (i.e., dq in this example). So, taking the
differential of e x is written as d(e x ) and the result of this operation is e x dx. This is the same as if we had taken the derivative
with respect to the non-present variable q and then multiplied
by dq. So, for instance, taking the differential of the function
e x , the operation would start out with a derivative with respect
to q ddq (e x ) = e x ddqx followed by a multiplication by dq, yielding
just e x dx.
Doing this yields the standard set of differential rules, but allows them to be applied separately from (and prior to) a full
derivative. Also note that because they have no dependency on
any variable present in the equation, the rules work in the singlevariable and multi-variable case. Solving for a derivative is then
merely solving for a ratio of differentials that arise after performing the differential. It unifies explicit and implicit differentiation
into a unified process that is easier to teach, use, and understand, and requires few if any special cases, save the standard
requirements of continuity and smoothness.
d(d(x)) = d(dx) = d2 x
(2)
Therefore, the second differential of a function is merely
the differential operator applied twice. However, one
must be careful when doing this, as the product rule
affects products of differentials as well.
For instance, d(3x 2 dx) will be found using the product
rule, where u = 3x 2 and v = dx. In other words:
d(3x 2 dx) = 3x 2 (d(dx)) + d(3x 2 ) dx
= 3x 2 d2 x + 6x dx dx
= 3x 2 d2 x + 6x dx 2
2
The point of all of this is to realize that the notation dd xy2
is not some arbitrary arrangement of symbols, but has a
deep (if, as will be shown, slightly incorrect or misleading) meaning. The notation means that the equation
is showing the ratio of the second differential of y (i.e.,
d(d(y))) to the square of dx (i.e., dx 2 ).3
In other words, starting with y, then applying the differential operator twice, and then dividing by dx twice,
3 In Leibniz notation, dx 2 is equivalent to (dx) 2 . If the differential of x 2 was wanted, it would be written as d(x 2 ). The rules
are given in [5, pg. 24].
3
2
2
arrives at the result dd xy2 . Unfortunately, that is not the
same sequence of steps that happens when two derivatives are performed, and thus it leads to a faulty formulation of the second derivative.
4
in (3) is that the ratio dd xx2 reduces to zero. However,
this is not necessarily true. The concern is that, since
d2 x
dx
d x is always 1 (i.e., a constant), then d x 2 should be
zero. The problem with this concern is that we are no
2
longer taking dd xx2 to be the derivative of dd xx . Using the
notation in (3), the derivative of dd xx would be:
Extending the Second Derivative’s Algebraic Manipulability
d
(
dx
dx
dx
)
=
d2 x dx d2 x
−
dx 2 dx dx 2
(5)
In this case, since dd xx reduces to 1, the expression is ob2
viously zero. However, in (5), the term dd xx2 is not itself
necessarily zero, since it is not the second derivative of
x with respect to x.
As a matter of fact, order of operations is very important when doing derivatives. When doing a derivative,
one first takes the differential and then divides by dx.
The second derivative is the derivative of the first, so
the next differential occurs after the first derivative is
complete, and the process finishes by dividing by dx
again.
5
However, what does it look like to take the differential
of the first derivative? Basic calculus rules tell us that
the quotient rule should be used:
( )
dx(d(dy)) − dy(d(dx))
dy
=
d
dx
(dx) 2
dx d2 y − dy d2 x
=
dx 2
2
dx d y dy d2 x
=
−
dx 2
dx 2
2
dx d y dy d2 x
−
=
dx dx
dx dx
d2 y dy d2 x
=
−
dx
dx dx
Then, for the second step, this can be divided by dx,
yielding:
( )
d ddyx
d2 y dy d2 x
=
−
(3)
dx
dx 2 dx dx 2
This, in fact, yields a notation for the second derivative
which is equally algebraically manipulable as the first
derivative. It is not very pretty or compact, but it
works algebraically.
The Notation for the Higher
Order Derivatives
The notation for the third and higher derivatives can
be found using the same techniques as for the second
derivative. To find the third derivative of y with respect
to x, one starts with the second derivative and takes the
differential:
( dy )
d
dx +
/
d *.
dx
, ( )
d2 y dy d2 x
−
=d
dx 2 dx dx 2
)
(
dx d2 y − dy d2 x
=d
dx 3
(dx 3 )(d(dx d2 y − dy d2 x)) − (dx d2 y − dy d2 x)(d(dx 3 ))
=
(dx 3 ) 2
3
3
2
2
d y dy d x
d xd y
dy (d2 x) 2
= 2−
−
3
+
3
dx
dx dx 2
dx 2 dx
dx dx 3
Finally, this result is divided by dx:
d*
,
The chain rule for the second derivative fits this algebraic notation correctly, provided we replace each
instance of the second derivative with its full form
(cf. (1)):
d
(
dy
dx
dx
dx
)
+
-
=
d3 y
dx3
−
d y d3 x
dx dx3
2
2
− 3 dd xx2 dd xy2 + 3 ddyx
(d2 x) 2
dx4
(6)
(4)
This in fact works out perfectly algebraically.
This expression includes a lot of terms not normally
seen, so some explanation is worthwhile. In this expression, d2 x represents the second differential of x, or
d(d(x)). Therefore, (d2 x) 2 represents (d(d(x))) 2 . Likewise, dx 4 represents (d(x)) 4 .
One objection that has been given to the present authors by early reviewers about the formula presented
Because the expanded notation for the second and
higher derivatives is much more verbose than the first
d2 y
dt 2
−
d y d2 x
dt d x 2
=
(
d2 y
dx2
−
d y d2 x
dx dx2
)(
)
dx 2
dt
+
dy
dx
(
d2 x
dt 2
−
d x d2 t
dt dt 2
)
4
this on the second derivative:
derivative, it is often useful to adopt a slight modification of Arbogast’s D notation (see [6, pgs. 209,218–
219]) for the total derivative instead of writing it as
algebraic differentials:4
d2 y dy d2 x
−
dx 2 dx dx 2
d2 x d2 y
dy (d2 x) 2
d3 y dy d3 x
−
3
+
3
D 3x y = 3 −
dx
dx dx 3
dx 2 dx 2
dx dx 4
D 2x y =
d2 y dy d2 x
−
dx 2 dx dx 2
dx 3 d2 y dx 3 dy d2 x dx 3
D2x y 3 = 2 3 −
dy
dx dy
dx dx 2 dy 3
( )3
dx
d2 y dx d2 x
D2x y
−
= 2
dy
dy dy dy 2
( )3
d2 x dx d2 y
dx
2
= 2−
−D x y
dy
dy
dy dy 2
D2x y =
(7)
(8)
This gets even more important as the number of derivatives increases. Each one is more unwieldy than the
previous one. However, each level can be interconverted
into differential notation as follows:
D xn y =
d(D xn−1 y)
dx
1
−D2x y * dy
, dx
(9)
It can be seen that this final equation is the derivative
of x with respect to y. Therefore, it can generally be
stated that the second derivative of y with respect to x
can be transformed into the second derivative of x with
respect to y with the following formula:
(
)3
1
2
−D x y
(10)
= Dy2 x
D1x y
The advantages of Arbogast’s notation over Lagrangian
notation are that (1) this modification of Arbogast’s
notation clearly specifies both the top and bottom differential, and (2) for very high order derivatives, Lagrangian notation takes up n superscript spaces to write
for the nth derivative, while Arbogast’s notation only
takes up log(n) spaces.
To see this formula in action on a simple equation, consider y = x 3 . Performing two derivatives gives us:
Therefore, when a compact representation of higher order derivatives is needed, this paper will use Arbogast’s
notation for its clarity and succinctness.5
6
3
2
2
+ = d x − dx d y
dy 2 dy dy 2
)3
(
d2 x dx d2 y
1
−
=
−D2x y
dy 2 dy dy 2
D1x y
y = x3
D1x y
D2x y
= 3x
(11)
2
= 6x
(12)
(13)
According to (10), Dy2 x (or, x ′′ in Lagrangian notation)
can be found by performing the following:
( 1 )3
Dy2 x = −(6x)
3x 2
−6x
=
27x 6
−2 −5
=
x
(14)
9
This can be checked by taking successive derivatives of
the inverse function of (11):
Swapping the Independent
and Dependent Variables
In fact, just as the algebraic manipulation of the first
derivative can be used to convert the derivative of y
with respect to x into the derivative of x with respect to
y, combining it with Arbogast’s notation for the second
derivative can be used to generate the formula for doing
1
x = y3
1 −2
Dy1 x = y 3
3
2 −5
2
(15)
Dy x = − y 3
9
(15) can be seen to be equivalent to (14) by substituting
for y using (11):
4 The difference between this notation and that of Arbogast is
that we are subscripting the D with the variable with which the
derivative is being taken with respect to. Additionally, we are
always supplying in the superscript the number of derivatives we
are taking. Therefore, where Arbogast would write simply D,
this notation would be written as D 1x .
5 It may be surprising to find a paper on the algebraic notation of differentials using a non-algebraic notation. The goal,
however, is to only use ratios when they act as ratios. When
writing a ratio that works like a ratio is too cumbersome, we
prefer simply avoiding the ratio notation altogether, to prevent
making unwarranted leaps based on notation that may mislead
the intuition.
−5
2
Dy2 x = − (x 3 ) 3
9
2 −5
=− x
9
5
(16)
the real exact solutions of which is
√
6 3 2c1
y(x) = √
−
√
3
2
3
162(x + c2 ) + 23328c1 + [162(x + c2 )]
√
√
This is the same result achieved by using the inversion
formula (cf. (10)).
7
Using the Inversion Formula
for the Second Derivative
3
162(x + c2 ) +
−
23328c13 + [162(x + c2 )]2
√
3 32
(19)
While the inversion formula (cf. (10)) is not original,
it is a tool that many mathematicians are unaware of,
and is rarely considered for solving higher-order differentials.6
Here c1 and c2 are integration constants that must be
determined from given boundary or Cauchy conditions.
On the other hand, (18) results in
y3
+ c1 y + c2,
6
the real inverse of which exactly coincides with (19).
As an example of how to apply (10), consider second
order ordinary nonlinear differential equations of the
form
(
)
F y ′′, y ′, y = 0.
x(y) =
Equations of this form can be solved implicitly for
8
F (a, b, c) = a − b3 f (c)
Relationship to Historic Leibnizian Thought
for generic function f . Indeed, consider the equation
(
dy
D2x y = f (y)
dx
)3
.
The view of differentials presented by Leibniz and those
following in his footsteps differed significantly from the
modern-day view of calculus. The modern view of calculus focuses on functions, which have defined independent and dependent variables. The Leibniz view,
however, according to [5], is a much more geometric
view. There is no preferred independent or dependent
variable.
(17)
Then, by virtue of (10) we derive
Dy2 x = − f (y).
Integration of this equation with respect to y twice will
provide with
∫ ∫
x(y) = −
f (y) dy dy.
(18)
The modern concept of the derivative generally implies
a dependent and in independent variable. The numerator is the dependent variable and the denominator is
the independent variable. In the geometric view, however, there are only relationships, and these relationships do not necessarily have an implied dependency
relationship.
For simplicity, let
f (y) = y,
Therefore, Leibnizian differentiation doesn’t occur with
respect to any independent variable. There is no preferred independent variable. Likewise, as we have seen
in Sections 6 and 7, the version of the differential presented here allows for the reversal of variable dependency relationships. Similarly, the procedure of differentiation given in Section 3 which allows us to formulate the new notation for the second derivative given in
(3) follows the Leibnizian methodology, where the differentiation is done mechanically without considering
variable dependencies.
so that (17) is reduced to
D2x y = y
(
dy
dx
)3
,
6 The
authors of this paper, as well as several early reviewers,
had originally thought that the inversion formula was a new finding. Again, that is the usefulness of the notation. Specific formulas such as the inversion formula do not need to be taught, as they
simply flow naturally out of the notation. Even though the inversion formula is not new with this paper, showing how the present
authors were able to use it to good benefit demonstrates the benefit of an improved notation—practitioners needs not memorize
endless formulas, but they can be developed straightforwardly as
needed based upon basic intuitions.
Leibniz did, however, consider certain kinds of variables
which map very directly to what we would consider as
6
However, if we assume that x is truly the independent
variable, then this means that d2 x = 0 and therefore
2
the whole expression ddyx dd xx2 reduces to 0 as well. This
“independent” variables. In the Leibniz conception,
what we would consider an “independent” variable is
a variable whose first derivative is considered constant.
This leads to numerous simplifications of differentials
because, if a differential is constant, by standard differential rules its differential is zero. Therefore, if x is the
independent variable (using modern terminology) then
that implies that dx is constant. If dx is a constant
(even if it is an infinitely small, unknowable constant),
then that means that its differential is zero. Therefore, d2 x and higher differentials of x reduce to zero,
simplifying the equation.7
2
reduces (3) to the modern notation of dd xy2 . Additionally, if we take the assumption that x is the independent
variable, then the problems identified in Section 2 disappear, because x, as an independent variable, cannot
then be dependent on t.8
2
In addition to (3) being reducible to dd xy2 under the assumption that x is the independent variable, the Leibnizian view also gives a set of tools that allows us to
2
reinflate instances of dd xy2 into (3). Euler showed that,
given an equation from a specific “progression of variables” (i.e., a particular choice of an independent variable), we can modify that equation in order to see what
it would have been if no choice of independent variable had been made. According to [5, pg. 75], the substitution for reinflating a differential from a particular
progression of variables (i.e., a particular independent
variable) into one that is independent of the progression of variables (i.e., no independent variable chosen),
an expansion practically identical to (3) can be used.
As an example, given the equation
xy = 3
the first differential of this would be given by
x dy + y dx = 0
and the second differential of this would be given by
x d2 y + 2 dx dy + y d2 x = 0.
Then, you could simplify the equation by choosing any
single differential to hold constant. This is referred
to in Leibnizian thought as choosing a “progression of
variables,” and it is identical to choosing an independent variable [5, pg. 71]. Therefore, if one chooses x
as the independent variable, then dx is constant, and
therefore d2 x = 0. Thus, the equation reduces to
9
Future Work
The notation presented here provides for a vast improvement in the ability for higher order differentials
to be manipulated algebraically.
x d2 y + 2 dx dy = 0.
This improved notation yields several potential areas
for study. These include:
However, if y is the independent variable, then dy is
held constant and therefore its differential, d2 y = 0.
This leads to the equation
1. developing a general formula for the algebraic expansion of higher derivatives,
2 dx dy + y d2 x = 0.
This understanding explains the success of the modern
notation of the second derivative. The notation given
in (3) is
d2 y dy d2 x
D2x =
−
.
dx 2 dx dx 2
2. identifying additional second order differential
equations that are solvable by swapping the dependent and independent variable,
3. finding other ways that differential equations can
be rendered solvable using insights from the new
notation,
7 As a way of understanding this, imagine the common independent variable used in physics, especially prior to relativity—
time. Especially consider the way that time flows in a prerelativistic era. It flows in a continual, constant fashion. Therefore, if the flow of time (i.e., dt) is constant, then by the rules of
differentiation the second differential of time must be zero. Thus,
an independent variable is one which acts in a similar fashion to
time. Another way to consider this is to consider the independence of the independent variable. It’s changes (i.e., differences)
are, by definition, independent of anything else. Therefore, we
may not assign a rule about the differences between the values.
Thus, because there is no valid rule, the second differential may
not be zero, but it is at most undefinable by definition.
4. finding further reductions in special formulas that
can be rendered by using algebraically manipulable
notations,
8 To be clear, there is nothing preventing someone from making an independent variable dependent on a parameter. However,
doing so then brings them around to needing to use the form of
the second derivative defined here (which does not presume a
particular choice of independent variable), or a compensating
mechanism such as Faà di Bruno’s formula.
7
5. extending this project to allow partial differentials
to be algebraically manipulable.
10
Acknowledgements
The authors would like to thank Aleks Kleyn, Chris
Burba, Daniel Lichtblau, George Montañez, and others
who read early versions of this manuscript and provided
important feedback and suggestions.
References
[1] E. W. Swokowski, Calculus with Analytic Geometry. PWS Publishers, alternate ed., 1983.
[2] W. P. Johnson, “The curious history of Fa‘a
di Bruno’s formula,” American Mathematical
Monthly, pp. 217–234, 2002.
[3] J. M. Henle and E. M. Kleinberg, Infinitesimal Calculus. Dover Publications, 1979.
[4] H. J. Keisler, Elementary Calculus: An Infinitesimal Approach. PWS Publishers, second ed., 1985.
[5] H. J. M. Bos, “Differentials, higher-order differentials, and the derivative in the Leibnizian calculus,”
Archive for History of Exact Sciences, vol. 14, no. 1,
pp. 1–90, 1974.
[6] F. Cajori, A History of Mathematical Notations
Volume II. Open Court Publishing, 1929.
8