Conservative Vector Fields And Potential Functions

Cristian Mesiano
May 9
7 min read

Last time, we looked at Fréchet differentiability simply put, it's how we generalize the idea of a derivative in ℝ to multiple dimensions.

In this post, the focus is on a crucial concept which has vast usage in both classical and quantum physics: conservative vector fields. The deep connection between conservative vector fields and potential functions hinges on Schwarz's Theorem.

Partial/Total Differentiability

It's worth to highlight that f(x_0 +h) = f(x_0)+l(h) + o(h) Geometrically, we are saying that the increment of the function is approximated by a linear function of h.

A function f is totally differentiable if admits l(h) (plus an error that goes to zero faster than h, when h->0).

Foglietto nr. 2: Totally differentiable function an example

Pay attention to the example shown in Foglietto nr.2: The Differential l(h) for a function f: ℝ^2->ℝ^3 is a vector 3x1 and it represents the best linear approximation of the function near x_0.

Wait a minute: an approximation implies introducing an error. What happens to the error in the neighborhood of x_0 is what matters! If the error of f(x_0+h) doesn't go to zero faster than |h|, it means that the closer we get to x_0, the more significant the error becomes relative to the step.

Foglietto nr.3: analysing the Error behaviour

The Foglietto nr 3 analyses the error behaviour. You might reach to the same conclusions even putting h in polar form!

From Pointwise Differentiability to Differentiability on Open Sets

So far, we have strictly confined the analysis of differentiability to specific points in the function's domain. Extending this to open sets is straightforward as shown in Foglietto nr 4.

This operation extends the range of applicability of differentiability from single point to a set.

This is useful because we can apply the Mean Value Theorem on a segment/interval of the function.

Foglietto nr 4: Total differentiability on open sets

We'll leverage it during the proof of the following theorem.

Total differentiability theorem

The theorem gives us a powerful tool to approximate (locally) a multivariable function through hyperplanes.

Furthermore it provides with conditions to assure the continuity of the function

Foglietto nr 5: Total Differentiability Theorem

Differentiability in multivariable calculus isn't just about "having a derivative"; it’s about the existence of a linear approximation that fits the surface perfectly at a point.

If the conditions of the theorem are met, we can approximate the function locally using the equation of a tangent hyperplane.

Partial derivatives: Tell you how the function changes along the axes.

Total differentiability: Tells you how the function changes in any direction, ensuring the "local flatness" required for physics and optimization.

The conditions are shown in Foglietto nr 5.

Foglietto nr 6: Total differentiability scheme of proof.

The scheme of the proof is shown in Foglietto nr 6: it's not mathematically so formal, but I hope it gives you the rational behind it.

The extension of the theorem to open-set becomes clearer in the proof schema cause it allows for the use of Lagrange theorem!

The Chain Rule

Have you ever wondered how a neural network actually works? One of the cornerstones of the learning systems widely used in LLMs, for example, is the chain rule. Weight corrections for every layer of the network are calculated precisely using the chain rule—and what's more, it happens across all layers in one fell swoop. How? Let’s take a look at the Foglietto nr. 7

You can visualize the structure of the Jacobian for a composite function by thinking of the differentials as matrix operators. From this perspective, the chain rule boils down to the simple associative property of matrix multiplication.

Back to the example of Neural Networks the action of successive layers concatenates linearly!

Let's dive deeper into the calculus. The proof of Foglietto nr 8 exactly shows how the infinitesimals behave: notice how the higher-order errors collapse into a single o(h), leaving the Jacobian multiplication as the core of the composite derivative.

From Jacobian to Gradient

The definition of the gradient per se, doesn't justify its fame: at the very end it's just the Transpose of a Jacobian, but consider the composition of functions shown in Foglietto nr. 9, and its power shines immediately!

Let's make it clear with an example:

let f a function ℝ^2->ℝ be the temperature measured by a probe that takes in input coordinates (x,y)
Let γ be a function ℝ->ℝ^2 that gives the position of the probe at the time t.
f(γ(t)) represents the temperature measured by the probe in a given point in time.

The Jacobian J_fºγ evaluated in t is simply the scalar product (inner product) between the gradient of f evaluated in γ(t) and the Jacobian of γ(t). Let's dive in and see how to interpret it!

From a physical point of view Jacobian of γ represents the velocity!

Look at the control-lines in the chart (foglietto nr.10): f(.) = c represents

the points having the same temperature.
it's the image of γ --> set of points

Since c is a constant --> the derivative of f(γ(t) = 0 --> the inner product = 0 --> the gradient of the temperature and the jacobian of γ are orthogonal:

The gradient vector is always perpendicular to the control line chart.
Th Jacobian of γ is tangent to the control line.

The local linearisation can be easily shown as a plane in function f: ℝ^2->ℝ:

https://video.wixstatic.com/video/a0f8f4_96c6d86fbd5a47049b4b516ce1290a62/1080p/mp4/file.mp4

Video nr 1: linearisation example

Directional derivatives

We have seen that the Jacobian of γ is tangent to the level sets (or contour lines), while the gradient ∇f is orthogonal to them. This naturally raises a question: is it possible to calculate derivatives in other, arbitrary directions?

The answer is YES it is!

Foglietto nr 11: Directional Derivatives

Ultimately, it all comes down to an increment h applied to 0 along a specific direction defined by a unit vector v. The increment of the function is then simply f(x0 + .v) - f(x0).

By taking the limit as h to 0, we obtain the rate of change specifically along that path.

The link with ∇

To exploit the link between the gradient and the directional derivates, it's worth to introduce a couple of substitutions into the definition of directional derivates:

f(t) = x0 +t*v
γ(t) = x0+t*v

The first change basically proves that the directional derivative is nothing but a derivation along a given line: we are "slicing" the high-dimensional surface of f with a vertical plane that passes through x0 and follows the direction v:

The Path x0 + tv: This is the direction, we start at x0 and we move in a straight line in direction v.

The second change is meant to extend the former equivalence along all the possible values of t.

Pay attention to the last step: At the very end the directional derivative is nothing but the inner product between the gradient of f and the unitary vector! (again look at the beauty of the chain rule in action :) )

Foglietto nr.12: Directional derivatives, convenient notation

To wrap-up, we can observe that the partial derivatives are just a special case of directional derivatives. By choosing the standard basis vectors e_i, we project the gradient onto the axes as shown in Foglietto nr 12.

This highlights a crucial distinction: while a partial derivative is a scalar representing slope, the gradient is the vector sum of these slopes, providing the complete orientation of the field in space.

The Fastest Increase

Foglietto nr 13: The gradient is the direction hab

When we optimise a loss function looking for a (local) minimum/maximum we usually follow the direction imposed by the gradient: the reason is simple, it's the fastest way to reach the minimum/maximum of a function. The proof shown in Foglietto nr 13 is just the direct consequence of inner product definition.

The Schwarz's Theorem

This is by far one of the most important theorem in multivariable calculus, since it represents the backbone of physical and mathematical system like conservative fields and Lagrangian mechanics.

We already encountered in the post the requirement of continuity of the partial derivatives: it comes from the proof scheme based again on the Lagrange theorem.

Foglietto nr 15: Schwarz's Theorem Proof

The basic schema of the proof lies on the idea of considering an infinitesimal rectangular patch of the function:

We can hold x1 constant and consider x2 as the sole variable. Since f is continuous in the patch, we can rightfully the Lagrange Theorem: it must exist ξ1 for which the incremental rapport between [u(h1,h2)-u(0,h2)] = the derivative respect x1 of u(ξ1,h2)
The next step is to apply again the Lagrange Theorem! This time we have a constraint ξ1 is fixed and the intermediate point ξ2 holds between 0 and h2.

Let's repeat the same steps 1. and 2. considering now x2 constant and x1 as the variable: what we get is a simple equivalence of the two results when

h1,h2 tend to zero.

If h1,h2 tend to zero, then the intermediate points ξ1,ξ2,γ1,γ2 tend to zero too since the intervals tend to zero!

The continuity of the partial derivates then allow the final result!

Foglietto nr.15 gives you a step by step formalization.

Conservative Vector Fields And Potential Functions

We reach almost at the end of the journey of this post, we have all the foundation to justify an extremely important concept both for classical and quantum physics: the conservative fields and its tight relationship with the respective potential function.

Look at the beauty of the definitions: the necessary conditions arise naturally as a consequence of Schwarz's theorem and the definition of the gradient!

Foglietto nr.16 the Conservative Vector Fields & Potential Function

Well, the theorem gives us the necessary conditions, but it doesn't tell us how to find the potential function associated to a conservative vector field.

Foglietto nr 17: Find the Potential Function of V

Simply integrating the components of a conservative vector field can hide some pitfalls whose solutions are not as obvious as one might think, look at the example shown in Foglietto nr 17 and focus on the integration constants:

The integration of V with respect to dx can be visualised as the curve obtained by cutting the surface with a plane passing by a fixed y. The constant clearly depends on which y has been chosen -> c depends on y.
The same reasoning applies if we integrate with respect to dy: we obtain a constant depending on x.
If the system comes up with mixed terms they must coincide. This is a direct consequence of the the condition ∂V(x0,y0)/∂x = ∂V(x0,y0)/∂y which follows from Schwarz’s theorem.

The example discussed above is represented in the Image 1: notice the section represented by the integration with respect to one variable.