By the way, if you find my note hard to read, I recommend that you read Ballmann's notes, whose link is in my post.
The idea is as follows: Although holonomy is usually described by an embedded square, I find it easier to use an embedded disk parameterized by a square. Given a point $p$ in a Riemannian manifold $M$ and a topologically trivial loop $c: [0,1]\rightarrow M$, where $c(0) = c(1) = p$, let $S \subset M$ be a parameterized square, where 3 of its sides map to $p$, and the 4th is the loop $c$. In other words, there is a map
$$
C: [0,1] \times [0,1] \rightarrow M
$$
such that $$C(\{0\}\times [0,1] \cup [0,1]\times\{0,1\}) = \{p\}$$
and $c(t) = C(1,t)$.
The idea now is to extend $X$ appropriately to the interior of the square, define a reference frame on the square, and measure the infinitesimal holonomy of $X$ at each point inside the square. Done properly, the integral of the infinitesimal holonomy will be equal to $P_CX - X$ at $p$. On the other hand, the infinitesimal holonomy at each point on the square is just the curvature tensor evaluated using two vectors tangent to the square, $X$, and one of the vectors in the reference frame.
First, parallel translate $X$ along the curve from $p$ back to $p$. Extend $X$ to the interior of the square as follows: For each $0 \le t \le 1$, observe that the curve $C(s,\cdot)$ is a loop from $p$ to $p$, where if $s= 0$, it is the trivial loop at $p$ and if $s=1$, it is the original loop we started with. Parallel translate $X$ along each loop, from $C(0,t)$ to $C(1,t)$, Observe that, even though $C(0,0) = C(1,1) = p$, $X(1,1) - X(0,0) \ne 0$. In fact, $$ P_C(X)-X = X(1,1)-X(0,0).$$
To calculate the infinitesimal holonomy in the interior of the square, it is necessary to measure $X$ relative to a reference frame. Let $Y$ denote one of the vectors in the frame at $p$. We then want to extend $Y$ appropriately to the square and measure how $g(X,Y)$ varies in the interior of the square. Here, we want $Y(1,1) = Y(0,0)$, so that $g(X,Y)$ really does measure the holonomy of $X$ relative to $Y$. So what we do is to parallel translate $Y$ from $p$ to each point $c(t) = C(1,t)$ along the curve $C(\cdot,t)$.
We now have vector fields $X(s,t)$ and $Y(s,t)$, where
$$
g(P_C(X)-X,Y) = g(X(1,1),Y(1,1)) - g(X(0,0),Y(0,0)).
$$
If you now calculate $\partial^2_{st}g(X(s,t),Y(s,t))$, the curvature tensor appears. The fundamental theorem of calculus says that the integral of this on the square is equal to $g(P_C(X)-X,Y)$.