coderodde's technical weblog

For best rendering of formal text,
it is advised to view this page via desktop browsers.

Solving most cost-effective loan problem

1 Introduction

In the most cost-effective loan problem, we are given a directed graph of actors where each actor may lend some amount of resources it possesses to its child nodes. In case an actor needs more than his immediate parents can lend, the parents might need to lend from their parents, adjust the interest rate to cover their own expenses, and pass the funds to the original lending actor.

Formally, we are given a directed graph $G = (V, A)$, where $V$ is the set of actors, and $A \subseteq V^2 \setminus \{ (u, u) \in V^2 \text{ for all } u \in V\}$ is the set of directed arcs, excluding self-loops. By $V(G)$ we denote the actor set of $G$, and, likewise, by $A(G)$ we denote the arc set of $G$. Given an arc $(u, v) \in A$, we call $u$ a parent of $v$, $v$ a child of $u$. Existence of such an arc indicates that $u$ may lend some or all of its resources to $v$. Along the graph, we are given a potential function $\mathfrak{P} \colon V \to [0, \infty) = \mathbb{R}_{\geq 0}$ that maps each actor in the graph to the (non-negative) equity that that very node has at its disposal. Finally, we are given an interest rate function $\mathfrak{I} \colon A \to \mathbb{R}_{\geq 0}$ that maps each arc $(u, v) \in A$ to the interest rate the actor $u$ can offer $v$ when $v$ decides to lend from $u$. From now on, we will call the aforementioned amount of resources or equity simply potential.

Apart from the target data structure, in a problem instance, we are given an actor $a \in V$ that applies for a loan, a required potential $P \in \mathbb{R}_{> 0}$ and a maximum tolerable interest rate $i \in \mathbb{R}_{\geq 0}$. Our aim, then, is to compute a loan (which may involve more than one lending actor) with minimized interest rates.

2 Interest rate model

Throughout the post, we assume a simple interest rate model. The accumulated balance at time $t$ since the time point at which the loan was issued, with initial principal $\mathfrak{A}$ and interest rate $r$ is given by \[ \mathfrak{A}(1 + r)^t. \] If we have in the graph, for instance, a directed acyclic path $\langle u, v, z \rangle$ with $r_{u,v}$ being the interest rate of $(u, v)$, and $r_{v, z}$ being the interest rate of $(v, z)$, the interest rate equation becomes \[ \mathfrak{A}(1 + r_{u,v})^t (1 + r_{v,z})^t = \mathfrak{A}\big[ (1 + r_{u,v})(1 + r_{v,z}) \big]^t = \mathfrak{A}(1 + R)^t. \] Above, $R$ is the combined interest rate. Dividing both sides by $\mathfrak{A}$ and taking $t$-th root, we obtain \begin{aligned} (1 + r_{u,v})(1 + r_{v,z}) &= 1 + R \\ R + 1 &= 1 + r_{u,v} + r_{v,z} + r_{u,v}r_{v,z} \\ R &= r_{u,v} + r_{v,z} + r_{u,v}r_{v,z}. \end{aligned} In general, we write $r_1 + r_2 + r_1 r_2 = \mathfrak{C}(r_1, r_2).$

Since we may deal with entire "loan chains," we need to define the concept of effective interest rate. Effective interest rate is given by \[ I(u, v) = \min \Bigg(\mathfrak{I}(u, v), \underset{z \in {\rm C {\small hildren}}(G, u)}{\min} \mathfrak{C}(\mathfrak{I}(u, z), I(z, v))\Bigg) \] where ${\rm C{\small HILDREN}}(G, u)$ is the set of child nodes of $u$, or, formally, $\{ v \colon (u, v) \in A(G) \}.$ Also, for the above formula to be sensible, we need to set $\mathfrak{I}(u, v) = \infty$ for all missing arcs $(u, v) \not\in A(G)$, and $I(u, u) = 0$ for all $u \in V(G)$.

3 Problem statement

Given a problem instance $(G, a, \mathfrak{P}, \mathfrak{I}, P, i)$, we wish to compute two additional functions: $\pi$ and $d$. $\pi \colon V \to \mathbb{R}_{\geq 0}$ is called a solution potential function and it maps each actor $u$ to potential $\pi(u)$ $u$ can lend, and $d \colon V \to V$ is called a direction function and it maps each actor $u$ to some of its children $d(u)$ to which $\pi(u)$ worth potential is being lent. What comes to constraints, no actor $u$ lending some of its potential shall have $I(u, a) > i$, since $a$ cannot afford effective interest rates above $i$. Also, if it is not possible, due to the first constraint, to obtain a loan worth $P$, the loan should be maximized from below as close to $P$ as possible.

In order to implement the first constraint, we need to define the set of admissible solution potential functions: \[ \pi_{I, a, i, G} = \{ \pi \colon V(G) \to \mathbb{R}_{\geq 0} \; | \; \pi(u) = 0 \text{ if } I(u, a) > i \}. \] An admissible solution potential function $\pi$ is said to be valid if it also satisfies \[ \sum_{u \in V} \pi(u) \in [0, P], \] and we denote that by $\pi \in \mathfrak{V}$.

Now, we can state the objective formally: \[ \pi = \underset{\pi' \in \mathfrak{V}}{\arg \min}{\Bigg[ P - \sum_{u \in V} \pi'(u) \Bigg]}, \] and \[ d(u) = \underset{z \in {\rm C {\small HILDREN}}(G, u)}{\arg \min} \Bigg[\mathfrak{C}\big(\mathfrak{I}(u, z), I(z, a)\big)\Bigg]. \]

4 Solution algorithm

Since the effective interest rate does not decrease with adding more directed arcs, we must have that for any actor $a \in V$ the most cost-efficient lender is an immediate parent. Whether the second most cost-efficient lender is an immediate parent or not depends on effective interest rates. Regardless, since we consider effective interest rates, basically we are growing a directed "minimum spanning tree" which we extend in a greedy fashion one arc at a time: In the very beginning, the tree is trivial and consists only of $a$. Then a most cost-effective parent $u_1$ is selected and the arc $(u_1, a)$ is introduced to the tree. Then $u_2$ is selected. It must not belong to $\{a, u_1\}$ while have the lowest possible effective interest rate among all nodes in $V \setminus \{ a, u_1 \}$. This procedure continues until a desired potential $P$ is collected or until there is no more nodes left with affordable effective interest rates. Below Priority-Queue-Insert$(Q, \langle a, b, c \rangle)$ stores the triple $\langle a, b, c \rangle$ in the priority queue $Q$ and uses $c$ as a priority key.

$\text{let } Q \text{ be an empty priority queue}$ $\text{let } \pi \text{ be an empty solution potential function}$ $\text{let } d \text{ be an empty direction function}$ $\text{let } C \leftarrow \{a\}$ $\text{let } P_{\text{collected}} \leftarrow 0 \text{ be the funds currently collected}$ $\pi(u) \leftarrow 0$ $d(u) \leftarrow \Nil$ Priority-Queue-Insert$(Q, \langle u, a, \mathfrak{I}(u, a) \rangle)$ $\langle u, v, i_{\text{current}} \rangle \leftarrow $ Priority-Queue-Extract-Minimum$(Q)$ $P_\Delta \leftarrow \min (P - P_{\text{collected}}, \mathfrak{P}(u))$ $P_{\text{collected}} \leftarrow P_{\text{collected}} + P_\Delta$ $\pi(u) \leftarrow P_\Delta$ $d(u) \leftarrow v$ $C \leftarrow C \cup \{ u \}$ $i_{\text{next}} \leftarrow \mathfrak{C} \big( i_{\text{current}}, \mathfrak{I}(z, u) \big)$ Priority-Queue-Insert$(Q, \langle z, u, i_{\text{next}}\rangle)$ $(\pi, d)$

5 Proof of correctness

Lemma 1 (Termination) The algorithm terminates.
Note that each node removed from the priority queue at line 13 is inserted into the closed set $C$ at line 18. Now, the line 23 will not reintroduce the same node to $Q$ and so, $Q$ eventually becomes empty (unless the algorithm terminates earlier in case the required potential is collected).

Lemma 2 (Optimality) The algorithm finds most cost-effective loans.
The effective interest rate function $I$ implicitly defines a transitive closure from any actor $u \neq a$ to $a$ assuming the effective interest rate of $u$ is within the constraint $I(u, a) \leq i$. That way, $a$ loans from all affordable parents (immediate or intermediate) in a greedy fashion: lend as much as possible from the most affordable actor, then lend as much as possible from the second most affordable actor, and so on until the requested potential is collected or there are no more affordable lenders left. It is easy to see that such strategy minimizes interest expenses.

Lemma 3 ($d$ is acyclic) Given a solution direction function $d$, there exist no actor sequence $\langle u_1, u_2, \dots, u_k \rangle$ such that $d(u_1) = u_2$, $d(u_2) = u_3$, $\dots$, $d(u_{k - 1}) = u_k$ and $d(u_k) = u_1$.
The only way for such a cycle to emerge is to have $\mathfrak{I}(u_1, u_2) = \mathfrak{I}(u_2, u_3) = \dots = \mathfrak{I}(u_{k-1}, u_k) = \mathfrak{I}(u_k, u_1) = 0$, and to pass in $\infty$ as the required potential. Since it is reasonable not to accept infinity for the requested potential, and we have the closed list around in the algorithm, this phenomenon cannot appear. Without loss of generality, suppose the search enters the cycle via $u_1$. When $u_k$ will be removed from the priority queue, the arc $(u_k, u_1)$ will be ignored by the line 20 since $u_1$ is in $C$, and so, there is no way $d(u_k)$ could be mapped to $u_1$.

Lemma 4 The sum of all solution potentials cannot exceed $P$.
This is trivially guaranteed by the line 14 and the second test at line 12.

6 Running time analysis

All the operations except Priority-Queue-Insert and Priority-Queue-Extract-Minimum run in $\mathcal{O}(1)$ time. In particular, operations on $\pi, d$ and $C$ may be expected to run in $\mathcal{O}(1)$ on average by resorting to hash-table based maps and sets. With geometric expansion scheme [1], adding an element to a hash-table based data structure runs in amortized $\mathcal{O}(1)$. (If the load factor reaches its upper bound, the underlying storage array must be made larger.)

What comes to the priority queue $Q$, it clearly stores graph actors without copies of a same actor. Now, if we choose a binary heap, both run in $\mathcal{O}(\log V)$ time. Priority-Queue-Insert is called no more than $|E|$ times, and Priority-Queue-Extract-Minimum is called once per node, and so we have the running time $\mathcal{O}(E \log V + V \log V) = \mathcal{O}(E \log V)$. By deploying a Fibonacci heap instead, we may reduce this to $\mathcal{O}(E + V \log V)$.

7 Future work

We used a very simple interest rate model in our solution, and so it leaves the case where the used interest rate model is more realistic. In general, with initial principal $\mathfrak{A}$, interest rate $r > 0$, the number of compound periods $n$ per time unit, and time since the moment the loan was issued $t$, the balance grows according to \[ \mathfrak{C} = \mathfrak{A}\Bigg( 1 + \frac{r}{n} \Bigg)^{\lfloor nt \rfloor}. \] Also, as $n \to \infty$, $\mathfrak{C} \to \mathfrak{A}e^{rt}$. Combining loans with different parameters in a meaningful way seems to be a non-trivial task but we might address it later.

8 References

[1] Keeping vectors efficient (Retrieved 2018-03-09.)

Richard E. Korf's bidirectional depth-first iterative-deepening pathfinding algorithm

In this post, I will briefly discuss a rather simple graph search technique [1] that turned out to be rather efficient on sparse, directed, unweighted graphs. Before that, I will review some terminology. A directed, unweighted graph $G$ is an ordered pair $(V, A)$, where $V$ is the set of nodes, and $A \subseteq V \times V$ is the set of directed arcs. Given two terminal nodes $s, t \in V$, we wish to find a shortest $s,t$-path, or, in other words, a path that connects the node $s$ to the node $t$ using minimum number of arcs. According to common jargon, $s$ is called a source node, and $t$ is called a target node. Also, by $A(G)$ we will denote the arc set of $G$. Also, if we are given an arc $a = (u, v)$, we call $u$ a parent of $v$, $v$ a child of $u$. Finally, we reserve the term " search frontier" to depict the set of nodes no further from the main node than $\Delta \in \mathbb{N} \cup \{ 0 \}$ steps.

What comes to the actual algorithm in question, it utilizes iterative deepening in order to reduce memory consumption. It first checks that the invocation is trivial (the source node is the target node), and if so, returns $\langle s \rangle = \langle t \rangle$. Now, let $F_f^k$ be the set of nodes in $G$ that are reachable from $s$ in $k$ hops following the directions of the arcs. Let also $F_b^k$ be the set of nodes in $G$ that are reachable from $t$ in $k$ hops following the opposite directions of the arcs. Formally, the algorithm computes $F_f^k$ and $F_b^k$ for $k = 0, 1, \dots$ until $F_f^k$ intersects $F_b^k$ for some $k$ and returns a "meeting node" $\mu$ for which $\mu \in F_f^k \cap F_b^k$. After finding such a meeting node, the shortest path is reconstructed by recursively finding the shortest $s, \mu$-path and appending the node stack of the backward search process. Otherwise, if the meeting node is not found, $k$ is incremented, and the search repeats once again. There is, however, a catch. Suppose a simple graph $V = \{s, u_1, u_2, t\}, A = \{ (s, u_1), (u_1, u_2), (u_2, t) \}$ is searched for a shortest $s,t$-path. Now, the search procedure currently under discussion will return with no path, and that happens because the forward and backward search frontiers, F_f^k and F_b^k, respectively, go "through" each other: For $k = 1$, $F_f^k = \{ u_1 \}$ and $F_b^k = \{ u_2 \}$, and if $k = 2$, $F_f^k = \{ u_2 \}$ and $F_b^k = \{ u_1 \}$, and so, the search frontiers are not able to agree on a meeting node. In order to remedy this, for each turn of the backward search, it is run twice: once for detecting shortest paths with the even number of arcs, and once for detecting shortest paths with the odd number of arcs. Or, in another words, for each $k$ in forward direction, two backward searches are conducted: $F_b^k$ and $F_b^{k + 1}$. Last but not least, if there is no $s,t$-path, the algorith will stuck.

As it became evident, BIDDFS is also bidirectional, so we need to to discuss shortly the benefits of such an arrangement. Suppose that the average out-degree is $d$ and the shortest path consists of $N$ arcs. Now the total work done by unidirectional breadth-first search is roughly \[ W_1 = \sum_{i=0}^N d_i = \Theta(d^N), \]

where $N$ is the number of arcs on the shortest path, and $d$ is the average out-degree. Of course, the above expression requires little variance in nodes' out-degrees. If, however, we denote the average out-degree by $d_o$ and average in-degree by $d_i$, the work done by bidirectional breadth-first search is \[ W_2 = \sum_{j=0}^{N/2} d_i^j + \sum_{j=0}^{N/2} d_o^j. \] Now, if we set $d_i = d_o = d$, the speedup is roughly \begin{aligned} \frac{W_1}{W_2} = \frac{\sum_{j = 0}^N d^j}{2 \sum_{j = 0}^{N / 2} d^j} = \frac{\frac{1 - d^{N + 1}}{1 - d}}{2 \frac{1 - d^{N / 2 + 1}}{1 - d}} = \frac{1 - d^{N + 1}}{2 (1 - d^{N / 2 + 1})} =\frac{d^{N + 1} - 1}{2(d^{N / 2 + 1} - 1)} \approx \frac{d^{N + 1}}{2d^{N / 2 + 1}} = \frac{d^{N / 2}}{2} = \Theta(d^{N/2}), \end{aligned}

which implies exponential speedup in shortest path length.

The figures 1 (A) and 1 (B) demonstrate how the two search spaces go "through" each other, and the figure 1 (C) shows how to address that issue.

Before we proceed to pseudocode, we have to review some notational conventions. Given a graph $G$, the set of all children of a node $u \in V(G)$ is written as ${\rm C{\small HILDREN}}(G, u)$. Also, the set of all parents of a node $u$ is written as ${\rm P{\small ARENTS}}(G, u)$. The algorithm 1 reconstructs the shortest path by recursively searching the shortest $s,\mu$-path from the source node and to the meeting node $\mu$, appends the $B$-path to the resulting path and returning the concatenation. The algorithm 2 is the forward search routine that colors all the nodes exactly $\Delta$ hops away from the source. Note that it may color the nodes less than $\Delta$ hops away, since it may "make a turn" and proceed towards closer nodes. It is, however, sufficient for us that the routine does not reach the nodes farther than $\Delta$ steps away. The algorithm 3, like the algorithm 2, makes a depth-first search backwards, but along opposite direction (from child to parent). When the node $\mu$ for which $\mu \in F_b^k$, the search is over and the shortest path is reconstructed via algorithm 1. Finally, the algorithm 4 performs the actual search by increasingly going deeper in the graph until the two search frontiers $F_f^k, F_b^k$ (or $F_f^k, F_b^{k+1}$) intersect.

Pseudocode

$\pi \leftarrow$ Find-Shortest-Path$(G, s, \mu)$ $\text{remove the last node from } \pi$ $\pi \circ B$

$F \leftarrow F \cup \{ u \}$ Depth-Limited-Search-Forward$(G, v, \Delta - 1, F)$

$\text{prepend } u \text{ to } B$ $u$ $\text{remove the head node in }B$ $\Nil$ $\mu \leftarrow $Depth-Limited-Search-Backward$(G, v, \Delta - 1, B, F)$ $\mu$ $\text{remove the head node in } B$ $\Nil$

$\langle s \rangle$ $\langle F, B, \Delta \rangle \leftarrow \langle \emptyset, \emptyset, 0 \rangle$ Depth-Limited-Search-Forward$(G, s, \Delta, F)$ $\mu \leftarrow $ Depth-Limited-Search-Backward$(G, t, d, B, F)$ Build-Path$(s, \mu, B, G)$ $F \leftarrow \emptyset$ $\Delta \leftarrow \Delta + 1$

The results are rather impressive. What comes to solving 15-puzzles, while iterative deepening A* dominates BID by a factor of three or so, on general graphs, BID is faster than iterative deepening depth-first search by an order of magnitude, and BID outperforms unidirectional breadth-first search by three orders of magnitude. The following listing is a typical output of the benchmarking program:

                    
*** 15-puzzle graph benchmark ***
Seed = 1544375493597
BreadthFirstSearch in 1050 milliseconds. Path length: 18
IterativeDeepeningDepthFirstSearch in 36302 milliseconds. Path length: 18
BidirectionalIterativeDeepeningDepthFirstSearch in 20 milliseconds. Path length: 18
IterativeDeepeningAStar in 116 milliseconds. Path length: 18
Algorithms agree: true

*** General graph benchmark ***
Seed = 1544375531134
BidirectionalIterativeDeepeningDepthFirstSearch in 0 milliseconds. Path length: 5
IterativeDeepeningDepthFirstSearch in 10 milliseconds. Path length: 5
BreadthFirstSearch in 472 milliseconds. Path length: 5
                    
                

8 References

[1] Korf, R. E., & Columbia University. (1985). Depth-first iterative-deepening: An optimal admissible tree search. New York: Department of Computer Science, Columbia University. (Retrieved 2019-06-05.)

Computing radial sets with overlapping areas

May 20, 2019

In this post, we shall compute the radial sets without overlapping areas in polar coordinates. A function $r = f(\theta)$ is said to be monotonically reducing on the interval $I = [a, b]$, if we have $f(\theta) \geq f(\theta + 2\pi)$ for any $\theta \in [a, b - 2\pi]$. Now, the radial set $A_f(a, b) = \{ (r \cos \theta, r \sin \theta) \colon r \in [0, f(\theta)], \theta \in [a, b] \}$ is calculated as \[ A = \frac{1}{2} \int_a^b f^2(\theta) \, d\theta, \] if $b \leq a + 2\pi$. In case $b > a + 2\pi$, some of the area will overlap. We can, however, deal with that issue: \begin{equation} |A_f(a,b)| = \frac{1}{2} \Bigg[ \int_a^b f^2(\theta)\,d\theta - \int_{a + 2\pi}^b f^2(\theta)\, d\theta \Bigg]. \end{equation}

Now, suppose $f(\theta) \leq f(\theta + 2\pi)$. We must have \[ |A_f(a, b)| = \frac{1}{2} \Bigg[ \int_a^b f^2(\theta) \,d\theta - \int_a^{b - 2\pi} f^2(\theta) \, d\theta \Bigg]. \]

Rotating a parabola

In this post, we will investigate how to rotate a parabola $P$ in a plane. Let $V = (x_V, y_V)$ be the vertex of $P$ and $\alpha$ be the angle in radians between axis of symmetry and the axis parallel to the $x$-axis $(y = y_V)$, call it $\ell$. As the value of $\alpha$ grows, $P$ rotates counter-clockwise direction. Finally, we are given a parameter $\beta$, which defines geometry of $P$, $\beta t^2$. When $\alpha = 0$, $P$ opens upwards. When $\alpha = \pi / 2$, $P$ opens to the left. What we wish to accomplish is to find out how to rotate the parabola around their vertices.

Next, we need a way to calculate the position and slope of a tangent line (call it $T$) visiting $V$ such that it makes an angle of $\alpha$ radians with $\ell$.

\begin{align} \frac{y - y_V}{x - x_V} &= \tan \alpha \\ y &= (\tan \alpha) (x - x_V) + y_V \\ &= (\tan \alpha) x - (\tan \alpha) x_V + y_V \\ &= \ell(x). \end{align}

Now, given an $x \in \mathbb{R}$, \begin{aligned} t &= d((x, \ell(x)), (x_V, y_V)) \\ &= \sqrt{(x - x_V)^2 + (\ell(x) - y_V)^2}, \end{aligned}

which implies \[ \beta t^2 = \beta \big[ (x - x_V)^2 + (\ell(x) - y_V)^2 \big]. \] Given $t \in \mathbb{R}$, we move from $V$ $t$ units towards positive part of $T$ and then $\beta t^2$ upwards. Note that, for example, if $\alpha \in (\pi / 2, 3\pi / 2)$, "positive'' becomes "negative" and "upwards" becomes "downwards".

Let $V_t$ be the point on a parabola with vertex $V$, vertex tangent angle $\alpha$ and the distance from $V$ $t$, the corresponding point $V_t$ is specified by \begin{aligned} V_t &= V + (t\cos \alpha, t\sin \alpha) + (\beta t^2 \sin \alpha, \beta t^2 \cos \alpha) \\ &= (x_V + t\cos \alpha + \beta t^2 \sin \alpha, \; y_V + t \sin \alpha + \beta t^2 \cos \alpha) \end{aligned}

The tangent line is obviously defined by $y = $ In particular, Assuming the angle $\alpha$, its equation is

\begin{align} \frac{y - y_V}{x - x_V} &= \tan \alpha \\ y - y_V &= (\tan \alpha)(x - x_V) \\ y &= (\tan \alpha)(x - x_V) + y_V \\ y &= (\tan \alpha)x + y_V - (\tan \alpha) x_V. \end{align}

Deriving the polynomial derivatives

Let $f(x) = x^n$, where $n$ is a non-negative integer Now we have.

\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}x}f(x) &= \lim_{h \to 0} \frac{f(x + h) - f(x)}{h} \\ &= \lim_{h \to 0} \frac{(x + h)^n - x^n}{h}\\ &= \lim_{h \to 0} \frac{\sum_{k = 0}^n \binom{n}{k} x^{n - k}h^k - x^n}{h} \\ &= \lim_{h \to 0} \frac{x^n + nx^{n - 1}h + \sum_{k = 2}^n \binom{n}{k} x^{n - k}h^k - x^n}{h} \\ &= \lim_{h \to 0} \big[ nx^{n - 1} \big] + \lim_{h \to 0}\Bigg[ \frac{h^2 \sum_{k=2}^n \binom{n}{k} x^{n - k}h^{k - 2} }{h} \Bigg] \\ &= nx^{n - 1} + \lim_{h \to 0} \Bigg[ h \sum_{k = 2}^n \binom{n}{k}x^{n - k}h^{k - 2} \Bigg] \\ &= nx^{n - 1} \end{aligned}

Next, keeping $n$ in the set of positive integers, suppose $f(x) = x^{-n}$. Now we have

\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}x}f(x) &= \lim_{h \to 0} \frac{\frac{1}{(x + h)^n} - \frac{1}{x^n}} {h} \\ &= \lim_{h \to 0} \frac{1}{h} \frac{x^n - (x + h)^n}{x^n (x + h)^n } \\ &= \lim_{h \to 0} \frac{1}{h} \frac{x^n - \sum_{k = 0}^n \binom{n}{k} x^{n - k}h^k}{x^n (x + h)^n} \\ &= \lim_{h \to 0} \frac{x^n - x^n - nx^{n - 1}h - \sum_{k = 2}^n \binom{n}{k} x^{n - k}h^k}{hx^n(x + h)^n} \\ &= \lim_{h \to 0} \Bigg[ \frac{-nx^{n - 1}h}{hx^n (x + h)^n} - \frac{h^2\sum_{k = 2}^n \binom{n}{k} x^{n - k}h^{k - 2}}{hx^n (x + h)^n} \Bigg] \\ &= \lim_{h \to 0} \frac{-nx^{n - 1}}{x^n (x + h)^n} - \lim_{h \to 0} h \frac{\sum_{k = 2}^n \binom{n}{k} x^{n - k}h^{k - 2}}{x^n(x + h)^n} \\ &= \frac{\lim_{h \to 0} -nx^{n - 1}}{\lim_{h \to 0} x^n (x + h)^n} \\ &= \frac{-nx^{n - 1}}{ x^{2n} } \\ &= -nx^{-n - 1}. \end{aligned}

Efficient indexed linked list

In this post, we will present an indexed, heuristic doubly-linked list data structure that runs all the single-element operations in $\mathcal{O}(\sqrt{N})$ time. The idea is to maintain, along the actual doubly-linked list $L$, a stack of fingers $L.fingers$. Each finger $f$ consists of a reference to a node $f.node$ somewhere in the list, and the appearance index $f.index$ of that node. Now, for example, when we want to access a node via an index (call that index $i$), the data structure scans all the fingers and chooses the finger whose index is closest to $i$, after which all we do is to iterate from the node in the finger to the list node at index $i$. The process of choosing the closest finger is exemplified in the following procedure:

$\Delta \leftarrow \infty$ $f \leftarrow \Nil$ $\delta \leftarrow |f'.index - i|$ $f'$ $\Delta \leftarrow \delta$ $f \leftarrow f'$ $f$
The actual routine for obtaining a reference to the target node is as follows:
$f \leftarrow$Get-Closest-Finger$(L, i)$ $\delta \leftarrow f.index - i$ $f.index \leftarrow f.index - \delta$ $f.node \leftarrow f.node.prev$ $f.node \leftarrow f.node.next$ $f.node$

Suppose that the fingers are evenly distributed as in the Figure 1. Evenly distributed fingers

Assuming that all the $f$ fingers are distributed evenly over a list of length $N$, the work $W_N(f)$ for the access operation runs in \[ \frac{N}{2f} + f \] steps. The idea is that we first consult all the $f$ fingers, choosing the closest one (call it $T$) to the target index, after which we move starting from the node $T.node$ towards the node with the target index. Obviously, consulting the fingers runs in $\Theta(f)$ time since the fingers are not sorted by their indices. Now, in the evenly-distributed finger list, each finger covers $N/f$ nodes, but if we choose a closer finger, it's enough to iterate the nodes at most $\lceil N/(2f) \rceil$ times.

Next, we would like to find the value $f$ that minimizes $W_N(f)$. To achieve that, we first need to derivate $W_N(f)$ with respect to $f$: \[ \begin{aligned} \frac{d}{df}W_N(f) &= \frac{d}{df}\Bigg[\frac{N}{2f} + f\Bigg] \\ &= -\frac{N}{2f^2} + 1. \end{aligned} \] Now, we need to find such an $f_0$, that \[ \frac{d}{df}W_N(f_0) = 0. \] A simple manipulation will lead to the equation $f_0 = \sqrt{N/2}$. Next, we need to verify that such $f_0$ is, in fact, a minimizing value. To this end, we need the second derivative of $W_f(N)$: \[ \frac{d}{df} \Bigg[ -\frac{N}{2f^2} + 1 \Bigg] = \frac{d}{df} \Bigg[-\frac{N}{2f^2}\Bigg] = \frac{N}{f^3}. \] Since $f > 0$, the second derivative is positive, and, so, the $f = \sqrt{N/2}$ is a (global) minimum. Finally, we can conclude that \[ W_N(f) = \frac{N}{2f} + f = \frac{N}{2\sqrt{N/2}} + \sqrt{N/2} = 2\sqrt{N/2} = \Theta(\sqrt{N}). \] As $\sqrt{1/2} < 1$, in our implementation we set $f = \lceil \sqrt{N/2} \rceil$.

(A Java implementation of the linked list under discussion may be found at GitHub.)