We first reinterpret the sums. To begin with, rewrite
(21.3.2) by replacing the square with two copies of the summation:
\begin{equation*}
a(n) = \sum_{d_{i,1},d_{i,2} | n+h_i} \rho(d_{1,1},\dots,d_{k,1}) \rho(d_{1,2},\dots,d_{k,2}).
\end{equation*}
Then rewrite \(S_1\) and \(S_2^{(j)}\) as
\begin{align*}
S_1 \amp= \sum_{d_{i,1},d_{i,2}} \rho(d_{1,1},\dots,d_{k,1}) \rho(d_{1,2},\dots,d_{k,2}) \sum_{\substack{x \lt n \leq 2x \\ \lcm(d_{i,1},d_{i,2}) | n+h_i}} 1\\
S_2^{(j)} \amp= \sum_{d_{i,1},d_{i,2}} \rho(d_{1,1},\dots,d_{k,1}) \rho(d_{1,2},\dots,d_{k,2}) \sum_{\substack{x \lt n \leq 2x \\ \lcm(d_{i,1},d_{i,2}) | n+h_i}} \chi_P(n+h_j).
\end{align*}
For \(i=j\) we only get a nonzero contribution to \(S_2^{(j)}\) when \(d_{j,1}= d_{j,2} = 1\) (the other option \(d_{j,*} = n+h_j\) lies beyond the cutoff). For \(i \neq j\text{,}\) we can think of first pinning \(n\) down among some number of arithmetic progressions modulo \(X := \prod_{i \neq j} \lcm(d_{i,1},d_{i,2})\) and then picking out prime values of \(n+h_j\) within each progression; we can estimate the effect of restricting to primes by replacing \(\chi_P(n+h_j)\) with \(\frac{X}{\varphi(X) \log (n+h_j)}\) in the sum, at the expense of creating a sum of error terms in the prime number theorem in arithmetic progressions.
Lemma 21.7.
Set \(R = x^{1/4-\delta}\) for some small fixed \(\delta>0\text{.}\) For \(j=1,\dots,k\text{,}\) define
\begin{align*}
I_k(F) \amp= \int_0^1 \cdots \int_0^1 F(t_1,\dots,t_k)^2 dt_1 \cdots dt_k\\
J_k^{(j)}(F) \amp= \int_0^1 \cdots \int_0^1 \left( \int_0^1 F(t_1,\dots,t_k)dt_j \right)^2 dt_1 \cdots dt_{j-1} \, dt_{j+1} \cdots dt_k.
\end{align*}
Then provided that these quantities are all nonzero, we have
\begin{align*}
S_1 \amp = \frac{(1+o(1)) \varphi(W)^k x (\log R)^k}{W^{k+1}} I_k(F)\\
S_2^{(j)} \amp = \frac{(1+o(1)) \varphi(W)^k x (\log R)^{k+1}}{W^{k+1} \log x} J_k^{(j)}(F).
\end{align*}