If the two squares are constructed simultaneously, will the number of steps double why

Krishnapur et al. [15] studied the length of the fluctuations of nodal lengths of random Laplace eigenfunctions on the standard 2-torus. A key step in the paper is a nontrivial bound for the sixth-order correlation of the integer solutions of the equation m=x2+y2. This is a problem about a certain diophantine equation, studied here in depth using a variety of methods.

In a recent paper, Krishnapur et al. [15] studied the length of the fluctuations of nodal lengths of random Laplace eigenfunctions on the standard 2-torus |${{\mathbb {T}}}={{\mathbb {R}}}^2/{{\mathbb {Z}}}^2$|⁠. This problem is obviously related to a study of the fine distribution of solutions of the diophantine equation x2+y2=m, to be solved in integers x and y. A crucial point of their paper depends on showing that the variance of the nodal length (as defined in [15]) is small. This reduces (see [15], Section 2 and Theorem 2.2) to the following estimate.

Let Λm be the set of Gaussian integers λ with norm |$\lambda \bar {\lambda } =m$| and let N=|Λm|. Define S6(m) to be the set of 6-correlations

\[ S_6(m) =\{(\lambda_1,\ldots,\lambda_6)\in \Lambda_m^6\,:\,\lambda_1+\lambda_2+\lambda_3=\lambda_4+\lambda_5+\lambda_6\}. \]

Then S6(m)=o(N4) when |$N\to \infty $|⁠.

The proof of this result in Section 6 of [15], provided by J. Bourgain, depends on subtle sum-product theorems. The argument provides a quantitative version, with a rather small gain as a function of N.

On the other hand, the problem of estimating S6(m) is a diophantine question, which may be explicitly written in the following elementary form.

 

Give a nontrivial upper bound for the number of integer solutions (xi,yi) (i=1,2,…,6) of the system of equations

\begin{equation}\label{eq1} \begin{split} x_i^2+y_i^2&=m (i=1,2,\ldots,6),\quad \xi_i=x_i+\sqrt{-1} y_i, \\ \xi_1+\xi_2+\xi_3&=\xi_4+\xi_5+\xi_6. \end{split} \end{equation}

(1)

Solutions to this system will be called trivial if ξi's cancel out in pairs, degenerate if there is cancellation within a proper subset of the ξi's, and nondegenerate otherwise. The obvious conjecture is that almost all solutions of the system (1) are trivial.

In this paper, we study once again the problem of estimating S6(m), attacking it with a variety of methods. A summary of the results proved here is as follows.

In Section 2, Theorem 1, we provide a somewhat surprising proof of a bound in which we gain a power of N, using an argument based only on combinatorics and Section 3 interprets the problem as solving a certain S-unit equation and we show in Theorem 3 that under certain arithmetic conditions on the arithmetic structure of m the optimal bound S6(m)=O(N3) holds. In fact, there is an asymptotic estimate with main term given by the trivial solutions; this part of the paper depends on the deep Absolute Subspace Theorem of Evertse, Schlickewei, and Schmidt.

Sections 4–9 use arithmetic and algebraic geometry to reduce to problem to a difficult diophantine problem on certain elliptic curves. Although the results obtained in this way are only partial, the algebraic geometry objects encountered in this way are very interesting on their own and, somewhat surprisingly, they were studied by Halphen in 1882 and a century later by Beauville in 1982, who pointed out connections with modular forms and certain simple examples of Shimura varieties. We believe that the new problems encountered are worth of further study and for this reason we present our still incomplete results to the reader. More precisely, Section 4 reduces the problem to the study of a certain one-dimensional family of plane curves of degree 6 and genus 1 and Section 5 applies the arithmetical theory of elliptic curves to obtain some conditional results on the problem. In particular, Theorem 8 shows that if |$\log N/\log \log m \to \infty $| then a hypothetical failure of the conjectural bound O(N3+ɛ) is possible only if many of the elliptic curves associated to the problem have unbounded rank.

Section 6 shows that the totality of sextic curves associated to the problem is in fact a Halphen pencil and studies in detail its geometry. After blowing up the nine base points of the linear system one obtains a smooth, projective, rational, elliptic surface with one double fiber. Theorem 10 determines all special fibers of the elliptic fibering. Section 7 studies the associated jacobian surface, which turns out to be Type IV of the six rational elliptic surfaces, found by Beauville, which have the minimal number of special fibers all of them being semistable. These surfaces are rigid and are Shimura surfaces parametrizing elliptic curves with certain torsion groups, in our case |${{\mathbb {Z}}}/6$|⁠. Section 8 determines explicitly in Theorem 11 the minimal model of the fibers of the jacobian surface. This is used later on to study the rank of the group of rational points of a fiber. Section 9 shows, as a consequence of the preceding analysis, that the Szpiro ratio of the elliptic curves associated to a squarefree m is uniformly bounded from above (Proposition 13). Section 10 gives a refinement of Theorem 8 by using the result on the Szpiro ratio obtained in Section 9.

In the second half of this paper, occupying Sections 11–19, we study the average distribution of solutions to Equation (1), first for squarefree m without many small prime factors, showing that for most m Equation (1) has no nontrivial solutions, and then for squarefree m with a large number of prime factors, showing that for most m the number of nondegenerate solutions is small on the assumption of the still unproven, but widely believed, Birch and Swinnerton-Dyer conjecture and the Riemann hypothesis for L-functions of elliptic curves over |${\mathbb {Q}}$|⁠. The content of these sections is as follows.

Section 11 studies the problem for random squarefree numbers m. Theorem 14 deals with numbers m for which the smallest prime factor tends to |$\infty $| and shows that for almost all such numbers the system (1) has only trivial solutions. This is obtained by showing first that for almost all such m the sequence of prime factors of m grows at a rate faster than exponential and thereafter by studying the algebraic structure of the system (1) by a combinatorial method. Theorem 17 shows how to remove the condition on the smallest prime factor.

Sections 12–19 study the problem in what may be the most interesting case from the point of view of mathematical physics, namely the case in which N is very large with |$\log N$| proportional to |$\log m/\log \log m$|⁠. As yet, this cannot be done unconditionally and in order to study the distribution of the rank for the corresponding elliptic curves we relate it, as in the work of previous authors, to the analytic rank of the L-function of the elliptic curve via the Birch and Swinnerton-Dyer conjecture. In order to study the average behavior of the analytic rank, we make use of the still unproven Riemann hypothesis for these L-functions.

Sections 12–14 prepare the study of the rank by means of Heath–Brown's method of taking moments of high order of the rank. The major difficult here consists in the fact that, unlike all previous work by other authors, the family of curves we study is very thin and arithmetically defined, so the classic averaging over smooth variables is not available to us. Thus we resort to methods from probability theory to deal with this difficulty. Section 15 begins the study of numbers m for which the number of prime factors is of order |$\log m/\log \log m$| and all such primes belong to a same dyadic interval. We define first a natural probability distribution on such numbers and then a lemma (conditional on the Birch and Swinnerton-Dyer conjecture and a Riemann hypothesis) is proved relating the average number of solutions of the system (1) to the average of an exponential function of the rank. Section 16 introduces a probability distribution function on the Gaussian primes associated to the numbers m and proves a large deviation estimate with Lemma 20. This is fundamental in the proof of the next rather technical but very important Lemma 21. Section 17 prepares Sections 18 and 19 where it is shown that for curves associated to numbers m with |$r \sim A^{-1}\log m/\log \log m$| factors, all in a same dyadic interval, the probability of the rank to be of order |$\log m/\log \log m$| is less than δr for any fixed δ>0, provided A>24. This is sufficient to conclude the proof of the final Theorem 25, proving the conjectural bound O(N3+ɛ) for a random m in the given family, conditionally to the Birch and Swinnerton-Dyer conjecture and the Riemann hypothesis for the associated L-functions.

2 Combinatorics

We denote by |${\mathcal X}$| the set of integer pairs x=(x,y) such that x2+y2=m and by |$N=|{\mathcal X}|$| its cardinality.

It is readily seen that O(N4) is the trivial bound and that we may assume xi≠±xj for i≠j, because the contribution of such solutions is the optimal O(N3).

 

The number of integer solutions of the system of equations

\begin{align*} x_i^2+y_i^2&=m (i=1,2,\ldots,6),\quad \xi_i=x_i+\sqrt{-1} y_i, \\ \xi_1+\xi_2+\xi_3&=\xi_4+\xi_5+\xi_6, \end{align*}

is at most |$O(N^{\frac 72})$|⁠.

Before embarking in the proof of this theorem, we need a simple lemma. Fix ξ4,ξ5,ξ6. Then

\[ \xi_1+\xi_2+\xi_3=A+\sqrt{-1} B \]

is a known quantity. Thus we have five equations at our disposal:

\begin{equation} x_1+x_2+x_3=A,\quad y_1+y_2+y_3=B,\label{eq2}\end{equation}

(2)

\begin{equation}y_1^2=m-x_1^2,\quad y_2^2=m-x_2^2,\quad y_3^2=m-x_3^2,\label{eq3} \end{equation}

(3)

in the six unknowns x1,x2,…,y3. We start by eliminating variables, in order to get an algebraic relation between the first coordinates x1and x2 of the two points x1=(x1,y1) and x2=(x2,y2).

We begin by introducing new coordinates (u,t) defined by

\begin{equation}\label{eq4} u=x_1+x_2,\quad t=y_1+y_2. \end{equation}

(4)

We have |$(y_1,y_2)=(\pm \sqrt {m-x_1^2},\pm \sqrt {m-x_2^2})$| for an appropriate choice of the signs ±. Let ϕ be the map defined by

\[ \phi(\textbf{x}_1,\textbf{x}_2)=(x_1+x_2,y_1+y_2), \]

and set

\[ {\mathcal P} = \phi({\mathcal X}^2\setminus\{\textbf{x}_1=\pm\textbf{x}_2\}). \]

We have the following lemma.

 

The map ϕ is two-to-one.

 

Suppose that

\[ x_1+x_2=u,\quad y_1+y_2=t. \]

Clearly, (u,t)≠(0,0) because x1≠±x2. Then eliminating yi in the equation t=y1+y2 using |$x_i^2+y_i^2=m$| and x1+x2=u, we obtain

\[ x_1x_2= \frac{(u^2 + t^2)^2 - 4 t^2 m}{4(u^2+t^2)}. \]

Since x1+x2=u and x1x2 are both determined, the pair (x1,x2) is determined up to a permutation. In the same way, the pair (y1,y2) is also determined up to permutation. Moreover, since |$x_1^2+y_1^2=m$| and |$x_2^2+y_2^2=m$|⁠, we see that the possible pairs (x1,y2) and (x2,y1) do not occur unless u=0 or t=0, hence x1=−x2 or y1=−y2.

 

Let (x1,x2,x3) be a solution of our system of equations, namely Equations (2) and (3). Then Equation (2) can be written as

\begin{equation}\label{eq5} (A-u)^2+(B-t)^2=m, \end{equation}

(5)

because A−u=x3, B−t=y3, and |$x_3^2+y_3^2=m$|⁠. This is a circle CA,B in the (u,t)-plane, of radius |$\sqrt m$| and center the point A,B.

We denote by |${\mathcal C}$| the set of circles CA,B obtained in this way. This gives an incidence relation between the set |${\mathcal P}$| of points and the above set |${\mathcal C}$| of circles. Now we define

\[ \|C_{A,B}\|:=|C_{A,B}\cap {\mathcal P}|, \]

and denote by ν(m) the number of solutions of the system (1) restricted so that xi≠±xj (⁠|$1\leqslant i<j\leqslant 3$|⁠), which is irrelevant for our total counting up to an error term O(N3).

By the lemma, the number ν(m) is

\[ \nu(m)=2\sum_{{(A,B)}\in {\mathcal C}} \|C_{A,B}\|^2. \]

Our first remark is that

\begin{equation}\label{eq6} \sum_{{(A,B)}\in {\mathcal C}} \|C_{A,B}\| \leqslant N^3, \end{equation}

(6)

because the triple (x1,x2,x3) determines both (u,t) and (A,B). Moreover,

\begin{equation}\label{eq7} \|C_{A,B}\| \leqslant N, \end{equation}

(7)

because the equation (u−A)2+(t−B)2=m has exactly N integer solutions, confirming our previous statement that O(N4) is a bound for the number of solutions of the system (1).

In order to obtain a better upper bound for ν(m), we relate ν(m) to the incidence numbers of subsets of |${\mathcal C}$|⁠, obtained as approximate level sets in the following way. Since |$\|C_{A,B}\|\leqslant N$|⁠, we cover the interval [1,N] by means of |$O(\log N)$| dyadic intervals of type [D,2D] and accordingly we define |${\mathcal C}(D)$| to be the subset of |${\mathcal C}$| consisting of all curves CA,B for which

\[ D \leqslant \|C_{A,B}\| &#x003C; 2D. \]

It is obvious that |$D\leqslant N$| and, by Equation (6), we also have

\begin{equation}\label{eq8} D |{\mathcal C}(D)| \leqslant N^3. \end{equation}

(8)

Clearly, we have

\begin{equation}\label{eq9} \nu(m)=O\left(\sum_D D^2|{\mathcal C}(D)|\right), \end{equation}

(9)

where |$D \leqslant N$| runs over the dyadic intervals for which C(D) is not empty. Now we appeal to a deep theorem about incidences of points and curves, due to Szemerédi and Trotter in the case when the curves are lines or circles with equal radius, see [20]. The Szemerédi–Trotter theorem for circles yields

\begin{equation}\label{eq10} D|{\mathcal C}(D)| \leqslant I({\mathcal P},{\mathcal C}(D)) =O(|{\mathcal P}|^{\frac 23} |{\mathcal C}(D)|^{\frac 23} +|{\mathcal P}| + |{\mathcal C}(D)|), \end{equation}

(10)

where

\[ I({\mathcal P},{\mathcal C}(D))= \sum_{(A,B)\in {\mathcal C}(D)}\|C_{A,B}\| \]

is the incidence number.

If in the right-hand side of the inequality (10) the term |$|{\mathcal C}(D)|$| dominates, we have D=O(1) and we use the estimate (8) in this range of D. If instead the term |$|{\mathcal P}|<N^2$| dominates, we get

\begin{equation}\label{eq11} D|{\mathcal C}(D)| =O(N^2). \end{equation}

(11)

Otherwise, we have

\[ D|{\mathcal C}(D)| =O(N^{\frac 43} |{\mathcal C}(D)|^{\frac 23}), \]

whence

\begin{equation}\label{eq12} D|{\mathcal C}(D)| = O(N^4 D^{-2}). \end{equation}

(12)

This covers the alternative (11) because |$D\leqslant N$|⁠, hence (12) always holds. Now from Equations (9), (12), and (8) we conclude that

\[ \nu(m) = O\left(\sum_D\min(N^3 D,N^4 D^{-1})\right)= O(N^{\frac 72}), \]

because the sum over D runs over a geometric progression and is dominated by the term where D is about |$\sqrt N$|⁠.

Figure 1 shows incidence in the simplest case m=5×13. We present the dual picture, circles centered at |${\mathcal P}$| and points at |${\mathcal C}$|⁠.

Open in new tabDownload slide

Circles of radius |$\sqrt {65}$| centered at |${\mathcal P}$|⁠, incidences as small circles centered at (A,B).

The set |${\mathcal X}$| consists of the 16 solutions (±1,±8), (±4,±7), (±7,±4), (±8,±1). We have |$|{\mathcal P}|=112$| and |$|{\mathcal C}|=372$|⁠.

3 The Unit Equation

Some additional evidence that the number of nondegenerate solutions of the system (1) is very small comes from reducing the problem to a question about S-unit equations.

Consider again the equation ξ1+ξ2+ξ3=ξ4+ξ5+ξ6. Then ξi belongs to the multiplicative group G generated by ±1, 1±i, and the Gaussian primes πi, |$\bar {\pi }_i$| such that |$\pi _i\bar {\pi }_i=p$| when p divides m and is congruent to 1 modulo 4. The |${\mathbb {Q}}$|-rank of this group is at most 1+2ω4,1(m), where now ω4,1(m) is the number of prime factors p≡1 mod 4 of m, but this time counted without multiplicity. Thus we have an equation

\[ \sum_{j=1}^5 \zeta_j =1, \quad (\zeta_j\in G,\ \zeta_j=\pm \xi_j/\xi_6). \]

A degenerate solution of this unit equation occurs if there is a subsum equal to 0, since multiplying the subsum by any element in G still yields a solution. A very deep theorem due to Evertse et al. [8] states that any unit equation |$a_1g_1+a_2g_2+\dots +a_ng_n=1$| in a multiplicative group G over |${\mathbb {C}}$| has at most |$\exp (c(n) (r+1))$| nondegenerate solutions where r is the |${\mathbb {Q}}$|-rank of the group and c(n) depends only on n. In fact, they give the value c(n)=(4n)3n but this is unimportant here. In our case, n=5 and r=1+2ω4,1(m).

 

There is a positive absolute constant C such that the number of solutions of the system (1) is O(N3)+O(N×Cω4,1(m)). In particular, if |$\log N/\omega _{4,1}(m)\ge \frac {1}{2} \log C$| the number of solutions of the system (1) is O(N3).

 

This result is not vacuous, because if |$N=\prod p^{\alpha _p}$| with p≡1 mod 4 or p=2 then |$N=2^{1+\varepsilon }\prod (\alpha _p+1)$| with ɛ=1 or 0 according as N is even or odd. Hence, if the exponents αp are sufficiently large on average, then the above theorem applies.

 

In our case, we have a bounded number of unit equations in a finitely generated group of rank |$r\leqslant 1+2\omega _{4,1}(m)$|⁠. Noting that ξ6 takes not more than N possibilities, the theorem of Evertse et al. yields the bound O(N×Cω4,1(m)) for the number of nondegenerate solutions of (1), with |$\log C = 2\times 20^{15}$|⁠. By Equation (20), we have |$N\geqslant 2^{1+\omega _{4,1}(m)}$|⁠, hence this is O(N3) if |$\log N/\omega _{4,1}(m)\ge \frac {1}{2} \log C$|⁠. As noted before, the number of trivial solutions is O(N3) and the theorem follows. Note also that Amoroso and Viada [1] have removed one exponential in the bound for c(n), see [1, Theorem 6.2].

Finally, we remark that it has been suggested that the true upper bound for the number of nondegenerate solutions of a unit equation could be subexponential in the rank, possibly |$\exp (c(n) r^{\beta (n)})$| for some β(n)<1; β(n)=1−1/(n+1) has been considered as a likely possibility. This of course would imply the bound O(N3) for the number of solutions of the system (1) and even that the number of solutions is asymptotic to the number of degenerate solutions, with an error term O(N1+ɛ) for any fixed ɛ>0.

4 Elliptic Curves

We give a reduction in the problem to the study of certain curves of genus 1. This will provide evidence in favor of the conjecture that the number of solutions is O(N3+ɛ) for every fixed ɛ>0. We start with the equations

\[ x_1^2+y_1^2=m,\quad x_2^2+y_2^2=m,\quad (A-x_1-x_2)^2+(B-y_1-y_2)^2=m, \]

and eliminate y1, y2 by taking resultants, getting a polynomial relation between x1 and x2. Our calculations turn out to be simpler after making the affine change of variables

\[ x_1=\tfrac{1}{2} (u+v),\quad x_2=\tfrac{1}{2} (u-v), \]

thus u=x1+x2 and v=x1−x2, and write

Using these coordinates, we obtain a plane sextic Φ in the (u,v)-plane, given by an equation

\begin{equation}\label{eq13} f(u,v):=U_6(u) + U_4(u)v^2+ U_2(u)v^4=0 \end{equation}

(13)

where U2, U4, U6 are certain polynomials of degree 2, 4, and 6 in u. We find

\begin{equation} U_6(u) =(K + 2 A u - u^2)^2 (K^2 + 16 A^2 m + 8 K m + 4 A (K - 4 m) u - 4 (K - m) u^2) ,\label{eq14}\end{equation}

(14)

\begin{align} U_4(u)&= -4 A^2 K^2 - 2 K^3 - 4 K^2 m - 4A K (4 A^2 + K +4 m) u \notag\\ &\quad +(32 A^2 K + 6K^2 - 32 A^2m) u^2 - 8 A(3K - 4m) u^3 + 8 (K - m) u^4,\label{eq15}\end{align}

(15)

\begin{align} U_2(u)=K^2 + 4 A K u - 4 (K - m) u^2.\label{eq16} \end{align}

(16)

A very remarkable feature of this change of variables is that the polynomial f(u,v) is only linear in m. This gives rise to a very interesting geometry.

We compute the genus of the plane curve Φ defined the equation f(u,v)=0, by looking at its singularities. We use homogeneous coordinates [u:v:w] and write F(u,v,w)=w6f(u/w,v/w) for the homogeneous form of f(u,v).

The line at infinity w=0 intersects the curve in three points, namely [0:1:0], [1:1:0], [1:−1:0]; these points are singular points. At the point [0:1:0] we have

\[ F(u,1,w) = 4(m-K) u^2 + 4 A K u w + K^2 w^2 +\hbox{higher order terms,} \]

hence we have an ordinary double point if K(A2+K−m)≠0.

In a similar way, we verify that [1:1:0] is an ordinary double point if K(A2+K−m)≠0. The analysis for the other point [1:−1:0] is the same, with the same result. We conclude that if K(A2+K−m)≠0, then the three points at infinity are ordinary double points of Φ. There are six other finite singular points, which in the generic case are again ordinary double points. They are, writing for simplicity T=A2+K:

\[ (A\pm\sqrt{{T}},0),\quad \left(\frac{A+\sqrt {T}}2,\pm \frac{ A -3\sqrt {T}}2\right),\quad \left(\frac{A-\sqrt {T}}2,\pm\frac{ A+3\sqrt {T}}2\right). \]

This gives a total of nine ordinary double points, so for generic A, K, m, the curve f(u,v)=0 has genus 1.

We label the singular points as follows:

  • P−1, P0 and P1 for the three double points at infinity,

  • P+, P− for the two finite double points with v=0,

  • P++, P+−, P−+, P−−, for the remaining four double points,

according to the signs of the square-root in the coordinates of the point. These nine points appear in a configuration of seven lines and nine points with every line containing three points, namely the triplets of points {P−1,P−,P+−}, {P−1,P−+,P+}, {P0,P−+,P−−}, {P0,P++,P+−}, {P1,P++,P−}, {P1,P+,P−−}, {P−1,P0,P1} (Figure 2).

The six full lines form a fully reducible sextic, occurring when m=0.

These points are defined over the field |${\mathbb {Q}}(A,\sqrt {A^2+K})$|⁠, which for general integers A, K, m, will be a quadratic extension of |${\mathbb {Q}}$|⁠. In that case, the field automorphism |$\sigma : \sqrt {A^2+K}\to -\sqrt {A^2+K}$| transforms a singular point into its conjugate over |${\mathbb {Q}}(\sqrt {A^2+K})$|⁠.

We have already seen that the double points at infinity require the condition K(A2+K−m)≠0. A calculation for the six finite singularities shows that we have ordinary double points with normal crossing if and only if

\begin{equation}\label{eq17} K(A^2 + K)(A^2 + K- m)(8 A^2 + 9K)(K^2 - 16 A^2m- 8K m + 16m^2)\ne 0. \end{equation}

(17)

The above condition characterizes the generic situation where the parameters A, K, m, are independent. A nongeneric situation will occur not only if some of the above singularities is no longer an ordinary node, but also if new singularities appear or if there is a confluence of singularities. Confluence of singularities is covered by condition (17). One can show that the further conditions coming from reducibility or from the appearance of other singularities and not already covered by condition (17) can be summarized by

\begin{equation}\label{eq18} m (K-m) (K+m) (8m+K) \ne 0 \end{equation}

(18)

together with the limiting case |$m=\infty $|⁠.

We have proved the following result.

 

Let |$x_i^2+y_i^2=m$| (i=1,2,3) satisfy the equations x1+x2+x3=A and y1+y2+y3=B, and let f(u,v)=U6(u)+U4(u)v2+U2(u)v4 be the polynomial of degree six defined by Equations (14)–(16). Then we have

\[ f(x_1+x_2,x_1-x_2) = 0. \]

Moreover, if A, K, m verify the conditions (17) and (18), then the plane sextic Φ defined by the equation f(u,v)=0 has genus 1 and has nine ordinary double points, namely the three points at infinity [0:1:0], [1:1:0], [1:−1:0], and six other points located at

\[ (A\pm\sqrt{{T}},0),\quad \left(\frac{A+\sqrt {T}}2,\pm \frac{ A -3\sqrt {T}}2\right),\quad \left(\frac{A-\sqrt {T}}2,\pm\frac{ A+3\sqrt {T}}2\right) \]

with T=A2+K.

Note also that the homogeneous part of highest degree of the polynomial f(u,v) is −4(K−m)u2(u−v)2(u+v)2. Since it is reducible over |${\mathbb {Z}}$| and is not a power of an irreducible polynomial, Runge's method can be applied to determine effectively the integral points on Φ. However, the solutions of interest to us are fairly small in size compared with the size of coefficients and it remains unclear whether this method can be used efficiently to determine a good bound for their number.

5 Application of the Theory of Elliptic Curves

In this section, we will use basic definitions and facts about heights which can be gathered from [3], Ch. 9. The information gathered so far implies a partial result that we view as evidence for the validity of the conjecture that the number of solutions of our problem is O(N3+ɛ). Curiously, the logic in our argument will show that at least one of the two statements in Theorem 8 is true, although as yet we are unable to tip the balance toward one of them by proceeding in this way.

In order to see this, we choose as origin 0 of the curve Φ of genus 1 one of the integral points (x1+x2,x1−x2) thus found, say 0=(x*1+x*2,x*1−x*2), and consider a minimal model E* of Φ with 0 as origin of the group law. Since we work over |${\mathbb {Q}}$|⁠, a global minimal model |$\phi : E^*\rightarrow \Phi $| and a global minimal discriminant Δ(E*) exist. Finding an explicit equation for the elliptic curve E* may be fierce work, but what matters to us is that this is done through purely algebraic and arithmetic operations. The only information we need for our modest purposes are the bounds

\begin{equation}\label{eq19} h(\Delta(E^*)) \leqslant c_1 \log m, \quad \hat h(\phi^{-1}(x_1+x_2,x_1-x_2)) \leqslant c_2 \log m \end{equation}

(19)

for the height of the discriminant of E* and of the Néron–Tate height of the points on E* which correspond to the integral points (x1+x2,x1−x2) we have found on Φ. Here c1 and c2 are absolute positive constants. The discriminant bound is obvious because the minimal discriminant is not larger than the usual discriminant, which is polynomial in m, A, K. The comparison with the Néron–Tate height is obtained by appealing to the explicit estimates obtained by Zimmer [21].

Let |$\mathfrak f(E^*)$| be the conductor of E*, which we identify with a positive integer (rather than defining it as a certain ideal) because we work over |${\mathbb {Q}}$|⁠. In our case, |$\mathfrak f(E^*)$| is divisible by the product of primes dividing the discriminant (which is not 1 for curves over |${\mathbb {Q}}$|⁠). The Szpiro ratio σ(E*) is then

\[ \sigma(E^*) = \frac {\log|\Delta(E^*)|}{\log{\mathfrak f}(E^*)} \]

and certainly |$1 \leqslant \sigma (E^*) \leqslant 2\log |\Delta (E^*)|$|⁠. We need one more result.

 

Let |$r={\mathrm {rank}}(E^*({\mathbb {Q}}))\geqslant 1$|⁠. Then

\[ |\{P\in E^*({\mathbb{Q}}) : \hat h(P) \leqslant c_2 \log m\}| \leqslant r^{r/2} (c_3\log m)^{r+2}\log\log(3m) + c_4. \]

Here c3 and c4 are absolute constants.

 

We consider the Mordell–Weil lattice |$\Lambda =E^*({\mathbb {Q}})/{\mathrm {tors}}$| in space |${\mathbb {R}}^r=\Lambda \otimes {\mathbb {R}}$|⁠, equipped with the distance |$\|\textbf {x}\|^2 = \hat h(\textbf {x})$|⁠. Note that by a famous theorem of Mazur the order of the torsion group of E* is uniformly bounded, so an upper bound for the number of rational points P in a ball of radius t is proportional to the number of lattice points in that ball.

With respect to this metric, the volume of a ball |$B_t(\textbf {y}):=\{\textbf {x} : \|\textbf {x}-\textbf {y}\|\leqslant t\}$| of radius t and center y is simply αrtr with αr the volume of the unit ball. Next, we note that we can cover the ball BT(0) with not more than |$(\sqrt r T/t)^r$| balls Bt(y) with centers y∈Λ. Now we recall Proposition 8 of Petsche's paper [18, p. 264], namely

 

Let k be a number field of degree |$d=[k:{\mathbb {Q}}]$| and let E/k be an elliptic curve with Szpiro ratio σ. Then

\[ \left|\left\{P \in E(k)\,:\, \hat h(P) \le\frac{{|\rm Norm}_{k/{\mathbb{Q}}} \Delta(E/k)|} {2^{13}3 d\sigma^2} \right\}\right| \leqslant c_5 \log^2(c_6 d\sigma^2) \]

with c5=134,861 and c6=104,613.

See also David [7], where the first results of similar quality were proved in the semistable case.

We apply this with E=E*, |$k={{\mathbb {Q}}}$|⁠, d=1, taking

\[ T=\sqrt{c_2\log m},\quad t = \sqrt{\log|\Delta(E^*)|}/(64\sqrt 6 \sigma(E^*)). \]

We cover the set of points Λ∩BT(0) with at most |$(\sqrt r T/t)^r$| balls Bt0(y) centered at points of Λ, so by the proposition we get the bound

\[ |\Lambda\cap B_T(\textbf{0})|\leqslant c_7 (r T/t)^{r/2} \log(c_6\sigma(E^*)^2) \]

for the number of lattice points in BT. The lemma follows using |$\log |\Delta (E^*)| \leqslant c_1\log m$| and |$\sigma \leqslant 2\log |\Delta (E^*)|$|⁠.

Finally, we are able to prove a weak unconditional result.

Recall that the number N of integer solutions of the equation x2+y2=m is

\begin{equation}\label{eq20} N=2^{1+\varepsilon} \prod_{\substack{p^{a_p}\| m\\ p\equiv 1\bmod 4}}(a_p+1) \end{equation}

(20)

where the symbol a∥b means ‘a exactly divides b’ and ɛ=0 or 1 according as m is odd or even. It is clear that if ω4,1(m) denotes the number of prime factors p≡1 mod 4 of m then |$N \geqslant 2^{1+\omega _{4,1}(m)}$|⁠.

 

At least one of the two following statements holds.

  • (I) If |$\log N/\log \log m \to \infty $|⁠, the number of solutions of the system of Equations (1) is O(N3+o(1)).

  • (II) There exist elliptic curves |$E/{\mathbb {Q}}$| of unbounded rank.

 

The condition |$\log N/\log \log m\to \infty $| is not vacuous, because |$\log N \geqslant (\log 2) \omega _{4,1}(m)$| and ω4,1(m) can be as large as |$(\frac {1}{2} +o(1))\log m/\log \log m$|⁠.

 

Suppose the conclusion of (I) does not hold. Then there are δ>0 and an infinite sequence (mν) with |$\log N_\nu /\log \log m_\nu \to \infty $| (Nν denotes the number of representations of mν as the sum of two squares) such that the system (1) with m=mν has at least |$N_\nu ^{3+\delta }$| solutions. The number of pairs (A,B) is |$O(N_\nu ^3)$|⁠, hence there is a pair (A,B) such that there are at least |$N_\nu ^\delta $| solutions of x1+x2+x3=A, y1+y2+y3=B, |$x_i^2+y_i^2=m_\nu $| (i=1,2,3).

The corresponding elliptic curve EA,B has at least |$N_\nu ^{\delta }$| integral points (x1+x2,x1−x2), all of them having size at most |$2\sqrt {m_\nu }$|⁠. Then the associated minimal model E* has at least |$N_\nu ^{\delta }+O(1)$| rational points P with |$\hat h(P) \leqslant c_2 \log m_\nu $|⁠. By the preceding lemma, if rν is the rank of E* we get

\[ N_\nu^\delta \le r_\nu^{r_\nu/2} (c_3\log m_\nu)^{r_\nu+2}\log\log(3m_\nu) + c_4. \]

Taking logarithms, we find (we may assume |$r_\nu \geqslant 1$|⁠)

\[ \delta \log N_\nu =O(r_\nu(\log r_\nu +\log\log m_\nu)). \]

If |$r_\nu >\log m_\nu $|⁠, conclusion (II) follows. Otherwise, |$\log r _\nu + \log \log m_\nu \leqslant 2\log \log m_\nu $| and we get

\[ \delta \log N_\nu =O(r_\nu \log\log m_\nu). \]

Now |$r_\nu \to \infty $| follows from |$\log N_\nu /\log \log m_\nu \to \infty $|⁠.

6 The Halphen Pencil and Rational Elliptic Surfaces

The elliptic curves we have studied give rise to a very interesting geometry associated with the rational elliptic surfaces constructed by Halphen in 1882, in a famous memoir [11].

The polynomial f(u,v), viewed as a polynomial in (A,T,u,v), is homogeneous of degree (1,2,1,2), thus we may eliminate one variable after a homogeneous change of coordinates. We set u=Ax, v=Ay, T=m−B2=tA2, m=μA2. After factoring out A8 (assuming A≠0) we remain with a polynomial h, which we write as |$h=h_0+\mu h_\infty $|⁠. Factoring h0 and |$h_\infty $| we find

\begin{align} h_0&=(t-1)(1 + \sqrt t - 2 x) (-1 + \sqrt t + 2 x) (1 + \sqrt t - x - y)\notag\\ &\quad \times(-1 + \sqrt t + x - y) (1 + \sqrt t - x + y) (-1 + \sqrt t + x + y),\label{eq21}\end{align}

(21)

\begin{align}h_\infty&= 4(-1 + t + 2 x - x^2 - x y) (-1 + t + 2 x - x^2 + x y)\notag\\ &\quad\times(2 + 2 t- 4 x + x^2 - y^2)\label{eq22} \end{align}

(22)

If A=0 and B≠0, there is a similar analysis obtained by replacing the role of A with B. The special case A=B=0 has little relevance for our goal of studying the system (1), because it corresponds to a degenerate solution of the system and, as it is immediate from Equation (5), fixing (x2,y2) determines (x1,y1) as the intersection of two distinct circles, hence with no more than 2 points, so the number of solutions arising from A=B=0 is at most 2N2.

In this paragraph t is a purely transcendental parameter over |${\mathbb {Q}}$|⁠. The curve EA,B now depends only on t and linearly on μ, and is defined over the field |${\mathbb {Q}}(\mu)$|⁠. We denote by E(μ,t) this curve, considered as a curve over the function field |${\mathbb {Q}}(\mu ,t)$|⁠. In fact, it is defined over the ring |${\mathbb {Z}}[\mu ,t]$|⁠. This is in general a sextic with nine double points.

Now the basic remark is that the set |${\mathfrak {B}}$| of the nine singular points of E(μ,t) is independent of μ. The set |${\mathfrak {B}}$| is defined over |${\mathbb {Q}}(t)$|⁠, but the points in |${\mathfrak {B}}$| are defined only over the quadratic extension |${\mathbb {Q}}(\sqrt t)$|⁠. More precisely, the set |${\mathfrak {B}}$| consists of the three points at infinity [−1:1:0], [0:1:0], [0:1:0], the two points |$(1\pm \sqrt t,0)$|⁠, and the four points |$((1\pm \sqrt t)/2,(1\mp 3\sqrt t)/2)$|⁠, |$((1\pm \sqrt t)/2,(-1\pm 3\sqrt t)/2)$|⁠. We label these points in the same way as it was done in Section 4.

We have obtained a linear system of sextic curves with, generically, ordinary double points on a set |${\mathfrak {B}}$|⁠, defined over |${\mathbb {Q}}(t)$|⁠, consisting of nine distinct points in the projective plane |${\mathbb {P}}^2/{\mathbb {Q}}(t)$|⁠. This provides an example of a Halphen pencil of degree 2, studied the first time by Halphen in 1882 in a classic memoir [11].

First of all, there is an unique cubic curve C0, defined over |${\mathbb {Q}}(t)$|⁠, passing through the nine base points. In order to find this cubic, one could use a computer-aided calculation, but a direct way of determining it is as follows. Consider the linear system of plane cubics going through the seven double points

\begin{align*} &[0:1:0],\quad [1:1:0],\quad [-1:1:0],\\ &\left(\frac{1\pm \sqrt{t}}2,\frac{1\mp3 \sqrt{t}}2\right),\quad \left(\frac{1\pm \sqrt{t}}2,\frac{-1\pm3 \sqrt{t}}2\right), \end{align*}

namely the three points at infinity and the four singular points with y≠0. The first three points are defined over |${\mathbb {Q}}$|⁠, but the remaining four points are defined only over |${\mathbb {Q}}(\sqrt t)$|⁠.

These cubics arise from a linear system M of dimension 3. Two independent cubics forms in M are easily found. Since the three points of |${\mathfrak {B}}$| at infinity lie on a line, to find one such cubic it suffices to add to this line two other lines that go through the four finite singular points with y≠0. This will give the linear subsystem of dimension 1 of cubics which split into a conic plus the line at infinity. An easy calculation now shows that two independent cubics are defined by the equations

\[ C_1(x,y):=(2x- 1)^2 - t=0,\tag{$C_1$} \]

\[ C_2(x,y):=-2(1 +t) +4 x - x^2 + y^2 =0,\tag{$C_2$} \]

where we must add the line at infinity to the two conics. A third independent cubic is given by

\[ C_3(x,y):=(-(2(1+t) + 4 x - x^2 + y^2)x=0\tag{$C_3$} \]

because it certainly vanishes at the four finite base points of M and because x(x2−y2) vanishes when (x,y) is one of (−1,1), (0,1), (1,1), hence C3(x,y) vanishes at the three points at infinity. Therefore, the general cubic curve of M is given by the equation

\begin{equation}\label{eq23} \lambda C_1(x,y)+\mu C_2(x,y)+ \nu C_3(x,y)=0. \end{equation}

(23)

By using this parametrization it is easy to find a cubic that goes through all nine singular points of E(μ,t). This means that, in addition to the seven base points we have already imposed, we need the vanishing of λC1+μC2+νC3 at the two points |$(1\pm \sqrt t,0)$|⁠. This yields the unique cubic curve C0 defined by

\begin{equation}\label{eq24} C_0(x,y)=-(1 - t) (1 + 3 t) +4 (1 + t) x - (5 + 3 t) x^2 +(1 - t) y^2 + 2 x (x - y) (x + y)=0. \end{equation}

(24)

The homaloidic transform of the projective plane |${\mathbb {P}}^2/{\mathbb {Q}}(t)$|⁠, by means of the linear system of all quartics with base points in |${\mathfrak {B}}$|⁠, is a nonsingular surface |$S_t/{\mathbb {Q}}(t)$| of degree |$4\times 4-|{\mathfrak {B}}|=7$|⁠, in projective space |${\mathbb {P}}^5/{\mathbb {Q}}(t)$| (the quartic forms depend on 15 homogeneous parameters, while we impose nine conditions at the base points).

The general sextic with double points at |${\mathfrak {B}}$| has genus |$g=\frac {(6-1)(6-2)}2-9=1$|⁠. Its image |${\mathcal E}_\mu $| by the homaloidic transform has degree |$6\times 4 - 2 |{\mathfrak {B}}|=6$| and, generically, is not singular. The intersection number of the image of two sextics is 6×6−4×9=0, so the image of the linear system of curves |$h_0+\mu h_\infty =0$| gives to St a structure of elliptic surface, with a pencil of curves of degree 6. This a particular case of the well-known family of rational elliptic surfaces of Halphen, with a multiple fiber of order 2 (the image of the double cubic). In our case, the special multiple fiber is a nonsingular elliptic curve of degree 3, taken with multiplicity 2.

We have the following theorem.

 

The surface St is a smooth, projective, rational elliptic surface, of degree 7 in |${\mathbb {P}}^5/{\mathbb {Q}}(t)$|⁠. It has nine exceptional curves of the first kind. The general fiber ϕ−1(μ) of the pencil |$\phi : S_t\to {\mathbb {P}}^1$| is a smooth curve |${\mathcal E}_\mu $| of degree 6 and genus 1, bisecant to the each of the nine exceptional curves of the first kind. The curve |$\mathcal E_\mu $| is the proper transform of the plane sextic with equation |$h_0+\mu h_\infty =0$|⁠. The fibers over μ=0, |$\mu =\infty $|⁠, μ=t, μ=t−1 are the only reducible fibers of the pencil. More precisely, all special fibers are described as follows.

  • The fiber |${\mathcal E}_0$| splits into the union of six lines Λi (i=1,2,…,6), which are the proper transform of the sextic h0(x,y)=0. The curve |${\mathcal E}_0$| is of Kodaira type I6, a loop of six lines.

  • The fiber |${\mathcal E}_\infty $| splits into the sum of three conics, image of the sextic that splits into the sum of three conics in the (x,y)-plane, passing through the three sextuples of points of |${\mathfrak {B}}$| given by

    \begin{align*} &\{P_{0}, P_{1}, P_{-}, P_{+}, P_{+-},P_{-+}\},\\ &\{P_{-1}, P_{0}, P_{-}, P_{+},P_{++},P_{--}\},\\ &\{P_{-1}, P_{1}, P_{+-}, P_{-+}, P_{++},P_{--}\}. \end{align*}

    The curve |${\mathcal E}_\infty $| has three ordinary double points and is of Kodaira type I3.

  • The fiber |${\mathcal E}_{t}$| is a nonsingular cubic curve C0, with multiplicity 2.

  • The fiber |${\mathcal E}_{t-1}$| splits into a line Λ, image of the line |$\ell _\infty $| at infinity, and a nonsingular rational quintic normal curve Γ5 in |${\mathbb {P}}^5$|⁠, image of a rational quintic curve Q5 with six finite double points belonging to |${\mathfrak {B}}$| and going simply through the remaining three points of |${\mathfrak {B}}$| at infinity. The curve |${\mathcal E}_{t-1}$| has two ordinary double points at the intersection of Λ and Γ5, hence it is of Kodaira type I2.

  • The fiber |${\mathcal E}_{(1 - t)/8}$| is the only other singular fiber, with one ordinary double point, image of the point |$(x,y)=(\frac {2}{3},0)$|⁠. It is of Kodaira type I1.

 

The fibers at μ=0 and |$\mu =\infty $| must be of special type, because our original problem degenerates in that case. The factorization of the polynomials h0 and |$h_\infty $|⁠, already provided in Equations (21) and (22), gives the reducible structure of these fibers in the Halphen pencil.

The value μ=t which gives the double cubic in the elliptic pencil in St is also easily found, because the corresponding sextic polynomial is |$C_0^2$|⁠. The value of μ is calculated from the equation |$C_0(0,0)^2=h_0(0,0)+\mu h_\infty (0,0)$|⁠.

In order to find the reducible fiber |${\mathcal E}_{t-1}$| we note that because h0=0 and |$h_\infty =0$| have the same points at infinity, there is a linear combination |$h_0+ \mu h_\infty $| such that the resulting polynomial has degree less than 6; the value of μ is μ=t−1. The corresponding plane sextic is reducible, acquiring the line at infinity |$\ell _\infty $|⁠, and its singularities outside of |${\mathfrak {B}}$| are also easily determined.

The proof of statement (v) is a bit more complicated. To this end, we use Kodaira's formula determining the Euler characteristic number in terms of the special fibers of the pencil, comparing the result with a direct computation using Zeuthen's formula for the arithmetic genus of the surface.

The surface St is rational, hence has arithmetic genus |$\chi ({\mathcal O}_{S_t})=1$| because it is a birational invariant. The first Chern number is c1(St)2=0 because St is elliptic. Now Zeuthen's formula |$\chi ({\mathcal O}_{S_t})=(c_1(S_t)^2+c_2(S_t))/12$| shows that the Euler characteristic χ(St)=c2(St) of St is c2(St)=12.

The Euler characteristic χ(St) is the sum

\[ \chi(S_t)= \sum_\mu \chi({\mathcal E}_\mu) \]

of the Euler characteristic of the fibers of the elliptic pencil. We have |$\chi ({\mathcal E}_\mu)=0$| if |$\mathcal E_\mu $| is a nonsingular fiber, while always |$\chi (\mathcal E_\mu)\geqslant 0$|⁠. For our surface, the reducible fibers we have found are |${\mathcal E}_0$|⁠, |${\mathcal E}_\infty $|⁠, |${\mathcal E}_{t-1}$|⁠, respectively, of Kodaira-type I6, I3, I2. Accordingly, we have

\[ \chi({\mathcal E}_0)=6,\quad \chi({\mathcal E}_\infty)=3,\quad \chi({\mathcal E}_{t-1})=2, \]

hence they contribute 6+3+2=11 to χ(St). Since χ(St)=12, there must be another singular fiber, necessarily of type I1 because other types would give a contribution of at least 2. Since the elliptic pencil is defined over |${\mathbb {Q}}(t)$|⁠, the uniqueness of this fiber implies that the node is also defined over |${\mathbb {Q}}(t)$|⁠.

It remains to find the values of the parameter μ such that the curve h(x,y)=0 acquires a double point, without becoming reducible. This must be a finite singularity, so we need to find for which values of μ≠0,t,t−1, the system of equations h=∂xh=∂yh=0 has a solution (x,y) lying outside of the singular set |${\mathfrak {B}}$|⁠. By what we have determined before, the solution (x,y) is unique, otherwise the curve h(x,y)=0 would be reducible, forcing μ to take one of the excluded values. Now h(x,y) is even in y, ∂xh(x,y) is even in y, but ∂yh(x,y) is odd in y. Therefore, since (x,y) is unique we must have y=0. Now it becomes a simple matter to find that μ=(1−t)/8 and |$x=\frac {2}{3}$|⁠.

A canonical divisor KS of the surface St is −Γ0, where Γ0 is the nonsingular cubic in the double fiber of the elliptic pencil, see for example [4, Theorem 2]. Thus |−KS|=Γ0 and |−2KS| are the elliptic pencil.

Although the family of our surfaces St does not have a section (it has a double fiber), one can construct from St another rational elliptic surface Σt with isomorphic fibres, but now with simple fibers everywhere. This can be done analytically using an inverse logarithmic transformation of Kodaira, but in the case of rational elliptic surfaces this can be done geometrically. The construction is classical and we give only an outline of it.

One starts with the jacobian |${\mathcal J}_\mu $| of the generic fiber |${\mathcal E}_\mu $| and proceeds to find a relative global minimal model |${\mathcal J}$| over the ring |${\mathbb {C}}(t)[\mu ]$|⁠. This can be done because |${\mathbb {C}}(t)[\mu ]$| is a principal ideal ring. Then, by assigning a closed point μ0 and a point of order 2 on the fiber |${\mathcal J}_{\mu _0}$|⁠, there is a principal homogeneous space of |${\mathcal J}$| (in modern parlance, a torsor) such that the fiber at μ0 is twice the fiber Jμ0. The zero-section of the minimal model J now lifts to a 2-section of the homogeneous space.

In order to obtain the surface St by this method, we need to find a torsion point of order 2 and rational over |${\mathbb {Q}}(t)$|⁠, on the jacobian J0 of Γ0. We can work directly on the cubic C0, which is canonically isomorphic over |${\mathbb {Q}}(t)$| to Γ0. Let P++P−+P* be the intersection of C0 with the line y=0. (We have P*=((1+3t)/2,0).) By Abel's theorem, there is a linear equivalence

\[ 2 \sum_{P\in{\mathfrak{B}}} P \sim 6 (P_{+} +P_{-}+P^*), \]

as one sees comparing the intersections of C0 with a sextic singular at |${\mathfrak {B}}$| and with the line y=0, counted with multiplicity 6. Therefore, the divisor

\[ \Omega=\sum_{P\in{\mathfrak{B}}} P - 3 (P_{+}+P_{-}+P^*) \]

determines a point ω of order 2 on the jacobian J0 of Γ0, rational over |${\mathbb {Q}}(t)$|⁠. This point ω is not 0 because Ω is not a complete intersection, as one sees recalling that C0 is the unique cubic curve determined by the nine points of |${\mathfrak {B}}$|⁠. A detailed analytic construction, valid also in the limiting cases when the nine base points have infinitely near points, can be found in the paper of Fujimoto [9].

Thus, we have obtained a smooth rational elliptic surface |$\Sigma _t\to {\mathbb {P}}^1$|⁠, defined over |${\mathbb {Q}}(t)$| and without multiple fibers, such that its fibers for μ≠t are isomorphic to |${\mathcal E}_\mu $|⁠, while the fiber at μ=t is isomorphic to Γ0. All isomorphisms are defined over |${\mathbb {Q}}(t)$|⁠, because so is the 2-torsion point ω. We simply write Jμ for the fiber of Σμ corresponding to the fiber |${\mathcal E}_\mu $| on St. The restriction map |${\mathcal E}_\mu \to J_\mu $| makes |${\mathcal E}_\mu $| a principal homogeneous space over Jμ, of degree 2.

Since Σt has nine independent vertical components and a section, it is an extremal surface. All surfaces Σt are isomorphic, with the isomorphisms preserving the two fibers J0 and |$J_\infty $|⁠, but moving the other fibers. This can be seen by noting that the cross ratio of the points |$(\frac {1-t}8,t-1;\infty ,0)$| (the position of the fibers in increasing order of complexity) is the constant −8.

Six sections of Σt rational over Q(t) are determined by the bisecants of the linear system of sextics that defines St. These bisecants are the lines

\begin{equation}\label{eq25} \begin{split} &\Lambda_{-1},\ \Lambda_0,\ \Lambda_1,\quad (\hbox{blow-ups of }P_{-1}, P_{0}, P_{1}),\\ &\Lambda_{*},\quad (\hbox{image of }y=0,\hbox{ joining }P_{+}\hbox{ and }P_{-}),\\ &\Lambda_{+-},\quad (\hbox{image of }3x+y=2,\hbox{ joining }P_{+-}\hbox{ and }P_{-+}),\\ &\Lambda_{++},\quad (\hbox{image of }3x-y=2,\hbox{ joining }P_{++}\hbox{ and }P_{--}). \end{split} \end{equation}

(25)

The last three sections intersect in a common point, image of the point |$(\frac {2}{3},0)$|⁠. The corresponding fiber where the three sections meet in a single point is the special fiber at μ=(1−t)/8, with the three sections meeting at the node of the fiber.

The six sections of Σt determined by the above bisecants of St are the only rational sections of Σt and generate a group isomorphic to |${\mathbb {Z}}/6$|⁠.

7 Changing Parameters

Since the surface Σt is unique up to isomorphism, we can eliminate the parameter t and work with a fixed surface Σ*. We do this by going to the model where the fibers I3, I2, I1, I6, are now located at |$\mu =0,1,-8,\infty $|⁠, because it gives a particularly simple equation to study. However, the isomorphism depends on t and we must be careful with questions of rationality over |${\mathbb {Q}}(t)$|⁠.

In what follows, we denote by λ a point on the parameter space of the elliptic fibration of Σ* and by Jλ the corresponding fiber over λ. Let ϕt′:Σt→Σ* be an isomorphism that preserves the elliptic pencil, but sends the fiber Jt to Jt′ while exchanging J0 and |$J_\infty $|⁠. Then it will act on the base |${\mathbb {P}}^1$| with an inversion μ→c/μ. Sending t−1 to to J1 gives c=t−1. The fiber of type I1 at (1−t)/8 goes to μ=−8. The fiber Jμ goes to J(t−1)/μ. Therefore, we must have t′=1−1/t and

\begin{equation}\label{eq26} \phi_{t'}(J_\mu)=J_{(t-1)/\mu}. \end{equation}

(26)

Rational elliptic surfaces with a section are called ‘extremal’ if the vertical components of the fibers together with the section generate the Néron–Severi group. Extremal rational elliptic surfaces over |${\mathbb {C}}$|⁠, with a section and with semistable special fibers, have been classified by Beauville [2]. They are unique up to isomorphism and have only four singular fibers. There are six distinct types and our surface Σ* is type IV in Beauville's classification. A very simple equation for this surface Σ* is in [2], with the elliptic pencil given in affine coordinates by

\begin{equation}\label{eq27} (x + y)(x + 1)(y +1) + \lambda x y =0. \end{equation}

(27)

The six surfaces studied by Beauville are universal elliptic curves parametrizing elliptic curves over |${\mathbb {C}}$| having torsion groups of certain types. The parameter space is identified with |${\mathcal H}/\Gamma $| where |${\mathcal H}$| is the upper half-plane and |$\Gamma &#x003C; SL_2({\mathbb {Z}})$| is a subgroup such that |$\mathcal H/\Gamma $| has genus 0. In our case, the surface Σ* is the unique surface of type [1,2,3,6] in Beauville's classification. The associated group Γ is the group |$\Gamma _0^0(6)$| and the Mordell–Weil group of Σ* is the torsion group |${\mathbb {Z}}/6$|⁠. In particular, the parametric solutions of the system (1) in rational functions of A, B, m, are only the trivial ones.

Next, we obtain the extended Weierstrass equation form for the elliptic curve (27). This is obtained as the proper transform of the curve by composition of the following sequence of birational trasformations of the plane (x,y), first to (x,y′), then to (x′,y′), then to (x′,X), and finally to (X,Y):

\begin{align*} &#x0026; x = x, \quad y = -1 + \frac{x y'}{2 + 2 x};\quad x = x' - \frac{1}{2} y' - \lambda, \quad y' = y';\\ &#x0026; x' = x', \quad y' = 2\frac{X + \lambda}{x'};\quad X=X,\quad x'= 1 + \frac Y X. \end{align*}

The extended Weiestrass form turns out to be

\begin{equation}\label{eq28} Y^2 - (\lambda -2) X Y - \lambda Y = X^3 + 2 \lambda X^2 + 2 \lambda X. \end{equation}

(28)

The discriminant of this equation is

\[ \Delta = \lambda^3 (1-\lambda)^2 (8+\lambda). \]

8 The Minimal Model

Here we investigate the minimal discriminant of the Weierstrass equation (28) when m and A2+B2 arise from our problem (1).

We are interested in the fibers with rational λ, say λ=a/c with (a,c)=1. In the case of special interest to us, we have

\[ \lambda=\frac{t-1}\mu = \frac {m-A^2-B^2}{m}, \]

hence

\begin{equation}\label{eq29} m - A^2 - B^2 = d a,\quad m = d c,\quad d = {\mathrm{GCD}}(m,A^2+B^2). \end{equation}

(29)

The Weierstrass equation (28) has coefficients

\[ a_1= \frac{2c-a}{c},\quad a_3=-\frac{a}{c},\quad a_2=a_4=2\frac{a}{c},\quad a_6=0. \]

In order to get a minimal equation over |${\mathbb {Z}}$| we begin by clearing denominators, replacing ai by aiai. (See Silverman [19, III, §1].) This yields new parameters

\begin{equation}\label{eq30} \begin{split} &a'_1 = 2 c - a ,\quad a'_3 = -c^2 a,\\ &a'_2 = 2 c a,\quad a'_4= 2 c^3 a,\quad a'_6=0. \end{split} \end{equation}

(30)

The discriminant Δ′ and the invariant c′4 and c′6 of this elliptic curve are

\begin{equation}\label{eq31} \begin{split} \Delta' &= (-a)^3 (c-a)^2 (-8c-a) c^6,\\ c'_4 &= (a + 2 c) (a^3 + 6 a^2 c - 12 a c^2 + 8 c^3),\\ c'_6 &= -(-a^2 - 4 a c + 8 c^2) (-a^4 - 8 a^3 c - 8 a c^3 + 8 c^4). \end{split} \end{equation}

(31)

The factors of the discriminant correspond to the special fibers of type I3I2, I1, I6, located at |$\lambda = 0,1,-8,\infty $|⁠.

In order to look for minimality at a prime p we recall that if ordp(c′4)<4 or ordp(c′6)<6, then the given Weierstrass equation is minimal at p, see for example [19, VII, Remark 1.1]. Since the resultant of the two forms with respect to a is −22039c24, with respect to c is −22039a24, and (a,c)=1, we infer that the Weierstrass equation with coefficients a′i is minimal for every prime p>3.

If 3 does not divide c−a, then ord3(c′4)=0 and the equation remains minimal. If instead 3|c−a, then ord3(c′6)=3, so the equation is again minimal.

At the prime 2, in order to examine minimality we check first ord2(c′i). If a is odd, then ord2(c′4)=0 and the equation is minimal. If a is even, then c is odd and we verify that ord2(c′6)=5 (hence the equation is minimal) unless a≡0 mod 4, in which case ord2(c′6)=6 and |${\mathrm {ord}}_2(c'_4)\geqslant 4$|⁠. In order to analyze this case, we set a=2kv, where v is odd, and |$k\geqslant 2$|⁠. Then

\begin{align*} a'_1 &= 2 r + 2^k(2 u - v), \quad a'_3 = -2^k (r + 2^k u)^2 v,\\ a'_2 &= 2^{k+1} (r + 2^k u) v,\quad a'_4 = 2^{k+1} (r + 2^k u)^3 v,\quad a'_6=0. \end{align*}

If k=2, then the discriminant Δ′ is

\[ \Delta' = -64 v^3 (c-4v)^2 (8c+4v) c^6, \]

hence ord2(Δ′)=8<12 because c and v are both odd. Therefore, if k=2 the equation is minimal at the prime 2.

Finally, if |$k\geqslant 3$| we have

\[ {\rm ord}_2(a'_1)=1,\quad {\rm ord}_2(a'_3)=k, \quad {\rm ord}_2(a'_2)=k+1, \quad {\rm ord}_2(a'_4)=k+1, \]

and a′6=0. Now the transformations X=4X′, Y =8Y ′, yield a new equation with coefficients in |${\mathbb {Z}}$|⁠, namely

\begin{equation}\label{eq32} Y^{\prime 2} + a''_1 X' Y' +a''_3 Y'=X^{\prime 3}+a''_2 X^{\prime 2}+a''_4 X', \end{equation}

(32)

where the coefficients and discriminant are given by

\[ a''_i=2^{-i}a'_i \quad (i=1,2,3,4),\quad \Delta''=2^{-12}\Delta'. \]

The coefficient a′′1 has order ord2(a′′1)=0, hence no further reduction at the prime 2 is possible and the Weierstrass equation is minimal at the prime 2.

 

Let λ=a/c with (a,c)=1. Then a Weierstrass equation of the fiber Jλ of the Beauville surface is

\begin{equation}\label{eq33} y^2 + (2 c - a) x y - c^2 a y = x^3 + 2 c a x^2 + 2 c^3 a x, \end{equation}

(33)

with discriminant

\[ \Delta = (- a)^3(c - a)^2\,(- 8 c - a)c^6. \]

The point (0,0) is a torsion point of order 3 and the point (−c2,c3) is a torsion point of order 2.

This Weierstrass equation is a minimal model over |${\mathbb {Z}}$|⁠, unless c≡1 mod 2 and a≡0 mod 8. In this special case, a minimal model is

\begin{equation}\label{eq34} y^2 +\tfrac{1}{2} (2 c - a) x y - \tfrac 18 c^2 a y = x^3 + \tfrac{1}{2} c a x^2 + \tfrac 18 c^3 a x, \end{equation}

(34)

with discriminant

\[ \Delta = 2^{-12} (- a)^3(c - a)^2(- 8 c - a) c^6. \]

The point (0,0) is a torsion point of order 3 and the point (−c2/4,c3/8) is a torsion point of order 2.

 

In the preceding discussion we have proved all statements except the statement about torsion. This can be obtained by direct computation from the formulas in [19], III, Group Law Algorithm 2.3. Incidentally, we may note that the Weierstrass equation (34) is an example of nonintegral 2-torsion with maximal denominators in both coordinates, see [19, VII, Theorem 3.4].

 

Let m, A, B, be the parameters associated to the system of Equations (2), (3), and suppose that

\[ (A^2 + B^2) (m -A^2 - B^2) (9m -A^2 - B^2) m \ne 0. \]

Then the plane sextic Φ defined in Section 2 has genus 1. A minimal Weierstrass model of the jacobian of Φ is the above model of the fiber Ja/c of the Beauville curve and

\[ m-A^2-B^2 = a d,\quad m = c d,\quad d = {\rm GCD}(m,A^2+B^2). \]

We conclude this section with two simple remarks. Let d=GCD(K,m). Then we verify that

\begin{align}\label{eq35} f(u,v)&\equiv 4 u (2 A - u - v) (2 A - u + v) \nonumber\\ &\quad \times((K (A - u) (u - v) (u + v)+mu (2 A - u - v) (2 A - u + v))\bmod d^2. \end{align}

(35)

This congruence may be useful if we want to study the integer solutions of the equation f(u,v)=0 by Runge's method.

In order to get further information about the equation f(u,v)=0 we use the fact that the Halphen pencil has a double fiber, which we have determined explicitly in Section 6 with Equation (24). The double fiber correspond to the value m=K+A2, which in turn determines the decomposition

\begin{equation}\label{eq36} \begin{split} f(u,v)&= c(u,v)^2 + 4 (m-K-A^2) g_1(u,v)g_2(u,v)g_3(u,v),\\ g_1(u,v)&=K + 2 A u - u^2 - u v,\\ g_2(u,v)&=K + 2 A u - u^2 + u v,\\ g_3(u,v)&=4 A^2 + 2K - 4 A u +u^2 - v^2, \end{split} \end{equation}

(36)

where

\begin{equation}\label{eq37} c(u,v)=3K^2+4 A^2K + (4AK+8A^3)u - (8 A^2 +3K)u^2 -Kv^2+ 2 A u^3 - 2 A u v^2. \end{equation}

(37)

Now m−K−A2=m−(m−A2−B2)−A2=B2, hence if f(u,v)=0 we see that 2B divides c(u,v) and −g1(u,v)g2(u,v)g3(u,v) is a perfect square, because

\begin{equation}\label{eq38} c(u,v)^2 + 4 B^2 g_1(u,v)g_2(u,v)g_3(u,v) =f(u,v)=0. \end{equation}

(38)

9 The Szpiro Ratio

A consequence of the computation if the minimal discriminant is that it gives information about the Szpiro ratio of the particular elliptic fibers of the Beauville surface which are associated to our problem. We have the following.

 

Suppose that m is squarefree. Then the Szpiro ratio σ(Jλ) of an elliptic curve Jλ with λ=(m−A2−B2)/m as in Corollary 12 is bounded by

\begin{align}\label{eq39} 1 \leqslant \sigma(J_\lambda) &#x003C; 39. \end{align}

(39)

 

We have

\begin{align*} m-A^2-B^2 &= m-(x_1+x_2+x_3)^2-(y_1+y_2+y_3)^2\\ &= -2m - 2x_1x_2-2x_1x_3-2x_2x_3 - 2y_1y_2-2y_1y_3-2y_2y_3 \\ &#x0026; >-2m - 2\sum_{i=1}^3(x_i^2+y_i^2) = -8m. \end{align*}

Also m−A2−B2=ad, m=cd with d=GCD(m−A2−B2,m), hence

\begin{equation}\label{eq40} -8c <a <c. \end{equation}

(40)

By Theorem 11, the discriminant Δ of the elliptic curve Jλ is (−a)3(c−a)2(−8c−a)c6 unless c≡1 mod 2 and a≡0 mod 8, in which case we must divide the above quantity by 212. Thus it follows from the bounds (40) that

\[ \log|\Delta| \leqslant 12 \log c +10.06. \]

On the other hand, c divides m and m is squarefree, hence c is squarefree. Therefore, noting that c or c/2 is a divisor of the minimal discriminant Δ, we also have that c or c/2 is a divisor of the conductor N(Jλ) of Jλ. Since the conductor is at least 2, we conclude that

\[ \sigma(J_\lambda) = \frac{\log |\Delta|}{\log N({\mathcal J}_\lambda)} \leqslant \frac{12 \log c + 10.06}{\max(\log 2, \log c -\log 2)} &#x003C;38.52. \]

10 A Remark on Bounds for the Rank

Given a triple (m,A,B) arising from a solution of our system of Equations (1), we want to study the behavior of the associated curve of genus 1, which we have shown is the fiber |${\mathcal E}_\mu $| of the surface Σt, where t=(m−B2)/A2 and μ=m/A2.

The surface Σt is isomorphic to another elliptic surface Σ* independent of t, such that its fibers over its base |${\mathbb {P}}^1$| are obtained by a projective transformation sending μ to λ=(t−1)/μ=(m−A2−B2)/m. Thus the surface Σ* contains as fibers all curves of genus 1 associated to the system (1), for varying m, A, B, through the identification of |${\mathcal E}_\mu $| with the fiber over λ=(m−A2−B2)/m in the fibration |$\Sigma ^* \to {\mathbb {P}}^1$|⁠.

As noted before, the jacobian surface of the surface Σ* is the unique Beauville surface of type (1,2,3,6) with fibers of type (I3,I2,I1,I6) located with λ at |$(0,1,-8,\infty)$|⁠, respectively. Now we have shown that the surface Σt has six bisections, namely the blow-ups of the three base points at |$\infty $| and the images of the lines y=0, 3x+y=2, 3x−y=2 of the Halphen pencil. These bisections are defined over |${\mathbb {Q}}$| and yield correspondingly bisections of Σ*, again defined over |${\mathbb {Q}}$|⁠. Let us choose one of these bisections and for each fiber |${\mathcal E}_\lambda $| let D(λ) be the divisor of degree 2, defined over |${\mathbb {Q}}(\lambda)$|⁠, intersection of the fiber with the bisection. Then the class map P↦cl(2P−D(λ)) yields a finite morphism |$cl_\mu : {\mathcal E}_\mu \to J_\lambda $| of degree 4.

Consider now a fiber |${\mathcal E}_\mu $| with |$\mu \in {\mathbb {Q}}$|⁠. If |${\mathcal E}_\mu $| has no rational points, we will say (by abuse of language) that it has rank 0 over |${\mathbb {Q}}$|⁠, although the notion of rank over |$\mathbb Q$| has, strictly speaking, no meaning here. If however it has a rational point P0 (the case we are interested in), then |${\mathcal E}_\mu $| has a structure of elliptic curve with P0 as origin. The morphism |$cl_\mu :{\mathcal E}_\mu \to J_\lambda $| we have just defined is the composition of a |${\mathbb {Q}}$|-isogeny of degree 4 and the translation by 2P0−D(λ). Therefore, |${\mathcal E}_\mu $| and Jλ have the same rank, say r(λ). This allows us to study the behavior of the rank of the curves |${\mathcal E}_\mu $| by studying the rank r(λ) of Jλ, as λ=(m−A2−B2)/m varies.

The following discussion supports the statement that Theorem 8 still holds, at least for m , even if we omit the condition |$\log N/\log \log m \to \infty $|⁠.

We have the lower bound (see Hindry and Silverman [14], David [7], Petsche [18], Theorem 2, p. 259)

\[ \hat h(P) \geqslant c(d,\sigma) \log|{\rm Norm}_{k/{\mathbb{Q}}}\Delta(E/k)| \]

for P not a torsion point, where |$c(d,\sigma)=1/(10^{15}d^3\sigma ^6\log ^2(c_6d\sigma ^2))$|⁠; the uniform bound |$\hat h(P) \gg \log |{\rm Norm}_{k/{\mathbb {Q}}}\Delta (E/k)| $| that follows from the Szpiro conjecture is a well-known conjecture due earlier to Lang (apparently, it is weaker than the Szpiro conjecture). If we take

\[ t=\tfrac{1}{2}\sqrt{c(1,\sigma)\log|\Delta(J_\lambda)|} \]

the balls of radius t centered at all points P∈Λ∩BT will be disjoint. For our curve Jλ with m we have by Proposition 13 the bound σ(Jλ)<39, whence the number of rational points on Jλ of height at most |$T=100\log m$|⁠, say, does not exceed

\[ (T/t + 1)^r \leqslant \left(c_7\frac{\log m}{\log|\Delta(J_\lambda)|}\right)^{r/2} \]

for an absolute constant c7. Therefore, for pairs (A,B) such that |Δ(Jλ)|≫mη we get a bound (c8/η)r+1 for the number of integral points on the curve CA,B. In that case, the condition on |$\log N$| in Theorem 8 becomes superfluous. The condition on Δ(Jλ) is certainly verified unless GCD(A2+B2,m)>m1−η. This appears to be a rather onerous condition on the pair (A,B) if η is small, so it is plausible to conclude that if there is a sequence (mν) such that there are (Aν,Bν) with the property that the number of nontrivial solutions of the system of equations x1+x2+x3=Aν, y1+y2+y3=Bν, |$x_i^2+y_i^2=m_\nu $| (i=1,2,3) becomes unbounded as |$\nu \to \infty $|⁠, then the elliptic curves EAν,Bν necessarily have unbounded rank. In particular, either the system (1) has only O(N3) solutions or there exist elliptic curves over |${\mathbb {Q}}$| of unbounded rank.

We strongly believe that the first statement is true and that the main contribution to the counting of solutions comes exclusively from the trivial solutions. However, in our case the elliptic curves in question belong to a small subset of a one-parameter family over |${\mathbb {Q}}$|⁠, so in our opinion there is no compelling evidence, either way, whether the Mordell–Weil rank of our curves is bounded or not.

11 Random Squarefree Numbers with Large Prime Factors

While the unconditional study of the solutions of the system (1) remains elusive, we can say much more if we study it in the average with respect to m.

Let bm=1 if m is sum of two squares and zero otherwise. The generating Dirichlet series F(s) of squarefree integers which are sum of two squares is

\begin{equation}\label{eq41} F(s)=\sum_{m=1}^\infty\frac{\mu(m)^2 b_m}{m^s} = \left(1+\frac 1 {2^s}\right) \prod_{p\equiv 1 \bmod 4} \left(1+\frac 1 {p^s}\right). \end{equation}

(41)

We easily verify that

\begin{equation}\label{eq42} F(s)^2 = \left(1+\frac 1 {2^s}\right) \prod_{p\equiv 1 \bmod 4} \left(1-\frac 1 {p^{2s}}\right) \frac {\zeta(s)}{\zeta(2s)} L(s,\chi_{-4}). \end{equation}

(42)

By following Landau's well-known method for an asymptotic formula counting of integers sum of two squares, we obtain the analog formula for squarefree integers which are the sum of two squares. Let ΩM be the set of squarefree integers up to M which are the sum of two squares. Then by formula (42) we have

\[ |\Omega_M| \sim \frac {3}{ 2\pi} \prod_{p\equiv 1 \bmod 4} \left(1-\frac 1{p^2}\right)^{\frac{1}{2}} \frac {M}{\sqrt{\log M}}. \]

Let also ΩM,K be the set

\begin{equation}\label{eq43} \Omega_{M,K} = \{ m\in \Omega_M\,:\, p\,|\,m \implies p >K\}. \end{equation}

(43)

The simple inclusion–exclusion sieve using the generating series (41) shows that for fixed K and |$M\to \infty $| we have uniformly in K an asymptotic formula

\begin{equation}\label{eq44} |\Omega_{M,K}| \sim \frac 23 \prod_{\substack{p\leqslant K\\ p\equiv 1 \bmod 4}} \left(1+\frac 1 p\right)^{-1} |\Omega_M| \sim \frac {\beta_1}{\sqrt{\log K}} |\Omega_M| \end{equation}

(44)

with β1 an absolute positive constant.

 

Let δ>0 be fixed. If K>K0(δ) and |$M\to \infty $|⁠, then for all but K−1+δ|ΩM,K| elements m∈ΩM,K the system (1) has only trivial solutions.

 

This result is not limited to 6-correlations and its proof extends quite easily to correlations of any order.

We need first a lemma about the distribution of prime factors of elements of the set ΩM.K, showing that usually they form a rapidly increasing sequence. There are precise results on the number of integers up to x which are sum of two squares and have no prime factors larger than y, see for example Moree's essay [17, Chapter 3, Theorem 1, p. 71], but the uniformity with respect to y is not sufficient for our purposes, because the important information for us concerns the distribution of small prime factors of squarefree numbers which are sum of two squares. Instead, we will follow a direct route to obtain a weak result valid in all ranges.

For the rest of this section, we denote by Φ(x) a fixed positive function monotonically increasing to |$\infty $| with x. Moreover, we assume that Φ(x) is a slowly growing function satisfying

\begin{equation}\label{eq45} \Phi(2x)\sim \Phi(x). \end{equation}

(45)

 

For m∈ΩM,K let m=p1p2⋯pr be its factorization, with |$K<p_1<\dots <p_r$|⁠. Then as |$M\to \infty $| we have ps>2sΦ(s) for s=1,2,…,r and all |$m\in \Omega _{M,K} \setminus \Omega ^{(1)}_{M,K}$|⁠, where the exceptional set |$\Omega ^{(1)}_{M,K}$| has cardinality

\[ |\Omega^{(1)}_{M,K}| \leqslant \eta(K,\Phi) |\Omega_{M,K}| \]

with η(K,Φ)→0 as |$K\to \infty $|⁠. If |$\Phi (x)=o(\log x)$|⁠, then we can take η(K,Φ)=K−1+δ for every fixed δ>0.

 

Let m=p1p2⋯pr, and write Dj=(2j−1,2j] for a dyadic interval. For |$m\in \Omega ^{(1)}_{M,K}$| let s to be the location of the first prime ps for which |$p_s \leqslant 2^{s\Phi (s)}$|⁠, if there is such an occurrence. Then we define |$\Omega ^{(1)}_{D_j,K,s}$| to be the set

\begin{equation}\label{eq46} \Omega^{(1)}_{D_j,K,s} = \{p_1p_2\cdots p_s\in\Omega_{M,K}\cap D_j\,:\, p_i >2^{i\Phi(i)}\ \hbox{for } i<s,\ p_s \leqslant 2^{s\Phi(s)}\}. \end{equation}

(46)

It is clear that |$\Omega ^{(1)}_{M,K}$| does not exceed

\begin{equation}\label{eq47} |\Omega^{(1)}_{M,K}|\leqslant \sum_{s,j}|\Omega^{(1)}_{D_j,K,s}|\cdot |\Omega_{M/2^{j-1},K}|. \end{equation}

(47)

We begin by studying the relative sizes of s, j, and K.

Firstly, we have

\[ 2^j \geqslant p_1 p_2\cdots p_s \geqslant (K+s)!/K! > 2^{s\log(s+K)} \]

hence

\begin{equation}\label{eq48} j >\max\left(s\log(s+K), \frac{\log K}{\log 2}\right). \end{equation}

(48)

Secondly, since elements of |$\Omega ^{(1)}_{D_j,K,s}$| satisfy pi>2iΦ(i) for i=1,2,…,s−1 and |$p_1 p_2\cdots p_s \leqslant 2^j$|⁠, we have

\begin{equation}\label{eq49} \frac{1}{2} s^2 \Phi(s) \sim \sum_{i=1}^{s-1} i \Phi(i)\leqslant j. \end{equation}

(49)

Thirdly, |$2^{j-1}<p_1p_2\cdots p_s \leqslant p_s^s \leqslant 2^{s^2\Phi (s)}$|⁠, hence

\begin{equation}\label{eq50} s^2\Phi(s) >j-1. \end{equation}

(50)

Therefore, the above bounds (49) and (50) show that

\begin{equation}\label{eq51} j-1 &#x003C; s^2\Phi(s) \lesssim 2 j. \end{equation}

(51)

From this it follows

\begin{equation}\label{eq52} u := \frac{\log(2^{j})}{\log(2^{s\Phi(s)})} = \frac{ j}{s\Phi(s)} \geqslant\left(\frac{1}{2} +o(1)\right) s. \end{equation}

(52)

Let us denote

\[ \Psi(x,y) = \{ n \leqslant x\,:\, p\,|\,n \implies p\leqslant y\} \]

and, as usual, ψ(x,y)=|Ψ(x,y)|. Recall that we have the useful approximation

\begin{equation}\label{eq53} \psi(x,y) = x \exp(-(1+o(1))u\log u)\quad (u = \log x/\log y) \end{equation}

(53)

uniformly for |$y >(\log x)^{1+\varepsilon }$| and |$u\to \infty $|⁠, see the survey by Hildebrand and Tenenbaum [13, Corollary 1.3].

By definition, p1p2⋯ps∈Ψ(p1p2⋯ps,ps). Therefore,

\[ |\Omega^{(1)}_{D_j,K,s}| \leqslant \psi(2^j,2^{s\Phi(s)}). \]

Case I: |$s\Phi (s) >\frac {1+\varepsilon }{\log 2} \log j$|⁠. In this case, the condition |$y >(\log x)^{1+\varepsilon }$| holds for x=2j and y=2sΦ(s), thus we may apply the estimates (53) and (52) getting

\begin{equation}\label{eq54} |\Omega^{(1)}_{D_j,K,s}| \leqslant \psi(2^j,2^{s\Phi(s)}) \ll 2^j \,{\mathrm{e}}^{-(\frac{1}{2}+o(1)) s \log s}. \end{equation}

(54)

Case II: |$s\Phi (s)\leqslant \frac {1+\varepsilon }{\log 2}\log j$|⁠. In this case

\begin{equation}\label{eq55} |\Omega^{(1)}_{D_j,K,s}| \leqslant \psi(2^j,2^{s\Phi(s)}) \leqslant \psi(2^j,2^{(1+\varepsilon)\log j}) \ll 2^{\varepsilon j} \end{equation}

(55)

as one sees applying again the bound (53) with a new value |$u \sim \frac {(\log 2)j}{(1+\varepsilon)\log j}$|⁠.

The contribution to |$|\Omega ^{(1)}_{M,K}|$| coming from values of s in Case II is majorized, up to an absolute constant factor, by

\begin{align*} \sum_{s,j} \psi(2^j,2^{(1+\varepsilon)\log j}) |\Omega_{M/2^{j-1},K}| &\ll \sum_{ K &#x003C; 2^j}(\log j) 2^{\varepsilon j} \frac{M/2^j}{\sqrt{\log(M/2^{j-1}+2)}}\\ &\ll \frac {\log K}{K^{1-\varepsilon}}\frac {M}{\sqrt{\log M}}\\ &\ll K^{-1+\delta} |\Omega_{M,K}| \end{align*}

as long as ɛ<δ, the last step because of the bound (44). This shows that we need to deal only with Case I.

Since the range of exponents j for which a given s may occur is

\begin{equation}\label{eq56} j \asymp s^2\Phi(s), \end{equation}

(56)

we have to bound

\[ \sum_s \sum_{j\asymp s^2\Phi(s)} 2^j \,{\rm e}^{-(\frac{1}{2}+o(1)) s\log s} |\Omega_{M/2^{j-1},K}|. \]

The sum for |$s >\log \log M$| is majorized by |$O(M/(\log M)^B)$| for arbitrarily large fixed B, so it can be neglected.

For |$s &#x003C; \log \log M$|⁠, we have by the bound (44) the estimate

\[ |\Omega_{M/2^{j-1},K}| \ll\frac 1{\sqrt{\log K}}\frac{ M2^{-j}}{\sqrt{\log M}} \]

and using Equation (56) we remain with the task of estimating |$\sum s^2\Phi (s) \exp (- \varepsilon _1 s\log s)$| over the remaining range |$s<\log \log M$|⁠. In order to complete the proof of the lemma, we need to show that as |$K\to \infty $| we have

\[ \sum_s s^2\Phi(s) \,{\rm e}^{-(\frac{1}{2}+o(1)) s\log s}\to 0 \]

where the sum is over the s that may occur in Case I. To do so we note that |$ s^2\Phi (s) \asymp j >s \log K $|⁠, hence |$s \Phi (s) \gg \log K\to \infty $| as |$K\to \infty $|⁠. It follows that |$s\to \infty $| as |$K\to \infty $| because the function Φ does not depend on K. Since the sum over s is convergent, we get what we want. More precisely, if |$\Phi (x)=o(\log x)$| the sum over |$s\Phi (s)>\log K$| will be O(K−A) for any fixed A>0.

 

For each prime p≡1 mod 4 we fix once for all a factorization |$p=\pi \bar {\pi }$|⁠. Recall that if m=p1p2⋯pr with |$2<p_1<p_2<\dots <p_r$| is a sum of two squares then |$m = \xi \bar {\xi }$| with |$\xi =\pi _1^*\cdots \pi _r^*$| and |$\pi ^*_i\in \{\pi _i,\bar {\pi }_i\}$|⁠, giving us the 2r+1+ɛ representations of m as a sum of two squares (ɛ=1or 0 according as m is even or odd). Our goal is to count the total number of nontrivial solutions of the equation

\begin{equation}\label{eq57} \pm\xi_1\pm\xi_2\pm\xi_3\pm\xi_4\pm\xi_5\pm\xi_6 = 0 \end{equation}

(57)

when |$m\in \Omega _{M,K}\setminus \Omega _{M,K}^{(1)}$|⁠, a solution being trivial if the terms ±ξi cancel in pairs.

Define for algebraically independent indeterminates ωi, |$\bar {\omega }_i$| the set

\[ {\mathcal M}_r = \{\omega_1,\bar{\omega}_1\}\times \cdots \times\{\omega_r,\bar{\omega}_r\} \]

as the set of the 2r monomials |$\omega _1^*\cdots \omega _r^*$| with |$\omega _s^*\in \{\omega _s,\bar {\omega }_s\}$|⁠. Note that the correspondence |$\omega _i^* \to \pi _i^*$| remains one-to-one when extended to |${\mathcal M}_r$|⁠, because of unique factorization in the ring |${\mathbb {Z}}(\sqrt {-1})$| of Gaussian integers. Thus we begin by studying the algebraic structure of the equation (57) by associating to each ξi (i=1,…,6) the corresponding monomial |$f_i\in \mathcal M_r$|⁠, where now the ωi and |$\bar {\omega }_i$| are independent variables.

We denote by |${\mathcal F}_r$| the set

\[ {\mathcal F}_r := \left\{F=\sum_{i=1}^k \pm f_i \,:\, f_i\in {\mathcal M}_r,\ k\leqslant 6 \right\}\setminus \{0\}; \]

we have removed the polynomial 0 because it corresponds to the trivial solutions of the equation (57). The number of possible polynomials Fr is O(64r).

We write uniquely

\[ F_r = \omega_r G_{r-1}+\bar{\omega}_r H_{r-1} \]

and choose some Fr−1∈{Gr−1,Hr−1} not equal to 0, because Fr≠0. Now Fr−1 consists of not more than 6 monomials, is not 0, does not contain ωr or |$\bar {\omega }_r$|⁠, and we can continue the process. In this way we derive from Fr a sequence of nonzero polynomials

\[ F_r,F_{r-1},\ldots,F_1 \]

where each Fs is a sum of not more than 6 monomials in |${\mathcal M}_s$| and hence ranges in a collection of not more than O(64s) elements. We view such a sequence as nodes of a tree |${\mathcal T}$| which, starting with a root Fr, produces from the node Fs one or two branches with next node Fs−1∈{Gs−1,Hs−1} (because of the condition Fs−1≠0).

For any element |$F\in {\mathcal F}_s$| with |$s\leqslant r$| and m=p1p2⋯pr with |$p_1 <p_2 <\dots <p_r$| we write for simplicity Fs|m for the value obtained when we specialize ωi and |$\bar {\omega }_i$| to πi and |$\bar {\pi }_i$|⁠. The number m will determine a nontrivial relation |$\sum \pm \xi _i=0$| if the corresponding element |$F\in {\mathcal F}_r$| vanishes by specialization, that is F|m=0.

Fix r and consider the set Zr of elements m of |$\Omega _{M,K}\setminus \Omega _{M,K}^{(1)}$| with r prime factors. We want to bound the number of m∈Zr giving rise to a nontrivial relation. Thus writing

\begin{equation}\label{eq58} Z^{(0)}_r:=\{m\in Z_r\,:\, F|_m=0 \hbox{ for some } F\in{\mathcal F}_r\} \end{equation}

(58)

it is clear that

\begin{equation}\label{eq59} |Z^{(0)}_r| \leqslant \sum_{s\leqslant r} |\{m\in Z_r\,:\, F_s|_m=0,\ F_{s-1}|_m\ne 0 \hbox{ for all } F_{s-1}\in{\mathcal F}_{s-1}\}|. \end{equation}

(59)

Next, |$F_s=\omega _s G_{s-1}+\bar {\omega }_s H_{s-1}$| and both Gs−1 and Hs−1 belong to |${\mathcal F}_{s-1}$|⁠, so their specializations by m are not 0. Since Fs|m=0 we have

\[\frac{\pi_s}{\bar{\pi}_s }= - \frac{H_{s-1}|_m}{G_{s-1}|_m},\]

so |$\pi _s/\bar {\pi }_s$| is uniquely determined by |$\{p_1,\dots p_{s-1}\}$| and the branch (Fr,Fr−1,…,Fs).

Let a+ib and x+iy be two Gaussian integers and suppose that

\[ \frac {a+{\rm i}b}{a-{\rm i}b}=\frac{x+{\rm i}y}{x-{\rm i}y}. \]

Then a/b=x/y and it follows that (a2+b2)/(x2+y2) is the square of a rational number. Therefore, if a2+b2 is squarefree we must have x+iy=±(a+ib). In particular, the ratio |$\pi _s/\bar {\pi }_s$| determines uniquely the prime ps.

Now it is easy to get a bound for |$\sum _r|Z^{(0)}_r|$|⁠. Fix s. At stage s, given (p1,…,ps−1) and Fs then the prime ps is determined and there are at most O(64s) possibilities for Fs. Hence the total number of values of m excluded in this way is, using ps>2sΦ(s), at most of order

\begin{align}\label{eq60} &64^{s}\sum_{(p_1,\ldots,p_{s-1})}\left[1+\sum_{r>s}\left| (p_{s+1},\ldots,p_r)\,:\, p_{s+1}\cdots p_r <\frac M{2^{s\Phi(s)}p_1\cdots p_{s-1}}\right|\right] \notag\\ &\quad\leqslant 64^{s}\left|(p_1,\dots ,p_{s-1}) \,:\, p_1\cdots p_{s-1} &#x003C; \frac M{2^{s\Phi(s)}}\right| \notag \\ &\qquad +64^{s} \sum_{r>s}\left|(p_1,\dots ,p_{s-1},p_{s+1},\ldots,p_r)\,:\, p_1\cdots p_{s-1}p_{s+1}\cdots p_r &#x003C; \frac M{2^{s\Phi(s)}}\right|\notag\\ &\quad\leqslant 64^{s} \left| m &#x003C; \frac M{2^{s\Phi(s)}}\,:\, m =\square+\square\right|. \end{align}

(60)

The same estimate holds replacing 2sΦ(s) by K, because in any case K<p1<ps.

It remains to sum this bound for all values of s and show that the total is negligible compared with |ΩM,K|, provided |$K\to \infty $| sufficiently slowly. As in the proof of the lemma, we split the range of s into small values, large values, and intermediate values, and more precisely the ranges |$s\sqrt {\Phi (s)}<\log K$|⁠, |$s >\log \log M$|⁠, and the complementary range.

The contribution from large values is |$O(M/(\log M)^A)$|⁠, as in the proof of the lemma.

For intermediate values, we use the bound |$O(M/\sqrt {\log M})$| for the number of m<M which are sum of two squares, getting a total of

\[ O\left(\frac M{\sqrt{\log M}}\sum_{\sqrt{\Phi(s)}\log K &#x003C; s\Phi(s)} 64^s 2^{-s\Phi(s)}\right). \]

Since Φ(s) is independent of K and increasing to |$\infty $| as |$s\to \infty $|⁠, the sum is O(K−A) for arbitrarily large A, so the total is negligible compared with |ΩM,K| when |$K\to \infty $|⁠.

For small values, we use the estimate (60) with K in place of 2sΦ(s). Then the bound is

\[ O\left(\frac M{K \sqrt{\log M}} \sum_{s\sqrt{\Phi(s)}<\log K} 64^s\right), \]

and the sum is O(Kδ) for any fixed δ>0, so again the total is negligible compared with |ΩM,K| when |$K\to \infty $|⁠.

Now we are ready to prove the following theorem.

 

As |$M\to \infty $|⁠, for almost all elements m∈ΩM the system (1) has only O(8ω4,1(m)) solutions. More precisely, the number of trivial solutions is of order 8ω4,1(m), the number of nontrivial degenerate solutions is O(4ω4,1(m)), and there are at most O(2ω4,1(m)) non-degenerate solutions.

 

Let |${\mathcal P}_K = \prod _{p\leqslant K} p$|⁠. For m∈ΩM we write uniquely m=dh with |$d\in {\mathcal P}_K$| and h∈ΩM/d,K. A decomposition |$\xi \bar {\xi } = m$| of m as a sum of two squares can be written as ξ=αβ with |$\alpha \bar {\alpha } = d$| and |$\beta \bar {\beta } = h$|⁠. This decomposition is unique up to multiplication of α and β by ±1.

This being said, we proceed as in the proof of Theorem 14. We want to count the total number of solutions of the equation

\begin{equation}\label{eq61} \pm\alpha_1\beta_1 \pm\alpha_2\beta_2 \pm\alpha_3\beta_3 \pm\alpha_4\beta_4\pm\alpha_5\beta_5 \pm\alpha_6\beta_6 =0. \end{equation}

(61)

We fix first (α1,…,α6). The number of possible sextuples is O(64ω4,1(d)).

If h=p1p2⋯pr we write |$h=\beta \bar {\beta }$| with |$\beta =\pi _1^*\cdots \pi _r^*$| and |$\pi _i^*\in \{\pi _i,\bar {\pi }_i\}$|⁠. Now the proof proceeds step-by-step as before, except for the fact that |$\mathcal F_r$| is a linear combination of polinomials fi with coefficients certain sums of the ±αi (i=1,…,6).

Note that here we do not exclude the possibility of obtaining the null polynomial, but simply observe that in this case the polynomial

\[ F= \pm\alpha_1f_1 \pm\alpha_2f_2 \pm\alpha_3f_3 \pm\alpha_4f_4\pm\alpha_5f_5 \pm\alpha_6f_6 \]

vanishes identically only if |$\sum \pm \alpha _i$| decomposes into disjoint sums |$\sum _{i\in I}\pm \alpha _i =0$| and fi is fixed for i∈I. Since the cardinality of I is at least 2, a null polynomial either corresponds to a degenerate solution or |I|=6. In this case, all polynomials fi are equal, giving the bound O(2ω4,1(h)) for the number of nondegenerate solutions. The number of trivial solutions is of order 8ω4,1(h) and the number of degenerate nontrivial solutions is of order at most 4ω4,1(h).

Therefore, by Theorem 14 the total number of nondegenerate solutions is at most of order

\begin{align*} \sum_{d\in\mathcal P_K} 64^{\omega_{4,1}(d)} |\Omega_{M/d,K}|& \ll \sum_{d\in\mathcal P_K} 64^{\omega_{4,1}(d)}(K^{-1+\delta} +o(1))|\Omega_{M/d}|\\ &#x0026; \ll \left(\sum_{d\in\mathcal P_K}\frac{ 64^{\omega_{4,1}(d)}}d\right) (K^{-1+\delta}+o(1)) \frac M{\sqrt{\log M}} \\ &#x0026; \ll \left(\frac{(\log K)^{33}}{K^{1-\delta}} + o(1) \right) \frac M{\sqrt{\log M}}. \end{align*}

Since K can be taken arbitrarily large, this completes the proof.

12 Random Squarefree Numbers with Many Small Prime Factors

The results obtained in the preceding section show that for almost all m the number of solutions of Equation (1) is as predicted. On the other hand, this does not preclude the existence of a small set of squarefree numbers m for which the number of solutions is much larger than expected. The analysis there also showed that small prime factors are usually responsible for nondegenerate solutions of the system (1), so it is of interest to examine the case of random squarefree numbers with many small prime factors.

Consider the quantity

\[ V_{\mathcal M}:=\sum_{m\in\mathcal M} \left|\left\{(\xi_1,\ldots,\xi_6)\,:\, \xi_i\in{\mathbb{Z}}[\sqrt{-1}], \ \xi_i\bar{\xi}_i=m,\ \sum_{i=1}^6\xi_i=0\right\}\right| \]

where |$\sum '$| means that m is and |$\mathcal M$| is a set of number of size about M with a large number of factors, say

\begin{equation}\label{eq62} \mathcal M =\{ m \in[M,2M]\,:\, m=\square+\square,\ \omega_{4,1}(m) \sim r\} \end{equation}

(62)

where r is large, hence |$ r \asymp \log m/\log \log m$|⁠.

In Sections 4 and 11 we have obtained some conditional results depending on the size of the rank of the elliptic curves associated to m. The curves in questions are the curves Jλ with λ=1−(A2+B2)/m, where A and B are determined by ξ1+ξ2+ξ3=A+iB and |$\xi _i\bar {\xi }_i=m$|⁠. For fixed m and varying A, B subject to the above constraint, the parameter λ varies in too thin a set to use directly the analytic methods introduced by Brumer, Heath-Brown, and several other authors, to study the average value of the rank of the curves in the family.

All these analytic methods rely on the assumption of a Riemann hypothesis for the L-functions associated to the relevant elliptic curves, as well as the validity of the Birch and Swinnerton-Dyer conjecture about the equality between the arithmetic rank and the analytic rank of an elliptic curves, and also on the datum of a precise geometric structure of the parametrizing space of the family. So, it may be of some interest to show here how varying A, B, and also m, keeping the constraints on A and B imposed by m, yields a set of elliptic curves for which one does not have a family with a precise geometric structure but for which one can combine the previous techniques with other methods derived from probability theory, again arriving in the end to nontrivial, albeit still conditional, conclusions.

The curves of interest to us are given in Theorem 11 and its Corollary 12, which we restate for reader's convenience.

Theorem 11. Let λ=a/c with (a,c)=1. Then a Weierstrass equation of the fiber Jλ of the Beauville curve is

\begin{equation}\label{eq63} y^2 + (2 c - a) x y - c^2 a y = x^3 + 2 c a x^2 + 2 c^3 a x, \end{equation}

(63)

with discriminant

\[ \Delta_\lambda = (- a)^3 (c - a)^2(- 8 c - a) c^6. \]

The point (0,0) is a torsion point of order 3 and the point (−c2,c3) is a torsion point of order 2.

This Weierstrass equation is a minimal model over |${\mathbb {Z}}$|⁠, unless c≡1 mod 2 and a≡0 mod 8. In this special case, a minimal model is

\begin{equation}\label{eq64} y^2 +\tfrac{1}{2} (2 c - a) x y - \tfrac 18 c^2 a y = x^3 + \tfrac{1}{2} c a x^2 + \tfrac 18 c^3 a x, \end{equation}

(64)

with discriminant

\[ \Delta_\lambda = 2^{-12} (- a)^3 (c - a)^2(- 8 c - a) c^6. \]

The point (0,0) is a torsion point of order 3 and the point (−c2/4,c3/8) is a torsion point of order 2. |$\square $|

Our family has parameters

\begin{equation}\label{eq65} \begin{split} &\lambda = 1 -\frac{A^2+B^2}{m},\quad \xi_1+\xi_2+\xi_3=A+{\mathrm{i}}B,\\ &\xi_i\in{\mathbb{Z}}[\sqrt{-1}],\quad \xi_i\bar{\xi}_i=m \enspace (i=1,2,3), \end{split} \end{equation}

(65)

and |$m\in \mathcal M$|⁠, where |$\mathcal M$| is given by Equation (62).

13 Heath-Brown's Method: Preliminaries

From the discussion at the end of Section 11 we know that if m is the number N(m;A,B) of solutions of the system of Equations (2) and (3), then it is bounded by

\begin{equation}\label{eq66} N(m;A,B) \leqslant \left(c_8 \frac{\log m}{\log|\Delta(J_\lambda)|}\right)^{r(\lambda)/2} +c_9, \end{equation}

(66)

where r(λ) is the rank of the elliptic curve Jλ; the constant c9 takes care of the case of rank 0 and we exclude the special fibers for λ=0, 1, −8, |$\infty $|⁠.

Thus a way of studying the problem of the number of solutions of the system (1) when m varies can be reduced to study the average distribution of the rank r(λ). The study of the distribution of the rank of a family of elliptic curves was pioneered by Goldfeld [10], by controlling the rank using the L-function associated to an elliptic curve. This assumed several unproved very deep hypotheses about the L-function, namely the modularity conjecture, the Birch and Swinnerton-Dyer conjecture, and a Riemann hypothesis. Since then, only the modularity conjecture has been proved, the Birch and Swinnerton-Dyer conjecture has been proved only for ranks 0 and 1, and the Riemann hypothesis remains totally inaccessible. Improvements of Goldfeld's method were done by Brumer [5] and then by Heath-Brown [12], who introduced the idea of studying high order moments of the rank. We also refer the reader to the paper [16] of Miller and Wong for the most recent developments in this direction.

The families object of study in these works are smooth families, typically the family of quadratic twists of an elliptic curve or the full Weierstrass family. However, in our case the family in question is determined by the parameter λ=1−(A2+B2)/m and we know very little about the parameters A and B and even less about the GCD of A2+B2 and m. This creates new serious difficulties and we have to appeal to probability methods to overcome this obstacle. Although our final result still depends on the Birch and Swinnerton-Dyer conjecture and on a Riemann hypothesis for L-functions of an elliptic curve, we hope that the introduction of probability methods in this context is sufficiently interesting in its own right and may find further applications in the study of other problems.

Let E be an elliptic curve over |${\mathbb {Q}}$| and let |$\tilde E$| be a minimal model of E. The L-function LE(s) attached to E is given by a certain Dirichlet series

\[ L_{\tilde E}(s) = \sum_{n=1}^\infty \frac {a_n(\tilde E)}{n^s}\quad (\Re(s) >2). \]

The function |$L_{\tilde E}(s)$| admits an Euler product. Its Euler factors at primes of good reduction do not depend on the model. When E is a minimal model over |${\mathbb {Z}}$|⁠, then

\[ L_E(s) = \prod_{p|N_E} \left(1-\frac {a_p(E)}{p^s}\right)^{-1} \prod_{p\not{\hskip 2pt|}N_E} \left(1-\frac {\alpha_1(p)}{p^s}\right)^{-1} \left(1-\frac {\alpha_2(p)}{p^s}\right)^{-1}, \]

where NE, the conductor of E, is an integer dividing the discriminant of E and whose prime divisors are precisely those at which E has bad reduction. The coefficients ap(E) and αi(p) are determined as follows.

For the primes p ∤ NE where E has good reduction we have

\begin{equation} |\alpha_1(p)| = |\alpha_2(p)| = \sqrt p \quad (\hbox{the Riemann hypothesis over }{\mathbb{F}}_p),\label{eq67} \end{equation}

(67)

\begin{equation} a_p(E)=\alpha_1(p)+\alpha_2(p)=1+p-|E({\mathbb{F}}_p)| \hbox{(the trace)},\label{eq68} \end{equation}

(68)

and for the primes p|NE we have according as E has split multiplicative reduction, nonsplit multiplicative reduction, or additive reduction.

The curves we are interested in are the curves E=Jλ with extended Weierstrass form given by the equation

\[ Y^2 - (\lambda -2) X Y - \lambda Y = X^3 + 2 \lambda X^2 + 2 \lambda X.\tag{28} \]

The minimal model for this curve is fully described in Theorem 11; it is a minimal model for λ=a/c at all primes p>2 with p ∤ c. If p=2 or p|c, we have bad reduction of multiplicative type.

This is brought to standard Weierstrass short form |$Y_1^2=X_1^3 + r X_1 + s$| by a transformation

\begin{align*} X&=x+w, \\ Y &= y+u X_1+v, \end{align*}

giving the linear system

\begin{align*} 2u &= \lambda -2, \\ 3w &= u^2-(\lambda-2) u - 2\lambda, \\ 2v &= (\lambda-2)w + \lambda, \end{align*}

from which it follows

\begin{equation}\label{eq69} \begin{split} r &=P_4(\lambda):=-2^{-4}3^{-1}(\lambda^4+8\lambda^3-16\lambda+16),\\ s&=P_6(\lambda):=2^{-5}3^{-3} (\lambda^6+12\lambda^5+24\lambda^4-56\lambda^3+ 24\lambda^2-96\lambda+64). \end{split} \end{equation}

(69)

The discriminant of this curve is 4r3+27s2=−2−2λ3(λ−1)2(λ+8). For λ=a/c, the Weierstrass curve y2=x3+P4(λ)x+P6(λ) has good reduction at the prime p≠2,3 precisely unless p|a(a−c)(a+8c)c. Again, the Weierstrass short form is a minimal model at all primes p ∤ 2⋅3⋅c.

At a place of good reduction we have

\begin{equation}\label{eq70} a_p(J_\lambda) = - \sum_{x\bmod p}\left(\frac{x^3+P_4(\lambda)x+P_6(\lambda)}{p}\right), \end{equation}

(70)

where |$(\frac \cdot p)$| is the Legendre symbol. If instead p ∤ 2⋅3⋅c, then λ is invertible mod p and the sum above is still well defined even if p is a place of bad reduction. In this case, the condition of having bad reduction is equivalent to

\[ \lambda\equiv 0,1,-8 \bmod p. \]

If p|c one needs to make the change of variables Y →Y/p3, X→X/p2 and clear denominators by multiplying by p6. Accordingly, at these primes we have multiplicative reduction as in to the following table: with λ=a/c:

\begin{equation} a\equiv 0 \bmod p: \quad a_p(J_\lambda)=1, \label{eq71}\end{equation}

(71)

\begin{equation}a\equiv c \bmod p: \quad a_p(J_\lambda)=\left(\frac{-3}p\right),\label{eq72} \end{equation}

(72)

\begin{equation}a\equiv -8c \bmod p: \quad a_p(J_\lambda)=\left(\frac{-3}p\right)\label{eq73}, \end{equation}

(73)

\begin{equation} c\equiv 0 \bmod p: \quad a_p(J_\lambda)=1.\label{eq74} \end{equation}

(74)

The formula

\begin{equation}\label{eq75} a_p(J_\lambda) = - \sum_{x\bmod p}\left(\frac{x^3+c^4P_4(a/c)x+c^6P_6(a/c)}{p}\right) \end{equation}

(75)

holds for all primes |$p\geqslant 5$|⁠. An equivalent formula is

\begin{equation}\label{eq76} a_p(J_\lambda) = - \tau_p^{-1}\sum_{x,y\bmod p}\left(\frac{y}{p}\right) e_p(yf(x;a,c)), \end{equation}

(76)

where τp is the Gauss sum, hence |$\tau _p=\sqrt p$| if p≡1 mod 4 and |$\tau _p= \hbox {i}\sqrt p$| if p≡3 mod 4, where as usual ep(x)=e2πix/p, and where we have abbreviated

\begin{equation}\label{eq77} f(x;a,c)=x^3+c^4P_4(a/c)x+c^6P_6(a/c). \end{equation}

(77)

14 The Analytic Rank

Here we assume the Birch and Swinnerton-Dyer conjecture and the Riemann hypothesis for the L-function of the elliptic curves Jλ considered in the preceding section.

We recall that the analytic rank ran(E) is the order of zero at s=1 of the L-function LE(s) of the elliptic curve E. The Birch and Swinnerton-Dyer conjecture then asserts that the analytic rank coincides with the rank of the Mordell-Weil group |$E({\mathbb {Q}})$|⁠. Therefore, the conjecture gives r(λ)=ran(Jλ).

The next step consists in obtaining an explicit bound for the analytic rank by appealing to the explicit formula in the theory of L-functions and to a Riemann hypothesis. The final, and most delicate step, consists in showing that assuming the above conjectures the analytic rank is small on average on the set of integers m.

We denote by h(t) the triangle function

\[ h(t)= \max(1-|t|,0), \]

and write |$h_X(t) = h(t/\log X)$| where X>1 is a large parameter. Then from the estimates of Brumer[5, Section 2] and Heath-Brown [12, p. 596] one has, on the basic hypotheses mentioned above, an uniform estimate

\begin{equation}\label{eq78} r(E) =r_{\mathrm{an}}(E)\leqslant \frac{\log N_E}{\log X} + \frac 2{\log X} (U_1(E,X)+U_2(E,X)) + O\left(\frac 1{\log X}\right), \end{equation}

(78)

where

\[ U_k(E,X) = -\sum_{p^k\leqslant X, p\geqslant 5} \frac{a_{p^k}(E)}{k p^k} \log(p^k)h_X(k\log p). \]

We have already determined ap(E). In our case E=Jλ we have ap2(Jλ)=1 if p|NE and ap2(E)=α1(p)2+α2(p)2=ap(E)2−2p otherwise.

For our purposes, our goal is only an estimate (on average, but with a very small number of exceptions)

\[ r(E) = o(\log N_E/\log\log N_E), \]

and it turns out that the U2(E,X) term is unimportant. Indeed, since

\[ |a_{p^k}(E)/p^k| \leqslant 2 p^{-k/2}, \]

we have

\[ U_2(E,X) \leqslant 2 \sum_ {p^2\leqslant X} \frac {\log p}p = \log X + O(1), \]

so the contribution of the U2 term to inequality (78) is just O(1). Therefore,

\begin{equation}\label{eq79} r(E) \leqslant \frac{\log N_E}{\log X} + \frac 2{\log X}U_1(E,X) + O(1). \end{equation}

(79)

The trivial estimate for U1(E,X) obtained in the same way is only

\[ U_1(E,X) \leqslant 2\sum_{p\leqslant X} \frac {\log p}{\sqrt p} h_X(\log p) \ll \sqrt X, \]

which gives, by taking for example |$X=(\log N_E)^2/(\log \log N_E)$|⁠, the known bound

\begin{equation}\label{eq80} r(E) \leqslant\left(\frac12+o(1)\right) \frac{\log N_E}{\log\log N_E}, \end{equation}

(80)

which just fails to reach our goal of a bound |$o(\log m/\log \log m)$|⁠.

15 The Distribution μ of Numbers m

Fix M and K and assume

\begin{equation}\label{eq81} K > (\log M)^A, \quad A >2, \quad \log K &#x003C; c \log\log M. \end{equation}

(81)

Denote

\begin{equation}\label{eq82} G_K := \left\{ \pi = a+\sqrt{-1}b\,:\, \pi \hbox{ a Gaussian prime and }\frac K 2 \leqslant|\pi|^2 &#x003C; K \right\}, \end{equation}

(82)

and write for simplicity

\begin{equation}\label{eq83} r := \frac{\log M}{\log K} \asymp \frac {\log M}{\log\log M}. \end{equation}

(83)

Denote by μ the normalized counting measure on |$G_K^r$| and consider the map

\begin{equation}\label{eq84} \psi: G_K^r \rightarrow {\mathbb{Z}}_+ , \quad (\pi_1,\ldots,\pi_r)\mapsto |\pi_1|^2\cdots |\pi_r|^2. \end{equation}

(84)

Thus the image (probability) distribution ψ[μ] is supported by smooth numbers in [1,M]. Note that, in contrast to what was done in Section 12, the sampling distribution on m given by ψ[μ] is not the uniform distribution on |$[1,M]\cap {\mathbb {Z}}$| but rather an example of a distribution on smooth numbers; there are variants that may be treated in a similar way. In the present situation, the number N of points |$\{(x,y)\in {\mathbb {Z}}^2 : x^2+y^2=m\}$| is essentially maximal, that is

\[ \log N \asymp \frac{\log M}{\log\log M}. \]

In fact, from the choice of K, with large probability, all π1,…,πr will be distinct and outside of an exceptional set of μ-measure less than |$(\log M)^{-o(r)}$| we will have at most o(r) repetitions. It follows that

\begin{equation}\label{eq85} \log N = (\log 2+o(1))r. \end{equation}

(85)

Fix π1,…,πr and m=|π1|2⋯|πr|2. Let S1,S2,S3⊂{1,…,r} and define

\begin{equation}\label{eq86} P_i = \prod_{j\in S_i}\pi_j \times \prod_{j\notin S_i} \bar{\pi}_j \in {\mathbb{Z}}[\sqrt{-1}]. \end{equation}

(86)

Now set |$A+\sqrt {-1}B = P_1+P_2+P_3$|⁠, leading to the genus-1 curve CA,B defined in Section 4. Our aim is to evaluate (on μ-average) the quantity

\begin{align}\label{eq87} Q(m) &:= |\{(P_1,\ldots,P_6)\in({\mathbb{Z}}[\sqrt{-1}])^6 \,:\, |P_i|^2=m \hbox{ and } P_1+P_2+P_3=P_4+P_5+P_6\}|\notag \\ &\leqslant \sum_{P_1,P_2,P_3} \min(|C_{A,B}\cap{\mathbb{Z}}^2|,2^{(1+\varepsilon)r}), \end{align}

(87)

for any fixed ɛ>0 and |$M\to \infty $|⁠.

Thus assuming |$C_{A,B}\cap {\mathbb {Z}}^2\ne \emptyset $| we may bound the quantity above according to Equation (66) and the fact that |$|\Delta (J_\lambda)| \geqslant (m/{\mathrm {GCD}}(m,A^2+B^2))^6$| (see end of Section 9), getting

\begin{equation}\label{eq88} |C_{A,B}\cap{\mathbb{Z}}^2| \leqslant \left(c_{10}\frac{\log M}{\log(m/{\rm GCD}(m,A^2+B^2))}\right)^{\frac{1}{2} r(C_{A,B})} + c_9. \end{equation}

(88)

In fact, we have r(CA,B)=r(λ).

 

Assume the Birch and Swinnerton-Dyer conjecture and the Riemann hypothesis for the L-function of the elliptic curves Jλ. Then the contribution to Q(m) due to a pair (A,B) for which |${\mathrm {GCD}}(A^2+B^2,m) \geqslant m^{1-\varepsilon }$| does not exceed |$N^{c_{12}A\varepsilon \log \frac 1{\varepsilon }}$|⁠, where N is the number of solutions of m=x2+y2 with |$(x,y)\in {\mathbb {Z}}^2$|⁠.(In this bound, |$A= r^{-1}\log M/\log \log M$|⁠.)

 

Let Nλ be the conductor of Jλ. We may assume Nλ>2. By the bounds (66) and (80) we have

\[ |C_{A,B}\cap{\mathbb{Z}}^2| \leqslant \left(\frac{c_{11}\log m}{\log N_\lambda}\right)^{(\frac 14+o(1)) \log N_\lambda /\log\log N_\lambda}, \]

and the result follows because

\[ N_\lambda\ll c^{12}\ll m^{12\varepsilon},\quad N = 2^{r+o(r)}, \quad A = r^{-1}\frac{\log M}{\log\log M}. \]

 

Let ɛ>0 be given. Then as |$M\to \infty $| we have

\begin{equation}\label{eq89} Q(m) \leqslant \sum_{P_1,P_2,P_3} \min\left(\left(\frac {c_{13}}{\varepsilon}\right)^{\frac{1}{2} r(\lambda)},2^{(1+\varepsilon)r}\right)+ O(N^{3+c_{12}A\varepsilon\log\frac 1{\varepsilon}}). \end{equation}

(89)

In order to continue the analysis of Q(m) we make further restrictions on P1,P2,P3 by requiring S1,S2,S3 to satisfy

\begin{equation}\label{eq90} \max\left(\max_{i,j,k\ {\mathrm{distinct}}}|S_i\setminus(S_j\cup S_k)|,| S_1\cap S_2\cap S_3| \right) \geqslant \varepsilon r \end{equation}

(90)

and

\begin{equation}\label{eq91} \min_{i\ne j} |(S_i\setminus S_j)\cup(S_j\setminus S_i)| \geqslant \varepsilon r. \end{equation}

(91)

These restrictions do not affect the overall counting because

\[ |\{(S_1,S_2,S_3)\,:\, (90)\hbox{ fails}\}| &#x003C; 4^r 8^{c_{14}\varepsilon\log\frac 1\varepsilon}\ll N^{2+c_{14}\varepsilon\log\frac 1\varepsilon} \]

and

\[ |\{(S_1,S_2,S_3)\,:\, (91)\hbox{ fails}\}| &#x003C; 2^{r+c_{15}\varepsilon\log \frac 1\varepsilon} 2^r \ll N^{2+c_{15}\varepsilon\log \frac 1\varepsilon}. \]

Since we have the trivial bound |$|C_{A,B}\cap {\mathbb {Z}}^2|\ll N=2^{r+o(r)}$| (as already noted in Section 2, Equation (7)), the total contribution is again |$N^{3+c_{16}\varepsilon \log \frac {1}{\varepsilon }}$|⁠.

Define

\[ z_i=\prod_{j\in S_i} \pi_j \in {\mathbb{Z}}[\sqrt{-1}]\quad (i=1,2,3), \]

thus

\begin{equation}\label{eq92} P_i=\frac{z_i}{\bar{z}_i} \prod_{j=1}^r \bar{\pi}_j,\quad A^2+B^2 = \left|\sum_{i=1}^3\frac{z_i}{\bar{z}_i}\right|^2m,\quad \lambda = 1 - \left|\sum_{i=1}^3\frac{z_i}{\bar{z}_i}\right|^2. \end{equation}

(92)

We explain the purpose behind introducing the restrictions (90) and (91). Write

\begin{equation} z_1 = \prod_{j\in S_1\cap S_2\cap S_3}\pi_j\cdot \prod_{j\in (S_1\cap S_2)\setminus S_3}\pi_j\cdot \prod_{j\in (S_1\cap S_3))\setminus S_2}\pi_j\cdot \prod_{j\in S_1\setminus (S_2\cup S_3)}\pi_j, \label{eq93} \end{equation}

(93)

\begin{equation} z_2 = \prod_{j\in S_1\cap S_2\cap S_3}\pi_j\cdot \prod_{j\in (S_2\cap S_3)\setminus S_1}\pi_j\cdot \prod_{j\in (S_2\cap S_1)\setminus S_3}\pi_j\cdot \prod_{j\in S_2\setminus (S_1\cup S_3)}\pi_j, \label{eq94} \end{equation}

(94)

\begin{equation} z_3 = \prod_{j\in S_1\cap S_2\cap S_3}\pi_j\cdot \prod_{j\in (S_3\cap S_1)\setminus S_2}\pi_j\cdot \prod_{j\in (S_3\cap S_2)\setminus S_1}\pi_j\cdot \prod_{j\in S_3\setminus (S_1\cup S_2)}\pi_j. \label{eq95} \end{equation}

(95)

Let |$q\in {\mathbb {Z}}_+$|⁠, q<M, and consider the image distribution on |$([{\mathbb {Z}}[\sqrt {-1}]/(q)]^*)^3$| of the distribution μ|(π1⋯πr,q)=1 under the map (π1,…,πr)↦(z1,z2,z3). Assume now that each factor |$\prod _{j\in S'}\pi _j$| with S′ of cardinality at least |S′|>ɛr appearing in the defining Equations (93)–(95) is uniformly distributed mod q. Then if (S1,S2,S3) satisfies the constraints (90) and (91), it follows that the distribution of z=(z1,z2,z3) considered above is uniform.

16 The Distribution ν

This leads us to the problem of analyzing the distribution of |$[{\mathbb {Z}}[\sqrt {-1}]/(q)]^*$| induced by the map

\begin{equation}\label{eq96} G^{r'}_{K,q} \to [{\mathbb{Z}}[\sqrt{-1}]/(q)]^*\quad ((\pi_1,\ldots,\pi_{r'})\mapsto \pi_1\cdots \pi_{r'}\bmod q) \end{equation}

(96)

where GK,q={π∈GK : (π,q)=1}, for r′∼r. (Since |${\mathbb {Z}}[\sqrt {-1}]$| is an unique factorization ring, the notation (π,q)=1 to express that the prime ideal (π) does not divide (q) is acceptable in this context.)

The next lemma is a large deviation estimate which will play an important role in the sequel.

 

Let |$1< Q\in {\mathbb {Z}}$|⁠, 0<γ<1. Denote

\[ \Omega_Q := \{\chi\ \hbox{multiplicative on}\ {\mathbb{Z}}[\sqrt{-1}]\ \hbox{and primitive}\bmod q,\ q\leqslant Q \} \]

and

\[ \Omega'_Q := \left\{\chi \in \Omega_Q\, :\, \left|\sum_{\pi\in G_K} \chi(\pi)\right| >\gamma |G_K| \right\}. \]

Then for positive absolute constants c18, c19, we have

\begin{equation}\label{eq97} |\Omega'_Q| \leqslant c_{18}Q^{(4\log\log Q + 8\log \frac 1\gamma+c_{19})/\log K} (\log (QK))^3. \end{equation}

(97)

 

For r1 a positive integer write, for certain coefficients |$\sigma _{\chi }\in {\mathbb {C}}$| with |σχ|=1:

\begin{align} \gamma^{r_1}|G_K|^{r_1}|\Omega'_Q| &#x0026; \leqslant \sum_{\chi\in\Omega'_Q} \sigma_{\chi} \left(\sum_{\pi\in G_K}\chi(\pi)\right)^{r_1}\notag\\ &\leqslant \sum_{\chi\in\Omega'_Q} \sigma_{\chi} \sum_{\substack{|z|^2 \leqslant K^{r_1}\\ z\in G_K^{r_1}}}\eta(z)\chi(z),\label{eq98} \end{align}

(98)

where

\[ \eta(z) = |\{(\pi_1,\ldots,\pi_{r_1}) \in G_K^{r_1}\,:\, \pi_1\cdots\pi_{r_1} = z\}|. \]

We apply Cauchy's inequality to (98) after inverting the order of summation, note that |$\|\eta \|^2_2 \leqslant r_1! |G_K|^{r_1}$|⁠, and get

\begin{align}\label{eq99} \gamma^{r_1/2}|G_K|^{r_1}|\Omega'_Q| &\leqslant \|\eta\|_2\cdot \left[\sum_{\substack{|z|<K^{r_1/2}\\ z\in G_K^{r_1}}} \left|\sum_{\chi\in \Omega'_Q} \chi(z) \sigma_{\chi}\right|^2\right]^{\frac{1}{2}}\nonumber\\ &\leqslant (r_1!)^{1/2}|G_K|^{r_1/2}\cdot \left[\sum_{z \in B} \left|\sum_{\chi\in \Omega'_Q} \chi(z) \sigma_{\chi}\right|^2\right]^{\frac{1}{2}}, \end{align}

(99)

where B is the square box |$B = \{z\,:\, |\Re (z)|\leqslant K^{r_1/2},\ |\Im (z)|\leqslant K^{r_1/2}\}$|⁠, because enlarging the range of z always maintains the inequality. This last step, which replaces the summation over points in a disk by the summation over all points in a box with sides parallel to the coordinate axes, actually turns out to be to our advantage.

Once this is done we expand |$\sum _z |\sum _\chi |^2$|⁠, invert the order of summation once more, and obtain

\begin{equation}\label{eq100} \gamma^{r_1}|G_K|^{r_1}|\Omega'_Q|\leqslant (r_1!)^{1/2}|G_K|^{r_1/2}\cdot \left[ \sum_{\chi,\chi'\in \Omega'_Q} \left| \sum_{z\in B}(\chi\overline{\chi'})(z) \right| \right]^{\frac{1}{2}}. \end{equation}

(100)

The contribution of the diagonal χ=χ′ to |$\sum |\sum |$| is bounded by O(Kr1|Ω′Q|). For the terms χ≠χ′ off the diagonal we note that |$\chi \overline {\chi '}$| is a character to a modulus |$q^*\leqslant Q^2$| and this character is not principal, because χ and χ′ are distinct primitive characters. The sum |$\sum _{z\in B_{m,n}(q^*)}(\chi \overline {\chi '})(z) = 0$| whenever Bm,n(q*) is the q*×q* box

\[ B_{m,n}(q^*)=\{z : 0\leqslant (\Re(z)-m) <q^*,0 \leqslant (\Im(z)-n) <q^*\}, \]

with |$m,n\in {\mathbb {Z}}$|⁠, hence in the sum over the main box B we can remove all boxes Bm,n(q*) properly contained in it and we remain only with the contribution of the points belonging to boxes Bm,n(q*) that intersect the boundary ∂B of the main box.

The estimation of the sum |$\sum _z(\chi \overline {\chi '})(z)$| for z in these partial boxes in Bm,n(q*) is done in the usual way by Fourier analysis of the truncation of the box, using additive characters mod q* in a 2D setting. This leads to generalized Gauss sums for nontrivial characters mod q* on |${\mathbb {Z}}[\sqrt {-1}]$| (the reader may consult the classical paper [6] for the theory and evaluation of generalized Gauss sums). The result of the analysis is the bound

\[ \sum_{\substack{z\in B_{m,n}(q^*)\cap B\\ B_{m,n}(q^*)\cap \partial B\ne\emptyset}}(\chi\overline{\chi'})(z) \ll q^*(\log q^*)^2, \]

which is the Pólya-Vinogradov inequality for multiplicative characters on |${\mathbb {Z}}[\sqrt {-1}]$|⁠.

For a fixed character mod q*, the number of boxes that intersect the boundary of the main box is not more than 8⌊Kr1/2/q*⌋+4 and the number of characters χ,χ′, to consider is not more than |Ω′Q|2. As noted before, |$q^*\leqslant Q^2$|⁠. Hence, recalling that the contribution of the diagonal was O(Kr1|Ω′Q|), Equation (100) simplifies to

\begin{equation}\label{eq101} \left(\frac{\gamma^2|G_K|}{(r_1!)^{1/r_1}}\right)^{r_1/2}|\Omega'_Q|\ll K^{r_1/2}|\Omega'_Q|^{\frac{1}{2}}+(K^{r_1/4} + Q)(\log Q)|\Omega'_Q|. \end{equation}

(101)

This inequality yields an upper bound for |Ω′Q| provided

\begin{equation}\label{eq102} \left(\frac{\gamma^2|G_K|}{r_1}\right)^{r_1/2}> c_{17} (K^{r_1/4} + Q) \log Q, \end{equation}

(102)

with c17 a sufficiently large absolute constant, whose value does not matter here.

We analyze the consequences of (101) together with (102) in the following way. The number of primes p≡1 mod 4 in [K/2,K] is asymptotic to |$K/(4\log K)$|⁠, hence |$|G_K|\sim K/(2\log K)$|⁠. If (102) fails, we have three cases:

Case I: (102) is not satisfied and |$K^{r_1/4} \geqslant Q$|⁠. In this case, we have

\[ \left(\frac{\gamma^2K}{3 r_1 \log K}\right)^{r_1}\ll K^{r_1/2}(\log Q)^2, \]

hence after reshuffling the formula and using |$Q^2 \geqslant |\Omega '_Q|$| we get

\[ \left(\frac {3r_1\log K}{\gamma^2}\right)^{r_1} \gg K^{r_1/2}(\log Q)^{-2} \gg Q^2(\log Q)^{-2}\gg |\Omega'_Q|(\log Q)^{-2}, \]

which can be rewritten as

\[ |\Omega'_Q| \ll \left(\frac {3r_1\log K}{\gamma^2}\right)^{r_1}(\log Q)^2. \]

Case II: Equation (102) is not satisfied and Q>Kr1/4. We can dispose of this case by taking r1 large enough, so we take

\begin{equation}\label{eq103} r_1 = \left\lfloor\frac {4 \log Q}{\log K}\right\rfloor + 1. \end{equation}

(103)

Case III: Equation (102) is satisfied. In this case (101) simplifies to

\[ \left(\frac{\gamma^2|G_K|}{(r_1!)^{1/r_1}}\right)^{r_1/2}|\Omega'_Q|\ll K^{r_1/2}|\Omega'_Q|^{\frac{1}{2}}, \]

yielding

\begin{equation}\label{eq104} |\Omega'_Q|\ll \left(\frac {3r_1\log K}{\gamma^2}\right)^{r_1}. \end{equation}

(104)

With the above choice (103) for r1 and combining the bounds obtained separating the subcases |$\log Q>\log K$| and |$\log Q<\log K$|⁠, we conclude that always

\[ |\Omega'_Q| \ll \left(\frac{ 15 \log Q}{\gamma^2}\right)^{4\log Q/\log K }(\log(QK))^3. \]

Our next goal is to establish an estimate for

\begin{equation}\label{eq105} \Sigma:= \sum_{\substack{p_1,\ldots,p_n\\ {\mathrm{distinct}}}}\int \prod_{j=1}^n f_{p_j}(\textbf{z})\,{\rm d} \mu, \end{equation}

(105)

where z is defined by means of sets Si satisfying the constraints (90) and (91).

 

For s=1,…,n, let fp(z,s) be functions depending only on the value of z mod p and satisfying

\begin{equation}\label{eq106} |f_p(\textbf{z},s)|\leqslant C\sqrt p,\quad \left| \sum_{H_p^3} \kern-0.21in\hbox{---}\kern 0.12in f_p(\textbf{z},s)\right| &#x003C; \beta. \end{equation}

(106)

(The slashed summation symbol indicates the averaged sum, thus here with a factor |$1/|H_p^3|$|⁠.)

Let |$\log K \sim A\log \log M$|⁠, |$P>(\log M)^{2-\varepsilon }$| for every fixed ɛ>0, |$\log P \ll \log \log M$|⁠, |$n<\log M$| , and suppose that A>24. Then

\begin{equation}\label{eq107} \sum_{\substack{p_1,\ldots,p_n\asymp P\\ {\mathrm{distinct}}}}\ \int_{\substack{\textbf{z}\bmod p_s\in H_{p_s}^3\\ s=1,\ldots,n}} \prod_{s=1}^n f_{p_s}(\textbf{z},s)\,{\rm d} \mu \ll \left(\frac{c_{27}\beta P}{\log P}\right)^n [\min(n, P^{\frac {1}{2}-\frac{12}{A}+o(1)}/\log P)]^{-n}. \end{equation}

(107)

 

Let |$H_q:=[{\mathbb {Z}}[\sqrt {-1}]/(q)]^*$| be the group of invertible elements of the ring |$Z[\sqrt {-1}]/(q)$| and denote by ν the image measure on Hq of the normalized counting measure on |$G_K^{r'}$| restricted to |$G_{K,q}^{r'}$| under the map (96).

We shall restrict q by further assuming:

\begin{equation}\label{eq108} q\ \hbox{ squarefree},\quad \log q \ll r,\quad r = o(r'). \end{equation}

(108)

Using multiplicative characters, write

\[ \nu = \frac 1{|H_q|} \sum_{\chi\bmod q} \hat \nu (\chi)\chi, \]

where

\begin{equation}\label{eq109} \hat \nu (\chi) = \sum_{x\in H_q} \nu(x) \overline{\chi(x)} =\left[\frac 1{|H_q|}\sum_{\pi\in G_K} \bar{\chi}(\pi)\right]^{r'}. \end{equation}

(109)

From this, if

\[ \left|\sum_{\pi\in G_K} \chi(\pi)\right| &#x003C; \gamma |G_K|, \]

then it follows that

\begin{equation}\label{eq110} |\hat\nu(\chi)| &#x003C; \gamma^{r'} &#x003C; q^{-c}, \end{equation}

(110)

with c arbitrarily large as γ→0.

Those characters for which this bound fails are induced by primitive characters χ1 in ∪q1|qΩ′q1⊂Ω′q. From the preceding discussion, we may identify ν (up to a negligible error) with the distribution

\begin{equation}\label{eq111} \nu^*=\frac 1{|H_q|}\chi_0 + \frac 1{|H_q|}\sum_{q_1|q} \hat \nu(\chi_1\chi_0)\chi_1 \end{equation}

(111)

on Hq with χ0 the principal character mod q.

Let z1,z2,z3 be as defined before at the end of Section 15. We write for simplicity z=(z1,z2,z3). By the definition of zi, the related discussion, Equation (111), and the assumption that the moduli q are restricted to be squarefree, we verify that the sum above may be bounded by an expression of the form

\[ \sum_{\substack{p_1,\ldots,p_n\\ {\rm distinct}}}\sum_{\substack{q_i|p_1\cdots p_n\\ i=1,2,3}}\ \sum_{\substack{\chi_i\in\Omega'_{q_i}\\ i=1,2,3}} \frac{c_{ \chi_1,\chi_2,\chi_3}}{|H_{p_1\cdots p_n}|^3} \sum_{z_i\in H_{p_1\cdots p_n}}\kern -0.1in \chi_1(z_1)\chi_2(z_2)\chi_3(z_3) \prod_{s=1}^n f_{p_s}(\textbf{z},s), \]

where the coefficients cχ1,χ2,χ3 are bounded by 1 and where Ω′1 reduces to the trivial character; here

\[ \chi_i=\prod_{j=1}^{n_i} \chi_{ij},\quad \chi_{ij}\bmod p_j,\quad \chi_{ij}=\chi_0\enspace \hbox{if } p_j{\!\not|\,}q_i. \]

In particular, χ1j(z1)χ2j(z2)χ3j(z3)=1 for all zi∈Hps whenever ps∤q1q2q3.

It follows that

\begin{equation}\label{eq112} |\Sigma|\leqslant \sum_{\substack{p_1,\ldots,p_n\\ {\mathrm{distinct}}}}\ \sum_{q_1,q_2,q_3|p_1\cdots p_n}\ \sum_{\substack{\chi_i\in\Omega'_{q_i}}{i=1,2,3}} \prod_{s=1}^n\left|\sum_{H_{p_s}^3}\kern-0.21in\hbox{---}\kern 0.12in\chi_1(z_1)\chi_2(z_2)\chi_3(z_3) f_{p_s}(\textbf{z},s)\right|. \end{equation}

(112)

We bound the terms in the right-hand side of this equation in two different ways. Let |$\bar {n}$| be the number of primes ps that divide q1q2q3 and ni be the number of primes ps that divide qi. For the corresponding factors in the product, we use the first estimate of Equation (106), getting from these factors a contribution of |$(c_{21}\sqrt P)^{\bar {n}}$| to the product. If instead ps does not divide q1q2q3, an event which occurs |$n-\bar {n}$| times, then χ1(z1)χ2(z2)χ3(z3)=1 and we may use the second estimate in (106), getting from the corresponding factors a contribution |$\beta ^{n-\bar {n}}$| to the product. The number of possible |$n-\bar {n}$| tuples is majorized by

\[ \binom{\lfloor c_{22}P/\log P\rfloor}{n-\bar{n}} &#x003C; \left(\frac{c_{22}P}{\log P}\right)^{n-\bar{n}}\frac 1{(n-\bar{n})!}, \]

the constant c22 reflecting the range of the approximation p$$$$P.

Given ni, the size of qi is |$q_i &#x003C; (c_{23}P)^{n_i}\leqslant (c_{23}P)^{\bar {n}}$| and we take |$Q = (c_{23}P)^{\bar {n}}$|⁠. Then Ω′qi⊂Ω′Q for all qi. Thus from (112) and the preceding discussion we have the bound

\begin{equation}\label{eq113} \begin{split} |\Sigma| \leqslant c_{24}^n\ \sum_{\bar{n} \leqslant n}\ \ \sum_{n_i\leqslant \bar{n}\ (i=1,2,3)} |\Omega'_Q| ^3 (\sqrt P)^{\bar{n}} \left(\frac {\beta P}{\log P}\right)^{n-\bar{n}} \frac 1{(n-\bar{n})!}, \end{split} \end{equation}

(113)

for some constant c24 whose value is immaterial.

We estimate |Ω′Q| by appealing to Lemma 20. Since |$\log \log Q = \log \bar {n} + \log \log P + O(1)$|⁠, |$\log \bar {n}<\log M$|⁠, |$\log K >A \log \log M$|⁠, and |$\log \log P \ll _A \log \log \log M$|⁠, we obtain

\begin{align*} |\Omega'_Q| &\leqslant (c_{25} P^{\bar{n}})^{4(\log \bar{n}+ \log\log P + 2\log\frac 1\gamma+c_{19})/\log K}(\bar{n} \log P)^3\\ &\ll ((c_{25} P^{\bar{n}})^ {4(\log\log M+ \log\log P +2\log \frac1{\gamma}+c_{19})/(A\log\log M)} (\bar{n} \log P)^3)\\ &#x0026; \ll_{\gamma} (c_{25} P)^{\kappa\bar{n}}, \end{align*}

where κ is

\[ \left. \kappa=\frac{4}{A} + 4\left[\log\log P +2\log\frac 1\gamma+c_{19}+3\log(\bar{n} \log P)/\bar{n}\right]\right/A\log\log M = \frac{4}{A} +o(1). \]

After taking γ sufficiently small so to satisfy (110) with a sufficiently large exponent, we conclude that assuming |$\beta \geqslant 1$|⁠, which is no restriction for us, we obtain:

\[ |\Sigma| \ll \left(\frac{c_{26}\beta P}{\log P}\right)^n\max_{\bar{n}\leqslant n} P^{3\kappa\bar{n}} \left(\frac {\log P}{\sqrt P}\right)^{\bar{n}} \frac{1}{(n-\bar{n})!}. \]

We may replace |$(n-\bar {n})!$| by |$(n-\bar {n})^{n-\bar {n}}\,{\mathrm {e}}^{-(n-\bar {n})}$|⁠. Since the maximum of Bx/xx is |$\exp (B/e)$| achieved at x=B/e, the maximum occurs at |$\bar {n} = 0$| if |$n\leqslant P^{\frac {1}{2}-3\kappa }/(e\log P)$| and at |$n-\bar {n} = P^{\frac {1}{2}-3\kappa }/(e\log P)$| otherwise. Thus

\[ |\Sigma|\leqslant \begin{cases} \displaystyle\left(\frac {c_{26}\beta P}{n \log P}\right)^n&\hbox{if } n\leqslant P^{\frac{1}{2} - 3\kappa}/(e\log P)\\ \displaystyle\left(e c_{26}\beta P^{\frac{1}{2}+3\kappa}\right)^n \exp\left(\frac {P^{\frac{1}{2}-3\kappa}}{e \log P}\right) <(c_{26}e \beta P^{\frac{1}{2}+3\kappa})^n&\hbox{if } n>P^{\frac{1}{2} - 3\kappa}/(e\log P) \end{cases} \]

and the conclusion of the lemma follows from this. The bound we have obtained is not trivial if A>24.

In the next step, we will analyze the distribution of λ mod p, where |$\lambda = 1 -|\sum z_i/\bar {z}_i|^2$|⁠, for zi∈Hp (i=1,2,3) uniformly distributed mod p. The map (z1,z2,z3)↦λ is the composition of

\begin{equation}\label{eq114} H_p^3 \rightarrow {\mathbb{Z}}[\sqrt{-1}]/(p)\quad (z_1,z_2,z_3) \mapsto z:=\sum_{i=1}^3 \frac {z_i}{\bar{z}_i} \end{equation}

(114)

and

\begin{equation}\label{eq115} {\mathbb{Z}}[\sqrt{-1}]/(p)\rightarrow {\mathbb{Z}}/(p) \quad z \mapsto 1 - z\cdot \bar{z} = \lambda. \end{equation}

(115)

 

This map is well defined because zi∈Hp and |$H_p=[{\mathbb {Z}}[\sqrt {-1}]/(p)]^*$|⁠, so the image of |$\bar {z}_i$| in Hp is also invertible.

17 The Distribution of λ mod p

We consider first the image distribution Δ3 in |${\mathbb {Z}}[\sqrt {-1}]/(p)$| under the map (114); here Hp is endowed with the normalized counting measure. To this end, we begin with the image distribution Δ1 in |${\mathbb {Z}}[\sqrt {-1}]/(p)$| under the map |$z\mapsto z/\bar {z}$|⁠. We have

\begin{equation} \Delta_1 = \frac 1{p^2} \sum_\psi \hat{\Delta}_1(\psi)\psi,\label{eq116} \end{equation}

(116)

\begin{equation}\Delta_3 = \frac 1{p^2} \sum_\psi \hat{\Delta}_1(\psi)^3\psi,\label{eq117} \end{equation}

(117)

where ψ runs over the additive characters of |${\mathbb {Z}}[\sqrt {-1}]/(p)$|⁠.

Now distinguishing cases according as p is inert or split in |${\mathbb {Z}}[\sqrt {-1}]$|⁠, we have

\[ \max_{\psi\ \hbox{non-trivial}} |\hat{\Delta_1}(\psi)| =\begin{cases} \displaystyle\max_{(a,b)\ne(0,0)}\left|\,\sum_{\substack{(x,y)\ne(0,0)\\ {}\bmod p}} \kern-0.36in\hbox{---}\kern 0.12in e_p\left(\frac{a(x^2-y^2)+2bxy}{x^2+y^2}\right)\right|= O(p^{-\frac{1}{2}}),\\ \displaystyle\max_{(a,b)\ne(0,0)}\left|\, \sum_{\substack{x^2+y^2\ne 0\\ {}\bmod p}} \kern-0.31in\hbox{---}\kern 0.12in e_p\left(\frac{a(x^2-y^2)+2bxy}{x^2+y^2}\right)\right|= O(p^{-\frac{1}{2}}), \end{cases} \]

where the slashed summation symbol indicates the averaged sum, hence here with a factor asymptotic to p−2. The two-dimensional exponential sum is |$O(p^{\frac 32})$| by the classical Weil estimate of one-dimensional exponential sums (note that the argument is homogeneous of degree 0), yielding the bound |$O(p^{-\frac {1}{2}})$| for the averaged sum.

In view of this estimate and Equation (117), we infer that

\[ \Delta_3 = \frac 1{p^2} +\frac 1{p^2}\sum_{\psi\ {\rm non}\hbox{-}{\rm trivial}} c(\psi) \psi, \]

with |$c(\psi) \ll p^{-\frac 32}$|⁠, implying by Parseval's theorem that

\begin{equation}\label{eq118} \sum_{z\in{\mathbb{Z}}[\sqrt{-1}]/(p)} \left|\Delta_3(z)-\frac 1{p^2}\right|^2 = O\left(\frac 1{p^3}\right). \end{equation}

(118)

Next, consider the image distribution of Δ3 under the map |$z\mapsto 1 -z\cdot \bar {z} \bmod p$|⁠, or equivalently |$z\mapsto z\cdot \bar {z} \bmod p$|⁠. Denote by Δ′ the image distribution on |${\mathbb {Z}}/(p)$| of the normalized uniform measure on |${\mathbb {Z}}[\sqrt {-1}]/(p)$|⁠. Then from the preceding (118) we find

\begin{align}\label{eq119} \sum_{x \bmod p}|\Delta(x)-\Delta'(x)|^2 &= \sum_{x \bmod p} \left|\sum_{z\bar{z}\equiv x \bmod p}\left(\Delta_3(z)-\frac 1{p^2}\right)\right|^2\nonumber\\ &\leqslant 4 p \sum_{z \bmod p} \left|\Delta_3(z)-\frac 1{p^2}\right|^2 = O\left(\frac 1{p^2}\right). \end{align}

(119)

The Fourier coefficients of Δ′ are bounded by

\begin{equation}\label{eq120} \max_{a\ne 0 \bmod p} \left|\frac 1{p^2}\sum_{x,y \bmod p} e_p(a(x^2+y^2))\right| = O\left(\frac 1p\right), \end{equation}

(120)

implying, again by Parseval's theorem, the bound

\begin{equation}\label{eq121} \sum_{x \bmod p}\left|\Delta'(x)-\frac 1p\right|^2 = O\left(\frac 1{p^2}\right). \end{equation}

(121)

It follows from Equations (119) and (121) that the density Δ of the distribution of λ under the map |$H_p^3\to {\mathbb {Z}}/(p)$| given by |$(z_1,z_2,z_3)\mapsto 1-(\sum z_i/\bar {z}_i)(\sum \bar {z}_i/z_i)$| satisfies

\begin{equation}\label{eq122} \sum_{\lambda \bmod p}\left|\Delta(\lambda)-\frac 1p\right|^2= O\left(\frac 1{p^2}\right). \end{equation}

(122)

Now we are in a position of estimating the quantity β introduced in (106) when fp(z) is a function of the single variable |$\lambda = 1-(\sum z_i/\bar {z}_i)(\sum \bar {z}_i/z_i)$|⁠. We write fp(λ) for this function; as noted in Remark 22, this function is well defined. In view of the definition of zi, the primes p for which zi may not be in Hp are those for which p|m. We obtain

\begin{equation}\label{eq123} \begin{split} \left|\sum_{H_p^3} \kern-0.21in\hbox{---}\kern 0.12in f_p(\lambda) \right| &=\left| \sum_{\lambda \bmod p} \Delta(\lambda)f_p(\lambda)\right|\\ &\ll \frac 1p\left|\sum_{\lambda \bmod p} f_p(\lambda)\right| + \sum_ {\lambda \bmod p}|f_p(\lambda)|\cdot \left|\Delta(\lambda)-\frac 1p\right|. \end{split} \end{equation}

(123)

By Equations (106) and (122) and Cauchy's inequality, the second term is bounded by

\[ C\sqrt p \cdot \sum_{\lambda \bmod p} \left|\Delta(\lambda)-\frac 1p\right| \ll \sqrt p\cdot \sqrt p\cdot \frac 1p\ll 1, \]

giving for p∤m the bound

\begin{equation}\label{eq124} \beta \ll \frac 1p \left| \sum_{\lambda \bmod p} f_p(\lambda)\right| + 1. \end{equation}

(124)

18 Application of the Explicit Formula: First Steps

In the last two sections we apply the explicit formula to give a good average bound for the rank r(λ) of the elliptic curve |$J_\lambda /{\mathbb {Q}}$| when λ=1−(A2+B2)/m. The average in question will be taken over squarefree numbers m, taken with the probability distribution ψ[μ] as defined in (84). In order to do this we will estimate very high moments of r(λ). The main new difficulty consists in the fact that the parameter λ is restricted to a thin set and is arithmetically defined rather than geometrically defined, so it becomes necessary to use the results of the preceding section obtained with methods of probability theory.

As in all work of this type, everything here is conditional to the validity of the Riemann hypothesis and of the Birch and Swinnerton-Dyer conjecture for the relevant L-functions.

Recall that Jλ has the extended Weierstrass form

\[ Y^2 - (\lambda -2) X Y - \lambda Y = X^3 + 2 \lambda X^2 + 2 \lambda X\tag{28} \]

and has a short Weierstrass form y2=x3+rx+s with r=P4(λ), s=P6(λ) where −48P4(λ) and 864P6(λ) are monic polynomials in |${\mathbb {Z}}[\lambda ]$| given by Equation (69). This provides a good model of the reduction of Jλ mod p for λ=1−(A2+B2)/m and p∤ 6m. In particular, for p∤ 6m the coefficients ap appearing in the explicit formula are given by

\[ a_p(J_\lambda) = - \sum_{x \bmod p}\left(\frac{x^3+P_4(\lambda)x+P_6(\lambda)}{p}\right)\tag{70} \]

where of course P4(λ) and P6(λ) must be taken mod p.

For simplicity, we write now r(λ), ap(λ), U1(λ,X), Nλ for the corresponding quantities associated to the elliptic curve Jλ. By the bounds obtained by the explicit formula in Section 13 and |$\log N_\lambda =O(\log M)$|⁠, we have (the constant 2 takes care of the O(1) term)

\[ r(\lambda) \leqslant 2\frac{\log M}{\log X} +\frac 2{\log X}U_1(\lambda,X)\tag{79} \]

for |$\log X =o(\log M)$| and where, setting |$h_X(t)=\max (1-|t|/\log X,0)$|⁠, we have

\begin{equation}\label{eq125} U_1(\lambda,X) = -\sum_{p\leqslant X, p\geqslant 5} a_p(\lambda)\frac{\log p}{p} h_X(\log p). \end{equation}

(125)

Our goal is to obtain an estimate |$o(\log M/\log \log M)$| on average on m, where m is counted with the probability distribution ψ[μ].

The set |$\mathcal M$| of numbers considered was as follows. Given M, K, A, with

\[ K = (\log M)^A, \quad A >2, \quad \log K &#x003C; c \log\log M, \tag{81} \]

we defined GK to be the set of Gaussian primes π satisfying |$K/2 \leqslant |\pi |^2<K$| and |$\mathcal M$| to be the set of squarefree integers m∈[M/2,M] whose prime factors p are all of type |π|2, with π as before. Since |$\log p\sim \log K=A\log \log M$|⁠, for |$m\in \mathcal M$| we have

\begin{equation}\label{eq126} \omega(m) \sim r :=\frac {\log M}{\log K} = \frac{\log M}{A\log\log M}. \end{equation}

(126)

Therefore, an easy upper bound and lower bound estimate yields

\begin{equation}\label{eq127} |\mathcal M| = M^{1-\frac 1A+o(1)} \end{equation}

(127)

and we are dealing with a relatively thin set of numbers m. We set

\begin{equation}\label{eq128} X = K^{1/\eta} \end{equation}

(128)

where η>0 is fixed but remains at our disposal, giving

\[ \frac {\log M}{\log X} = \eta \frac {\log M}{\log K} = \eta r. \]

Note also that the contribution of terms with

\begin{equation}\label{eq129} p &#x003C; X_1 =\left(\frac{\log M}{\log\log M}\right)^2 \end{equation}

(129)

in the sum in Equation (81) is bounded by

\[ \frac 1{\log X} \sum_{p &#x003C; X_1} \frac{\log p}p 2\sqrt p \ll \frac{\sqrt{X_1}}{\log X} \ll\frac{\log M}{(\log X)\log\log M} \]

and we may therefore restrict the p-summation to the range

\[ X_1 \leqslant p \leqslant X. \]

Further, given z=(z1,z2,z3) its reduction mod p is in |$H_p^3$| unless p divides |z1z2z3|2. If this happens then p|m, so p>K/2. Thus the contribution to the sum from terms with p|m is

\[ \sum_{\substack{X_1< p &#x003C; X\\ p|m}}h_X(\log p)\frac{\log p}p a_p(\lambda)\leqslant 2(\sqrt{K/2})^{-1} \sum_{p|m} \log p \ll (\log M)^{1-A/2}, \]

hence it is negligible because A>2 by hypothesis. Therefore, taking moments of order 2l of Equation (79), we get the estimate

\begin{equation}\label{eq130} \mu\left[ r(\lambda) >4 \frac{\log M}{\log X} \right] \leqslant\left(\frac 1{\log M}\right)^{2l} \ \int \left[\sum_{\substack{X_1<p<X\\ \textbf{z} \bmod p\in H_p^3}} h_X(\log p) \frac{\log p}p a_p(\lambda) \right]^{2l} {\mathrm{d}} \mu. \end{equation}

(130)

From now on, we assume that

\begin{equation}\label{eq131} l = o(\log M). \end{equation}

(131)

Next, we subdivide the interval for p into dyadic ranges p$$$$2k with X1≪2k≪X, apply Hölder's inequality to separate the dyadic intervals, expand the power 2l of the sum over each dyadic interval, obtaining

\begin{equation}\label{eq132} {\mathrm{r.h.s.}} (130) \leqslant \left(\frac{c_{28} \log X}{\log M}\right)^{2l}\sum_k \left(\frac {k}{2^k}\right)^{2l}\sum_{\sum j_p=2l} \frac{(2l)!}{\prod j_p!} \left|\int_{\substack{\textbf{z} \bmod p\in H_p^3\\ p\asymp 2^k}} \prod_p a_p(\lambda)^{j_p}\,{\rm d} \mu\right| \end{equation}

(132)

where c28 is some absolute constant.

19 Application of the Explicit Formula: Conclusion

In this final section we give the estimation of the right-hand side of Equation (132).

We rewrite the sum and product in Equation (132) in terms of tuples (h1,h2,…,ht) of t positive integers with |$\sum h_s=2l$| and tuples of primes (p1,p2,…,pt) such that jps=hs and ps$$$$2k. Once we fix a tuple (h1,…,ht), we define an associated set of functions fp(z,s) for s=1,…,t, depending only on the value of λ mod p, by setting

\[ f_p(\textbf{z},s) = \begin{cases} a_p(\lambda) &\hbox{if } h_s=1 \\ (2\sqrt p)^{-h_s} a_p(\lambda)^{h_s} &#x0026; \hbox{if } h_s>1 \end{cases} \]

which makes it plain that

\begin{equation}\label{eq133} |f_p(\textbf{z},s)|\leqslant 2\sqrt p \enspace\hbox{if } h_s=1,\quad |f_p(\textbf{z},s)|\leqslant 1 \enspace\hbox{if } h_s\geqslant 2. \end{equation}

(133)

Then the inner sum in Equation (132) is majorized by

\[ \sum_{t\leqslant 2l}\sum_{\sum h_s=2l} \frac{(2l)!}{\prod h_s!}\left(\prod_{h_s\geqslant 2} (2\sqrt{p_s})^{h_s}\right) \left|\sum_{\substack{p_1,\ldots,p_t\asymp 2^k\\ {\rm distinct}}} \int_{\substack{\textbf{z} \bmod {p_s}\in H_{p_s}^3\\ s=1,\ldots, t}} \prod_{s=1}^t f_{p_s}(\textbf{z},s) \,{\rm d}\mu\right| \]

and we evaluate it by appealing to Lemma 21. A bound for the critical parameter |$\beta =|\sum \kern -0.16in\hbox {--\!\!--}\ \ f_p(\textbf {z},s)|$| is readily obtained. Indeed, if hs=1 then by Equation (124) we get

\[ \left|\sum \kern-0.21in\hbox{---}\kern 0.12in f_{p_s}(\textbf{z},s)\right| \ll \frac 1{p}\left|\sum_{\lambda \bmod p}\ \sum_{x \bmod p} \left(\frac{x^3+P_4(\lambda)x+P_6(\lambda)}{p}\right)\right|. \]

Now a simple elementary direct proof of the bound O(p) for the double sum goes as follows. We state first an easy lemma.

 

Let X be a rational variety of dimension n defined over |${\mathbb {Z}}$| and suppose that there are a birational map ϕ defined over |${\mathbb {Z}}$| which is an isomorphism |$\phi :Y\rightarrow V$| between a nonempty Zariski open set Y of X, defined over |${\mathbb {Z}}$| and a nonempty Zariski open set V , also defined over |${\mathbb {Z}}$|⁠, of the affine space |${\mathbb {A}}^n$|⁠. (By a variety we mean an absolutely irreducible algebraic set.)

Let p be a prime number, let |$X/{\mathbb {F}}_p=X\otimes {\mathbb {F}}_p$| be the reduction of X modulo p and let |$X({\mathbb {F}}_p)$| denote its number of points defined over |${\mathbb {F}}_p$|⁠. Let U be a nonempty Zariski open subset of X. Then for all but a finite set of primes p (depending only X and ϕ) we have

\begin{equation}\label{eq134} U({\mathbb{F}}_p)=p^n+O(p^{n-1}). \end{equation}

(134)

 

For all but a finite set of primes p determined by X and ϕ (the reduction of X modulo p must be a variety), we have

\begin{align*} U({\mathbb{F}}_p)&=X({\mathbb{F}}_p)+O(p^{n-1})\\ Y({\mathbb{F}}_p)&=X({\mathbb{F}}_p)+O(p^{n-1})\\ V({\mathbb{F}}_p)&={\mathbb{A}}^n({\mathbb{F}}_p)+O(p^{n-1}) \end{align*}

because X\U, X\Y , An\V are closed Zariski sets of dimension at most n−1, all defined over |${\mathbb {Z}}$|⁠. On the other hand, |$A^n({\mathbb {F}}_p)=p^n$|⁠, while |$Y({\mathbb {F}}_p)=V({\mathbb {F}}_p)$| for all but a finite set of primes p determined by ϕ, because ϕ is an isomorphism defined over |${\mathbb {Z}}$|⁠. The conclusion of the lemma follows.

The following remark is well known to the experts, and we state it (without proof), because it may be useful in other settings.

 

The lemma remains valid also when we remove the condition that the birational map ϕ is defined over |${\mathbb {Z}}$|⁠, but the proof is not so elementary.

We apply the lemma with X the affine surface y2=x3+P4(λ)x+P6(λ) in the parameters (x,y,λ). The surface X is birationally equivalent over |${\mathbb {Q}}$| to the Beauville cubic surface (27), as shown by the sequence of birational transformations provided at the end of Section 7. In homogeneous coordinates [x:y:λ:w], the Beauville cubic has only one singularity at [0:0:1:0], which is a double point. A birational map of it on |${\mathbb {A}}^2$|⁠, defined over |${\mathbb {Z}}$|⁠, is obtained by projecting the cubic surface from the singular point to a plane, so the lemma is applicable here.

The sum of Legendre symbols we want to estimate is

\[ \sum_{\lambda \bmod p}\ \sum_{x \bmod p} \left(\frac{x^3+P_4(\lambda)x+P_6(\lambda)}{p}\right) = X({\mathbb{F}}_p) - {\mathbb{A}}^2({\mathbb{F}}_p) =X({\mathbb{F}}_p)-p^2 \]

and the bound O(p) follows from the lemma. Hence β=O(1) if s=1. Instead, if |$s\geqslant 2$| then

\[ \left|\sum\kern-0.21in\hbox{---}\kern 0.12in f_{p_s}(\textbf{z},s)\right| \leqslant 1 \]

by Equation (133). Hence β=O(1) in this case too.

In the application of Lemma 21 we take n=t, P$$$$2k and the conditions on fps(z,s) are verified with β=O(1). Thus we get

\begin{align}\label{eq135} &\mu\left[ r(\lambda) >4 \frac{\log M}{\log X} \right] \nonumber\\ &#x0026; \quad \ll\left(\frac{c_{28}\log X}{\log M}\right)^{2l}\sum_k \left(\frac {k}{2^k}\right)^{2l}\sum_{t=1}^{2l}\sum_{\sum h_s=2l} \frac{(2l)!}{\prod h_s!}2^{\frac k2 \sum_{h_s\geqslant 2} h_s} \left(\frac{c_{25} 2^k}{k \min\left(t, (2^k)^\kappa\right)}\right)^t \end{align}

(135)

with |$\kappa = \frac {1}{2}-\frac {12}{A}+o(1)$|⁠.

It remains to simplify the bound so obtained. Let u=|{s : hs=1}|. Then |$2^{\frac k2 \sum _{h_s\geqslant 2} h_s}=2^{k(l-u/2)}$|⁠. Moreover, we have

\[ \sum_{\substack{\sum h_s=2l\\ h_s\geqslant 1}} \frac{(2l)!}{h_1!\cdots h_t!}<\sum_{\sum h_s=2l} \frac{(2l)!}{h_1!\cdots h_t!}=({\mathop{\underbrace{(1+1+\cdots+1)}}_{t\ {\rm times}}}^{2l} =t^{2l} \]

hence noting that |$t\leqslant 2l$| and absorbing the constant c25 in a new constant c29, Equation (135) simplifies to

\begin{equation}\label{eq136} \mu\left[ r(\lambda) >4 \frac{\log M}{\log X} \right] \ll \left(\frac{c_{29} t\log X}{\log M}\right)^{2l}\sum_{k ,t,u} k^{l } \left(\frac {k}{2^k}\right)^{l+\frac u2-t} \left(\frac{1}{\min(t, (2^k)^\kappa)}\right)^t. \end{equation}

(136)

Further easy simplifications can be done as follows. The triple sum can be replaced by |$\max $| if |$l\gg \log \log M$|⁠, which we may suppose, by changing the constant c29 and taking l of order of magnitude larger than |$\log \log M$|⁠. The term kl is bounded by |$(\log X)^l$|⁠, so it can be removed by replacing |$\log X$| in the formula by |$(\log X)^{3/2} = O((\log \log M)^{3/2})$|⁠. Since |$t\leqslant 2l$| and |$l=o(\log M)$|⁠, we can simply replace |$c_{29} t(\log X)^{3/2}$| by the cleaner bound |$(\log \log M)^2 \log M$|⁠, for example. Recalling that |$(\log M)/\log X = \eta r$|⁠, this gives

\begin{equation}\label{eq137} \mu[ r(\lambda) >4 \eta r] &#x003C; (\log\log M)^{4l}\max_{k,t,u}\left(\frac {k}{2^k}\right)^{l+\frac u2-t} \left(\frac{1}{\min(t,(2^k)^\kappa)}\right)^t \end{equation}

(137)

where |$\kappa =\frac {1}{2}-\frac {12}{A}+o(1)$|⁠.

Note now that |$2l=u+\sum _{h_s\geqslant 2}h_s \geqslant u+2(t-u)=2t-u$|⁠, hence |$t\leqslant l+u/2$|⁠, so the exponent of the fraction k/2k is nonnegative. In particular, the corresponding term is always less than or equal to 1.

We have two cases, according to whether t is small or large compared with l.

Case I: We have t<l/2. In this case,

\[ \left(\frac{k}{2^k}\right)^{l+\frac u2-t} &#x003C; \left(\frac {k}{2^k}\right)^{l/2} <\left(\frac{\log X_1}{X_1}\right)^{l/2}< (\log M)^{-\frac{1}{2} l} \]

because 2k>X1/2 and |$X_1>(\log M)^{2-\varepsilon }$|⁠. Thus Equation (137) yields

\begin{equation}\label{eq138} \mu[ r(\lambda) >4 \eta r] &#x003C; \left(\frac{(\log\log M)^4}{(\log M)^{\frac{1}{2}}}\right)^{l} <\delta^{l} \end{equation}

(138)

for any fixed δ>0, as |$M\to \infty $|⁠.

Case II: We have |$t \geqslant l/2$|. In this case,

\[ \left(\frac{1}{\min(t,(2^k)^\kappa)}\right)^t \leqslant (X_1/2)^{-\kappa l/2} <(\log M)^{-\frac{1}{2}\kappa l} \]

hence

\begin{equation}\label{eq139} \mu[ r(\lambda) >4 \eta r] &#x003C; \left(\frac{(\log\log M)^4}{(\log M)^\tau}\right)^{l} &#x003C; \delta^{l} \end{equation}

(139)

with |$\tau = \frac {1}{2}\kappa >0$| (recall that κ>0 because A>24), as |$M\to \infty $|⁠.

We take |$l = \lceil r \rceil = \lceil \log M/(A\log \log M)\rceil $|⁠, which we may because both conditions |$l \gg \log \log M$| and |$l=o(\log M)$| are satisfied. In view of Equations (138) and (139), in either case we have, as |$M\to \infty $|⁠,

\begin{equation}\label{eq140} \mu[ r(\lambda) >4 \eta r] &#x003C; \delta^{r} \end{equation}

(140)

for any fixed η>0 and δ>0, provided A>24.

Finally, we are ready to determine the average behavior of the number of solutions of the system (1) for |$m\in \mathcal M$| as defined in Section 19.

 

Assume the Riemann hypothesis and the Birch and Swinnerton-Dyer conjecture for the L-functions of elliptic curves over |${\mathbb {Q}}$|⁠.

For fixed A>24 and squarefree |$m\in \mathcal M$| chosen at random according to the distribution ψ[μ] as defined in Section 16, the number of solutions of

\[ P_1+P_2+P_3=P_4+P_5+P_6,\quad |P_i|^2=m,\quad P_i\in{\mathbb{Z}}[\sqrt{-1}] \]

is at most N3+ɛ as |$M\to \infty $|⁠, where N=2ω(m).

 

At the beginning of Section 15 it was defined for m squarefree the quantity

\begin{align*} Q(m) &:= |\{(P_1,\ldots,P_6)\in({\mathbb{Z}}[\sqrt{-1}])^6 \,:\, |P_i|^2=m \hbox{ and } P_1+P_2+P_3=P_4+P_5+P_6\}| \\ &\leqslant \sum_{P_1,P_2,P_3} \min\left(|C_{A,B}\cap{\mathbb{Z}}^2|,2^{(1+\varepsilon)r}\right)\tag{87} \end{align*}

with |$r=\log M/\log K \sim \omega (m)$|⁠, hence up to a factor 2o(r) this is the number of solutions of the system (1). Then on the assumption of the Riemann hypothesis for the L-function of elliptic curves over |${\mathbb {Q}}$| and the Birch and Swinnerton-Dyer conjecture we had obtained in Corollary 19 the estimate

\[ Q(m) \leqslant \sum_{P_1,P_2,P_3} \min\left(\left(\frac {c_{13}}{\varepsilon}\right)^{\frac{1}{2} r(\lambda)},2^{(1+\varepsilon)r}\right)+ O(N^{3+c_{12}A\varepsilon\log\frac 1{\varepsilon}})\tag{89} \]

for |$m\in \mathcal M$| and |$M\to \infty $|⁠.

The average of Q(m) with respect to the distribution ψ[μ] is done by appealing to Equation (89). By Equation (140) with δ=1/16, for any fixed η>0 as |$M\to \infty $| we have r(λ)<4ηr outside of a set of μ-measure 16−r. In this exceptional set, we use the trivial bound 2(1+ɛ)r in taking the minimum in Equation (89). In the nonexceptional set, we use instead the nontrivial bound |$C^{\frac {1}{2}r(\lambda)}< C^{2\eta r}$|⁠, with C=c13/ɛ. This gives the bound

\begin{align*} \int Q(m) {\mathrm{d}} \mu &\leqslant \int\left[\sum_{P_1,P_2,P_3}\min\left(C^{\frac{1}{2} r(\lambda)},2^{(1+\varepsilon)r}\right)\right]{\rm d} \mu +O(N^{3+c_{12}A\varepsilon\log\frac 1{\varepsilon}})\\ &\ll N^{3}(c_{13}/\varepsilon)^{2\eta r} + 16^{-r} N^3 2^{(1+\varepsilon)r}+ N^{3+c_{12}A\varepsilon\log\frac 1{\varepsilon}}. \end{align*}

Since we can choose ɛ arbitrarily small and, afterwards, η>0 also arbitrarily small, for example η=ɛ, recalling that N=2(1+o(1))r we obtain Q(m)=O(N3+o(1)) on average on m with respect to the distribution ψ[μ].

Funding

E. B. thanks the AXA Foundation and the Mittag-Leffler Institute for their generous support during the preparation of this work, and also the support of a grant from the School of Mathematics Council of the Institute for Advanced Study.

Acknowledgements

E.B. thanks Ciro Ciliberto for pointing out in a conversation at the Mittag-Leffler Institute the connection of the pencil of sextics with Halphen's rational elliptic surface and for useful discussions on its geometry, and also thanks N.M. Katz for continued enlightening discussions on the arithmetic of elliptic curves.

References

3

Heights in Diophantine Geometry

,  

8

Linear equations with variables which lie in a multiplicative group

Annals of Mathematics (2)

, , vol.  (pg. -)

9

On rational surfaces with multiple fibers

Publications of the Research Institute for Mathematical Sciences, Research Institute for Mathematical Sciences , Kyoto University

, , vol.  (pg. -)

Postingan terbaru

LIHAT SEMUA