Convergence failure in secular equation DLAED4

The issue was originally filed as a [convergence failure of SYEVD]( https://github.com/jax-ml/jax/issues/32641) in Jax. It can be traced to a convergence failure of DLAED4. Here is a minimal standalone reproducer. 

```C++
// main.cpp
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <vector>
#include <iostream>
#include <iomanip>

extern "C" {
  void dlaed4_(const int* N, const int* I,
               const double* D, const double* Z,
               double* DELTA, const double* RHO,
               double* DLAM, int* INFO);
}

int main(int argc, char** argv) {
  const int n = 29;
  const int i = 23;
  const double rho = 6.2964808955495258e-05;

  const double d[n] = {
    -8.9907411945275305e-17, 3.1527281176230556e-06, 7.8104209108493992e-06, 1.4528378412105169e-05, 1.6571840586254615e-05, 1.6791959834500457e-05, 1.6951003434697501e-05, 1.6961635049587819e-05, 1.6969218460422806e-05,     1.7028037406879576e-05, 1.9135232081591496e-05, 2.1876238530929982e-05, 2.1920146416725336e-05, 3.9357203097119847e-05, 1.0557423781645928e-04, 1.3686127039388258e-04, 1.3686300570614894e-04, 1.3686313427158908e-04, 1.3686314099616321e-04, 1.3686314127052412e-04, 1.3686314210337733e-04, 1.3686314231692531e-04, 1.3686314257213152e-04, 5.3532577970471340e-02, 9.9996866934433004e-01, 1.0000000849390769e+00, 1.0000000901678945e+00, 1.0000000918404301e+00, 1.0000001095998177e+00
  };
  const double z[n] = {
    1.6700411548440888e-07, 2.2704845744451106e-06, -6.8596999842130412e-06, -1.9813464524908256e-04, 3.4676472662621390e-03, 5.8131816598176476e-02, -2.4389840480820292e-03, 2.2571085689296944e-03, -2.0150012240231657e-03, -9.7432047705138750e-03, 7.3184880793391433e-04, -2.9319792346511997e-02, 8.3364827297377855e-07, -8.9722126354645391e-09, 7.0412738449665979e-01, -1.4042299148200350e-06, 3.8731670039916887e-06, -5.4330381400333468e-07, 4.4026694557354993e-07, 2.2528424563293299e-07, 3.6064772311683760e-06, 2.8472995783701232e-07, 8.3086036055888118e-07, 2.8464273601162125e-02, -7.0642255076744132e-01, -5.2009238883566685e-08, 8.6348330718758654e-08, -2.9664496252475970e-08, 1.4932776629316714e-09
  };

  std::vector<double> delta(n, 0.0);
  double dlam = 0.0;
  int info = 0;

  dlaed4_(&n, &i, d, z, delta.data(), &rho, &dlam, &info);

  std::cout << std::setprecision(17) << std::scientific;
  std::cout << "info = " << info << "\n";
  std::cout << "dlam = " << dlam << "\n";

  return (info == 0) ? 0 : 1;
}
```
The code returns `info = 1`, which means not converged. The maximum number of iterations in DLAED4 is 30; the algorithm requires 31 iterations. 

The eigenvalue to be found is very close to the left pole. There are several iterations where the Newton-like step overshoots and a fallback is taken. This happens for the first time in iteration 5. The step taken by the fallback turns out to be a bad step. The objective function value (`W` in the table) grows a lot and the remaining iterations do not suffice to correct this. Recall that this is a root finder, so `W` should go to zero. 

| iter | W | relative position |
| ---: | ---: | ---: |
| 3 | -15.183206754205958 | 9.3224424664962507E-012 |
| 4 | -7.2541345644250672 | 2.8321247865713130E-011 |
| 5 | 15845.035200581398 | 0.25000000001416062 |
| 6 | 7937.0973190230861 | 5.8695311144112583E-004 |
| 7 | 5292.0409126050627 | 2.9347656988118687E-004 |
| 8 | 2646.0775072154233 | 1.1736977032438319E-004 |
| 9 | 1443.4324592139662 | 5.8684899322815527E-005 |
| 10 | 721.72516385211111 | 2.7943564578650639E-005 |
| 11 | 369.27924724566680 | 1.3971796449949251E-005 |
| 12 | 184.64794882583919 | 6.9033154364403488E-006 |
| 13 | 92.881305918680368 | 3.4516718788441071E-006 |
| 14 | 46.448843381611326 | 1.7204602732982808E-006 |
| 15 | -2.9323820134484166 | 8.3977374357369141E-011 |
| 16 | -1.4556166910784307 | 1.8246643456796507E-010 |
| 17 | 23.277161269594469 | 8.6032136986642434E-007 |
| 18 | 11.646269558195501 | 4.2954626252879772E-007 |
| 19 | -0.91250965489136449 | 3.0284357148388565E-010 |
| 20 | -0.45016815654621373 | 6.3928690716673472E-010 |
| 21 | 5.8486784388543924 | 2.1509277471798222E-007 |
| 22 | 2.9300774463199879 | 1.0726528303886306E-007 |
| 23 | 1.4829047906270059 | 5.3952284973014894E-008 |
| 24 | 0.73997721204365996 | 2.6852690757367693E-008 |
| 25 | -0.35749940737873870 | 8.0525079151536220E-010 |
| 26 | -0.15628904239033362 | 1.6119748312661070E-009 |
| 27 | 7.0768362172978744E-002 | 4.8489293641088358E-009 |
| 28 | 1.1908566236531963E-002 | 3.6314105287554539E-009 |
| 29 | -3.2274931252138529E-004 | 3.4136194909394072E-009 |
| 30 | -2.3743868398512502E-007 | 3.4192031981775864E-009 |

The root cause of the slow convergence is the fallback to bisection.

https://github.com/Reference-LAPACK/lapack/blob/06f5ba3143280d6ccf7b4dff3c0761f1ea9479a2/SRC/dlaed4.f#L849-L856

As we see by the relative position in iteration 5, bisection is really bad in this case and moves the candidate point far away from the root. 

I would like to understand why bisection was chosen as the fallback.

* LAPACK 2.0 had a different update (I did not try to decipher what exactly it does) and goes into a cyclic infinite loop for this case.
* LAPACK 3.0 introduced bisection as fallback. This part has not been changed since.

Are there any release notes for LAPACK 3.0? Was there any other kind of dissemination, email lists etc, where I could find a hint?

	TEMP = TAU + ETA
	IF( TEMP.GT.DLTUB .OR. TEMP.LT.DLTLB ) THEN
	IF( W.LT.ZERO ) THEN
	ETA = ( DLTUB-TAU ) / TWO
	ELSE
	ETA = ( DLTLB-TAU ) / TWO
	END IF
	END IF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convergence failure in secular equation DLAED4 #1166

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

iter	W	relative position
3	-15.183206754205958	9.3224424664962507E-012
4	-7.2541345644250672	2.8321247865713130E-011
5	15845.035200581398	0.25000000001416062
6	7937.0973190230861	5.8695311144112583E-004
7	5292.0409126050627	2.9347656988118687E-004
8	2646.0775072154233	1.1736977032438319E-004
9	1443.4324592139662	5.8684899322815527E-005
10	721.72516385211111	2.7943564578650639E-005
11	369.27924724566680	1.3971796449949251E-005
12	184.64794882583919	6.9033154364403488E-006
13	92.881305918680368	3.4516718788441071E-006
14	46.448843381611326	1.7204602732982808E-006
15	-2.9323820134484166	8.3977374357369141E-011
16	-1.4556166910784307	1.8246643456796507E-010
17	23.277161269594469	8.6032136986642434E-007
18	11.646269558195501	4.2954626252879772E-007
19	-0.91250965489136449	3.0284357148388565E-010
20	-0.45016815654621373	6.3928690716673472E-010
21	5.8486784388543924	2.1509277471798222E-007
22	2.9300774463199879	1.0726528303886306E-007
23	1.4829047906270059	5.3952284973014894E-008
24	0.73997721204365996	2.6852690757367693E-008
25	-0.35749940737873870	8.0525079151536220E-010
26	-0.15628904239033362	1.6119748312661070E-009
27	7.0768362172978744E-002	4.8489293641088358E-009
28	1.1908566236531963E-002	3.6314105287554539E-009
29	-3.2274931252138529E-004	3.4136194909394072E-009
30	-2.3743868398512502E-007	3.4192031981775864E-009

Convergence failure in secular equation DLAED4 #1166

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions