Fix initial format namenode corruption: Disable restart-controller

Since the restart controller was enabled in https://github.com/stackabletech/hdfs-operator/pull/743 we had more and more test failures due to namenodes not coming up.

## Description

The problem is the format-namenode init container, which checks if a certain file `/stackable/data/namenode/current/VERSION` is there.
If it exists, we assume that everything is formatted properly and do not format the node.

In the failed tests, the `/stackable/data/namenode/current/VERSION` file was there, but the directory was missing other important files like `fsimage_xxx` that would be there with a correct format. This leads to the fsImage not found error in the tests.

We think that increased restarts, partially introduced with the restart-controller (which just makes the problem appear more often, it existed before), as one of the problems that might interfere with the proper namenode formatting, or will interrupt an ongoing format and leave corrupt data / directory.

We identified one safe restart that is due to the HDFS ZNode ConfigMap (!) being applied AFTER the HDFS cluster, which happens in most product tests, but e.g. not the HDFS smoke test itself (previously it didn't use any ZNode). This results in a restart for journal and namenodes.

Now we tried to check if there are any "fsimage_xxx" files at all, and only then skip the formatting in the format-namenodes script.

But the format script itself checks if the directory exists and / or is empty:
```
Running in non-interactive mode, and data appears to exist in Storage Directory root= /stackable/data/namenode; location= null. Not formatting.
```
https://github.com/apache/hadoop/blob/fd18f0647dc1cae191925ac0f593b5962effc583/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java#L1191

TLDR: The problem exists only for fresh clusters (which is good), and we have a concept of a plan to solve it!

We thought about a couple of options for the case: VERSION file exists, fsimage does not.

1. Error out, never start the namenode and let administrators clean up that PVC / folder? Blocking the cluster.
2. Move the `/stackable/data/namenode/current/` directory to something like `/stackable/data/namenode/current_bk/` and restart the formatting, therefore not losing any existing data but not blocking the cluster (but clutter up the PVC)?
3. Force format (cli option), since there are no fsimages and we can not delete any data (not too sure about this one)?

## Tasks

We decided for Option 1.
- [x] Disable restart-controller
- [x] Emit a warning if the `VERSION` but e.g. no `fsimage_xxx` file exists 
- [x] Document the behavior and add a troubleshoot guide


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix initial format namenode corruption: Disable restart-controller #750

Description

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Fix initial format namenode corruption: Disable restart-controller #750

Description

Description

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions