Cecilia Data AB - How to recover from type-mismatch gluster split-brain problem

How to recover from type-mismatch gluster split-brain problem

Details: Category: Blog; Published: Wednesday, 19 January 2022 16:47; Written by Lars Berntzon; Hits: 3580

According to all documents I have read it is not possible to recover from a glusterfs split-brain situation when the problems is "Type-mismatch". You can see this is the self-heal deamon log /var/log/glusterfs/glustershd.log for instance with lines like:

[2022-01-18 23:06:34.791698] E [MSGID: 108008] [afr-self-heal-common.c:385:afr_gfid_split_brain_source] 13-gluster-normal-px2jenkins-replicate-0: Gfid mismatch detected for <gfid:4c21e24c-1d22-4a12-a139-8a515f4b8d13>/build.xml>, 1d4f68e1-1a11-474d-8712-55ab8164d06f on my-volume-client-1 and cc8057e4-4395-411b-8315-34d941d1acae on my-volume-client-0.

The end symptom to the user experiences is that file access of some files reports: "endpoint not connected"

The only suggestion I have found was to completely recreate the volumes.

I could however recover by following this procedure (using the volume "my-volume"):

Find the files with problems

First run:

gluster volume heal my-volume info

This will give you a list of possible victims. One list per brick. Like:

Brick server11:/srv/gluster-normal/my-volume
Status: Connected
Number of entries: 0

Brick server19:/srv/gluster-normal/my-volume
<gfid:f083d990-71a6-40b7-8106-b058cb2acbcb>
/problem-file
Status: Connected
Number of entries: 2

Brick server15:/srv/gluster-normal/my-volume
<gfid:f083d990-71a6-40b7-8106-b058cb2acbcb>
/problem-file
/jobs/testrunner/builds/.487462 
Status: Connected
Number of entries: 2

This list might change if you run it several times because the self-healing daemon is running, but it you see a file that is constantly there, like the file that you got the initial problem with you continue and check get gfid of those files on all bricks

Check GFID on proper server

Login to each brick for the volume and check the gfid for the file to correct, gfid is the gluster internal id for a file, a bit like inode-numbers. First run on the brick that report no problems for this file.

On server11:

getfattr -e hex -n trusted.gfid /srv/gluster-normal/my-volume/problem-file
# file: /srv/gluster-normal/my-volume/problem-file
trusted.gfid=0x4c9d33a1fa064ccea30e0295807de94b

This is the proper id. Next run on the failing bricks.

Correct the GFID on broken servers

On server19 first check that there is indeed a mismatch:

cd /srv/gluster-normal/my-volume
getfattr -e hex -n trusted.gfid problem-file
# file: /srv/gluster-normal/my-volume/problem-file
trusted.gfid=0x8ae8de84bf1c49c1ba622b38d80579af

There was the first problem so now correct it with:

cd /srv/gluster-normal/my-volume
setfattr  -n trusted.gfid -v 0x4c9d33a1fa064ccea30e0295807de94b problem-file

Correct according to the above on all servers where there is a mismatch of the gfid of the file.

Note! If a directory has mismatching gfid:s, it must be corrected before any files below it.

Remove files not known

For entries like: <gfid:c12161df-cdb8-4605-9381-3e9da7e458ba> in the "gluster v heal my-volume info" output usually means a missing file that can be removed on all bricks warning about it. The file then located in the folder <brick-dir>/.glusterfs/c1/21/c12161df-cdb8-4605-9381-3e9da7e458ba and is a symbolic link to the proper path, both can be removed normally. But make sure its ok to delete.

Some tips

If the list of non-healed files still persist, Verify if you can access them from fuse mounted dirs. If so, try to copy them to a temporary file and then copy back so that the gluster replicas will update.

Keep cleaning out broken gfid:s or possibly delete broken files until gluster v heal info returns 0 entries (for a quiet filsystem that is)

Nav view search

Navigation

Search