View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0016208||Buildsys||Ci.centos.org Ecosystem Testing||public||2019-06-22 14:16||2019-09-04 14:51|
|Summary||0016208: cockpit-images volume (os-pv-100gi-00000002) inexplicably full|
In the "cockpit" OpenShift project we use a persistent volume (https://console.apps.ci.centos.org:8443/console/project/cockpit/browse/persistentvolumeclaims/cockpit-images) as VM image cache. This is now shown as full:
172.22.6.19:/exports/os-pv-100gi-00000002 187G 187G 1.0M 100% /cache/images
and regularly leads to ENOSPC. However, all actual files on this device are only 30 GB:
$ du -hsc /cache/images/
Yesterday I already tried to shut down all containers that mount this volume, just in case there are processes holding open fds to deleted files, but that didn't help at all. There are no hidden files on this device either. At this point I don't know what I could do on my end.
Is there something obviously wrong on the NFS server side?
In case repairing the volume isn't easy: It wouldn't be a problem to entirely delete, or remove/re-add the PV, and re-populate it. I just copied all files to our other (Red Hat internal) store.
|Tags||No tags attached.|
Odd, today this looks different even nothing substantial changed on our end since I filed this 4 days ago (as I shut down all pods that work with the image):
172.22.6.19:/exports/os-pv-100gi-00000002 187G 143G 44G 77% /cache/images
I re-enabled the pods for now, as it seems it's not dangerously close to ENOSPC again. However, we still only allocate ~ 30 GB, not 143.
This is still an issue today:
187G 171G 16G 92% /cache/images
$ du -hs /cache/images/
I tried to completely empty the volume file system now:
$ ls -la /cache/images/
drwxrwxrwx. 2 nobody nobody 6 Sep 4 14:45 .
drwxrwsrwx. 4 root 1000190000 34 Sep 4 09:13 ..
du -hs /cache/images/
$ df /cache/images/
Filesystem 1K-blocks Used Available Use% Mounted on
172.22.6.19:/exports/os-pv-100gi-00000002 195265536 142974976 52290560 74% /cache/images
I then shut down all pods that use that volume, just in case they hold open deleted fds there. But that didn't help either.
So there really is some inexplicable 137G usage that I just can't get rid of. The fun thing is that this is supposed to be a 100 GiB PV, not 187 -- 100 would be enough, but 50 is just too small. We need some 70 GiB for our stuff.