2013年12月10日火曜日

Resolving "out of memory" situation on ZFS import, without actual DRAM

This entry has several screenshots w/Japanese, and may not able to read them - my apologies for the inconvenience.


You won't go to heaven if you have enabled deduplication feature of ZFS once.


ZFS will keep DDT (de-duplication table), which manages data blocks, in ARC and the media if dedup is set to "on". Generally it should work great for read/writes, but you'll see the tragedy when you'll try to delete huge data with DDT. ZFS may stall if the system doesn't have enough DRAM to hold DDT. Sometimes deleting 1GB files on 8GB DRAM system can be challenging.


Even though ZFS stopped due to out-of-memory, usually rebooting the system works for me. The pool will be imported, the partial transaction will be finished, and the volume will be available. I experienced this kind of issue for several dozen times(!), but I've never lost any data on ZFS, and I believe ZFS is still reliable.


And... it happened again....


The other day I did "zfs destroy array2/backup" to free the unnecessary filesystem with 4GB data, but it caused out-of-memory and stopped ZFS again. As I mentioned, rebooting the system should be the remedy, but not for this time.


"Argh!
I will lose my data!"


Anyway I tried to recover from this situation...


■ Update OpenIndiana system

I've run OpenIndiana 151a7 kernel for a while. I believe this kernel has some ZFS issue which causes deadlock when ZFS is starving with free memory. This is just my guess, but updating the kernel may work, so I've updated OpenIndiana packages via pkg command.

■ Over-commit the vRAM using ESXiMy first ZFS-based system was running on <4GB DRAM environment. Today I allocate 16Gigabytes of DRAM for the guest on ESXi, but it seems to be not enough for this situation. Finally I decided to "over-commit" DRAM as possible as can, because the system uses touches lots of memory but I didn't see so much random access against the entire memory region.




Yes, I gave it 32Gigabytes of memory (This host actually has only 24GB). Fortunately, it is quite easy to configure vRAM over-commit on ESXi. The memory size can be set to 32GB even if the host only has 8GB or 16GB actual memory. Its paging file is on Fusion-io ioDrive2 flash tier.

■ Imported successfully

I had some confidence that more additional memory will resolve this, and my guess was true - finally the pool had been imported successfully. According to ESXi, The OpenIndiana VM spent 18.91GB vRAM.



Anyway, once the ZFS pool had been successfully imported, it doesn't need so much vRAM anymore. It only required 1.9GB vRAM after the restart the guest.



■ ZFS Dedupe? ... No...

Now I believe we should avoid using deduplication feature of ZFS, or you'll see some difficulty when deleting data on the pool.



I've been struggled with ZFS dedupe several times, and going to build a new ZFS pool with dedup=ABSOLUTELY off.

0 件のコメント:

コメントを投稿