ZFS, ARC, page cache, and 1970s buffer tuning

Some time ago Pawel Dawidek ported Sun's ZFS to FreeBSD. ZFS has many interesting features.

One of the more problematic features on general purpose systems is the disk block caching in the ARC (Adjustable Replacement Cache). The way FreeBSD's native file system do caching is by storing a file's pages at the appropriate offset in the file's vnode's associated vm object. Typically, unless a page is actively being written to or read from disk, i.e. it is associated with a buf, it can be taken away from the vm object by the pagedaemon to be used somewhere else in the system. In ZFS' ARC disk blocks are cached by their DVA (oversimplifying, but effectively their block offset on disk). I won't go in to the rationale for this design choice, but this precludes us from using the existing mechanism for coupling file caching with the VM system. ARC buffers are allocated from "wired" memory, meaning that the pagedaemon cannot evict them and it has no information about their usage relative to other pages in the system (i.e. the buffer can be released and the memory re-allocated to some other service).

As a result of this, the VM does not have the ability to determine how "hot" a ZFS buffer is vis a vis other ZFS buffers or other parts of memory. The VM can only influence the overall usage of the ARC by calling the lowmem handler to free some number of buffers. However, it has no way of determining relative priority of ZFS memory usage versus user applications. A system administrator is left to deciding a priori what the minimum ARC size is (the size below which memory pressure and the lowmem handler will not shrink it) and what the maximum size is (the ARC memory above which the ARC will not allocate further buffers). On a dedicated system one can set a large arc_max and arc_min, however, none of this yields graceful management of resources under mixed workloads.

Ideally one could keep the default ARC settings small but still use all of memory for block caching if it isn't in use by other applications.

To this end I've added a caching layer in between the memory allocation functions (zio_buf_alloc and zio_buf_free) and their consumers. Caching of buffer pages is limited by ZFS' rather idiosyncratic allocation patterns. Thus, the bulk of allocations by allocation count are still malloc backed (not eligible for page caching). Only the 128k buffer allocations all tend be size aligned with worst alignment I've seen in practice being 32k aligned and thus practical to track by their alignment on disk. Unfortunately, the block offset isn't usually available at allocation time so I allocate 128k buffers using the anonymous allocation function, geteblk(), and then synchronize with the vm object when I/O is done using the buffer.

All I/O traverses zio_create, thus if a read or write is being done to a top level logical device the I/O is synchronized with cache in the new function zio_sync_cache(...)

more later ...

ZFS, ARC, page cache, and 1970s buffer tuning

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112