|
@@ -0,0 +1,94 @@
|
|
|
+Split page table lock
|
|
|
+=====================
|
|
|
+
|
|
|
+Originally, mm->page_table_lock spinlock protected all page tables of the
|
|
|
+mm_struct. But this approach leads to poor page fault scalability of
|
|
|
+multi-threaded applications due high contention on the lock. To improve
|
|
|
+scalability, split page table lock was introduced.
|
|
|
+
|
|
|
+With split page table lock we have separate per-table lock to serialize
|
|
|
+access to the table. At the moment we use split lock for PTE and PMD
|
|
|
+tables. Access to higher level tables protected by mm->page_table_lock.
|
|
|
+
|
|
|
+There are helpers to lock/unlock a table and other accessor functions:
|
|
|
+ - pte_offset_map_lock()
|
|
|
+ maps pte and takes PTE table lock, returns pointer to the taken
|
|
|
+ lock;
|
|
|
+ - pte_unmap_unlock()
|
|
|
+ unlocks and unmaps PTE table;
|
|
|
+ - pte_alloc_map_lock()
|
|
|
+ allocates PTE table if needed and take the lock, returns pointer
|
|
|
+ to taken lock or NULL if allocation failed;
|
|
|
+ - pte_lockptr()
|
|
|
+ returns pointer to PTE table lock;
|
|
|
+ - pmd_lock()
|
|
|
+ takes PMD table lock, returns pointer to taken lock;
|
|
|
+ - pmd_lockptr()
|
|
|
+ returns pointer to PMD table lock;
|
|
|
+
|
|
|
+Split page table lock for PTE tables is enabled compile-time if
|
|
|
+CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS.
|
|
|
+If split lock is disabled, all tables guaded by mm->page_table_lock.
|
|
|
+
|
|
|
+Split page table lock for PMD tables is enabled, if it's enabled for PTE
|
|
|
+tables and the architecture supports it (see below).
|
|
|
+
|
|
|
+Hugetlb and split page table lock
|
|
|
+---------------------------------
|
|
|
+
|
|
|
+Hugetlb can support several page sizes. We use split lock only for PMD
|
|
|
+level, but not for PUD.
|
|
|
+
|
|
|
+Hugetlb-specific helpers:
|
|
|
+ - huge_pte_lock()
|
|
|
+ takes pmd split lock for PMD_SIZE page, mm->page_table_lock
|
|
|
+ otherwise;
|
|
|
+ - huge_pte_lockptr()
|
|
|
+ returns pointer to table lock;
|
|
|
+
|
|
|
+Support of split page table lock by an architecture
|
|
|
+---------------------------------------------------
|
|
|
+
|
|
|
+There's no need in special enabling of PTE split page table lock:
|
|
|
+everything required is done by pgtable_page_ctor() and pgtable_page_dtor(),
|
|
|
+which must be called on PTE table allocation / freeing.
|
|
|
+
|
|
|
+Make sure the architecture doesn't use slab allocator for page table
|
|
|
+allocation: slab uses page->slab_cache and page->first_page for its pages.
|
|
|
+These fields share storage with page->ptl.
|
|
|
+
|
|
|
+PMD split lock only makes sense if you have more than two page table
|
|
|
+levels.
|
|
|
+
|
|
|
+PMD split lock enabling requires pgtable_pmd_page_ctor() call on PMD table
|
|
|
+allocation and pgtable_pmd_page_dtor() on freeing.
|
|
|
+
|
|
|
+Allocation usually happens in pmd_alloc_one(), freeing in pmd_free(), but
|
|
|
+make sure you cover all PMD table allocation / freeing paths: i.e X86_PAE
|
|
|
+preallocate few PMDs on pgd_alloc().
|
|
|
+
|
|
|
+With everything in place you can set CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK.
|
|
|
+
|
|
|
+NOTE: pgtable_page_ctor() and pgtable_pmd_page_ctor() can fail -- it must
|
|
|
+be handled properly.
|
|
|
+
|
|
|
+page->ptl
|
|
|
+---------
|
|
|
+
|
|
|
+page->ptl is used to access split page table lock, where 'page' is struct
|
|
|
+page of page containing the table. It shares storage with page->private
|
|
|
+(and few other fields in union).
|
|
|
+
|
|
|
+To avoid increasing size of struct page and have best performance, we use a
|
|
|
+trick:
|
|
|
+ - if spinlock_t fits into long, we use page->ptr as spinlock, so we
|
|
|
+ can avoid indirect access and save a cache line.
|
|
|
+ - if size of spinlock_t is bigger then size of long, we use page->ptl as
|
|
|
+ pointer to spinlock_t and allocate it dynamically. This allows to use
|
|
|
+ split lock with enabled DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC, but costs
|
|
|
+ one more cache line for indirect access;
|
|
|
+
|
|
|
+The spinlock_t allocated in pgtable_page_ctor() for PTE table and in
|
|
|
+pgtable_pmd_page_ctor() for PMD table.
|
|
|
+
|
|
|
+Please, never access page->ptl directly -- use appropriate helper.
|