123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225 |
- Fault injection capabilities infrastructure
- ===========================================
- See also drivers/md/faulty.c and "every_nth" module option for scsi_debug.
- Available fault injection capabilities
- --------------------------------------
- o failslab
- injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
- o fail_page_alloc
- injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
- o fail_make_request
- injects disk IO errors on devices permitted by setting
- /sys/block/<device>/make-it-fail or
- /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
- Configure fault-injection capabilities behavior
- -----------------------------------------------
- o debugfs entries
- fault-inject-debugfs kernel module provides some debugfs entries for runtime
- configuration of fault-injection capabilities.
- - /debug/fail*/probability:
- likelihood of failure injection, in percent.
- Format: <percent>
- Note that one-failure-per-hundred is a very high error rate
- for some testcases. Consider setting probability=100 and configure
- /debug/fail*/interval for such testcases.
- - /debug/fail*/interval:
- specifies the interval between failures, for calls to
- should_fail() that pass all the other tests.
- Note that if you enable this, by setting interval>1, you will
- probably want to set probability=100.
- - /debug/fail*/times:
- specifies how many times failures may happen at most.
- A value of -1 means "no limit".
- - /debug/fail*/space:
- specifies an initial resource "budget", decremented by "size"
- on each call to should_fail(,size). Failure injection is
- suppressed until "space" reaches zero.
- - /debug/fail*/verbose
- Format: { 0 | 1 | 2 }
- specifies the verbosity of the messages when failure is
- injected. '0' means no messages; '1' will print only a single
- log line per failure; '2' will print a call trace too -- useful
- to debug the problems revealed by fault injection.
- - /debug/fail*/task-filter:
- Format: { 'Y' | 'N' }
- A value of 'N' disables filtering by process (default).
- Any positive value limits failures to only processes indicated by
- /proc/<pid>/make-it-fail==1.
- - /debug/fail*/require-start:
- - /debug/fail*/require-end:
- - /debug/fail*/reject-start:
- - /debug/fail*/reject-end:
- specifies the range of virtual addresses tested during
- stacktrace walking. Failure is injected only if some caller
- in the walked stacktrace lies within the required range, and
- none lies within the rejected range.
- Default required range is [0,ULONG_MAX) (whole of virtual address space).
- Default rejected range is [0,0).
- - /debug/fail*/stacktrace-depth:
- specifies the maximum stacktrace depth walked during search
- for a caller within [require-start,require-end) OR
- [reject-start,reject-end).
- - /debug/fail_page_alloc/ignore-gfp-highmem:
- Format: { 'Y' | 'N' }
- default is 'N', setting it to 'Y' won't inject failures into
- highmem/user allocations.
- - /debug/failslab/ignore-gfp-wait:
- - /debug/fail_page_alloc/ignore-gfp-wait:
- Format: { 'Y' | 'N' }
- default is 'N', setting it to 'Y' will inject failures
- only into non-sleep allocations (GFP_ATOMIC allocations).
- o Boot option
- In order to inject faults while debugfs is not available (early boot time),
- use the boot option:
- failslab=
- fail_page_alloc=
- fail_make_request=<interval>,<probability>,<space>,<times>
- How to add new fault injection capability
- -----------------------------------------
- o #include <linux/fault-inject.h>
- o define the fault attributes
- DECLARE_FAULT_INJECTION(name);
- Please see the definition of struct fault_attr in fault-inject.h
- for details.
- o provide a way to configure fault attributes
- - boot option
- If you need to enable the fault injection capability from boot time, you can
- provide boot option to configure it. There is a helper function for it:
- setup_fault_attr(attr, str);
- - debugfs entries
- failslab, fail_page_alloc, and fail_make_request use this way.
- Helper functions:
- init_fault_attr_entries(entries, attr, name);
- void cleanup_fault_attr_entries(entries);
- - module parameters
- If the scope of the fault injection capability is limited to a
- single kernel module, it is better to provide module parameters to
- configure the fault attributes.
- o add a hook to insert failures
- Upon should_fail() returning true, client code should inject a failure.
- should_fail(attr, size);
- Application Examples
- --------------------
- o inject slab allocation failures into module init/cleanup code
- ------------------------------------------------------------------------------
- #!/bin/bash
- FAILCMD=Documentation/fault-injection/failcmd.sh
- BLACKLIST="root_plug evbug"
- FAILNAME=failslab
- echo Y > /debug/$FAILNAME/task-filter
- echo 10 > /debug/$FAILNAME/probability
- echo 100 > /debug/$FAILNAME/interval
- echo -1 > /debug/$FAILNAME/times
- echo 2 > /debug/$FAILNAME/verbose
- echo 1 > /debug/$FAILNAME/ignore-gfp-wait
- blacklist()
- {
- echo $BLACKLIST | grep $1 > /dev/null 2>&1
- }
- oops()
- {
- dmesg | grep BUG > /dev/null 2>&1
- }
- find /lib/modules/`uname -r` -name '*.ko' -exec basename {} .ko \; |
- while read i
- do
- oops && exit 1
- if ! blacklist $i
- then
- echo inserting $i...
- bash $FAILCMD modprobe $i
- fi
- done
- lsmod | awk '{ if ($3 == 0) { print $1 } }' |
- while read i
- do
- oops && exit 1
- if ! blacklist $i
- then
- echo removing $i...
- bash $FAILCMD modprobe -r $i
- fi
- done
- ------------------------------------------------------------------------------
- o inject slab allocation failures only for a specific module
- ------------------------------------------------------------------------------
- #!/bin/bash
- FAILMOD=Documentation/fault-injection/failmodule.sh
- echo injecting errors into the module $1...
- modprobe $1
- bash $FAILMOD failslab $1 10
- echo 25 > /debug/failslab/probability
- ------------------------------------------------------------------------------
|