fault-injection.txt 5.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221
  1. Fault injection capabilities infrastructure
  2. ===========================================
  3. See also drivers/md/faulty.c and "every_nth" module option for scsi_debug.
  4. Available fault injection capabilities
  5. --------------------------------------
  6. o failslab
  7. injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
  8. o fail_page_alloc
  9. injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
  10. o fail_make_request
  11. injects disk IO errors on permitted devices by
  12. /sys/block/<device>/make-it-fail or
  13. /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
  14. Configure fault-injection capabilities behavior
  15. -----------------------------------------------
  16. o debugfs entries
  17. fault-inject-debugfs kernel module provides some debugfs entries for runtime
  18. configuration of fault-injection capabilities.
  19. - /debug/*/probability:
  20. likelihood of failure injection, in percent.
  21. Format: <percent>
  22. Note that one-failure-per-handred is a very high error rate
  23. for some testcases. Please set probably=100 and configure
  24. /debug/*/interval for such testcases.
  25. - /debug/*/interval:
  26. specifies the interval between failures, for calls to
  27. should_fail() that pass all the other tests.
  28. Note that if you enable this, by setting interval>1, you will
  29. probably want to set probability=100.
  30. - /debug/*/times:
  31. specifies how many times failures may happen at most.
  32. A value of -1 means "no limit".
  33. - /debug/*/space:
  34. specifies an initial resource "budget", decremented by "size"
  35. on each call to should_fail(,size). Failure injection is
  36. suppressed until "space" reaches zero.
  37. - /debug/*/verbose
  38. Format: { 0 | 1 | 2 }
  39. specifies the verbosity of the messages when failure is injected.
  40. We default to 0 (no extra messages), setting it to '1' will
  41. print only to tell failure happened, '2' will print call trace too -
  42. it is useful to debug the problems revealed by fault injection
  43. capabilities.
  44. - /debug/*/task-filter:
  45. Format: { 0 | 1 }
  46. A value of '0' disables filtering by process (default).
  47. Any positive value limits failures to only processes indicated by
  48. /proc/<pid>/make-it-fail==1.
  49. - /debug/*/address-start:
  50. - /debug/*/address-end:
  51. specifies the range of virtual addresses tested during
  52. stacktrace walking. Failure is injected only if some caller
  53. in the walked stacktrace lies within this range.
  54. Default is [0,ULONG_MAX) (whole of virtual address space).
  55. - /debug/*/stacktrace-depth:
  56. specifies the maximum stacktrace depth walked during search
  57. for a caller within [address-start,address-end).
  58. - /debug/fail_page_alloc/ignore-gfp-highmem:
  59. Format: { 0 | 1 }
  60. default is 0, setting it to '1' won't inject failures into
  61. highmem/user allocations.
  62. - /debug/failslab/ignore-gfp-wait:
  63. - /debug/fail_page_alloc/ignore-gfp-wait:
  64. Format: { 0 | 1 }
  65. default is 0, setting it to '1' will inject failures
  66. only into non-sleep allocations (GFP_ATOMIC allocations).
  67. o Boot option
  68. In order to inject faults while debugfs is not available (early boot time),
  69. use the boot option:
  70. failslab=
  71. fail_page_alloc=
  72. fail_make_request=<interval>,<probability>,<space>,<times>
  73. How to add new fault injection capability
  74. -----------------------------------------
  75. o #include <linux/fault-inject.h>
  76. o define the fault attributes
  77. DECLARE_FAULT_INJECTION(name);
  78. Please see the definition of struct fault_attr in fault-inject.h
  79. for details.
  80. o provide the way to configure fault attributes
  81. - boot option
  82. If you need to enable the fault injection capability from boot time, you can
  83. provide boot option to configure it. There is a helper function for it.
  84. setup_fault_attr(attr, str);
  85. - debugfs entries
  86. failslab, fail_page_alloc, and fail_make_request use this way.
  87. There is a helper function for it.
  88. init_fault_attr_entries(entries, attr, name);
  89. void cleanup_fault_attr_entries(entries);
  90. - module parameters
  91. If the scope of the fault injection capability is limited to a
  92. single kernel module, it is better to provide module parameters to
  93. configure the fault attributes.
  94. o add a hook to insert failures
  95. should_fail() returns 1 when failures should happen.
  96. should_fail(attr,size);
  97. Application Examples
  98. --------------------
  99. o inject slab allocation failures into module init/cleanup code
  100. ------------------------------------------------------------------------------
  101. #!/bin/bash
  102. FAILCMD=Documentation/fault-injection/failcmd.sh
  103. BLACKLIST="root_plug evbug"
  104. FAILNAME=failslab
  105. echo Y > /debug/$FAILNAME/task-filter
  106. echo 10 > /debug/$FAILNAME/probability
  107. echo 100 > /debug/$FAILNAME/interval
  108. echo -1 > /debug/$FAILNAME/times
  109. echo 2 > /debug/$FAILNAME/verbose
  110. echo 1 > /debug/$FAILNAME/ignore-gfp-wait
  111. blacklist()
  112. {
  113. echo $BLACKLIST | grep $1 > /dev/null 2>&1
  114. }
  115. oops()
  116. {
  117. dmesg | grep BUG > /dev/null 2>&1
  118. }
  119. find /lib/modules/`uname -r` -name '*.ko' -exec basename {} .ko \; |
  120. while read i
  121. do
  122. oops && exit 1
  123. if ! blacklist $i
  124. then
  125. echo inserting $i...
  126. bash $FAILCMD modprobe $i
  127. fi
  128. done
  129. lsmod | awk '{ if ($3 == 0) { print $1 } }' |
  130. while read i
  131. do
  132. oops && exit 1
  133. if ! blacklist $i
  134. then
  135. echo removing $i...
  136. bash $FAILCMD modprobe -r $i
  137. fi
  138. done
  139. ------------------------------------------------------------------------------
  140. o inject slab allocation failures only for a specific module
  141. ------------------------------------------------------------------------------
  142. #!/bin/bash
  143. FAILMOD=Documentation/fault-injection/failmodule.sh
  144. echo injecting errors into the module $1...
  145. modprobe $1
  146. bash $FAILMOD failslab $1 10
  147. echo 25 > /debug/failslab/probability
  148. ------------------------------------------------------------------------------