IPMI.txt 21 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534
  1. The Linux IPMI Driver
  2. ---------------------
  3. Corey Minyard
  4. <minyard@mvista.com>
  5. <minyard@acm.org>
  6. The Intelligent Platform Management Interface, or IPMI, is a
  7. standard for controlling intelligent devices that monitor a system.
  8. It provides for dynamic discovery of sensors in the system and the
  9. ability to monitor the sensors and be informed when the sensor's
  10. values change or go outside certain boundaries. It also has a
  11. standardized database for field-replacable units (FRUs) and a watchdog
  12. timer.
  13. To use this, you need an interface to an IPMI controller in your
  14. system (called a Baseboard Management Controller, or BMC) and
  15. management software that can use the IPMI system.
  16. This document describes how to use the IPMI driver for Linux. If you
  17. are not familiar with IPMI itself, see the web site at
  18. http://www.intel.com/design/servers/ipmi/index.htm. IPMI is a big
  19. subject and I can't cover it all here!
  20. Configuration
  21. -------------
  22. The LinuxIPMI driver is modular, which means you have to pick several
  23. things to have it work right depending on your hardware. Most of
  24. these are available in the 'Character Devices' menu.
  25. No matter what, you must pick 'IPMI top-level message handler' to use
  26. IPMI. What you do beyond that depends on your needs and hardware.
  27. The message handler does not provide any user-level interfaces.
  28. Kernel code (like the watchdog) can still use it. If you need access
  29. from userland, you need to select 'Device interface for IPMI' if you
  30. want access through a device driver. Another interface is also
  31. available, you may select 'IPMI sockets' in the 'Networking Support'
  32. main menu. This provides a socket interface to IPMI. You may select
  33. both of these at the same time, they will both work together.
  34. The driver interface depends on your hardware. If you have a board
  35. with a standard interface (These will generally be either "KCS",
  36. "SMIC", or "BT", consult your hardware manual), choose the 'IPMI SI
  37. handler' option. A driver also exists for direct I2C access to the
  38. IPMI management controller. Some boards support this, but it is
  39. unknown if it will work on every board. For this, choose 'IPMI SMBus
  40. handler', but be ready to try to do some figuring to see if it will
  41. work.
  42. There is also a KCS-only driver interface supplied, but it is
  43. depracated in favor of the SI interface.
  44. You should generally enable ACPI on your system, as systems with IPMI
  45. should have ACPI tables describing them.
  46. If you have a standard interface and the board manufacturer has done
  47. their job correctly, the IPMI controller should be automatically
  48. detect (via ACPI or SMBIOS tables) and should just work. Sadly, many
  49. boards do not have this information. The driver attempts standard
  50. defaults, but they may not work. If you fall into this situation, you
  51. need to read the section below named 'The SI Driver' on how to
  52. hand-configure your system.
  53. IPMI defines a standard watchdog timer. You can enable this with the
  54. 'IPMI Watchdog Timer' config option. If you compile the driver into
  55. the kernel, then via a kernel command-line option you can have the
  56. watchdog timer start as soon as it intitializes. It also have a lot
  57. of other options, see the 'Watchdog' section below for more details.
  58. Note that you can also have the watchdog continue to run if it is
  59. closed (by default it is disabled on close). Go into the 'Watchdog
  60. Cards' menu, enable 'Watchdog Timer Support', and enable the option
  61. 'Disable watchdog shutdown on close'.
  62. Basic Design
  63. ------------
  64. The Linux IPMI driver is designed to be very modular and flexible, you
  65. only need to take the pieces you need and you can use it in many
  66. different ways. Because of that, it's broken into many chunks of
  67. code. These chunks are:
  68. ipmi_msghandler - This is the central piece of software for the IPMI
  69. system. It handles all messages, message timing, and responses. The
  70. IPMI users tie into this, and the IPMI physical interfaces (called
  71. System Management Interfaces, or SMIs) also tie in here. This
  72. provides the kernelland interface for IPMI, but does not provide an
  73. interface for use by application processes.
  74. ipmi_devintf - This provides a userland IOCTL interface for the IPMI
  75. driver, each open file for this device ties in to the message handler
  76. as an IPMI user.
  77. ipmi_si - A driver for various system interfaces. This supports
  78. KCS, SMIC, and may support BT in the future. Unless you have your own
  79. custom interface, you probably need to use this.
  80. ipmi_smb - A driver for accessing BMCs on the SMBus. It uses the
  81. I2C kernel driver's SMBus interfaces to send and receive IPMI messages
  82. over the SMBus.
  83. af_ipmi - A network socket interface to IPMI. This doesn't take up
  84. a character device in your system.
  85. Note that the KCS-only interface ahs been removed.
  86. Much documentation for the interface is in the include files. The
  87. IPMI include files are:
  88. net/af_ipmi.h - Contains the socket interface.
  89. linux/ipmi.h - Contains the user interface and IOCTL interface for IPMI.
  90. linux/ipmi_smi.h - Contains the interface for system management interfaces
  91. (things that interface to IPMI controllers) to use.
  92. linux/ipmi_msgdefs.h - General definitions for base IPMI messaging.
  93. Addressing
  94. ----------
  95. The IPMI addressing works much like IP addresses, you have an overlay
  96. to handle the different address types. The overlay is:
  97. struct ipmi_addr
  98. {
  99. int addr_type;
  100. short channel;
  101. char data[IPMI_MAX_ADDR_SIZE];
  102. };
  103. The addr_type determines what the address really is. The driver
  104. currently understands two different types of addresses.
  105. "System Interface" addresses are defined as:
  106. struct ipmi_system_interface_addr
  107. {
  108. int addr_type;
  109. short channel;
  110. };
  111. and the type is IPMI_SYSTEM_INTERFACE_ADDR_TYPE. This is used for talking
  112. straight to the BMC on the current card. The channel must be
  113. IPMI_BMC_CHANNEL.
  114. Messages that are destined to go out on the IPMB bus use the
  115. IPMI_IPMB_ADDR_TYPE address type. The format is
  116. struct ipmi_ipmb_addr
  117. {
  118. int addr_type;
  119. short channel;
  120. unsigned char slave_addr;
  121. unsigned char lun;
  122. };
  123. The "channel" here is generally zero, but some devices support more
  124. than one channel, it corresponds to the channel as defined in the IPMI
  125. spec.
  126. Messages
  127. --------
  128. Messages are defined as:
  129. struct ipmi_msg
  130. {
  131. unsigned char netfn;
  132. unsigned char lun;
  133. unsigned char cmd;
  134. unsigned char *data;
  135. int data_len;
  136. };
  137. The driver takes care of adding/stripping the header information. The
  138. data portion is just the data to be send (do NOT put addressing info
  139. here) or the response. Note that the completion code of a response is
  140. the first item in "data", it is not stripped out because that is how
  141. all the messages are defined in the spec (and thus makes counting the
  142. offsets a little easier :-).
  143. When using the IOCTL interface from userland, you must provide a block
  144. of data for "data", fill it, and set data_len to the length of the
  145. block of data, even when receiving messages. Otherwise the driver
  146. will have no place to put the message.
  147. Messages coming up from the message handler in kernelland will come in
  148. as:
  149. struct ipmi_recv_msg
  150. {
  151. struct list_head link;
  152. /* The type of message as defined in the "Receive Types"
  153. defines above. */
  154. int recv_type;
  155. ipmi_user_t *user;
  156. struct ipmi_addr addr;
  157. long msgid;
  158. struct ipmi_msg msg;
  159. /* Call this when done with the message. It will presumably free
  160. the message and do any other necessary cleanup. */
  161. void (*done)(struct ipmi_recv_msg *msg);
  162. /* Place-holder for the data, don't make any assumptions about
  163. the size or existence of this, since it may change. */
  164. unsigned char msg_data[IPMI_MAX_MSG_LENGTH];
  165. };
  166. You should look at the receive type and handle the message
  167. appropriately.
  168. The Upper Layer Interface (Message Handler)
  169. -------------------------------------------
  170. The upper layer of the interface provides the users with a consistent
  171. view of the IPMI interfaces. It allows multiple SMI interfaces to be
  172. addressed (because some boards actually have multiple BMCs on them)
  173. and the user should not have to care what type of SMI is below them.
  174. Creating the User
  175. To user the message handler, you must first create a user using
  176. ipmi_create_user. The interface number specifies which SMI you want
  177. to connect to, and you must supply callback functions to be called
  178. when data comes in. The callback function can run at interrupt level,
  179. so be careful using the callbacks. This also allows to you pass in a
  180. piece of data, the handler_data, that will be passed back to you on
  181. all calls.
  182. Once you are done, call ipmi_destroy_user() to get rid of the user.
  183. From userland, opening the device automatically creates a user, and
  184. closing the device automatically destroys the user.
  185. Messaging
  186. To send a message from kernel-land, the ipmi_request() call does
  187. pretty much all message handling. Most of the parameter are
  188. self-explanatory. However, it takes a "msgid" parameter. This is NOT
  189. the sequence number of messages. It is simply a long value that is
  190. passed back when the response for the message is returned. You may
  191. use it for anything you like.
  192. Responses come back in the function pointed to by the ipmi_recv_hndl
  193. field of the "handler" that you passed in to ipmi_create_user().
  194. Remember again, these may be running at interrupt level. Remember to
  195. look at the receive type, too.
  196. From userland, you fill out an ipmi_req_t structure and use the
  197. IPMICTL_SEND_COMMAND ioctl. For incoming stuff, you can use select()
  198. or poll() to wait for messages to come in. However, you cannot use
  199. read() to get them, you must call the IPMICTL_RECEIVE_MSG with the
  200. ipmi_recv_t structure to actually get the message. Remember that you
  201. must supply a pointer to a block of data in the msg.data field, and
  202. you must fill in the msg.data_len field with the size of the data.
  203. This gives the receiver a place to actually put the message.
  204. If the message cannot fit into the data you provide, you will get an
  205. EMSGSIZE error and the driver will leave the data in the receive
  206. queue. If you want to get it and have it truncate the message, us
  207. the IPMICTL_RECEIVE_MSG_TRUNC ioctl.
  208. When you send a command (which is defined by the lowest-order bit of
  209. the netfn per the IPMI spec) on the IPMB bus, the driver will
  210. automatically assign the sequence number to the command and save the
  211. command. If the response is not receive in the IPMI-specified 5
  212. seconds, it will generate a response automatically saying the command
  213. timed out. If an unsolicited response comes in (if it was after 5
  214. seconds, for instance), that response will be ignored.
  215. In kernelland, after you receive a message and are done with it, you
  216. MUST call ipmi_free_recv_msg() on it, or you will leak messages. Note
  217. that you should NEVER mess with the "done" field of a message, that is
  218. required to properly clean up the message.
  219. Note that when sending, there is an ipmi_request_supply_msgs() call
  220. that lets you supply the smi and receive message. This is useful for
  221. pieces of code that need to work even if the system is out of buffers
  222. (the watchdog timer uses this, for instance). You supply your own
  223. buffer and own free routines. This is not recommended for normal use,
  224. though, since it is tricky to manage your own buffers.
  225. Events and Incoming Commands
  226. The driver takes care of polling for IPMI events and receiving
  227. commands (commands are messages that are not responses, they are
  228. commands that other things on the IPMB bus have sent you). To receive
  229. these, you must register for them, they will not automatically be sent
  230. to you.
  231. To receive events, you must call ipmi_set_gets_events() and set the
  232. "val" to non-zero. Any events that have been received by the driver
  233. since startup will immediately be delivered to the first user that
  234. registers for events. After that, if multiple users are registered
  235. for events, they will all receive all events that come in.
  236. For receiving commands, you have to individually register commands you
  237. want to receive. Call ipmi_register_for_cmd() and supply the netfn
  238. and command name for each command you want to receive. Only one user
  239. may be registered for each netfn/cmd, but different users may register
  240. for different commands.
  241. From userland, equivalent IOCTLs are provided to do these functions.
  242. The Lower Layer (SMI) Interface
  243. -------------------------------
  244. As mentioned before, multiple SMI interfaces may be registered to the
  245. message handler, each of these is assigned an interface number when
  246. they register with the message handler. They are generally assigned
  247. in the order they register, although if an SMI unregisters and then
  248. another one registers, all bets are off.
  249. The ipmi_smi.h defines the interface for management interfaces, see
  250. that for more details.
  251. The SI Driver
  252. -------------
  253. The SI driver allows up to 4 KCS or SMIC interfaces to be configured
  254. in the system. By default, scan the ACPI tables for interfaces, and
  255. if it doesn't find any the driver will attempt to register one KCS
  256. interface at the spec-specified I/O port 0xca2 without interrupts.
  257. You can change this at module load time (for a module) with:
  258. modprobe ipmi_si.o type=<type1>,<type2>....
  259. ports=<port1>,<port2>... addrs=<addr1>,<addr2>...
  260. irqs=<irq1>,<irq2>... trydefaults=[0|1]
  261. regspacings=<sp1>,<sp2>,... regsizes=<size1>,<size2>,...
  262. regshifts=<shift1>,<shift2>,...
  263. slave_addrs=<addr1>,<addr2>,...
  264. Each of these except si_trydefaults is a list, the first item for the
  265. first interface, second item for the second interface, etc.
  266. The si_type may be either "kcs", "smic", or "bt". If you leave it blank, it
  267. defaults to "kcs".
  268. If you specify si_addrs as non-zero for an interface, the driver will
  269. use the memory address given as the address of the device. This
  270. overrides si_ports.
  271. If you specify si_ports as non-zero for an interface, the driver will
  272. use the I/O port given as the device address.
  273. If you specify si_irqs as non-zero for an interface, the driver will
  274. attempt to use the given interrupt for the device.
  275. si_trydefaults sets whether the standard IPMI interface at 0xca2 and
  276. any interfaces specified by ACPE are tried. By default, the driver
  277. tries it, set this value to zero to turn this off.
  278. The next three parameters have to do with register layout. The
  279. registers used by the interfaces may not appear at successive
  280. locations and they may not be in 8-bit registers. These parameters
  281. allow the layout of the data in the registers to be more precisely
  282. specified.
  283. The regspacings parameter give the number of bytes between successive
  284. register start addresses. For instance, if the regspacing is set to 4
  285. and the start address is 0xca2, then the address for the second
  286. register would be 0xca6. This defaults to 1.
  287. The regsizes parameter gives the size of a register, in bytes. The
  288. data used by IPMI is 8-bits wide, but it may be inside a larger
  289. register. This parameter allows the read and write type to specified.
  290. It may be 1, 2, 4, or 8. The default is 1.
  291. Since the register size may be larger than 32 bits, the IPMI data may not
  292. be in the lower 8 bits. The regshifts parameter give the amount to shift
  293. the data to get to the actual IPMI data.
  294. The slave_addrs specifies the IPMI address of the local BMC. This is
  295. usually 0x20 and the driver defaults to that, but in case it's not, it
  296. can be specified when the driver starts up.
  297. When compiled into the kernel, the addresses can be specified on the
  298. kernel command line as:
  299. ipmi_si.type=<type1>,<type2>...
  300. ipmi_si.ports=<port1>,<port2>... ipmi_si.addrs=<addr1>,<addr2>...
  301. ipmi_si.irqs=<irq1>,<irq2>... ipmi_si.trydefaults=[0|1]
  302. ipmi_si.regspacings=<sp1>,<sp2>,...
  303. ipmi_si.regsizes=<size1>,<size2>,...
  304. ipmi_si.regshifts=<shift1>,<shift2>,...
  305. ipmi_si.slave_addrs=<addr1>,<addr2>,...
  306. It works the same as the module parameters of the same names.
  307. By default, the driver will attempt to detect any device specified by
  308. ACPI, and if none of those then a KCS device at the spec-specified
  309. 0xca2. If you want to turn this off, set the "trydefaults" option to
  310. false.
  311. If you have high-res timers compiled into the kernel, the driver will
  312. use them to provide much better performance. Note that if you do not
  313. have high-res timers enabled in the kernel and you don't have
  314. interrupts enabled, the driver will run VERY slowly. Don't blame me,
  315. these interfaces suck.
  316. The SMBus Driver
  317. ----------------
  318. The SMBus driver allows up to 4 SMBus devices to be configured in the
  319. system. By default, the driver will register any SMBus interfaces it finds
  320. in the I2C address range of 0x20 to 0x4f on any adapter. You can change this
  321. at module load time (for a module) with:
  322. modprobe ipmi_smb.o
  323. addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]]
  324. dbg=<flags1>,<flags2>...
  325. [defaultprobe=0] [dbg_probe=1]
  326. The addresses are specified in pairs, the first is the adapter ID and the
  327. second is the I2C address on that adapter.
  328. The debug flags are bit flags for each BMC found, they are:
  329. IPMI messages: 1, driver state: 2, timing: 4, I2C probe: 8
  330. Setting smb_defaultprobe to zero disabled the default probing of SMBus
  331. interfaces at address range 0x20 to 0x4f. This means that only the
  332. BMCs specified on the smb_addr line will be detected.
  333. Setting smb_dbg_probe to 1 will enable debugging of the probing and
  334. detection process for BMCs on the SMBusses.
  335. Discovering the IPMI compilant BMC on the SMBus can cause devices
  336. on the I2C bus to fail. The SMBus driver writes a "Get Device ID" IPMI
  337. message as a block write to the I2C bus and waits for a response.
  338. This action can be detrimental to some I2C devices. It is highly recommended
  339. that the known I2c address be given to the SMBus driver in the smb_addr
  340. parameter. The default adrress range will not be used when a smb_addr
  341. parameter is provided.
  342. When compiled into the kernel, the addresses can be specified on the
  343. kernel command line as:
  344. ipmb_smb.addr=<adapter1>,<i2caddr1>[,<adapter2>,<i2caddr2>[,...]]
  345. ipmi_smb.dbg=<flags1>,<flags2>...
  346. ipmi_smb.defaultprobe=0 ipmi_smb.dbg_probe=1
  347. These are the same options as on the module command line.
  348. Note that you might need some I2C changes if CONFIG_IPMI_PANIC_EVENT
  349. is enabled along with this, so the I2C driver knows to run to
  350. completion during sending a panic event.
  351. Other Pieces
  352. ------------
  353. Watchdog
  354. --------
  355. A watchdog timer is provided that implements the Linux-standard
  356. watchdog timer interface. It has three module parameters that can be
  357. used to control it:
  358. modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type>
  359. preaction=<preaction type> preop=<preop type> start_now=x
  360. nowayout=x
  361. The timeout is the number of seconds to the action, and the pretimeout
  362. is the amount of seconds before the reset that the pre-timeout panic will
  363. occur (if pretimeout is zero, then pretimeout will not be enabled). Note
  364. that the pretimeout is the time before the final timeout. So if the
  365. timeout is 50 seconds and the pretimeout is 10 seconds, then the pretimeout
  366. will occur in 40 second (10 seconds before the timeout).
  367. The action may be "reset", "power_cycle", or "power_off", and
  368. specifies what to do when the timer times out, and defaults to
  369. "reset".
  370. The preaction may be "pre_smi" for an indication through the SMI
  371. interface, "pre_int" for an indication through the SMI with an
  372. interrupts, and "pre_nmi" for a NMI on a preaction. This is how
  373. the driver is informed of the pretimeout.
  374. The preop may be set to "preop_none" for no operation on a pretimeout,
  375. "preop_panic" to set the preoperation to panic, or "preop_give_data"
  376. to provide data to read from the watchdog device when the pretimeout
  377. occurs. A "pre_nmi" setting CANNOT be used with "preop_give_data"
  378. because you can't do data operations from an NMI.
  379. When preop is set to "preop_give_data", one byte comes ready to read
  380. on the device when the pretimeout occurs. Select and fasync work on
  381. the device, as well.
  382. If start_now is set to 1, the watchdog timer will start running as
  383. soon as the driver is loaded.
  384. If nowayout is set to 1, the watchdog timer will not stop when the
  385. watchdog device is closed. The default value of nowayout is true
  386. if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not.
  387. When compiled into the kernel, the kernel command line is available
  388. for configuring the watchdog:
  389. ipmi_watchdog.timeout=<t> ipmi_watchdog.pretimeout=<t>
  390. ipmi_watchdog.action=<action type>
  391. ipmi_watchdog.preaction=<preaction type>
  392. ipmi_watchdog.preop=<preop type>
  393. ipmi_watchdog.start_now=x
  394. ipmi_watchdog.nowayout=x
  395. The options are the same as the module parameter options.
  396. The watchdog will panic and start a 120 second reset timeout if it
  397. gets a pre-action. During a panic or a reboot, the watchdog will
  398. start a 120 timer if it is running to make sure the reboot occurs.
  399. Note that if you use the NMI preaction for the watchdog, you MUST
  400. NOT use nmi watchdog mode 1. If you use the NMI watchdog, you
  401. must use mode 2.
  402. Once you open the watchdog timer, you must write a 'V' character to the
  403. device to close it, or the timer will not stop. This is a new semantic
  404. for the driver, but makes it consistent with the rest of the watchdog
  405. drivers in Linux.