浏览代码

Merge branch 'linus' into core/rcu

Ingo Molnar 16 年之前
父节点
当前提交
c4c0c56a7a
共有 100 个文件被更改,包括 4401 次插入1827 次删除
  1. 9 2
      CREDITS
  2. 0 4
      Documentation/00-INDEX
  3. 315 0
      Documentation/ABI/testing/sysfs-class-regulator
  4. 20 0
      Documentation/ABI/testing/sysfs-dev
  5. 24 0
      Documentation/ABI/testing/sysfs-devices-memory
  6. 6 0
      Documentation/ABI/testing/sysfs-kernel-mm
  7. 15 0
      Documentation/ABI/testing/sysfs-kernel-mm-hugepages
  8. 23 19
      Documentation/CodingStyle
  9. 2 2
      Documentation/DMA-API.txt
  10. 9 0
      Documentation/DMA-attributes.txt
  11. 1 1
      Documentation/DocBook/Makefile
  12. 38 0
      Documentation/DocBook/gadget.tmpl
  13. 24 33
      Documentation/DocBook/kernel-locking.tmpl
  14. 18 0
      Documentation/DocBook/kgdb.tmpl
  15. 2 2
      Documentation/DocBook/procfs-guide.tmpl
  16. 4 4
      Documentation/DocBook/s390-drivers.tmpl
  17. 105 0
      Documentation/DocBook/sh.tmpl
  18. 51 12
      Documentation/DocBook/uio-howto.tmpl
  19. 1 1
      Documentation/DocBook/videobook.tmpl
  20. 12 26
      Documentation/DocBook/z8530book.tmpl
  21. 1 1
      Documentation/HOWTO
  22. 2 2
      Documentation/Intel-IOMMU.txt
  23. 26 0
      Documentation/SubmittingPatches
  24. 8 3
      Documentation/accounting/delay-accounting.txt
  25. 6 2
      Documentation/accounting/getdelays.c
  26. 8 1
      Documentation/accounting/taskstats-struct.txt
  27. 1 1
      Documentation/arm/IXP4xx
  28. 3 9
      Documentation/arm/Interrupts
  29. 2 2
      Documentation/arm/README
  30. 4 4
      Documentation/arm/Samsung-S3C24XX/GPIO.txt
  31. 1 1
      Documentation/arm/Samsung-S3C24XX/Overview.txt
  32. 1 1
      Documentation/arm/Samsung-S3C24XX/USB-Host.txt
  33. 67 0
      Documentation/bt8xxgpio.txt
  34. 6 15
      Documentation/cciss.txt
  35. 0 133
      Documentation/cli-sti-removal.txt
  36. 1 2
      Documentation/controllers/memory.txt
  37. 1 1
      Documentation/cpu-freq/governors.txt
  38. 73 80
      Documentation/edac.txt
  39. 131 0
      Documentation/fb/sh7760fb.txt
  40. 31 15
      Documentation/fb/tridentfb.txt
  41. 53 23
      Documentation/feature-removal-schedule.txt
  42. 7 0
      Documentation/filesystems/Locking
  43. 5 5
      Documentation/filesystems/bfs.txt
  44. 18 9
      Documentation/filesystems/configfs/configfs.txt
  45. 0 487
      Documentation/filesystems/configfs/configfs_example.c
  46. 485 0
      Documentation/filesystems/configfs/configfs_example_explicit.c
  47. 448 0
      Documentation/filesystems/configfs/configfs_example_macros.c
  48. 59 44
      Documentation/filesystems/nfs-rdma.txt
  49. 106 0
      Documentation/filesystems/omfs.txt
  50. 46 2
      Documentation/filesystems/proc.txt
  51. 10 0
      Documentation/filesystems/relay.txt
  52. 6 0
      Documentation/filesystems/sysfs.txt
  53. 8 0
      Documentation/filesystems/vfat.txt
  54. 3 3
      Documentation/filesystems/vfs.txt
  55. 1 0
      Documentation/ftrace.txt
  56. 129 6
      Documentation/gpio.txt
  57. 41 16
      Documentation/hwmon/dme1737
  58. 7 6
      Documentation/hwmon/it87
  59. 4 7
      Documentation/hwmon/lm85
  60. 0 4
      Documentation/hwmon/w83627hf
  61. 3 3
      Documentation/hwmon/w83791d
  62. 281 0
      Documentation/i2c/upgrading-clients
  63. 4 4
      Documentation/ia64/kvm.txt
  64. 137 0
      Documentation/ia64/paravirt_ops.txt
  65. 1 1
      Documentation/input/cs461x.txt
  66. 0 2
      Documentation/input/gameport-programming.txt
  67. 0 1
      Documentation/input/input.txt
  68. 0 2
      Documentation/input/joystick-api.txt
  69. 0 1
      Documentation/input/joystick-parport.txt
  70. 0 1
      Documentation/input/joystick.txt
  71. 2 2
      Documentation/ioctl/ioctl-decoding.txt
  72. 1 1
      Documentation/iostats.txt
  73. 6 0
      Documentation/isdn/README.mISDN
  74. 10 10
      Documentation/kdump/kdump.txt
  75. 50 12
      Documentation/kernel-parameters.txt
  76. 1 1
      Documentation/keys.txt
  77. 18 8
      Documentation/laptops/thinkpad-acpi.txt
  78. 1 1
      Documentation/leds-class.txt
  79. 386 133
      Documentation/lguest/lguest.c
  80. 1 1
      Documentation/local_ops.txt
  81. 29 1
      Documentation/md.txt
  82. 252 140
      Documentation/moxa-smartio
  83. 81 31
      Documentation/networking/bonding.txt
  84. 2 2
      Documentation/networking/can.txt
  85. 167 0
      Documentation/networking/dm9000.txt
  86. 2 12
      Documentation/networking/e1000.txt
  87. 17 4
      Documentation/networking/ip-sysctl.txt
  88. 320 99
      Documentation/networking/ixgb.txt
  89. 67 0
      Documentation/networking/mac80211_hwsim/README
  90. 11 0
      Documentation/networking/mac80211_hwsim/hostapd.conf
  91. 10 0
      Documentation/networking/mac80211_hwsim/wpa_supplicant.conf
  92. 1 89
      Documentation/networking/multiqueue.txt
  93. 1 1
      Documentation/networking/packet_mmap.txt
  94. 2 5
      Documentation/networking/s2io.txt
  95. 8 7
      Documentation/networking/tc-actions-env-rules.txt
  96. 1 1
      Documentation/networking/udplite.txt
  97. 2 2
      Documentation/power/00-INDEX
  98. 32 0
      Documentation/power/apm-acpi.txt
  99. 0 257
      Documentation/power/pm.txt
  100. 6 1
      Documentation/power/pm_qos_interface.txt

+ 9 - 2
CREDITS

@@ -317,6 +317,14 @@ S: 2322 37th Ave SW
 S: Seattle, Washington 98126-2010
 S: USA
 
+N: Muli Ben-Yehuda
+E: mulix@mulix.org
+E: muli@il.ibm.com
+W: http://www.mulix.org
+D: trident OSS sound driver, x86-64 dma-ops and Calgary IOMMU,
+D: KVM and Xen bits and other misc. hackery.
+S: Haifa, Israel
+
 N: Johannes Berg
 E: johannes@sipsolutions.net
 W: http://johannes.sipsolutions.net/
@@ -3344,8 +3352,7 @@ S: Spain
 N: Linus Torvalds
 E: torvalds@linux-foundation.org
 D: Original kernel hacker
-S: 12725 SW Millikan Way, Suite 400
-S: Beaverton, Oregon 97005
+S: Portland, Oregon 97005
 S: USA
 
 N: Marcelo Tosatti

+ 0 - 4
Documentation/00-INDEX

@@ -89,8 +89,6 @@ cciss.txt
 	- info, major/minor #'s for Compaq's SMART Array Controllers.
 cdrom/
 	- directory with information on the CD-ROM drivers that Linux has.
-cli-sti-removal.txt
-	- cli()/sti() removal guide.
 computone.txt
 	- info on Computone Intelliport II/Plus Multiport Serial Driver.
 connector/
@@ -361,8 +359,6 @@ telephony/
 	- directory with info on telephony (e.g. voice over IP) support.
 time_interpolators.txt
 	- info on time interpolators.
-tipar.txt
-	- information about Parallel link cable for Texas Instruments handhelds.
 tty.txt
 	- guide to the locking policies of the tty layer.
 uml/

+ 315 - 0
Documentation/ABI/testing/sysfs-class-regulator

@@ -0,0 +1,315 @@
+What:		/sys/class/regulator/.../state
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		state. This holds the regulator output state.
+
+		This will be one of the following strings:
+
+		'enabled'
+		'disabled'
+		'unknown'
+
+		'enabled' means the regulator output is ON and is supplying
+		power to the system.
+
+		'disabled' means the regulator output is OFF and is not
+		supplying power to the system..
+
+		'unknown' means software cannot determine the state.
+
+		NOTE: this field can be used in conjunction with microvolts
+		and microamps to determine regulator output levels.
+
+
+What:		/sys/class/regulator/.../type
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		type. This holds the regulator type.
+
+		This will be one of the following strings:
+
+		'voltage'
+		'current'
+		'unknown'
+
+		'voltage' means the regulator output voltage can be controlled
+		by software.
+
+		'current' means the regulator output current limit can be
+		controlled by software.
+
+		'unknown' means software cannot control either voltage or
+		current limit.
+
+
+What:		/sys/class/regulator/.../microvolts
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		microvolts. This holds the regulator output voltage setting
+		measured in microvolts (i.e. E-6 Volts).
+
+		NOTE: This value should not be used to determine the regulator
+		output voltage level as this value is the same regardless of
+		whether the regulator is enabled or disabled.
+
+
+What:		/sys/class/regulator/.../microamps
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		microamps. This holds the regulator output current limit
+		setting measured in microamps (i.e. E-6 Amps).
+
+		NOTE: This value should not be used to determine the regulator
+		output current level as this value is the same regardless of
+		whether the regulator is enabled or disabled.
+
+
+What:		/sys/class/regulator/.../opmode
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		opmode. This holds the regulator operating mode setting.
+
+		The opmode value can be one of the following strings:
+
+		'fast'
+		'normal'
+		'idle'
+		'standby'
+		'unknown'
+
+		The modes are described in include/linux/regulator/regulator.h
+
+		NOTE: This value should not be used to determine the regulator
+		output operating mode as this value is the same regardless of
+		whether the regulator is enabled or disabled.
+
+
+What:		/sys/class/regulator/.../min_microvolts
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		min_microvolts. This holds the minimum safe working regulator
+		output voltage setting for this domain measured in microvolts.
+
+		NOTE: this will return the string 'constraint not defined' if
+		the power domain has no min microvolts constraint defined by
+		platform code.
+
+
+What:		/sys/class/regulator/.../max_microvolts
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		max_microvolts. This holds the maximum safe working regulator
+		output voltage setting for this domain measured in microvolts.
+
+		NOTE: this will return the string 'constraint not defined' if
+		the power domain has no max microvolts constraint defined by
+		platform code.
+
+
+What:		/sys/class/regulator/.../min_microamps
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		min_microamps. This holds the minimum safe working regulator
+		output current limit setting for this domain measured in
+		microamps.
+
+		NOTE: this will return the string 'constraint not defined' if
+		the power domain has no min microamps constraint defined by
+		platform code.
+
+
+What:		/sys/class/regulator/.../max_microamps
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		max_microamps. This holds the maximum safe working regulator
+		output current limit setting for this domain measured in
+		microamps.
+
+		NOTE: this will return the string 'constraint not defined' if
+		the power domain has no max microamps constraint defined by
+		platform code.
+
+
+What:		/sys/class/regulator/.../num_users
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		num_users. This holds the number of consumer devices that
+		have called regulator_enable() on this regulator.
+
+
+What:		/sys/class/regulator/.../requested_microamps
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		requested_microamps. This holds the total requested load
+		current in microamps for this regulator from all its consumer
+		devices.
+
+
+What:		/sys/class/regulator/.../parent
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Some regulator directories will contain a link called parent.
+		This points to the parent or supply regulator if one exists.
+
+What:		/sys/class/regulator/.../suspend_mem_microvolts
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_mem_microvolts. This holds the regulator output
+		voltage setting for this domain measured in microvolts when
+		the system is suspended to memory.
+
+		NOTE: this will return the string 'not defined' if
+		the power domain has no suspend to memory voltage defined by
+		platform code.
+
+What:		/sys/class/regulator/.../suspend_disk_microvolts
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_disk_microvolts. This holds the regulator output
+		voltage setting for this domain measured in microvolts when
+		the system is suspended to disk.
+
+		NOTE: this will return the string 'not defined' if
+		the power domain has no suspend to disk voltage defined by
+		platform code.
+
+What:		/sys/class/regulator/.../suspend_standby_microvolts
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_standby_microvolts. This holds the regulator output
+		voltage setting for this domain measured in microvolts when
+		the system is suspended to standby.
+
+		NOTE: this will return the string 'not defined' if
+		the power domain has no suspend to standby voltage defined by
+		platform code.
+
+What:		/sys/class/regulator/.../suspend_mem_mode
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_mem_mode. This holds the regulator operating mode
+		setting for this domain when the system is suspended to
+		memory.
+
+		NOTE: this will return the string 'not defined' if
+		the power domain has no suspend to memory mode defined by
+		platform code.
+
+What:		/sys/class/regulator/.../suspend_disk_mode
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_disk_mode. This holds the regulator operating mode
+		setting for this domain when the system is suspended to disk.
+
+		NOTE: this will return the string 'not defined' if
+		the power domain has no suspend to disk mode defined by
+		platform code.
+
+What:		/sys/class/regulator/.../suspend_standby_mode
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_standby_mode. This holds the regulator operating mode
+		setting for this domain when the system is suspended to
+		standby.
+
+		NOTE: this will return the string 'not defined' if
+		the power domain has no suspend to standby mode defined by
+		platform code.
+
+What:		/sys/class/regulator/.../suspend_mem_state
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_mem_state. This holds the regulator operating state
+		when suspended to memory.
+
+		This will be one of the following strings:
+
+		'enabled'
+		'disabled'
+		'not defined'
+
+What:		/sys/class/regulator/.../suspend_disk_state
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_disk_state. This holds the regulator operating state
+		when suspended to disk.
+
+		This will be one of the following strings:
+
+		'enabled'
+		'disabled'
+		'not defined'
+
+What:		/sys/class/regulator/.../suspend_standby_state
+Date:		May 2008
+KernelVersion:	2.6.26
+Contact:	Liam Girdwood <lg@opensource.wolfsonmicro.com>
+Description:
+		Each regulator directory will contain a field called
+		suspend_standby_state. This holds the regulator operating
+		state when suspended to standby.
+
+		This will be one of the following strings:
+
+		'enabled'
+		'disabled'
+		'not defined'

+ 20 - 0
Documentation/ABI/testing/sysfs-dev

@@ -0,0 +1,20 @@
+What:		/sys/dev
+Date:		April 2008
+KernelVersion:	2.6.26
+Contact:	Dan Williams <dan.j.williams@intel.com>
+Description:	The /sys/dev tree provides a method to look up the sysfs
+		path for a device using the information returned from
+		stat(2).  There are two directories, 'block' and 'char',
+		beneath /sys/dev containing symbolic links with names of
+		the form "<major>:<minor>".  These links point to the
+		corresponding sysfs path for the given device.
+
+		Example:
+		$ readlink /sys/dev/block/8:32
+		../../block/sdc
+
+		Entries in /sys/dev/char and /sys/dev/block will be
+		dynamically created and destroyed as devices enter and
+		leave the system.
+
+Users:		mdadm <linux-raid@vger.kernel.org>

+ 24 - 0
Documentation/ABI/testing/sysfs-devices-memory

@@ -0,0 +1,24 @@
+What:		/sys/devices/system/memory
+Date:		June 2008
+Contact:	Badari Pulavarty <pbadari@us.ibm.com>
+Description:
+		The /sys/devices/system/memory contains a snapshot of the
+		internal state of the kernel memory blocks. Files could be
+		added or removed dynamically to represent hot-add/remove
+		operations.
+
+Users:		hotplug memory add/remove tools
+		https://w3.opensource.ibm.com/projects/powerpc-utils/
+
+What:		/sys/devices/system/memory/memoryX/removable
+Date:		June 2008
+Contact:	Badari Pulavarty <pbadari@us.ibm.com>
+Description:
+		The file /sys/devices/system/memory/memoryX/removable
+		indicates whether this memory block is removable or not.
+		This is useful for a user-level agent to determine
+		identify removable sections of the memory before attempting
+		potentially expensive hot-remove memory operation
+
+Users:		hotplug memory remove tools
+		https://w3.opensource.ibm.com/projects/powerpc-utils/

+ 6 - 0
Documentation/ABI/testing/sysfs-kernel-mm

@@ -0,0 +1,6 @@
+What:		/sys/kernel/mm
+Date:		July 2008
+Contact:	Nishanth Aravamudan <nacc@us.ibm.com>, VM maintainers
+Description:
+		/sys/kernel/mm/ should contain any and all VM
+		related information in /sys/kernel/.

+ 15 - 0
Documentation/ABI/testing/sysfs-kernel-mm-hugepages

@@ -0,0 +1,15 @@
+What:		/sys/kernel/mm/hugepages/
+Date:		June 2008
+Contact:	Nishanth Aravamudan <nacc@us.ibm.com>, hugetlb maintainers
+Description:
+		/sys/kernel/mm/hugepages/ contains a number of subdirectories
+		of the form hugepages-<size>kB, where <size> is the page size
+		of the hugepages supported by the kernel/CPU combination.
+
+		Under these directories are a number of files:
+			nr_hugepages
+			nr_overcommit_hugepages
+			free_hugepages
+			surplus_hugepages
+			resv_hugepages
+		See Documentation/vm/hugetlbpage.txt for details.

+ 23 - 19
Documentation/CodingStyle

@@ -474,25 +474,29 @@ make a good program).
 So, you can either get rid of GNU emacs, or change it to use saner
 values.  To do the latter, you can stick the following in your .emacs file:
 
-(defun linux-c-mode ()
-  "C mode with adjusted defaults for use with the Linux kernel."
-  (interactive)
-  (c-mode)
-  (c-set-style "K&R")
-  (setq tab-width 8)
-  (setq indent-tabs-mode t)
-  (setq c-basic-offset 8))
-
-This will define the M-x linux-c-mode command.  When hacking on a
-module, if you put the string -*- linux-c -*- somewhere on the first
-two lines, this mode will be automatically invoked. Also, you may want
-to add
-
-(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode)
-			auto-mode-alist))
-
-to your .emacs file if you want to have linux-c-mode switched on
-automagically when you edit source files under /usr/src/linux.
+(defun c-lineup-arglist-tabs-only (ignored)
+  "Line up argument lists by tabs, not spaces"
+  (let* ((anchor (c-langelem-pos c-syntactic-element))
+	 (column (c-langelem-2nd-pos c-syntactic-element))
+	 (offset (- (1+ column) anchor))
+	 (steps (floor offset c-basic-offset)))
+    (* (max steps 1)
+       c-basic-offset)))
+
+(add-hook 'c-mode-hook
+          (lambda ()
+            (let ((filename (buffer-file-name)))
+              ;; Enable kernel mode for the appropriate files
+              (when (and filename
+                         (string-match "~/src/linux-trees" filename))
+                (setq indent-tabs-mode t)
+                (c-set-style "linux")
+                (c-set-offset 'arglist-cont-nonempty
+                              '(c-lineup-gcc-asm-reg
+                                c-lineup-arglist-tabs-only))))))
+
+This will make emacs go better with the kernel coding style for C
+files below ~/src/linux-trees.
 
 But even if you fail in getting emacs to do sane formatting, not
 everything is lost: use "indent".

+ 2 - 2
Documentation/DMA-API.txt

@@ -298,10 +298,10 @@ recommended that you never use these unless you really know what the
 cache width is.
 
 int
-dma_mapping_error(dma_addr_t dma_addr)
+dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 
 int
-pci_dma_mapping_error(dma_addr_t dma_addr)
+pci_dma_mapping_error(struct pci_dev *hwdev, dma_addr_t dma_addr)
 
 In some circumstances dma_map_single and dma_map_page will fail to create
 a mapping. A driver can check for these errors by testing the returned

+ 9 - 0
Documentation/DMA-attributes.txt

@@ -22,3 +22,12 @@ ready and available in memory.  The DMA of the "completion indication"
 could race with data DMA.  Mapping the memory used for completion
 indications with DMA_ATTR_WRITE_BARRIER would prevent the race.
 
+DMA_ATTR_WEAK_ORDERING
+----------------------
+
+DMA_ATTR_WEAK_ORDERING specifies that reads and writes to the mapping
+may be weakly ordered, that is that reads and writes may pass each other.
+
+Since it is optional for platforms to implement DMA_ATTR_WEAK_ORDERING,
+those that do not will simply ignore the attribute and exhibit default
+behavior.

+ 1 - 1
Documentation/DocBook/Makefile

@@ -12,7 +12,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
 	    kernel-api.xml filesystems.xml lsm.xml usb.xml kgdb.xml \
 	    gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
 	    genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
-	    mac80211.xml debugobjects.xml
+	    mac80211.xml debugobjects.xml sh.xml
 
 ###
 # The build process is as follows (targets):

+ 38 - 0
Documentation/DocBook/gadget.tmpl

@@ -524,6 +524,44 @@ These utilities include endpoint autoconfiguration.
 <!-- !Edrivers/usb/gadget/epautoconf.c -->
 </sect1>
 
+<sect1 id="composite"><title>Composite Device Framework</title>
+
+<para>The core API is sufficient for writing drivers for composite
+USB devices (with more than one function in a given configuration),
+and also multi-configuration devices (also more than one function,
+but not necessarily sharing a given configuration).
+There is however an optional framework which makes it easier to
+reuse and combine functions.
+</para>
+
+<para>Devices using this framework provide a <emphasis>struct
+usb_composite_driver</emphasis>, which in turn provides one or
+more <emphasis>struct usb_configuration</emphasis> instances.
+Each such configuration includes at least one
+<emphasis>struct usb_function</emphasis>, which packages a user
+visible role such as "network link" or "mass storage device".
+Management functions may also exist, such as "Device Firmware
+Upgrade".
+</para>
+
+!Iinclude/linux/usb/composite.h
+!Edrivers/usb/gadget/composite.c
+
+</sect1>
+
+<sect1 id="functions"><title>Composite Device Functions</title>
+
+<para>At this writing, a few of the current gadget drivers have
+been converted to this framework.
+Near-term plans include converting all of them, except for "gadgetfs".
+</para>
+
+!Edrivers/usb/gadget/f_acm.c
+!Edrivers/usb/gadget/f_serial.c
+
+</sect1>
+
+
 </chapter>
 
 <chapter id="controllers"><title>Peripheral Controller Drivers</title>

+ 24 - 33
Documentation/DocBook/kernel-locking.tmpl

@@ -219,10 +219,10 @@
    </para>
 
    <sect1 id="lock-intro">
-   <title>Three Main Types of Kernel Locks: Spinlocks, Mutexes and Semaphores</title>
+   <title>Two Main Types of Kernel Locks: Spinlocks and Mutexes</title>
 
    <para>
-     There are three main types of kernel locks.  The fundamental type
+     There are two main types of kernel locks.  The fundamental type
      is the spinlock 
      (<filename class="headerfile">include/asm/spinlock.h</filename>),
      which is a very simple single-holder lock: if you can't get the 
@@ -239,14 +239,6 @@
      can't sleep (see <xref linkend="sleeping-things"/>), and so have to
      use a spinlock instead.
    </para>
-   <para>
-     The third type is a semaphore
-     (<filename class="headerfile">include/linux/semaphore.h</filename>): it
-     can have more than one holder at any time (the number decided at
-     initialization time), although it is most commonly used as a
-     single-holder lock (a mutex).  If you can't get a semaphore, your
-     task will be suspended and later on woken up - just like for mutexes.
-   </para>
    <para>
      Neither type of lock is recursive: see
      <xref linkend="deadlock"/>.
@@ -278,7 +270,7 @@
     </para>
 
     <para>
-      Semaphores still exist, because they are required for
+      Mutexes still exist, because they are required for
       synchronization between <firstterm linkend="gloss-usercontext">user 
       contexts</firstterm>, as we will see below.
     </para>
@@ -289,18 +281,17 @@
 
      <para>
        If you have a data structure which is only ever accessed from
-       user context, then you can use a simple semaphore
-       (<filename>linux/linux/semaphore.h</filename>) to protect it.  This
-       is the most trivial case: you initialize the semaphore to the number 
-       of resources available (usually 1), and call
-       <function>down_interruptible()</function> to grab the semaphore, and 
-       <function>up()</function> to release it.  There is also a 
-       <function>down()</function>, which should be avoided, because it 
+       user context, then you can use a simple mutex
+       (<filename>include/linux/mutex.h</filename>) to protect it.  This
+       is the most trivial case: you initialize the mutex.  Then you can
+       call <function>mutex_lock_interruptible()</function> to grab the mutex,
+       and <function>mutex_unlock()</function> to release it.  There is also a 
+       <function>mutex_lock()</function>, which should be avoided, because it 
        will not return if a signal is received.
      </para>
 
      <para>
-       Example: <filename>linux/net/core/netfilter.c</filename> allows 
+       Example: <filename>net/netfilter/nf_sockopt.c</filename> allows 
        registration of new <function>setsockopt()</function> and 
        <function>getsockopt()</function> calls, with
        <function>nf_register_sockopt()</function>.  Registration and 
@@ -515,7 +506,7 @@
       <listitem>
 	<para>
           If you are in a process context (any syscall) and want to
-	lock other process out, use a semaphore.  You can take a semaphore
+	lock other process out, use a mutex.  You can take a mutex
 	and sleep (<function>copy_from_user*(</function> or
 	<function>kmalloc(x,GFP_KERNEL)</function>).
       </para>
@@ -662,7 +653,7 @@
 <entry>SLBH</entry>
 <entry>SLBH</entry>
 <entry>SLBH</entry>
-<entry>DI</entry>
+<entry>MLI</entry>
 <entry>None</entry>
 </row>
 
@@ -692,8 +683,8 @@
 <entry>spin_lock_bh</entry>
 </row>
 <row>
-<entry>DI</entry>
-<entry>down_interruptible</entry>
+<entry>MLI</entry>
+<entry>mutex_lock_interruptible</entry>
 </row>
 
 </tbody>
@@ -1310,7 +1301,7 @@ as Alan Cox says, <quote>Lock data, not code</quote>.
     <para>
       There is a coding bug where a piece of code tries to grab a
       spinlock twice: it will spin forever, waiting for the lock to
-      be released (spinlocks, rwlocks and semaphores are not
+      be released (spinlocks, rwlocks and mutexes are not
       recursive in Linux).  This is trivial to diagnose: not a
       stay-up-five-nights-talk-to-fluffy-code-bunnies kind of
       problem.
@@ -1335,7 +1326,7 @@ as Alan Cox says, <quote>Lock data, not code</quote>.
 
     <para>
       This complete lockup is easy to diagnose: on SMP boxes the
-      watchdog timer or compiling with <symbol>DEBUG_SPINLOCKS</symbol> set
+      watchdog timer or compiling with <symbol>DEBUG_SPINLOCK</symbol> set
       (<filename>include/linux/spinlock.h</filename>) will show this up 
       immediately when it happens.
     </para>
@@ -1558,7 +1549,7 @@ the amount of locking which needs to be done.
    <title>Read/Write Lock Variants</title>
 
    <para>
-      Both spinlocks and semaphores have read/write variants:
+      Both spinlocks and mutexes have read/write variants:
       <type>rwlock_t</type> and <structname>struct rw_semaphore</structname>.
       These divide users into two classes: the readers and the writers.  If
       you are only reading the data, you can get a read lock, but to write to
@@ -1681,7 +1672,7 @@ the amount of locking which needs to be done.
  #include &lt;linux/slab.h&gt;
  #include &lt;linux/string.h&gt;
 +#include &lt;linux/rcupdate.h&gt;
- #include &lt;linux/semaphore.h&gt;
+ #include &lt;linux/mutex.h&gt;
  #include &lt;asm/errno.h&gt;
 
  struct object
@@ -1913,7 +1904,7 @@ machines due to caching.
        </listitem>
        <listitem>
         <para>
-          <function> put_user()</function>
+          <function>put_user()</function>
         </para>
        </listitem>
       </itemizedlist>
@@ -1927,13 +1918,13 @@ machines due to caching.
 
      <listitem>
       <para>
-      <function>down_interruptible()</function> and
-      <function>down()</function>
+      <function>mutex_lock_interruptible()</function> and
+      <function>mutex_lock()</function>
       </para>
       <para>
-       There is a <function>down_trylock()</function> which can be
+       There is a <function>mutex_trylock()</function> which can be
        used inside interrupt context, as it will not sleep.
-       <function>up()</function> will also never sleep.
+       <function>mutex_unlock()</function> will also never sleep.
       </para>
      </listitem>
     </itemizedlist>
@@ -2023,7 +2014,7 @@ machines due to caching.
       <para>
         Prior to 2.5, or when <symbol>CONFIG_PREEMPT</symbol> is
         unset, processes in user context inside the kernel would not
-        preempt each other (ie. you had that CPU until you have it up,
+        preempt each other (ie. you had that CPU until you gave it up,
         except for interrupts).  With the addition of
         <symbol>CONFIG_PREEMPT</symbol> in 2.5.4, this changed: when
         in user context, higher priority tasks can "cut in": spinlocks

+ 18 - 0
Documentation/DocBook/kgdb.tmpl

@@ -98,6 +98,24 @@
     "Kernel debugging" select "KGDB: kernel debugging with remote gdb".
     </para>
     <para>
+    It is advised, but not required that you turn on the
+    CONFIG_FRAME_POINTER kernel option.  This option inserts code to
+    into the compiled executable which saves the frame information in
+    registers or on the stack at different points which will allow a
+    debugger such as gdb to more accurately construct stack back traces
+    while debugging the kernel.
+    </para>
+    <para>
+    If the architecture that you are using supports the kernel option
+    CONFIG_DEBUG_RODATA, you should consider turning it off.  This
+    option will prevent the use of software breakpoints because it
+    marks certain regions of the kernel's memory space as read-only.
+    If kgdb supports it for the architecture you are using, you can
+    use hardware breakpoints if you desire to run with the
+    CONFIG_DEBUG_RODATA option turned on, else you need to turn off
+    this option.
+    </para>
+    <para>
     Next you should choose one of more I/O drivers to interconnect debugging
     host and debugged target.  Early boot debugging requires a KGDB
     I/O driver that supports early debugging and the driver must be

+ 2 - 2
Documentation/DocBook/procfs-guide.tmpl

@@ -29,12 +29,12 @@
 
     <revhistory>
       <revision>
-	<revnumber>1.0&nbsp;</revnumber>
+	<revnumber>1.0</revnumber>
 	<date>May 30, 2001</date>
 	<revremark>Initial revision posted to linux-kernel</revremark>
       </revision>
       <revision>
-	<revnumber>1.1&nbsp;</revnumber>
+	<revnumber>1.1</revnumber>
 	<date>June 3, 2001</date>
 	<revremark>Revised after comments from linux-kernel</revremark>
       </revision>

+ 4 - 4
Documentation/DocBook/s390-drivers.tmpl

@@ -100,7 +100,7 @@
       the hardware structures represented here, please consult the Principles
       of Operation.
     </para>
-!Iinclude/asm-s390/cio.h
+!Iarch/s390/include/asm/cio.h
     </sect1>
     <sect1 id="ccwdev">
      <title>ccw devices</title>
@@ -114,7 +114,7 @@
       ccw device structure. Device drivers must not bypass those functions
       or strange side effects may happen.
     </para>
-!Iinclude/asm-s390/ccwdev.h
+!Iarch/s390/include/asm/ccwdev.h
 !Edrivers/s390/cio/device.c
 !Edrivers/s390/cio/device_ops.c
     </sect1>
@@ -125,7 +125,7 @@
 	measurement data which is made available by the channel subsystem
 	for each channel attached device.
   </para>
-!Iinclude/asm-s390/cmb.h
+!Iarch/s390/include/asm/cmb.h
 !Edrivers/s390/cio/cmf.c
     </sect1>
   </chapter>
@@ -142,7 +142,7 @@
   </para>
    <sect1 id="ccwgroupdevices">
     <title>ccw group devices</title>
-!Iinclude/asm-s390/ccwgroup.h
+!Iarch/s390/include/asm/ccwgroup.h
 !Edrivers/s390/cio/ccwgroup.c
    </sect1>
   </chapter>

+ 105 - 0
Documentation/DocBook/sh.tmpl

@@ -0,0 +1,105 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
+	"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
+
+<book id="sh-drivers">
+ <bookinfo>
+  <title>SuperH Interfaces Guide</title>
+  
+  <authorgroup>
+   <author>
+    <firstname>Paul</firstname>
+    <surname>Mundt</surname>
+    <affiliation>
+     <address>
+      <email>lethal@linux-sh.org</email>
+     </address>
+    </affiliation>
+   </author>
+  </authorgroup>
+
+  <copyright>
+   <year>2008</year>
+   <holder>Paul Mundt</holder>
+  </copyright>
+  <copyright>
+   <year>2008</year>
+   <holder>Renesas Technology Corp.</holder>
+  </copyright>
+
+  <legalnotice>
+   <para>
+     This documentation is free software; you can redistribute
+     it and/or modify it under the terms of the GNU General Public
+     License version 2 as published by the Free Software Foundation.
+   </para>
+      
+   <para>
+     This program is distributed in the hope that it will be
+     useful, but WITHOUT ANY WARRANTY; without even the implied
+     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+     See the GNU General Public License for more details.
+   </para>
+      
+   <para>
+     You should have received a copy of the GNU General Public
+     License along with this program; if not, write to the Free
+     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+     MA 02111-1307 USA
+   </para>
+      
+   <para>
+     For more details see the file COPYING in the source
+     distribution of Linux.
+   </para>
+  </legalnotice>
+ </bookinfo>
+
+<toc></toc>
+
+  <chapter id="mm">
+    <title>Memory Management</title>
+    <sect1 id="sh4">
+    <title>SH-4</title>
+      <sect2 id="sq">
+        <title>Store Queue API</title>
+!Earch/sh/kernel/cpu/sh4/sq.c
+      </sect2>
+    </sect1>
+    <sect1 id="sh5">
+      <title>SH-5</title>
+      <sect2 id="tlb">
+	<title>TLB Interfaces</title>
+!Iarch/sh/mm/tlb-sh5.c
+!Iarch/sh/include/asm/tlb_64.h
+      </sect2>
+    </sect1>
+  </chapter>
+  <chapter id="clk">
+    <title>Clock Framework Extensions</title>
+!Iarch/sh/include/asm/clock.h
+  </chapter>
+  <chapter id="mach">
+    <title>Machine Specific Interfaces</title>
+    <sect1 id="dreamcast">
+      <title>mach-dreamcast</title>
+!Iarch/sh/boards/mach-dreamcast/rtc.c
+    </sect1>
+    <sect1 id="x3proto">
+      <title>mach-x3proto</title>
+!Earch/sh/boards/mach-x3proto/ilsel.c
+    </sect1>
+  </chapter>
+  <chapter id="busses">
+    <title>Busses</title>
+    <sect1 id="superhyway">
+      <title>SuperHyway</title>
+!Edrivers/sh/superhyway/superhyway.c
+    </sect1>
+
+    <sect1 id="maple">
+      <title>Maple</title>
+!Edrivers/sh/maple/maple.c
+    </sect1>
+  </chapter>
+</book>

+ 51 - 12
Documentation/DocBook/uio-howto.tmpl

@@ -21,6 +21,18 @@
     </affiliation>
 </author>
 
+<copyright>
+	<year>2006-2008</year>
+	<holder>Hans-Jürgen Koch.</holder>
+</copyright>
+
+<legalnotice>
+<para>
+This documentation is Free Software licensed under the terms of the
+GPL version 2.
+</para>
+</legalnotice>
+
 <pubdate>2006-12-11</pubdate>
 
 <abstract>
@@ -29,6 +41,12 @@
 </abstract>
 
 <revhistory>
+	<revision>
+	<revnumber>0.5</revnumber>
+	<date>2008-05-22</date>
+	<authorinitials>hjk</authorinitials>
+	<revremark>Added description of write() function.</revremark>
+	</revision>
 	<revision>
 	<revnumber>0.4</revnumber>
 	<date>2007-11-26</date>
@@ -57,20 +75,9 @@
 </bookinfo>
 
 <chapter id="aboutthisdoc">
-<?dbhtml filename="about.html"?>
+<?dbhtml filename="aboutthis.html"?>
 <title>About this document</title>
 
-<sect1 id="copyright">
-<?dbhtml filename="copyright.html"?>
-<title>Copyright and License</title>
-<para>
-      Copyright (c) 2006 by Hans-Jürgen Koch.</para>
-<para>
-This documentation is Free Software licensed under the terms of the
-GPL version 2.
-</para>
-</sect1>
-
 <sect1 id="translations">
 <?dbhtml filename="translations.html"?>
 <title>Translations</title>
@@ -189,6 +196,30 @@ interested in translating it, please email me
 	represents the total interrupt count. You can use this number
 	to figure out if you missed some interrupts.
 	</para>
+	<para>
+	For some hardware that has more than one interrupt source internally,
+	but not separate IRQ mask and status registers, there might be
+	situations where userspace cannot determine what the interrupt source
+	was if the kernel handler disables them by writing to the chip's IRQ
+	register. In such a case, the kernel has to disable the IRQ completely
+	to leave the chip's register untouched. Now the userspace part can
+	determine the cause of the interrupt, but it cannot re-enable
+	interrupts. Another cornercase is chips where re-enabling interrupts
+	is a read-modify-write operation to a combined IRQ status/acknowledge
+	register. This would be racy if a new interrupt occurred
+	simultaneously.
+	</para>
+	<para>
+	To address these problems, UIO also implements a write() function. It
+	is normally not used and can be ignored for hardware that has only a
+	single interrupt source or has separate IRQ mask and status registers.
+	If you need it, however, a write to <filename>/dev/uioX</filename>
+	will call the <function>irqcontrol()</function> function implemented
+	by the driver. You have to write a 32-bit value that is usually either
+	0 or 1 to disable or enable interrupts. If a driver does not implement
+	<function>irqcontrol()</function>, <function>write()</function> will
+	return with <varname>-ENOSYS</varname>.
+	</para>
 
 	<para>
 	To handle interrupts properly, your custom kernel module can
@@ -362,6 +393,14 @@ device is actually used.
 <function>open()</function>, you will probably also want a custom
 <function>release()</function> function.
 </para></listitem>
+
+<listitem><para>
+<varname>int (*irqcontrol)(struct uio_info *info, s32 irq_on)
+</varname>: Optional. If you need to be able to enable or disable
+interrupts from userspace by writing to <filename>/dev/uioX</filename>,
+you can implement this function. The parameter <varname>irq_on</varname>
+will be 0 to disable interrupts and 1 to enable them.
+</para></listitem>
 </itemizedlist>
 
 <para>

+ 1 - 1
Documentation/DocBook/videobook.tmpl

@@ -1648,7 +1648,7 @@ static struct video_buffer capture_fb;
 
   <chapter id="pubfunctions">
      <title>Public Functions Provided</title>
-!Edrivers/media/video/videodev.c
+!Edrivers/media/video/v4l2-dev.c
   </chapter>
 
 </book>

+ 12 - 26
Documentation/DocBook/z8530book.tmpl

@@ -69,12 +69,6 @@
 	device to be used as both a tty interface and as a synchronous 
 	controller is a project for Linux post the 2.4 release
   </para>
-  <para>
-	The support code handles most common card configurations and
-	supports running both Cisco HDLC and Synchronous PPP. With extra
-	glue the frame relay and X.25 protocols can also be used with this
-	driver.
-  </para>
   </chapter>
   
   <chapter id="Driver_Modes">
@@ -179,35 +173,27 @@
   <para>
 	If you wish to use the network interface facilities of the driver,
 	then you need to attach a network device to each channel that is
-	present and in use. In addition to use the SyncPPP and Cisco HDLC
+	present and in use. In addition to use the generic HDLC
 	you need to follow some additional plumbing rules. They may seem 
 	complex but a look at the example hostess_sv11 driver should
 	reassure you.
   </para>
   <para>
 	The network device used for each channel should be pointed to by
-	the netdevice field of each channel. The dev-&gt; priv field of the
+	the netdevice field of each channel. The hdlc-&gt; priv field of the
 	network device points to your private data - you will need to be
-	able to find your ppp device from this. In addition to use the
-	sync ppp layer the private data must start with a void * pointer
-	to the syncppp structures.
+	able to find your private data from this.
   </para>
   <para>
 	The way most drivers approach this particular problem is to
 	create a structure holding the Z8530 device definition and
-	put that and the syncppp pointer into the private field of
-	the network device. The network device fields of the channels
-	then point back to the network devices. The ppp_device can also
-	be put in the private structure conveniently.
+	put that into the private field of the network device. The
+	network device fields of the channels then point back to the
+	network devices.
   </para>
   <para>
-	If you wish to use the synchronous ppp then you need to attach
-	the syncppp layer to the network device. You should do this before
-	you register the network device. The
-	<function>sppp_attach</function> requires that the first void *
-	pointer in your private data is pointing to an empty struct
-	ppp_device. The function fills in the initial data for the
-	ppp/hdlc layer.
+	If you wish to use the generic HDLC then you need to register
+	the HDLC device.
   </para>
   <para>
 	Before you register your network device you will also need to
@@ -314,10 +300,10 @@
 	buffer in sk_buff format and queues it for transmission. The
 	caller must provide the entire packet with the exception of the
 	bitstuffing and CRC. This is normally done by the caller via
-	the syncppp interface layer. It returns 0 if the buffer has been 
-        queued and non zero values  for queue full. If the function accepts 
-	the buffer it becomes property of the Z8530 layer and the caller 
-	should not free it. 
+	the generic HDLC interface layer. It returns 0 if the buffer has been
+	queued and non zero values for queue full. If the function accepts
+	the buffer it becomes property of the Z8530 layer and the caller
+	should not free it.
   </para>
   <para>
 	The function <function>z8530_get_stats</function> returns a pointer

+ 1 - 1
Documentation/HOWTO

@@ -358,7 +358,7 @@ Here is a list of some of the different kernel trees available:
     - pcmcia, Dominik Brodowski <linux@dominikbrodowski.net>
 	git.kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git
 
-    - SCSI, James Bottomley <James.Bottomley@SteelEye.com>
+    - SCSI, James Bottomley <James.Bottomley@hansenpartnership.com>
 	git.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git
 
     - x86, Ingo Molnar <mingo@elte.hu>

+ 2 - 2
Documentation/Intel-IOMMU.txt

@@ -48,7 +48,7 @@ IOVA generation is pretty generic. We used the same technique as vmalloc()
 but these are not global address spaces, but separate for each domain.
 Different DMA engines may support different number of domains.
 
-We also allocate gaurd pages with each mapping, so we can attempt to catch
+We also allocate guard pages with each mapping, so we can attempt to catch
 any overflow that might happen.
 
 
@@ -112,4 +112,4 @@ TBD
 
 - For compatibility testing, could use unity map domain for all devices, just
   provide a 1-1 for all useful memory under a single domain for all devices.
-- API for paravirt ops for abstracting functionlity for VMM folks.
+- API for paravirt ops for abstracting functionality for VMM folks.

+ 26 - 0
Documentation/SubmittingPatches

@@ -528,7 +528,33 @@ See more details on the proper patch format in the following
 references.
 
 
+16) Sending "git pull" requests  (from Linus emails)
 
+Please write the git repo address and branch name alone on the same line
+so that I can't even by mistake pull from the wrong branch, and so
+that a triple-click just selects the whole thing.
+
+So the proper format is something along the lines of:
+
+	"Please pull from
+
+		git://jdelvare.pck.nerim.net/jdelvare-2.6 i2c-for-linus
+
+	 to get these changes:"
+
+so that I don't have to hunt-and-peck for the address and inevitably
+get it wrong (actually, I've only gotten it wrong a few times, and
+checking against the diffstat tells me when I get it wrong, but I'm
+just a lot more comfortable when I don't have to "look for" the right
+thing to pull, and double-check that I have the right branch-name).
+
+
+Please use "git diff -M --stat --summary" to generate the diffstat:
+the -M enables rename detection, and the summary enables a summary of
+new/deleted or renamed files.
+
+With rename detection, the statistics are rather different [...]
+because git will notice that a fair number of the changes are renames.
 
 -----------------------------------
 SECTION 2 - HINTS, TIPS, AND TRICKS

+ 8 - 3
Documentation/accounting/delay-accounting.txt

@@ -11,6 +11,7 @@ the delays experienced by a task while
 a) waiting for a CPU (while being runnable)
 b) completion of synchronous block I/O initiated by the task
 c) swapping in pages
+d) memory reclaim
 
 and makes these statistics available to userspace through
 the taskstats interface.
@@ -41,7 +42,7 @@ this structure. See
      include/linux/taskstats.h
 for a description of the fields pertaining to delay accounting.
 It will generally be in the form of counters returning the cumulative
-delay seen for cpu, sync block I/O, swapin etc.
+delay seen for cpu, sync block I/O, swapin, memory reclaim etc.
 
 Taking the difference of two successive readings of a given
 counter (say cpu_delay_total) for a task will give the delay
@@ -94,7 +95,9 @@ CPU	count	real total	virtual total	delay total
 	7876	92005750	100000000	24001500
 IO	count	delay total
 	0	0
-MEM	count	delay total
+SWAP	count	delay total
+	0	0
+RECLAIM	count	delay total
 	0	0
 
 Get delays seen in executing a given simple command
@@ -108,5 +111,7 @@ CPU	count	real total	virtual total	delay total
 	6	4000250		4000000		0
 IO	count	delay total
 	0	0
-MEM	count	delay total
+SWAP	count	delay total
+	0	0
+RECLAIM	count	delay total
 	0	0

+ 6 - 2
Documentation/accounting/getdelays.c

@@ -196,14 +196,18 @@ void print_delayacct(struct taskstats *t)
 	       "      %15llu%15llu%15llu%15llu\n"
 	       "IO    %15s%15s\n"
 	       "      %15llu%15llu\n"
-	       "MEM   %15s%15s\n"
+	       "SWAP  %15s%15s\n"
+	       "      %15llu%15llu\n"
+	       "RECLAIM  %12s%15s\n"
 	       "      %15llu%15llu\n",
 	       "count", "real total", "virtual total", "delay total",
 	       t->cpu_count, t->cpu_run_real_total, t->cpu_run_virtual_total,
 	       t->cpu_delay_total,
 	       "count", "delay total",
 	       t->blkio_count, t->blkio_delay_total,
-	       "count", "delay total", t->swapin_count, t->swapin_delay_total);
+	       "count", "delay total", t->swapin_count, t->swapin_delay_total,
+	       "count", "delay total",
+	       t->freepages_count, t->freepages_delay_total);
 }
 
 void task_context_switch_counts(struct taskstats *t)

+ 8 - 1
Documentation/accounting/taskstats-struct.txt

@@ -6,7 +6,7 @@ This document contains an explanation of the struct taskstats fields.
 There are three different groups of fields in the struct taskstats:
 
 1) Common and basic accounting fields
-    If CONFIG_TASKSTATS is set, the taskstats inteface is enabled and
+    If CONFIG_TASKSTATS is set, the taskstats interface is enabled and
     the common fields and basic accounting fields are collected for
     delivery at do_exit() of a task.
 2) Delay accounting fields
@@ -26,6 +26,8 @@ There are three different groups of fields in the struct taskstats:
 
 5) Time accounting for SMT machines
 
+6) Extended delay accounting fields for memory reclaim
+
 Future extension should add fields to the end of the taskstats struct, and
 should not change the relative position of each field within the struct.
 
@@ -170,4 +172,9 @@ struct taskstats {
 	__u64	ac_utimescaled;		/* utime scaled on frequency etc */
 	__u64	ac_stimescaled;		/* stime scaled on frequency etc */
 	__u64	cpu_scaled_run_real_total; /* scaled cpu_run_real_total */
+
+6) Extended delay accounting fields for memory reclaim
+	/* Delay waiting for memory reclaim */
+	__u64	freepages_count;
+	__u64	freepages_delay_total;
 }

+ 1 - 1
Documentation/arm/IXP4xx

@@ -32,7 +32,7 @@ Linux currently supports the following features on the IXP4xx chips:
 - Flash access (MTD/JFFS)
 - I2C through GPIO on IXP42x
 - GPIO for input/output/interrupts 
-  See include/asm-arm/arch-ixp4xx/platform.h for access functions.
+  See arch/arm/mach-ixp4xx/include/mach/platform.h for access functions.
 - Timers (watchdog, OS)
 
 The following components of the chips are not supported by Linux and

+ 3 - 9
Documentation/arm/Interrupts

@@ -138,14 +138,8 @@ So, what's changed?
 
                 Set active the IRQ edge(s)/level.  This replaces the
                 SA1111 INTPOL manipulation, and the set_GPIO_IRQ_edge()
-                function.  Type should be one of the following:
-
-                #define IRQT_NOEDGE     (0)
-                #define IRQT_RISING     (__IRQT_RISEDGE)
-                #define IRQT_FALLING    (__IRQT_FALEDGE)
-                #define IRQT_BOTHEDGE   (__IRQT_RISEDGE|__IRQT_FALEDGE)
-                #define IRQT_LOW        (__IRQT_LOWLVL)
-                #define IRQT_HIGH       (__IRQT_HIGHLVL)
+                function.  Type should be one of IRQ_TYPE_xxx defined in
+		<linux/irq.h>
 
 3. set_GPIO_IRQ_edge() is obsolete, and should be replaced by set_irq_type.
 
@@ -164,7 +158,7 @@ So, what's changed?
    be re-checked for pending events.  (see the Neponset IRQ handler for
    details).
 
-7. fixup_irq() is gone, as is include/asm-arm/arch-*/irq.h
+7. fixup_irq() is gone, as is arch/arm/mach-*/include/mach/irq.h
 
 Please note that this will not solve all problems - some of them are
 hardware based.  Mixing level-based and edge-based IRQs on the same

+ 2 - 2
Documentation/arm/README

@@ -79,7 +79,7 @@ Machine/Platform support
   To this end, we now have arch/arm/mach-$(MACHINE) directories which are
   designed to house the non-driver files for a particular machine (eg, PCI,
   memory management, architecture definitions etc).  For all future
-  machines, there should be a corresponding include/asm-arm/arch-$(MACHINE)
+  machines, there should be a corresponding arch/arm/mach-$(MACHINE)/include/mach
   directory.
 
 
@@ -176,7 +176,7 @@ Kernel entry (head.S)
   class typically based around one or more system on a chip devices, and
   acts as a natural container around the actual implementations.  These
   classes are given directories - arch/arm/mach-<class> and
-  include/asm-arm/arch-<class> - which contain the source files to
+  arch/arm/mach-<class> - which contain the source files to/include/mach
   support the machine class.  This directories also contain any machine
   specific supporting code.
 

+ 4 - 4
Documentation/arm/Samsung-S3C24XX/GPIO.txt

@@ -16,13 +16,13 @@ Introduction
 Headers
 -------
 
-  See include/asm-arm/arch-s3c2410/regs-gpio.h for the list
+  See arch/arm/mach-s3c2410/include/mach/regs-gpio.h for the list
   of GPIO pins, and the configuration values for them. This
-  is included by using #include <asm/arch/regs-gpio.h>
+  is included by using #include <mach/regs-gpio.h>
 
   The GPIO management functions are defined in the hardware
-  header include/asm-arm/arch-s3c2410/hardware.h which can be
-  included by #include <asm/arch/hardware.h>
+  header arch/arm/mach-s3c2410/include/mach/hardware.h which can be
+  included by #include <mach/hardware.h>
 
   A useful amount of documentation can be found in the hardware
   header on how the GPIO functions (and others) work.

+ 1 - 1
Documentation/arm/Samsung-S3C24XX/Overview.txt

@@ -36,7 +36,7 @@ Layout
   in arch/arm/mach-s3c2410 and S3C2440 in arch/arm/mach-s3c2440
 
   Register, kernel and platform data definitions are held in the
-  include/asm-arm/arch-s3c2410 directory.
+  arch/arm/mach-s3c2410 directory./include/mach
 
 
 Machines

+ 1 - 1
Documentation/arm/Samsung-S3C24XX/USB-Host.txt

@@ -49,7 +49,7 @@ Board Support
 Platform Data
 -------------
 
-  See linux/include/asm-arm/arch-s3c2410/usb-control.h for the
+  See arch/arm/mach-s3c2410/include/mach/usb-control.h for the
   descriptions of the platform device data. An implementation
   can be found in linux/arch/arm/mach-s3c2410/usb-simtec.c .
 

+ 67 - 0
Documentation/bt8xxgpio.txt

@@ -0,0 +1,67 @@
+===============================================================
+==  BT8XXGPIO driver                                         ==
+==                                                           ==
+==  A driver for a selfmade cheap BT8xx based PCI GPIO-card  ==
+==                                                           ==
+==  For advanced documentation, see                          ==
+==  http://www.bu3sch.de/btgpio.php                          ==
+===============================================================
+
+
+A generic digital 24-port PCI GPIO card can be built out of an ordinary
+Brooktree bt848, bt849, bt878 or bt879 based analog TV tuner card. The
+Brooktree chip is used in old analog Hauppauge WinTV PCI cards. You can easily
+find them used for low prices on the net.
+
+The bt8xx chip does have 24 digital GPIO ports.
+These ports are accessible via 24 pins on the SMD chip package.
+
+
+==============================================
+==  How to physically access the GPIO pins  ==
+==============================================
+
+The are several ways to access these pins. One might unsolder the whole chip
+and put it on a custom PCI board, or one might only unsolder each individual
+GPIO pin and solder that to some tiny wire. As the chip package really is tiny
+there are some advanced soldering skills needed in any case.
+
+The physical pinouts are drawn in the following ASCII art.
+The GPIO pins are marked with G00-G23
+
+                                           G G G G G G G G G G G G     G G G G G G
+                                           0 0 0 0 0 0 0 0 0 0 1 1     1 1 1 1 1 1
+                                           0 1 2 3 4 5 6 7 8 9 0 1     2 3 4 5 6 7
+           | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+           ---------------------------------------------------------------------------
+         --|                               ^                                     ^   |--
+         --|                               pin 86                           pin 67   |--
+         --|                                                                         |--
+         --|                                                               pin 61 >  |-- G18
+         --|                                                                         |-- G19
+         --|                                                                         |-- G20
+         --|                                                                         |-- G21
+         --|                                                                         |-- G22
+         --|                                                               pin 56 >  |-- G23
+         --|                                                                         |--
+         --|                           Brooktree 878/879                             |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|                                                                         |--
+         --|   O                                                                     |--
+         --|                                                                         |--
+           ---------------------------------------------------------------------------
+           | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
+           ^
+           This is pin 1
+

+ 6 - 15
Documentation/cciss.txt

@@ -112,27 +112,18 @@ Hot plug support for SCSI tape drives
 
 Hot plugging of SCSI tape drives is supported, with some caveats.
 The cciss driver must be informed that changes to the SCSI bus
-have been made, in addition to and prior to informing the SCSI 
-mid layer.  This may be done via the /proc filesystem.  For example:
+have been made.  This may be done via the /proc filesystem.
+For example:
 
 	echo "rescan" > /proc/scsi/cciss0/1
 
-This causes the adapter to query the adapter about changes to the 
-physical SCSI buses and/or fibre channel arbitrated loop and the 
+This causes the driver to query the adapter about changes to the
+physical SCSI buses and/or fibre channel arbitrated loop and the
 driver to make note of any new or removed sequential access devices
 or medium changers.  The driver will output messages indicating what 
 devices have been added or removed and the controller, bus, target and 
-lun used to address the device.  Once this is done, the SCSI mid layer 
-can be informed of changes to the virtual SCSI bus which the driver 
-presents to it in the usual way. For example: 
-
-	echo scsi add-single-device 3 2 1 0 > /proc/scsi/scsi
- 
-to add a device on controller 3, bus 2, target 1, lun 0.   Note that
-the driver makes an effort to preserve the devices positions
-in the virtual SCSI bus, so if you are only moving tape drives 
-around on the same adapter and not adding or removing tape drives 
-from the adapter, informing the SCSI mid layer may not be necessary.
+lun used to address the device.  It then notifies the SCSI mid layer
+of these changes.
 
 Note that the naming convention of the /proc filesystem entries 
 contains a number in addition to the driver name.  (E.g. "cciss0" 

+ 0 - 133
Documentation/cli-sti-removal.txt

@@ -1,133 +0,0 @@
-
-#### cli()/sti() removal guide, started by Ingo Molnar <mingo@redhat.com>
-
-
-as of 2.5.28, five popular macros have been removed on SMP, and
-are being phased out on UP:
-
- cli(), sti(), save_flags(flags), save_flags_cli(flags), restore_flags(flags)
-
-until now it was possible to protect driver code against interrupt
-handlers via a cli(), but from now on other, more lightweight methods
-have to be used for synchronization, such as spinlocks or semaphores.
-
-for example, driver code that used to do something like:
-
-	struct driver_data;
-
-	irq_handler (...)
-	{
-		....
-		driver_data.finish = 1;
-		driver_data.new_work = 0;
-		....
-	}
-
-	...
-
-	ioctl_func (...)
-	{
-		...
-		cli();
-		...
-		driver_data.finish = 0;
-		driver_data.new_work = 2;
-		...
-		sti();
-		...
-	}
-
-was SMP-correct because the cli() function ensured that no
-interrupt handler (amongst them the above irq_handler()) function
-would execute while the cli()-ed section is executing.
-
-but from now on a more direct method of locking has to be used:
-
-	DEFINE_SPINLOCK(driver_lock);
-	struct driver_data;
-
-	irq_handler (...)
-	{
-		unsigned long flags;
-		....
-		spin_lock_irqsave(&driver_lock, flags);
-		....
-		driver_data.finish = 1;
-		driver_data.new_work = 0;
-		....
-		spin_unlock_irqrestore(&driver_lock, flags);
-		....
-	}
-
-	...
-
-	ioctl_func (...)
-	{
-		...
-		spin_lock_irq(&driver_lock);
-		...
-		driver_data.finish = 0;
-		driver_data.new_work = 2;
-		...
-		spin_unlock_irq(&driver_lock);
-		...
-	}
-
-the above code has a number of advantages:
-
-- the locking relation is easier to understand - actual lock usage
-  pinpoints the critical sections. cli() usage is too opaque.
-  Easier to understand means it's easier to debug.
-
-- it's faster, because spinlocks are faster to acquire than the
-  potentially heavily-used IRQ lock. Furthermore, your driver does
-  not have to wait eg. for a big heavy SCSI interrupt to finish,
-  because the driver_lock spinlock is only used by your driver.
-  cli() on the other hand was used by many drivers, and extended
-  the critical section to the whole IRQ handler function - creating
-  serious lock contention.
-
- 
-to make the transition easier, we've still kept the cli(), sti(),
-save_flags(), save_flags_cli() and restore_flags() macros defined
-on UP systems - but their usage will be phased out until 2.6 is
-released.
-
-drivers that want to disable local interrupts (interrupts on the
-current CPU), can use the following five macros:
-
-  local_irq_disable(), local_irq_enable(), local_save_flags(flags),
-  local_irq_save(flags), local_irq_restore(flags)
-
-but beware, their meaning and semantics are much simpler, far from
-that of the old cli(), sti(), save_flags(flags) and restore_flags(flags)
-SMP meaning:
-
-    local_irq_disable()       => turn local IRQs off
-
-    local_irq_enable()        => turn local IRQs on
-
-    local_save_flags(flags)   => save the current IRQ state into flags. The
-                                 state can be on or off. (on some
-                                 architectures there's even more bits in it.)
-
-    local_irq_save(flags)     => save the current IRQ state into flags and
-                                 disable interrupts.
-
-    local_irq_restore(flags)  => restore the IRQ state from flags.
-
-(local_irq_save can save both irqs on and irqs off state, and
-local_irq_restore can restore into both irqs on and irqs off state.)
-
-another related change is that synchronize_irq() now takes a parameter:
-synchronize_irq(irq). This change too has the purpose of making SMP
-synchronization more lightweight - this way you can wait for your own
-interrupt handler to finish, no need to wait for other IRQ sources.
-
-
-why were these changes done? The main reason was the architectural burden
-of maintaining the cli()/sti() interface - it became a real problem. The
-new interrupt system is much more streamlined, easier to understand, debug,
-and it's also a bit faster - the same happened to it that will happen to
-cli()/sti() using drivers once they convert to spinlocks :-)
-

+ 1 - 2
Documentation/controllers/memory.txt

@@ -242,8 +242,7 @@ rmdir() if there are no tasks.
 1. Add support for accounting huge pages (as a separate controller)
 2. Make per-cgroup scanner reclaim not-shared pages first
 3. Teach controller to account for shared-pages
-4. Start reclamation when the limit is lowered
-5. Start reclamation in the background when the limit is
+4. Start reclamation in the background when the limit is
    not yet hit but the usage is getting closer
 
 Summary

+ 1 - 1
Documentation/cpu-freq/governors.txt

@@ -122,7 +122,7 @@ around '10000' or more.
 show_sampling_rate_(min|max): the minimum and maximum sampling rates
 available that you may set 'sampling_rate' to.
 
-up_threshold: defines what the average CPU usaged between the samplings
+up_threshold: defines what the average CPU usage between the samplings
 of 'sampling_rate' needs to be for the kernel to make a decision on
 whether it should increase the frequency.  For example when it is set
 to its default value of '80' it means that between the checking

+ 73 - 80
Documentation/edac.txt

@@ -222,74 +222,9 @@ both csrow2 and csrow3 are populated, this indicates a dual ranked
 set of DIMMs for channels 0 and 1.
 
 
-Within each of the 'mc','mcX' and 'csrowX' directories are several
+Within each of the 'mcX' and 'csrowX' directories are several
 EDAC control and attribute files.
 
-
-============================================================================
-DIRECTORY 'mc'
-
-In directory 'mc' are EDAC system overall control and attribute files:
-
-
-Panic on UE control file:
-
-	'edac_mc_panic_on_ue'
-
-	An uncorrectable error will cause a machine panic.  This is usually
-	desirable.  It is a bad idea to continue when an uncorrectable error
-	occurs - it is indeterminate what was uncorrected and the operating
-	system context might be so mangled that continuing will lead to further
-	corruption. If the kernel has MCE configured, then EDAC will never
-	notice the UE.
-
-	LOAD TIME: module/kernel parameter: panic_on_ue=[0|1]
-
-	RUN TIME:  echo "1" >/sys/devices/system/edac/mc/edac_mc_panic_on_ue
-
-
-Log UE control file:
-
-	'edac_mc_log_ue'
-
-	Generate kernel messages describing uncorrectable errors.  These errors
-	are reported through the system message log system.  UE statistics
-	will be accumulated even when UE logging is disabled.
-
-	LOAD TIME: module/kernel parameter: log_ue=[0|1]
-
-	RUN TIME: echo "1" >/sys/devices/system/edac/mc/edac_mc_log_ue
-
-
-Log CE control file:
-
-	'edac_mc_log_ce'
-
-	Generate kernel messages describing correctable errors.  These
-	errors are reported through the system message log system.
-	CE statistics will be accumulated even when CE logging is disabled.
-
-	LOAD TIME: module/kernel parameter: log_ce=[0|1]
-
-	RUN TIME: echo "1" >/sys/devices/system/edac/mc/edac_mc_log_ce
-
-
-Polling period control file:
-
-	'edac_mc_poll_msec'
-
-	The time period, in milliseconds, for polling for error information.
-	Too small a value wastes resources.  Too large a value might delay
-	necessary handling of errors and might loose valuable information for
-	locating the error.  1000 milliseconds (once each second) is the current
-	default. Systems which require all the bandwidth they can get, may
-	increase this.
-
-	LOAD TIME: module/kernel parameter: poll_msec=[0|1]
-
-	RUN TIME: echo "1000" >/sys/devices/system/edac/mc/edac_mc_poll_msec
-
-
 ============================================================================
 'mcX' DIRECTORIES
 
@@ -392,7 +327,7 @@ Sdram memory scrubbing rate:
 	'sdram_scrub_rate'
 
 	Read/Write attribute file that controls memory scrubbing. The scrubbing
-	rate is set by writing a minimum bandwith in bytes/sec to the attribute
+	rate is set by writing a minimum bandwidth in bytes/sec to the attribute
 	file. The rate will be translated to an internal value that gives at
 	least the specified rate.
 
@@ -537,7 +472,6 @@ Channel 1 DIMM Label control file:
 	motherboard specific and determination of this information
 	must occur in userland at this time.
 
-
 ============================================================================
 SYSTEM LOGGING
 
@@ -570,7 +504,6 @@ error type, a notice of "no info" and then an optional,
 driver-specific error message.
 
 
-
 ============================================================================
 PCI Bus Parity Detection
 
@@ -604,6 +537,74 @@ Enable/Disable PCI Parity checking control file:
 	echo "0" >/sys/devices/system/edac/pci/check_pci_parity
 
 
+Parity Count:
+
+	'pci_parity_count'
+
+	This attribute file will display the number of parity errors that
+	have been detected.
+
+
+============================================================================
+MODULE PARAMETERS
+
+Panic on UE control file:
+
+	'edac_mc_panic_on_ue'
+
+	An uncorrectable error will cause a machine panic.  This is usually
+	desirable.  It is a bad idea to continue when an uncorrectable error
+	occurs - it is indeterminate what was uncorrected and the operating
+	system context might be so mangled that continuing will lead to further
+	corruption. If the kernel has MCE configured, then EDAC will never
+	notice the UE.
+
+	LOAD TIME: module/kernel parameter: edac_mc_panic_on_ue=[0|1]
+
+	RUN TIME:  echo "1" > /sys/module/edac_core/parameters/edac_mc_panic_on_ue
+
+
+Log UE control file:
+
+	'edac_mc_log_ue'
+
+	Generate kernel messages describing uncorrectable errors.  These errors
+	are reported through the system message log system.  UE statistics
+	will be accumulated even when UE logging is disabled.
+
+	LOAD TIME: module/kernel parameter: edac_mc_log_ue=[0|1]
+
+	RUN TIME: echo "1" > /sys/module/edac_core/parameters/edac_mc_log_ue
+
+
+Log CE control file:
+
+	'edac_mc_log_ce'
+
+	Generate kernel messages describing correctable errors.  These
+	errors are reported through the system message log system.
+	CE statistics will be accumulated even when CE logging is disabled.
+
+	LOAD TIME: module/kernel parameter: edac_mc_log_ce=[0|1]
+
+	RUN TIME: echo "1" > /sys/module/edac_core/parameters/edac_mc_log_ce
+
+
+Polling period control file:
+
+	'edac_mc_poll_msec'
+
+	The time period, in milliseconds, for polling for error information.
+	Too small a value wastes resources.  Too large a value might delay
+	necessary handling of errors and might loose valuable information for
+	locating the error.  1000 milliseconds (once each second) is the current
+	default. Systems which require all the bandwidth they can get, may
+	increase this.
+
+	LOAD TIME: module/kernel parameter: edac_mc_poll_msec=[0|1]
+
+	RUN TIME: echo "1000" > /sys/module/edac_core/parameters/edac_mc_poll_msec
+
 
 Panic on PCI PARITY Error:
 
@@ -614,21 +615,13 @@ Panic on PCI PARITY Error:
 	error has been detected.
 
 
-	module/kernel parameter: panic_on_pci_parity=[0|1]
+	module/kernel parameter: edac_panic_on_pci_pe=[0|1]
 
 	Enable:
-	echo "1" >/sys/devices/system/edac/pci/panic_on_pci_parity
+	echo "1" > /sys/module/edac_core/parameters/edac_panic_on_pci_pe
 
 	Disable:
-	echo "0" >/sys/devices/system/edac/pci/panic_on_pci_parity
-
-
-Parity Count:
-
-	'pci_parity_count'
-
-	This attribute file will display the number of parity errors that
-	have been detected.
+	echo "0" > /sys/module/edac_core/parameters/edac_panic_on_pci_pe
 
 
 

+ 131 - 0
Documentation/fb/sh7760fb.txt

@@ -0,0 +1,131 @@
+SH7760/SH7763 integrated LCDC Framebuffer driver
+================================================
+
+0. Overwiew
+-----------
+The SH7760/SH7763 have an integrated LCD Display controller (LCDC) which
+supports (in theory) resolutions ranging from 1x1 to 1024x1024,
+with color depths ranging from 1 to 16 bits, on STN, DSTN and TFT Panels.
+
+Caveats:
+* Framebuffer memory must be a large chunk allocated at the top
+  of Area3 (HW requirement). Because of this requirement you should NOT
+  make the driver a module since at runtime it may become impossible to
+  get a large enough contiguous chunk of memory.
+
+* The driver does not support changing resolution while loaded
+  (displays aren't hotpluggable anyway)
+
+* Heavy flickering may be observed
+  a) if you're using 15/16bit color modes at >= 640x480 px resolutions,
+  b) during PCMCIA (or any other slow bus) activity.
+
+* Rotation works only 90degress clockwise, and only if horizontal
+  resolution is <= 320 pixels.
+
+files:   drivers/video/sh7760fb.c
+        include/asm-sh/sh7760fb.h
+        Documentation/fb/sh7760fb.txt
+
+1. Platform setup
+-----------------
+SH7760:
+ Video data is fetched via the DMABRG DMA engine, so you have to
+ configure the SH DMAC for DMABRG mode (write 0x94808080 to the
+ DMARSRA register somewhere at boot).
+
+ PFC registers PCCR and PCDR must be set to peripheral mode.
+ (write zeros to both).
+
+The driver does NOT do the above for you since board setup is, well, job
+of the board setup code.
+
+2. Panel definitions
+--------------------
+The LCDC must explicitly be told about the type of LCD panel
+attached.  Data must be wrapped in a "struct sh7760fb_platdata" and
+passed to the driver as platform_data.
+
+Suggest you take a closer look at the SH7760 Manual, Section 30.
+(http://documentation.renesas.com/eng/products/mpumcu/e602291_sh7760.pdf)
+
+The following code illustrates what needs to be done to
+get the framebuffer working on a 640x480 TFT:
+
+====================== cut here ======================================
+
+#include <linux/fb.h>
+#include <asm/sh7760fb.h>
+
+/*
+ * NEC NL6440bc26-01 640x480 TFT
+ * dotclock 25175 kHz
+ * Xres                640     Yres            480
+ * Htotal      800     Vtotal          525
+ * HsynStart   656     VsynStart       490
+ * HsynLenn    30      VsynLenn        2
+ *
+ * The linux framebuffer layer does not use the syncstart/synclen
+ * values but right/left/upper/lower margin values. The comments
+ * for the x_margin explain how to calculate those from given
+ * panel sync timings.
+ */
+static struct fb_videomode nl6448bc26 = {
+       .name           = "NL6448BC26",
+       .refresh        = 60,
+       .xres           = 640,
+       .yres           = 480,
+       .pixclock       = 39683,        /* in picoseconds! */
+       .hsync_len      = 30,
+       .vsync_len      = 2,
+       .left_margin    = 114,  /* HTOT - (HSYNSLEN + HSYNSTART) */
+       .right_margin   = 16,   /* HSYNSTART - XRES */
+       .upper_margin   = 33,   /* VTOT - (VSYNLEN + VSYNSTART) */
+       .lower_margin   = 10,   /* VSYNSTART - YRES */
+       .sync           = FB_SYNC_HOR_HIGH_ACT | FB_SYNC_VERT_HIGH_ACT,
+       .vmode          = FB_VMODE_NONINTERLACED,
+       .flag           = 0,
+};
+
+static struct sh7760fb_platdata sh7760fb_nl6448 = {
+       .def_mode       = &nl6448bc26,
+       .ldmtr          = LDMTR_TFT_COLOR_16,   /* 16bit TFT panel */
+       .lddfr          = LDDFR_8BPP,           /* we want 8bit output */
+       .ldpmmr         = 0x0070,
+       .ldpspr         = 0x0500,
+       .ldaclnr        = 0,
+       .ldickr         = LDICKR_CLKSRC(LCDC_CLKSRC_EXTERNAL) |
+                         LDICKR_CLKDIV(1),
+       .rotate         = 0,
+       .novsync        = 1,
+       .blank          = NULL,
+};
+
+/* SH7760:
+ * 0xFE300800: 256 * 4byte xRGB palette ram
+ * 0xFE300C00: 42 bytes ctrl registers
+ */
+static struct resource sh7760_lcdc_res[] = {
+       [0] = {
+               .start  = 0xFE300800,
+               .end    = 0xFE300CFF,
+               .flags  = IORESOURCE_MEM,
+       },
+       [1] = {
+               .start  = 65,
+               .end    = 65,
+               .flags  = IORESOURCE_IRQ,
+       },
+};
+
+static struct platform_device sh7760_lcdc_dev = {
+       .dev    = {
+               .platform_data = &sh7760fb_nl6448,
+       },
+       .name           = "sh7760-lcdc",
+       .id             = -1,
+       .resource       = sh7760_lcdc_res,
+       .num_resources  = ARRAY_SIZE(sh7760_lcdc_res),
+};
+
+====================== cut here ======================================

+ 31 - 15
Documentation/fb/tridentfb.txt

@@ -3,11 +3,25 @@ Tridentfb is a framebuffer driver for some Trident chip based cards.
 The following list of chips is thought to be supported although not all are
 tested:
 
-those from the Image series with Cyber in their names - accelerated
-those with Blade in their names (Blade3D,CyberBlade...) - accelerated
-the newer CyberBladeXP family  - nonaccelerated
-
-Only PCI/AGP based cards are supported, none of the older Tridents.
+those from the TGUI series 9440/96XX and with Cyber in their names
+those from the Image series and with Cyber in their names
+those with Blade in their names (Blade3D,CyberBlade...)
+the newer CyberBladeXP family
+
+All families are accelerated. Only PCI/AGP based cards are supported,
+none of the older Tridents.
+The driver supports 8, 16 and 32 bits per pixel depths.
+The TGUI family requires a line length to be power of 2 if acceleration
+is enabled. This means that range of possible resolutions and bpp is
+limited comparing to the range if acceleration is disabled (see list
+of parameters below).
+
+Known bugs:
+1. The driver randomly locks up on 3DImage975 chip with acceleration
+   enabled. The same happens in X11 (Xorg).
+2. The ramdac speeds require some more fine tuning. It is possible to
+   switch resolution which the chip does not support at some depths for
+   older chips.
 
 How to use it?
 ==============
@@ -17,12 +31,11 @@ video=tridentfb
 
 The parameters for tridentfb are concatenated with a ':' as in this example.
 
-video=tridentfb:800x600,bpp=16,noaccel
+video=tridentfb:800x600-16@75,noaccel
 
 The second level parameters that tridentfb understands are:
 
 noaccel - turns off acceleration (when it doesn't work for your card)
-accel - force text acceleration (for boards which by default are noacceled)
 
 fp	- use flat panel related stuff
 crt 	- assume monitor is present instead of fp
@@ -31,21 +44,24 @@ center 	- for flat panels and resolutions smaller than native size center the
 	  image, otherwise use
 stretch
 
-memsize - integer value in Kb, use if your card's memory size is misdetected.
+memsize - integer value in KB, use if your card's memory size is misdetected.
 	  look at the driver output to see what it says when initializing.
-memdiff - integer value in Kb,should be nonzero if your card reports
-	  more memory than it actually has.For instance mine is 192K less than
+
+memdiff - integer value in KB, should be nonzero if your card reports
+	  more memory than it actually has. For instance mine is 192K less than
 	  detection says in all three BIOS selectable situations 2M, 4M, 8M.
 	  Only use if your video memory is taken from main memory hence of
-	  configurable size.Otherwise use memsize.
-	  If in some modes which barely fit the memory you see garbage at the bottom
-	  this might help by not letting change to that mode anymore.
+	  configurable size. Otherwise use memsize.
+	  If in some modes which barely fit the memory you see garbage
+	  at the bottom this might help by not letting change to that mode
+	  anymore.
 
 nativex - the width in pixels of the flat panel.If you know it (usually 1024
 	  800 or 1280) and it is not what the driver seems to detect use it.
 
-bpp  - bits per pixel (8,16 or 32)
-mode - a mode name like 800x600 (as described in Documentation/fb/modedb.txt)
+bpp	- bits per pixel (8,16 or 32)
+mode	- a mode name like 800x600-8@75 as described in
+	  Documentation/fb/modedb.txt
 
 Using insane values for the above parameters will probably result in driver
 misbehaviour so take care(for instance memsize=12345678 or memdiff=23784 or

+ 53 - 23
Documentation/feature-removal-schedule.txt

@@ -47,6 +47,30 @@ Who:	Mauro Carvalho Chehab <mchehab@infradead.org>
 
 ---------------------------
 
+What:	old tuner-3036 i2c driver
+When:	2.6.28
+Why:	This driver is for VERY old i2c-over-parallel port teletext receiver
+	boxes. Rather then spending effort on converting this driver to V4L2,
+	and since it is extremely unlikely that anyone still uses one of these
+	devices, it was decided to drop it.
+Who:	Hans Verkuil <hverkuil@xs4all.nl>
+	Mauro Carvalho Chehab <mchehab@infradead.org>
+
+ ---------------------------
+
+What:   V4L2 dpc7146 driver
+When:   2.6.28
+Why:    Old driver for the dpc7146 demonstration board that is no longer
+	relevant. The last time this was tested on actual hardware was
+	probably around 2002. Since this is a driver for a demonstration
+	board the decision was made to remove it rather than spending a
+	lot of effort continually updating this driver to stay in sync
+	with the latest internal V4L2 or I2C API.
+Who:    Hans Verkuil <hverkuil@xs4all.nl>
+	Mauro Carvalho Chehab <mchehab@infradead.org>
+
+---------------------------
+
 What:	PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl])
 When:	November 2005
 Files:	drivers/pcmcia/: pcmcia_ioctl.c
@@ -138,24 +162,6 @@ Who:	Kay Sievers <kay.sievers@suse.de>
 
 ---------------------------
 
-What:	find_task_by_pid
-When:	2.6.26
-Why:	With pid namespaces, calling this funciton will return the
-	wrong task when called from inside a namespace.
-
-	The best way to save a task pid and find a task by this
-	pid later, is to find this task's struct pid pointer (or get
-	it directly from the task) and call pid_task() later.
-
-	If someone really needs to get a task by its pid_t, then
-	he most likely needs the find_task_by_vpid() to get the
-	task from the same namespace as the current task is in, but
-	this may be not so in general.
-
-Who:	Pavel Emelyanov <xemul@openvz.org>
-
----------------------------
-
 What:	ACPI procfs interface
 When:	July 2008
 Why:	ACPI sysfs conversion should be finished by January 2008.
@@ -300,11 +306,15 @@ Who:	ocfs2-devel@oss.oracle.com
 
 ---------------------------
 
-What:	asm/semaphore.h
-When:	2.6.26
-Why:	Implementation became generic; users should now include
-	linux/semaphore.h instead.
-Who:	Matthew Wilcox <willy@linux.intel.com>
+What:	SCTP_GET_PEER_ADDRS_NUM_OLD, SCTP_GET_PEER_ADDRS_OLD,
+	SCTP_GET_LOCAL_ADDRS_NUM_OLD, SCTP_GET_LOCAL_ADDRS_OLD
+When: 	June 2009
+Why:    A newer version of the options have been introduced in 2005 that
+	removes the limitions of the old API.  The sctp library has been
+        converted to use these new options at the same time.  Any user
+	space app that directly uses the old options should convert to using
+	the new options.
+Who:	Vlad Yasevich <vladislav.yasevich@hp.com>
 
 ---------------------------
 
@@ -314,3 +324,23 @@ Why:	This option was introduced just to allow older lm-sensors userspace
 	to keep working over the upgrade to 2.6.26. At the scheduled time of
 	removal fixed lm-sensors (2.x or 3.x) should be readily available.
 Who:	Rene Herman <rene.herman@gmail.com>
+
+---------------------------
+
+What:	Code that is now under CONFIG_WIRELESS_EXT_SYSFS
+	(in net/core/net-sysfs.c)
+When:	After the only user (hal) has seen a release with the patches
+	for enough time, probably some time in 2010.
+Why:	Over 1K .text/.data size reduction, data is available in other
+	ways (ioctls)
+Who:	Johannes Berg <johannes@sipsolutions.net>
+
+---------------------------
+
+What: CONFIG_NF_CT_ACCT
+When: 2.6.29
+Why:  Accounting can now be enabled/disabled without kernel recompilation.
+      Currently used only to set a default value for a feature that is also
+      controlled by a kernel/module/sysfs/sysctl parameter.
+Who:  Krzysztof Piotr Oledzki <ole@ans.pl>
+

+ 7 - 0
Documentation/filesystems/Locking

@@ -510,6 +510,7 @@ prototypes:
 	void (*close)(struct vm_area_struct*);
 	int (*fault)(struct vm_area_struct*, struct vm_fault *);
 	int (*page_mkwrite)(struct vm_area_struct *, struct page *);
+	int (*access)(struct vm_area_struct *, unsigned long, void*, int, int);
 
 locking rules:
 		BKL	mmap_sem	PageLocked(page)
@@ -517,6 +518,7 @@ open:		no	yes
 close:		no	yes
 fault:		no	yes
 page_mkwrite:	no	yes		no
+access:		no	yes
 
 	->page_mkwrite() is called when a previously read-only page is
 about to become writeable. The file system is responsible for
@@ -525,6 +527,11 @@ taking to lock out truncate, the page range should be verified to be
 within i_size. The page mapping should also be checked that it is not
 NULL.
 
+	->access() is called when get_user_pages() fails in
+acces_process_vm(), typically used to debug a process through
+/proc/pid/mem or ptrace.  This function is needed only for
+VM_IO | VM_PFNMAP VMAs.
+
 ================================================================================
 			Dubious stuff
 

+ 5 - 5
Documentation/filesystems/bfs.txt

@@ -26,11 +26,11 @@ You can simplify mounting by just typing:
 
 this will allocate the first available loopback device (and load loop.o 
 kernel module if necessary) automatically. If the loopback driver is not
-loaded automatically, make sure that your kernel is compiled with kmod 
-support (CONFIG_KMOD) enabled. Beware that umount will not
-deallocate /dev/loopN device if /etc/mtab file on your system is a
-symbolic link to /proc/mounts. You will need to do it manually using
-"-d" switch of losetup(8). Read losetup(8) manpage for more info.
+loaded automatically, make sure that you have compiled the module and
+that modprobe is functioning. Beware that umount will not deallocate
+/dev/loopN device if /etc/mtab file on your system is a symbolic link to
+/proc/mounts. You will need to do it manually using "-d" switch of
+losetup(8). Read losetup(8) manpage for more info.
 
 To create the BFS image under UnixWare you need to find out first which
 slice contains it. The command prtvtoc(1M) is your friend:

+ 18 - 9
Documentation/filesystems/configfs/configfs.txt

@@ -233,12 +233,10 @@ accomplished via the group operations specified on the group's
 config_item_type.
 
 	struct configfs_group_operations {
-		int (*make_item)(struct config_group *group,
-				 const char *name,
-				 struct config_item **new_item);
-		int (*make_group)(struct config_group *group,
-				  const char *name,
-				  struct config_group **new_group);
+		struct config_item *(*make_item)(struct config_group *group,
+						 const char *name);
+		struct config_group *(*make_group)(struct config_group *group,
+						   const char *name);
 		int (*commit_item)(struct config_item *item);
 		void (*disconnect_notify)(struct config_group *group,
 					  struct config_item *item);
@@ -313,9 +311,20 @@ the subsystem must be ready for it.
 [An Example]
 
 The best example of these basic concepts is the simple_children
-subsystem/group and the simple_child item in configfs_example.c  It
-shows a trivial object displaying and storing an attribute, and a simple
-group creating and destroying these children.
+subsystem/group and the simple_child item in configfs_example_explicit.c
+and configfs_example_macros.c.  It shows a trivial object displaying and
+storing an attribute, and a simple group creating and destroying these
+children.
+
+The only difference between configfs_example_explicit.c and
+configfs_example_macros.c is how the attributes of the childless item
+are defined.  The childless item has extended attributes, each with
+their own show()/store() operation.  This follows a convention commonly
+used in sysfs.  configfs_example_explicit.c creates these attributes
+by explicitly defining the structures involved.  Conversely
+configfs_example_macros.c uses some convenience macros from configfs.h
+to define the attributes.  These macros are similar to their sysfs
+counterparts.
 
 [Hierarchy Navigation and the Subsystem Mutex]
 

+ 0 - 487
Documentation/filesystems/configfs/configfs_example.c

@@ -1,487 +0,0 @@
-/*
- * vim: noexpandtab ts=8 sts=0 sw=8:
- *
- * configfs_example.c - This file is a demonstration module containing
- *      a number of configfs subsystems.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License as published by the Free Software Foundation; either
- * version 2 of the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this program; if not, write to the
- * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
- * Boston, MA 021110-1307, USA.
- *
- * Based on sysfs:
- * 	sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
- *
- * configfs Copyright (C) 2005 Oracle.  All rights reserved.
- */
-
-#include <linux/init.h>
-#include <linux/module.h>
-#include <linux/slab.h>
-
-#include <linux/configfs.h>
-
-
-
-/*
- * 01-childless
- *
- * This first example is a childless subsystem.  It cannot create
- * any config_items.  It just has attributes.
- *
- * Note that we are enclosing the configfs_subsystem inside a container.
- * This is not necessary if a subsystem has no attributes directly
- * on the subsystem.  See the next example, 02-simple-children, for
- * such a subsystem.
- */
-
-struct childless {
-	struct configfs_subsystem subsys;
-	int showme;
-	int storeme;
-};
-
-struct childless_attribute {
-	struct configfs_attribute attr;
-	ssize_t (*show)(struct childless *, char *);
-	ssize_t (*store)(struct childless *, const char *, size_t);
-};
-
-static inline struct childless *to_childless(struct config_item *item)
-{
-	return item ? container_of(to_configfs_subsystem(to_config_group(item)), struct childless, subsys) : NULL;
-}
-
-static ssize_t childless_showme_read(struct childless *childless,
-				     char *page)
-{
-	ssize_t pos;
-
-	pos = sprintf(page, "%d\n", childless->showme);
-	childless->showme++;
-
-	return pos;
-}
-
-static ssize_t childless_storeme_read(struct childless *childless,
-				      char *page)
-{
-	return sprintf(page, "%d\n", childless->storeme);
-}
-
-static ssize_t childless_storeme_write(struct childless *childless,
-				       const char *page,
-				       size_t count)
-{
-	unsigned long tmp;
-	char *p = (char *) page;
-
-	tmp = simple_strtoul(p, &p, 10);
-	if (!p || (*p && (*p != '\n')))
-		return -EINVAL;
-
-	if (tmp > INT_MAX)
-		return -ERANGE;
-
-	childless->storeme = tmp;
-
-	return count;
-}
-
-static ssize_t childless_description_read(struct childless *childless,
-					  char *page)
-{
-	return sprintf(page,
-"[01-childless]\n"
-"\n"
-"The childless subsystem is the simplest possible subsystem in\n"
-"configfs.  It does not support the creation of child config_items.\n"
-"It only has a few attributes.  In fact, it isn't much different\n"
-"than a directory in /proc.\n");
-}
-
-static struct childless_attribute childless_attr_showme = {
-	.attr	= { .ca_owner = THIS_MODULE, .ca_name = "showme", .ca_mode = S_IRUGO },
-	.show	= childless_showme_read,
-};
-static struct childless_attribute childless_attr_storeme = {
-	.attr	= { .ca_owner = THIS_MODULE, .ca_name = "storeme", .ca_mode = S_IRUGO | S_IWUSR },
-	.show	= childless_storeme_read,
-	.store	= childless_storeme_write,
-};
-static struct childless_attribute childless_attr_description = {
-	.attr = { .ca_owner = THIS_MODULE, .ca_name = "description", .ca_mode = S_IRUGO },
-	.show = childless_description_read,
-};
-
-static struct configfs_attribute *childless_attrs[] = {
-	&childless_attr_showme.attr,
-	&childless_attr_storeme.attr,
-	&childless_attr_description.attr,
-	NULL,
-};
-
-static ssize_t childless_attr_show(struct config_item *item,
-				   struct configfs_attribute *attr,
-				   char *page)
-{
-	struct childless *childless = to_childless(item);
-	struct childless_attribute *childless_attr =
-		container_of(attr, struct childless_attribute, attr);
-	ssize_t ret = 0;
-
-	if (childless_attr->show)
-		ret = childless_attr->show(childless, page);
-	return ret;
-}
-
-static ssize_t childless_attr_store(struct config_item *item,
-				    struct configfs_attribute *attr,
-				    const char *page, size_t count)
-{
-	struct childless *childless = to_childless(item);
-	struct childless_attribute *childless_attr =
-		container_of(attr, struct childless_attribute, attr);
-	ssize_t ret = -EINVAL;
-
-	if (childless_attr->store)
-		ret = childless_attr->store(childless, page, count);
-	return ret;
-}
-
-static struct configfs_item_operations childless_item_ops = {
-	.show_attribute		= childless_attr_show,
-	.store_attribute	= childless_attr_store,
-};
-
-static struct config_item_type childless_type = {
-	.ct_item_ops	= &childless_item_ops,
-	.ct_attrs	= childless_attrs,
-	.ct_owner	= THIS_MODULE,
-};
-
-static struct childless childless_subsys = {
-	.subsys = {
-		.su_group = {
-			.cg_item = {
-				.ci_namebuf = "01-childless",
-				.ci_type = &childless_type,
-			},
-		},
-	},
-};
-
-
-/* ----------------------------------------------------------------- */
-
-/*
- * 02-simple-children
- *
- * This example merely has a simple one-attribute child.  Note that
- * there is no extra attribute structure, as the child's attribute is
- * known from the get-go.  Also, there is no container for the
- * subsystem, as it has no attributes of its own.
- */
-
-struct simple_child {
-	struct config_item item;
-	int storeme;
-};
-
-static inline struct simple_child *to_simple_child(struct config_item *item)
-{
-	return item ? container_of(item, struct simple_child, item) : NULL;
-}
-
-static struct configfs_attribute simple_child_attr_storeme = {
-	.ca_owner = THIS_MODULE,
-	.ca_name = "storeme",
-	.ca_mode = S_IRUGO | S_IWUSR,
-};
-
-static struct configfs_attribute *simple_child_attrs[] = {
-	&simple_child_attr_storeme,
-	NULL,
-};
-
-static ssize_t simple_child_attr_show(struct config_item *item,
-				      struct configfs_attribute *attr,
-				      char *page)
-{
-	ssize_t count;
-	struct simple_child *simple_child = to_simple_child(item);
-
-	count = sprintf(page, "%d\n", simple_child->storeme);
-
-	return count;
-}
-
-static ssize_t simple_child_attr_store(struct config_item *item,
-				       struct configfs_attribute *attr,
-				       const char *page, size_t count)
-{
-	struct simple_child *simple_child = to_simple_child(item);
-	unsigned long tmp;
-	char *p = (char *) page;
-
-	tmp = simple_strtoul(p, &p, 10);
-	if (!p || (*p && (*p != '\n')))
-		return -EINVAL;
-
-	if (tmp > INT_MAX)
-		return -ERANGE;
-
-	simple_child->storeme = tmp;
-
-	return count;
-}
-
-static void simple_child_release(struct config_item *item)
-{
-	kfree(to_simple_child(item));
-}
-
-static struct configfs_item_operations simple_child_item_ops = {
-	.release		= simple_child_release,
-	.show_attribute		= simple_child_attr_show,
-	.store_attribute	= simple_child_attr_store,
-};
-
-static struct config_item_type simple_child_type = {
-	.ct_item_ops	= &simple_child_item_ops,
-	.ct_attrs	= simple_child_attrs,
-	.ct_owner	= THIS_MODULE,
-};
-
-
-struct simple_children {
-	struct config_group group;
-};
-
-static inline struct simple_children *to_simple_children(struct config_item *item)
-{
-	return item ? container_of(to_config_group(item), struct simple_children, group) : NULL;
-}
-
-static int simple_children_make_item(struct config_group *group, const char *name, struct config_item **new_item)
-{
-	struct simple_child *simple_child;
-
-	simple_child = kzalloc(sizeof(struct simple_child), GFP_KERNEL);
-	if (!simple_child)
-		return -ENOMEM;
-
-
-	config_item_init_type_name(&simple_child->item, name,
-				   &simple_child_type);
-
-	simple_child->storeme = 0;
-
-	*new_item = &simple_child->item;
-	return 0;
-}
-
-static struct configfs_attribute simple_children_attr_description = {
-	.ca_owner = THIS_MODULE,
-	.ca_name = "description",
-	.ca_mode = S_IRUGO,
-};
-
-static struct configfs_attribute *simple_children_attrs[] = {
-	&simple_children_attr_description,
-	NULL,
-};
-
-static ssize_t simple_children_attr_show(struct config_item *item,
-			   		 struct configfs_attribute *attr,
-			   		 char *page)
-{
-	return sprintf(page,
-"[02-simple-children]\n"
-"\n"
-"This subsystem allows the creation of child config_items.  These\n"
-"items have only one attribute that is readable and writeable.\n");
-}
-
-static void simple_children_release(struct config_item *item)
-{
-	kfree(to_simple_children(item));
-}
-
-static struct configfs_item_operations simple_children_item_ops = {
-	.release 	= simple_children_release,
-	.show_attribute	= simple_children_attr_show,
-};
-
-/*
- * Note that, since no extra work is required on ->drop_item(),
- * no ->drop_item() is provided.
- */
-static struct configfs_group_operations simple_children_group_ops = {
-	.make_item	= simple_children_make_item,
-};
-
-static struct config_item_type simple_children_type = {
-	.ct_item_ops	= &simple_children_item_ops,
-	.ct_group_ops	= &simple_children_group_ops,
-	.ct_attrs	= simple_children_attrs,
-	.ct_owner	= THIS_MODULE,
-};
-
-static struct configfs_subsystem simple_children_subsys = {
-	.su_group = {
-		.cg_item = {
-			.ci_namebuf = "02-simple-children",
-			.ci_type = &simple_children_type,
-		},
-	},
-};
-
-
-/* ----------------------------------------------------------------- */
-
-/*
- * 03-group-children
- *
- * This example reuses the simple_children group from above.  However,
- * the simple_children group is not the subsystem itself, it is a
- * child of the subsystem.  Creation of a group in the subsystem creates
- * a new simple_children group.  That group can then have simple_child
- * children of its own.
- */
-
-static int group_children_make_group(struct config_group *group, const char *name, struct config_group **new_group)
-{
-	struct simple_children *simple_children;
-
-	simple_children = kzalloc(sizeof(struct simple_children),
-				  GFP_KERNEL);
-	if (!simple_children)
-		return -ENOMEM;
-
-
-	config_group_init_type_name(&simple_children->group, name,
-				    &simple_children_type);
-
-	*new_group = &simple_children->group;
-	return 0;
-}
-
-static struct configfs_attribute group_children_attr_description = {
-	.ca_owner = THIS_MODULE,
-	.ca_name = "description",
-	.ca_mode = S_IRUGO,
-};
-
-static struct configfs_attribute *group_children_attrs[] = {
-	&group_children_attr_description,
-	NULL,
-};
-
-static ssize_t group_children_attr_show(struct config_item *item,
-			   		struct configfs_attribute *attr,
-			   		char *page)
-{
-	return sprintf(page,
-"[03-group-children]\n"
-"\n"
-"This subsystem allows the creation of child config_groups.  These\n"
-"groups are like the subsystem simple-children.\n");
-}
-
-static struct configfs_item_operations group_children_item_ops = {
-	.show_attribute	= group_children_attr_show,
-};
-
-/*
- * Note that, since no extra work is required on ->drop_item(),
- * no ->drop_item() is provided.
- */
-static struct configfs_group_operations group_children_group_ops = {
-	.make_group	= group_children_make_group,
-};
-
-static struct config_item_type group_children_type = {
-	.ct_item_ops	= &group_children_item_ops,
-	.ct_group_ops	= &group_children_group_ops,
-	.ct_attrs	= group_children_attrs,
-	.ct_owner	= THIS_MODULE,
-};
-
-static struct configfs_subsystem group_children_subsys = {
-	.su_group = {
-		.cg_item = {
-			.ci_namebuf = "03-group-children",
-			.ci_type = &group_children_type,
-		},
-	},
-};
-
-/* ----------------------------------------------------------------- */
-
-/*
- * We're now done with our subsystem definitions.
- * For convenience in this module, here's a list of them all.  It
- * allows the init function to easily register them.  Most modules
- * will only have one subsystem, and will only call register_subsystem
- * on it directly.
- */
-static struct configfs_subsystem *example_subsys[] = {
-	&childless_subsys.subsys,
-	&simple_children_subsys,
-	&group_children_subsys,
-	NULL,
-};
-
-static int __init configfs_example_init(void)
-{
-	int ret;
-	int i;
-	struct configfs_subsystem *subsys;
-
-	for (i = 0; example_subsys[i]; i++) {
-		subsys = example_subsys[i];
-
-		config_group_init(&subsys->su_group);
-		mutex_init(&subsys->su_mutex);
-		ret = configfs_register_subsystem(subsys);
-		if (ret) {
-			printk(KERN_ERR "Error %d while registering subsystem %s\n",
-			       ret,
-			       subsys->su_group.cg_item.ci_namebuf);
-			goto out_unregister;
-		}
-	}
-
-	return 0;
-
-out_unregister:
-	for (; i >= 0; i--) {
-		configfs_unregister_subsystem(example_subsys[i]);
-	}
-
-	return ret;
-}
-
-static void __exit configfs_example_exit(void)
-{
-	int i;
-
-	for (i = 0; example_subsys[i]; i++) {
-		configfs_unregister_subsystem(example_subsys[i]);
-	}
-}
-
-module_init(configfs_example_init);
-module_exit(configfs_example_exit);
-MODULE_LICENSE("GPL");

+ 485 - 0
Documentation/filesystems/configfs/configfs_example_explicit.c

@@ -0,0 +1,485 @@
+/*
+ * vim: noexpandtab ts=8 sts=0 sw=8:
+ *
+ * configfs_example_explicit.c - This file is a demonstration module
+ *      containing a number of configfs subsystems.  It explicitly defines
+ *      each structure without using the helper macros defined in
+ *      configfs.h.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ *
+ * Based on sysfs:
+ * 	sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
+ *
+ * configfs Copyright (C) 2005 Oracle.  All rights reserved.
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+#include <linux/configfs.h>
+
+
+
+/*
+ * 01-childless
+ *
+ * This first example is a childless subsystem.  It cannot create
+ * any config_items.  It just has attributes.
+ *
+ * Note that we are enclosing the configfs_subsystem inside a container.
+ * This is not necessary if a subsystem has no attributes directly
+ * on the subsystem.  See the next example, 02-simple-children, for
+ * such a subsystem.
+ */
+
+struct childless {
+	struct configfs_subsystem subsys;
+	int showme;
+	int storeme;
+};
+
+struct childless_attribute {
+	struct configfs_attribute attr;
+	ssize_t (*show)(struct childless *, char *);
+	ssize_t (*store)(struct childless *, const char *, size_t);
+};
+
+static inline struct childless *to_childless(struct config_item *item)
+{
+	return item ? container_of(to_configfs_subsystem(to_config_group(item)), struct childless, subsys) : NULL;
+}
+
+static ssize_t childless_showme_read(struct childless *childless,
+				     char *page)
+{
+	ssize_t pos;
+
+	pos = sprintf(page, "%d\n", childless->showme);
+	childless->showme++;
+
+	return pos;
+}
+
+static ssize_t childless_storeme_read(struct childless *childless,
+				      char *page)
+{
+	return sprintf(page, "%d\n", childless->storeme);
+}
+
+static ssize_t childless_storeme_write(struct childless *childless,
+				       const char *page,
+				       size_t count)
+{
+	unsigned long tmp;
+	char *p = (char *) page;
+
+	tmp = simple_strtoul(p, &p, 10);
+	if (!p || (*p && (*p != '\n')))
+		return -EINVAL;
+
+	if (tmp > INT_MAX)
+		return -ERANGE;
+
+	childless->storeme = tmp;
+
+	return count;
+}
+
+static ssize_t childless_description_read(struct childless *childless,
+					  char *page)
+{
+	return sprintf(page,
+"[01-childless]\n"
+"\n"
+"The childless subsystem is the simplest possible subsystem in\n"
+"configfs.  It does not support the creation of child config_items.\n"
+"It only has a few attributes.  In fact, it isn't much different\n"
+"than a directory in /proc.\n");
+}
+
+static struct childless_attribute childless_attr_showme = {
+	.attr	= { .ca_owner = THIS_MODULE, .ca_name = "showme", .ca_mode = S_IRUGO },
+	.show	= childless_showme_read,
+};
+static struct childless_attribute childless_attr_storeme = {
+	.attr	= { .ca_owner = THIS_MODULE, .ca_name = "storeme", .ca_mode = S_IRUGO | S_IWUSR },
+	.show	= childless_storeme_read,
+	.store	= childless_storeme_write,
+};
+static struct childless_attribute childless_attr_description = {
+	.attr = { .ca_owner = THIS_MODULE, .ca_name = "description", .ca_mode = S_IRUGO },
+	.show = childless_description_read,
+};
+
+static struct configfs_attribute *childless_attrs[] = {
+	&childless_attr_showme.attr,
+	&childless_attr_storeme.attr,
+	&childless_attr_description.attr,
+	NULL,
+};
+
+static ssize_t childless_attr_show(struct config_item *item,
+				   struct configfs_attribute *attr,
+				   char *page)
+{
+	struct childless *childless = to_childless(item);
+	struct childless_attribute *childless_attr =
+		container_of(attr, struct childless_attribute, attr);
+	ssize_t ret = 0;
+
+	if (childless_attr->show)
+		ret = childless_attr->show(childless, page);
+	return ret;
+}
+
+static ssize_t childless_attr_store(struct config_item *item,
+				    struct configfs_attribute *attr,
+				    const char *page, size_t count)
+{
+	struct childless *childless = to_childless(item);
+	struct childless_attribute *childless_attr =
+		container_of(attr, struct childless_attribute, attr);
+	ssize_t ret = -EINVAL;
+
+	if (childless_attr->store)
+		ret = childless_attr->store(childless, page, count);
+	return ret;
+}
+
+static struct configfs_item_operations childless_item_ops = {
+	.show_attribute		= childless_attr_show,
+	.store_attribute	= childless_attr_store,
+};
+
+static struct config_item_type childless_type = {
+	.ct_item_ops	= &childless_item_ops,
+	.ct_attrs	= childless_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct childless childless_subsys = {
+	.subsys = {
+		.su_group = {
+			.cg_item = {
+				.ci_namebuf = "01-childless",
+				.ci_type = &childless_type,
+			},
+		},
+	},
+};
+
+
+/* ----------------------------------------------------------------- */
+
+/*
+ * 02-simple-children
+ *
+ * This example merely has a simple one-attribute child.  Note that
+ * there is no extra attribute structure, as the child's attribute is
+ * known from the get-go.  Also, there is no container for the
+ * subsystem, as it has no attributes of its own.
+ */
+
+struct simple_child {
+	struct config_item item;
+	int storeme;
+};
+
+static inline struct simple_child *to_simple_child(struct config_item *item)
+{
+	return item ? container_of(item, struct simple_child, item) : NULL;
+}
+
+static struct configfs_attribute simple_child_attr_storeme = {
+	.ca_owner = THIS_MODULE,
+	.ca_name = "storeme",
+	.ca_mode = S_IRUGO | S_IWUSR,
+};
+
+static struct configfs_attribute *simple_child_attrs[] = {
+	&simple_child_attr_storeme,
+	NULL,
+};
+
+static ssize_t simple_child_attr_show(struct config_item *item,
+				      struct configfs_attribute *attr,
+				      char *page)
+{
+	ssize_t count;
+	struct simple_child *simple_child = to_simple_child(item);
+
+	count = sprintf(page, "%d\n", simple_child->storeme);
+
+	return count;
+}
+
+static ssize_t simple_child_attr_store(struct config_item *item,
+				       struct configfs_attribute *attr,
+				       const char *page, size_t count)
+{
+	struct simple_child *simple_child = to_simple_child(item);
+	unsigned long tmp;
+	char *p = (char *) page;
+
+	tmp = simple_strtoul(p, &p, 10);
+	if (!p || (*p && (*p != '\n')))
+		return -EINVAL;
+
+	if (tmp > INT_MAX)
+		return -ERANGE;
+
+	simple_child->storeme = tmp;
+
+	return count;
+}
+
+static void simple_child_release(struct config_item *item)
+{
+	kfree(to_simple_child(item));
+}
+
+static struct configfs_item_operations simple_child_item_ops = {
+	.release		= simple_child_release,
+	.show_attribute		= simple_child_attr_show,
+	.store_attribute	= simple_child_attr_store,
+};
+
+static struct config_item_type simple_child_type = {
+	.ct_item_ops	= &simple_child_item_ops,
+	.ct_attrs	= simple_child_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+
+struct simple_children {
+	struct config_group group;
+};
+
+static inline struct simple_children *to_simple_children(struct config_item *item)
+{
+	return item ? container_of(to_config_group(item), struct simple_children, group) : NULL;
+}
+
+static struct config_item *simple_children_make_item(struct config_group *group, const char *name)
+{
+	struct simple_child *simple_child;
+
+	simple_child = kzalloc(sizeof(struct simple_child), GFP_KERNEL);
+	if (!simple_child)
+		return ERR_PTR(-ENOMEM);
+
+	config_item_init_type_name(&simple_child->item, name,
+				   &simple_child_type);
+
+	simple_child->storeme = 0;
+
+	return &simple_child->item;
+}
+
+static struct configfs_attribute simple_children_attr_description = {
+	.ca_owner = THIS_MODULE,
+	.ca_name = "description",
+	.ca_mode = S_IRUGO,
+};
+
+static struct configfs_attribute *simple_children_attrs[] = {
+	&simple_children_attr_description,
+	NULL,
+};
+
+static ssize_t simple_children_attr_show(struct config_item *item,
+					 struct configfs_attribute *attr,
+					 char *page)
+{
+	return sprintf(page,
+"[02-simple-children]\n"
+"\n"
+"This subsystem allows the creation of child config_items.  These\n"
+"items have only one attribute that is readable and writeable.\n");
+}
+
+static void simple_children_release(struct config_item *item)
+{
+	kfree(to_simple_children(item));
+}
+
+static struct configfs_item_operations simple_children_item_ops = {
+	.release	= simple_children_release,
+	.show_attribute	= simple_children_attr_show,
+};
+
+/*
+ * Note that, since no extra work is required on ->drop_item(),
+ * no ->drop_item() is provided.
+ */
+static struct configfs_group_operations simple_children_group_ops = {
+	.make_item	= simple_children_make_item,
+};
+
+static struct config_item_type simple_children_type = {
+	.ct_item_ops	= &simple_children_item_ops,
+	.ct_group_ops	= &simple_children_group_ops,
+	.ct_attrs	= simple_children_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct configfs_subsystem simple_children_subsys = {
+	.su_group = {
+		.cg_item = {
+			.ci_namebuf = "02-simple-children",
+			.ci_type = &simple_children_type,
+		},
+	},
+};
+
+
+/* ----------------------------------------------------------------- */
+
+/*
+ * 03-group-children
+ *
+ * This example reuses the simple_children group from above.  However,
+ * the simple_children group is not the subsystem itself, it is a
+ * child of the subsystem.  Creation of a group in the subsystem creates
+ * a new simple_children group.  That group can then have simple_child
+ * children of its own.
+ */
+
+static struct config_group *group_children_make_group(struct config_group *group, const char *name)
+{
+	struct simple_children *simple_children;
+
+	simple_children = kzalloc(sizeof(struct simple_children),
+				  GFP_KERNEL);
+	if (!simple_children)
+		return ERR_PTR(-ENOMEM);
+
+	config_group_init_type_name(&simple_children->group, name,
+				    &simple_children_type);
+
+	return &simple_children->group;
+}
+
+static struct configfs_attribute group_children_attr_description = {
+	.ca_owner = THIS_MODULE,
+	.ca_name = "description",
+	.ca_mode = S_IRUGO,
+};
+
+static struct configfs_attribute *group_children_attrs[] = {
+	&group_children_attr_description,
+	NULL,
+};
+
+static ssize_t group_children_attr_show(struct config_item *item,
+					struct configfs_attribute *attr,
+					char *page)
+{
+	return sprintf(page,
+"[03-group-children]\n"
+"\n"
+"This subsystem allows the creation of child config_groups.  These\n"
+"groups are like the subsystem simple-children.\n");
+}
+
+static struct configfs_item_operations group_children_item_ops = {
+	.show_attribute	= group_children_attr_show,
+};
+
+/*
+ * Note that, since no extra work is required on ->drop_item(),
+ * no ->drop_item() is provided.
+ */
+static struct configfs_group_operations group_children_group_ops = {
+	.make_group	= group_children_make_group,
+};
+
+static struct config_item_type group_children_type = {
+	.ct_item_ops	= &group_children_item_ops,
+	.ct_group_ops	= &group_children_group_ops,
+	.ct_attrs	= group_children_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct configfs_subsystem group_children_subsys = {
+	.su_group = {
+		.cg_item = {
+			.ci_namebuf = "03-group-children",
+			.ci_type = &group_children_type,
+		},
+	},
+};
+
+/* ----------------------------------------------------------------- */
+
+/*
+ * We're now done with our subsystem definitions.
+ * For convenience in this module, here's a list of them all.  It
+ * allows the init function to easily register them.  Most modules
+ * will only have one subsystem, and will only call register_subsystem
+ * on it directly.
+ */
+static struct configfs_subsystem *example_subsys[] = {
+	&childless_subsys.subsys,
+	&simple_children_subsys,
+	&group_children_subsys,
+	NULL,
+};
+
+static int __init configfs_example_init(void)
+{
+	int ret;
+	int i;
+	struct configfs_subsystem *subsys;
+
+	for (i = 0; example_subsys[i]; i++) {
+		subsys = example_subsys[i];
+
+		config_group_init(&subsys->su_group);
+		mutex_init(&subsys->su_mutex);
+		ret = configfs_register_subsystem(subsys);
+		if (ret) {
+			printk(KERN_ERR "Error %d while registering subsystem %s\n",
+			       ret,
+			       subsys->su_group.cg_item.ci_namebuf);
+			goto out_unregister;
+		}
+	}
+
+	return 0;
+
+out_unregister:
+	for (; i >= 0; i--) {
+		configfs_unregister_subsystem(example_subsys[i]);
+	}
+
+	return ret;
+}
+
+static void __exit configfs_example_exit(void)
+{
+	int i;
+
+	for (i = 0; example_subsys[i]; i++) {
+		configfs_unregister_subsystem(example_subsys[i]);
+	}
+}
+
+module_init(configfs_example_init);
+module_exit(configfs_example_exit);
+MODULE_LICENSE("GPL");

+ 448 - 0
Documentation/filesystems/configfs/configfs_example_macros.c

@@ -0,0 +1,448 @@
+/*
+ * vim: noexpandtab ts=8 sts=0 sw=8:
+ *
+ * configfs_example_macros.c - This file is a demonstration module
+ *      containing a number of configfs subsystems.  It uses the helper
+ *      macros defined by configfs.h
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ *
+ * Based on sysfs:
+ * 	sysfs is Copyright (C) 2001, 2002, 2003 Patrick Mochel
+ *
+ * configfs Copyright (C) 2005 Oracle.  All rights reserved.
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+
+#include <linux/configfs.h>
+
+
+
+/*
+ * 01-childless
+ *
+ * This first example is a childless subsystem.  It cannot create
+ * any config_items.  It just has attributes.
+ *
+ * Note that we are enclosing the configfs_subsystem inside a container.
+ * This is not necessary if a subsystem has no attributes directly
+ * on the subsystem.  See the next example, 02-simple-children, for
+ * such a subsystem.
+ */
+
+struct childless {
+	struct configfs_subsystem subsys;
+	int showme;
+	int storeme;
+};
+
+static inline struct childless *to_childless(struct config_item *item)
+{
+	return item ? container_of(to_configfs_subsystem(to_config_group(item)), struct childless, subsys) : NULL;
+}
+
+CONFIGFS_ATTR_STRUCT(childless);
+#define CHILDLESS_ATTR(_name, _mode, _show, _store)	\
+struct childless_attribute childless_attr_##_name = __CONFIGFS_ATTR(_name, _mode, _show, _store)
+#define CHILDLESS_ATTR_RO(_name, _show)	\
+struct childless_attribute childless_attr_##_name = __CONFIGFS_ATTR_RO(_name, _show);
+
+static ssize_t childless_showme_read(struct childless *childless,
+				     char *page)
+{
+	ssize_t pos;
+
+	pos = sprintf(page, "%d\n", childless->showme);
+	childless->showme++;
+
+	return pos;
+}
+
+static ssize_t childless_storeme_read(struct childless *childless,
+				      char *page)
+{
+	return sprintf(page, "%d\n", childless->storeme);
+}
+
+static ssize_t childless_storeme_write(struct childless *childless,
+				       const char *page,
+				       size_t count)
+{
+	unsigned long tmp;
+	char *p = (char *) page;
+
+	tmp = simple_strtoul(p, &p, 10);
+	if (!p || (*p && (*p != '\n')))
+		return -EINVAL;
+
+	if (tmp > INT_MAX)
+		return -ERANGE;
+
+	childless->storeme = tmp;
+
+	return count;
+}
+
+static ssize_t childless_description_read(struct childless *childless,
+					  char *page)
+{
+	return sprintf(page,
+"[01-childless]\n"
+"\n"
+"The childless subsystem is the simplest possible subsystem in\n"
+"configfs.  It does not support the creation of child config_items.\n"
+"It only has a few attributes.  In fact, it isn't much different\n"
+"than a directory in /proc.\n");
+}
+
+CHILDLESS_ATTR_RO(showme, childless_showme_read);
+CHILDLESS_ATTR(storeme, S_IRUGO | S_IWUSR, childless_storeme_read,
+	       childless_storeme_write);
+CHILDLESS_ATTR_RO(description, childless_description_read);
+
+static struct configfs_attribute *childless_attrs[] = {
+	&childless_attr_showme.attr,
+	&childless_attr_storeme.attr,
+	&childless_attr_description.attr,
+	NULL,
+};
+
+CONFIGFS_ATTR_OPS(childless);
+static struct configfs_item_operations childless_item_ops = {
+	.show_attribute		= childless_attr_show,
+	.store_attribute	= childless_attr_store,
+};
+
+static struct config_item_type childless_type = {
+	.ct_item_ops	= &childless_item_ops,
+	.ct_attrs	= childless_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct childless childless_subsys = {
+	.subsys = {
+		.su_group = {
+			.cg_item = {
+				.ci_namebuf = "01-childless",
+				.ci_type = &childless_type,
+			},
+		},
+	},
+};
+
+
+/* ----------------------------------------------------------------- */
+
+/*
+ * 02-simple-children
+ *
+ * This example merely has a simple one-attribute child.  Note that
+ * there is no extra attribute structure, as the child's attribute is
+ * known from the get-go.  Also, there is no container for the
+ * subsystem, as it has no attributes of its own.
+ */
+
+struct simple_child {
+	struct config_item item;
+	int storeme;
+};
+
+static inline struct simple_child *to_simple_child(struct config_item *item)
+{
+	return item ? container_of(item, struct simple_child, item) : NULL;
+}
+
+static struct configfs_attribute simple_child_attr_storeme = {
+	.ca_owner = THIS_MODULE,
+	.ca_name = "storeme",
+	.ca_mode = S_IRUGO | S_IWUSR,
+};
+
+static struct configfs_attribute *simple_child_attrs[] = {
+	&simple_child_attr_storeme,
+	NULL,
+};
+
+static ssize_t simple_child_attr_show(struct config_item *item,
+				      struct configfs_attribute *attr,
+				      char *page)
+{
+	ssize_t count;
+	struct simple_child *simple_child = to_simple_child(item);
+
+	count = sprintf(page, "%d\n", simple_child->storeme);
+
+	return count;
+}
+
+static ssize_t simple_child_attr_store(struct config_item *item,
+				       struct configfs_attribute *attr,
+				       const char *page, size_t count)
+{
+	struct simple_child *simple_child = to_simple_child(item);
+	unsigned long tmp;
+	char *p = (char *) page;
+
+	tmp = simple_strtoul(p, &p, 10);
+	if (!p || (*p && (*p != '\n')))
+		return -EINVAL;
+
+	if (tmp > INT_MAX)
+		return -ERANGE;
+
+	simple_child->storeme = tmp;
+
+	return count;
+}
+
+static void simple_child_release(struct config_item *item)
+{
+	kfree(to_simple_child(item));
+}
+
+static struct configfs_item_operations simple_child_item_ops = {
+	.release		= simple_child_release,
+	.show_attribute		= simple_child_attr_show,
+	.store_attribute	= simple_child_attr_store,
+};
+
+static struct config_item_type simple_child_type = {
+	.ct_item_ops	= &simple_child_item_ops,
+	.ct_attrs	= simple_child_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+
+struct simple_children {
+	struct config_group group;
+};
+
+static inline struct simple_children *to_simple_children(struct config_item *item)
+{
+	return item ? container_of(to_config_group(item), struct simple_children, group) : NULL;
+}
+
+static struct config_item *simple_children_make_item(struct config_group *group, const char *name)
+{
+	struct simple_child *simple_child;
+
+	simple_child = kzalloc(sizeof(struct simple_child), GFP_KERNEL);
+	if (!simple_child)
+		return ERR_PTR(-ENOMEM);
+
+	config_item_init_type_name(&simple_child->item, name,
+				   &simple_child_type);
+
+	simple_child->storeme = 0;
+
+	return &simple_child->item;
+}
+
+static struct configfs_attribute simple_children_attr_description = {
+	.ca_owner = THIS_MODULE,
+	.ca_name = "description",
+	.ca_mode = S_IRUGO,
+};
+
+static struct configfs_attribute *simple_children_attrs[] = {
+	&simple_children_attr_description,
+	NULL,
+};
+
+static ssize_t simple_children_attr_show(struct config_item *item,
+					 struct configfs_attribute *attr,
+					 char *page)
+{
+	return sprintf(page,
+"[02-simple-children]\n"
+"\n"
+"This subsystem allows the creation of child config_items.  These\n"
+"items have only one attribute that is readable and writeable.\n");
+}
+
+static void simple_children_release(struct config_item *item)
+{
+	kfree(to_simple_children(item));
+}
+
+static struct configfs_item_operations simple_children_item_ops = {
+	.release	= simple_children_release,
+	.show_attribute	= simple_children_attr_show,
+};
+
+/*
+ * Note that, since no extra work is required on ->drop_item(),
+ * no ->drop_item() is provided.
+ */
+static struct configfs_group_operations simple_children_group_ops = {
+	.make_item	= simple_children_make_item,
+};
+
+static struct config_item_type simple_children_type = {
+	.ct_item_ops	= &simple_children_item_ops,
+	.ct_group_ops	= &simple_children_group_ops,
+	.ct_attrs	= simple_children_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct configfs_subsystem simple_children_subsys = {
+	.su_group = {
+		.cg_item = {
+			.ci_namebuf = "02-simple-children",
+			.ci_type = &simple_children_type,
+		},
+	},
+};
+
+
+/* ----------------------------------------------------------------- */
+
+/*
+ * 03-group-children
+ *
+ * This example reuses the simple_children group from above.  However,
+ * the simple_children group is not the subsystem itself, it is a
+ * child of the subsystem.  Creation of a group in the subsystem creates
+ * a new simple_children group.  That group can then have simple_child
+ * children of its own.
+ */
+
+static struct config_group *group_children_make_group(struct config_group *group, const char *name)
+{
+	struct simple_children *simple_children;
+
+	simple_children = kzalloc(sizeof(struct simple_children),
+				  GFP_KERNEL);
+	if (!simple_children)
+		return ERR_PTR(-ENOMEM);
+
+	config_group_init_type_name(&simple_children->group, name,
+				    &simple_children_type);
+
+	return &simple_children->group;
+}
+
+static struct configfs_attribute group_children_attr_description = {
+	.ca_owner = THIS_MODULE,
+	.ca_name = "description",
+	.ca_mode = S_IRUGO,
+};
+
+static struct configfs_attribute *group_children_attrs[] = {
+	&group_children_attr_description,
+	NULL,
+};
+
+static ssize_t group_children_attr_show(struct config_item *item,
+					struct configfs_attribute *attr,
+					char *page)
+{
+	return sprintf(page,
+"[03-group-children]\n"
+"\n"
+"This subsystem allows the creation of child config_groups.  These\n"
+"groups are like the subsystem simple-children.\n");
+}
+
+static struct configfs_item_operations group_children_item_ops = {
+	.show_attribute	= group_children_attr_show,
+};
+
+/*
+ * Note that, since no extra work is required on ->drop_item(),
+ * no ->drop_item() is provided.
+ */
+static struct configfs_group_operations group_children_group_ops = {
+	.make_group	= group_children_make_group,
+};
+
+static struct config_item_type group_children_type = {
+	.ct_item_ops	= &group_children_item_ops,
+	.ct_group_ops	= &group_children_group_ops,
+	.ct_attrs	= group_children_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct configfs_subsystem group_children_subsys = {
+	.su_group = {
+		.cg_item = {
+			.ci_namebuf = "03-group-children",
+			.ci_type = &group_children_type,
+		},
+	},
+};
+
+/* ----------------------------------------------------------------- */
+
+/*
+ * We're now done with our subsystem definitions.
+ * For convenience in this module, here's a list of them all.  It
+ * allows the init function to easily register them.  Most modules
+ * will only have one subsystem, and will only call register_subsystem
+ * on it directly.
+ */
+static struct configfs_subsystem *example_subsys[] = {
+	&childless_subsys.subsys,
+	&simple_children_subsys,
+	&group_children_subsys,
+	NULL,
+};
+
+static int __init configfs_example_init(void)
+{
+	int ret;
+	int i;
+	struct configfs_subsystem *subsys;
+
+	for (i = 0; example_subsys[i]; i++) {
+		subsys = example_subsys[i];
+
+		config_group_init(&subsys->su_group);
+		mutex_init(&subsys->su_mutex);
+		ret = configfs_register_subsystem(subsys);
+		if (ret) {
+			printk(KERN_ERR "Error %d while registering subsystem %s\n",
+			       ret,
+			       subsys->su_group.cg_item.ci_namebuf);
+			goto out_unregister;
+		}
+	}
+
+	return 0;
+
+out_unregister:
+	for (; i >= 0; i--) {
+		configfs_unregister_subsystem(example_subsys[i]);
+	}
+
+	return ret;
+}
+
+static void __exit configfs_example_exit(void)
+{
+	int i;
+
+	for (i = 0; example_subsys[i]; i++) {
+		configfs_unregister_subsystem(example_subsys[i]);
+	}
+}
+
+module_init(configfs_example_init);
+module_exit(configfs_example_exit);
+MODULE_LICENSE("GPL");

+ 59 - 44
Documentation/filesystems/nfs-rdma.txt

@@ -5,7 +5,7 @@
 ################################################################################
 
  Author: NetApp and Open Grid Computing
- Date: April 15, 2008
+ Date: May 29, 2008
 
 Table of Contents
 ~~~~~~~~~~~~~~~~~
@@ -60,16 +60,18 @@ Installation
     The procedures described in this document have been tested with
     distributions from Red Hat's Fedora Project (http://fedora.redhat.com/).
 
-  - Install nfs-utils-1.1.1 or greater on the client
+  - Install nfs-utils-1.1.2 or greater on the client
 
-    An NFS/RDMA mount point can only be obtained by using the mount.nfs
-    command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs
-    you are using, type:
+    An NFS/RDMA mount point can be obtained by using the mount.nfs command in
+    nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils
+    version with support for NFS/RDMA mounts, but for various reasons we
+    recommend using nfs-utils-1.1.2 or greater). To see which version of
+    mount.nfs you are using, type:
 
-    > /sbin/mount.nfs -V
+    $ /sbin/mount.nfs -V
 
-    If the version is less than 1.1.1 or the command does not exist,
-    then you will need to install the latest version of nfs-utils.
+    If the version is less than 1.1.2 or the command does not exist,
+    you should install the latest version of nfs-utils.
 
     Download the latest package from:
 
@@ -77,22 +79,33 @@ Installation
 
     Uncompress the package and follow the installation instructions.
 
-    If you will not be using GSS and NFSv4, the installation process
-    can be simplified by disabling these features when running configure:
+    If you will not need the idmapper and gssd executables (you do not need
+    these to create an NFS/RDMA enabled mount command), the installation
+    process can be simplified by disabling these features when running
+    configure:
 
-    > ./configure --disable-gss --disable-nfsv4
+    $ ./configure --disable-gss --disable-nfsv4
 
-    For more information on this see the package's README and INSTALL files.
+    To build nfs-utils you will need the tcp_wrappers package installed. For
+    more information on this see the package's README and INSTALL files.
 
     After building the nfs-utils package, there will be a mount.nfs binary in
     the utils/mount directory. This binary can be used to initiate NFS v2, v3,
-    or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4.
-    The standard technique is to create a symlink called mount.nfs4 to mount.nfs.
+    or v4 mounts. To initiate a v4 mount, the binary must be called
+    mount.nfs4.  The standard technique is to create a symlink called
+    mount.nfs4 to mount.nfs.
 
-    NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed
+    This mount.nfs binary should be installed at /sbin/mount.nfs as follows:
+
+    $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs
+
+    In this location, mount.nfs will be invoked automatically for NFS mounts
+    by the system mount commmand.
+
+    NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed
     on the NFS client machine. You do not need this specific version of
     nfs-utils on the server. Furthermore, only the mount.nfs command from
-    nfs-utils-1.1.1 is needed on the client.
+    nfs-utils-1.1.2 is needed on the client.
 
   - Install a Linux kernel with NFS/RDMA
 
@@ -156,8 +169,8 @@ Check RDMA and NFS Setup
     this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel
     card:
 
-    > modprobe ib_mthca
-    > modprobe ib_ipoib
+    $ modprobe ib_mthca
+    $ modprobe ib_ipoib
 
     If you are using InfiniBand, make sure there is a Subnet Manager (SM)
     running on the network. If your IB switch has an embedded SM, you can
@@ -166,7 +179,7 @@ Check RDMA and NFS Setup
 
     If an SM is running on your network, you should see the following:
 
-    > cat /sys/class/infiniband/driverX/ports/1/state
+    $ cat /sys/class/infiniband/driverX/ports/1/state
     4: ACTIVE
 
     where driverX is mthca0, ipath5, ehca3, etc.
@@ -174,10 +187,10 @@ Check RDMA and NFS Setup
     To further test the InfiniBand software stack, use IPoIB (this
     assumes you have two IB hosts named host1 and host2):
 
-    host1> ifconfig ib0 a.b.c.x
-    host2> ifconfig ib0 a.b.c.y
-    host1> ping a.b.c.y
-    host2> ping a.b.c.x
+    host1$ ifconfig ib0 a.b.c.x
+    host2$ ifconfig ib0 a.b.c.y
+    host1$ ping a.b.c.y
+    host2$ ping a.b.c.x
 
     For other device types, follow the appropriate procedures.
 
@@ -202,11 +215,11 @@ NFS/RDMA Setup
     /vol0   192.168.0.47(fsid=0,rw,async,insecure,no_root_squash)
     /vol0   192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash)
 
-    The IP address(es) is(are) the client's IPoIB address for an InfiniBand HCA or the
-    cleint's iWARP address(es) for an RNIC.
+    The IP address(es) is(are) the client's IPoIB address for an InfiniBand
+    HCA or the cleint's iWARP address(es) for an RNIC.
 
-    NOTE: The "insecure" option must be used because the NFS/RDMA client does not
-    use a reserved port.
+    NOTE: The "insecure" option must be used because the NFS/RDMA client does
+    not use a reserved port.
 
  Each time a machine boots:
 
@@ -214,43 +227,45 @@ NFS/RDMA Setup
 
     For InfiniBand using a Mellanox adapter:
 
-    > modprobe ib_mthca
-    > modprobe ib_ipoib
-    > ifconfig ib0 a.b.c.d
+    $ modprobe ib_mthca
+    $ modprobe ib_ipoib
+    $ ifconfig ib0 a.b.c.d
 
     NOTE: use unique addresses for the client and server
 
   - Start the NFS server
 
-    If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
-    load the RDMA transport module:
+    If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
+    kernel config), load the RDMA transport module:
 
-    > modprobe svcrdma
+    $ modprobe svcrdma
 
-    Regardless of how the server was built (module or built-in), start the server:
+    Regardless of how the server was built (module or built-in), start the
+    server:
 
-    > /etc/init.d/nfs start
+    $ /etc/init.d/nfs start
 
     or
 
-    > service nfs start
+    $ service nfs start
 
     Instruct the server to listen on the RDMA transport:
 
-    > echo rdma 2050 > /proc/fs/nfsd/portlist
+    $ echo rdma 2050 > /proc/fs/nfsd/portlist
 
   - On the client system
 
-    If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config),
-    load the RDMA client module:
+    If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
+    kernel config), load the RDMA client module:
 
-    > modprobe xprtrdma.ko
+    $ modprobe xprtrdma.ko
 
-    Regardless of how the client was built (module or built-in), issue the mount.nfs command:
+    Regardless of how the client was built (module or built-in), use this
+    command to mount the NFS/RDMA server:
 
-    > /path/to/your/mount.nfs <IPoIB-server-name-or-address>:/<export> /mnt -i -o rdma,port=2050
+    $ mount -o rdma,port=2050 <IPoIB-server-name-or-address>:/<export> /mnt
 
-    To verify that the mount is using RDMA, run "cat /proc/mounts" and check the
-    "proto" field for the given mount.
+    To verify that the mount is using RDMA, run "cat /proc/mounts" and check
+    the "proto" field for the given mount.
 
   Congratulations! You're using NFS/RDMA!

+ 106 - 0
Documentation/filesystems/omfs.txt

@@ -0,0 +1,106 @@
+Optimized MPEG Filesystem (OMFS)
+
+Overview
+========
+
+OMFS is a filesystem created by SonicBlue for use in the ReplayTV DVR
+and Rio Karma MP3 player.  The filesystem is extent-based, utilizing
+block sizes from 2k to 8k, with hash-based directories.  This
+filesystem driver may be used to read and write disks from these
+devices.
+
+Note, it is not recommended that this FS be used in place of a general
+filesystem for your own streaming media device.  Native Linux filesystems
+will likely perform better.
+
+More information is available at:
+
+    http://linux-karma.sf.net/
+
+Various utilities, including mkomfs and omfsck, are included with
+omfsprogs, available at:
+
+    http://bobcopeland.com/karma/
+
+Instructions are included in its README.
+
+Options
+=======
+
+OMFS supports the following mount-time options:
+
+    uid=n        - make all files owned by specified user
+    gid=n        - make all files owned by specified group
+    umask=xxx    - set permission umask to xxx
+    fmask=xxx    - set umask to xxx for files
+    dmask=xxx    - set umask to xxx for directories
+
+Disk format
+===========
+
+OMFS discriminates between "sysblocks" and normal data blocks.  The sysblock
+group consists of super block information, file metadata, directory structures,
+and extents.  Each sysblock has a header containing CRCs of the entire
+sysblock, and may be mirrored in successive blocks on the disk.  A sysblock may
+have a smaller size than a data block, but since they are both addressed by the
+same 64-bit block number, any remaining space in the smaller sysblock is
+unused.
+
+Sysblock header information:
+
+struct omfs_header {
+        __be64 h_self;                  /* FS block where this is located */
+        __be32 h_body_size;             /* size of useful data after header */
+        __be16 h_crc;                   /* crc-ccitt of body_size bytes */
+        char h_fill1[2];
+        u8 h_version;                   /* version, always 1 */
+        char h_type;                    /* OMFS_INODE_X */
+        u8 h_magic;                     /* OMFS_IMAGIC */
+        u8 h_check_xor;                 /* XOR of header bytes before this */
+        __be32 h_fill2;
+};
+
+Files and directories are both represented by omfs_inode:
+
+struct omfs_inode {
+        struct omfs_header i_head;      /* header */
+        __be64 i_parent;                /* parent containing this inode */
+        __be64 i_sibling;               /* next inode in hash bucket */
+        __be64 i_ctime;                 /* ctime, in milliseconds */
+        char i_fill1[35];
+        char i_type;                    /* OMFS_[DIR,FILE] */
+        __be32 i_fill2;
+        char i_fill3[64];
+        char i_name[OMFS_NAMELEN];      /* filename */
+        __be64 i_size;                  /* size of file, in bytes */
+};
+
+Directories in OMFS are implemented as a large hash table.  Filenames are
+hashed then prepended into the bucket list beginning at OMFS_DIR_START.
+Lookup requires hashing the filename, then seeking across i_sibling pointers
+until a match is found on i_name.  Empty buckets are represented by block
+pointers with all-1s (~0).
+
+A file is an omfs_inode structure followed by an extent table beginning at
+OMFS_EXTENT_START:
+
+struct omfs_extent_entry {
+        __be64 e_cluster;               /* start location of a set of blocks */
+        __be64 e_blocks;                /* number of blocks after e_cluster */
+};
+
+struct omfs_extent {
+        __be64 e_next;                  /* next extent table location */
+        __be32 e_extent_count;          /* total # extents in this table */
+        __be32 e_fill;
+        struct omfs_extent_entry e_entry;       /* start of extent entries */
+};
+
+Each extent holds the block offset followed by number of blocks allocated to
+the extent.  The final extent in each table is a terminator with e_cluster
+being ~0 and e_blocks being ones'-complement of the total number of blocks
+in the table.
+
+If this table overflows, a continuation inode is written and pointed to by
+e_next.  These have a header but lack the rest of the inode structure.
+

+ 46 - 2
Documentation/filesystems/proc.txt

@@ -296,6 +296,7 @@ Table 1-4: Kernel info in /proc
  uptime      System uptime                                     
  version     Kernel version                                    
  video	     bttv info of video resources			(2.4)
+ vmallocinfo Show vmalloced areas
 ..............................................................................
 
 You can,  for  example,  check  which interrupts are currently in use and what
@@ -557,6 +558,49 @@ VmallocTotal: total size of vmalloc memory area
  VmallocUsed: amount of vmalloc area which is used
 VmallocChunk: largest contigious block of vmalloc area which is free
 
+..............................................................................
+
+vmallocinfo:
+
+Provides information about vmalloced/vmaped areas. One line per area,
+containing the virtual address range of the area, size in bytes,
+caller information of the creator, and optional information depending
+on the kind of area :
+
+ pages=nr    number of pages
+ phys=addr   if a physical address was specified
+ ioremap     I/O mapping (ioremap() and friends)
+ vmalloc     vmalloc() area
+ vmap        vmap()ed pages
+ user        VM_USERMAP area
+ vpages      buffer for pages pointers was vmalloced (huge area)
+ N<node>=nr  (Only on NUMA kernels)
+             Number of pages allocated on memory node <node>
+
+> cat /proc/vmallocinfo
+0xffffc20000000000-0xffffc20000201000 2101248 alloc_large_system_hash+0x204 ...
+  /0x2c0 pages=512 vmalloc N0=128 N1=128 N2=128 N3=128
+0xffffc20000201000-0xffffc20000302000 1052672 alloc_large_system_hash+0x204 ...
+  /0x2c0 pages=256 vmalloc N0=64 N1=64 N2=64 N3=64
+0xffffc20000302000-0xffffc20000304000    8192 acpi_tb_verify_table+0x21/0x4f...
+  phys=7fee8000 ioremap
+0xffffc20000304000-0xffffc20000307000   12288 acpi_tb_verify_table+0x21/0x4f...
+  phys=7fee7000 ioremap
+0xffffc2000031d000-0xffffc2000031f000    8192 init_vdso_vars+0x112/0x210
+0xffffc2000031f000-0xffffc2000032b000   49152 cramfs_uncompress_init+0x2e ...
+  /0x80 pages=11 vmalloc N0=3 N1=3 N2=2 N3=3
+0xffffc2000033a000-0xffffc2000033d000   12288 sys_swapon+0x640/0xac0      ...
+  pages=2 vmalloc N1=2
+0xffffc20000347000-0xffffc2000034c000   20480 xt_alloc_table_info+0xfe ...
+  /0x130 [x_tables] pages=4 vmalloc N0=4
+0xffffffffa0000000-0xffffffffa000f000   61440 sys_init_module+0xc27/0x1d00 ...
+   pages=14 vmalloc N2=14
+0xffffffffa000f000-0xffffffffa0014000   20480 sys_init_module+0xc27/0x1d00 ...
+   pages=4 vmalloc N1=4
+0xffffffffa0014000-0xffffffffa0017000   12288 sys_init_module+0xc27/0x1d00 ...
+   pages=2 vmalloc N1=2
+0xffffffffa0017000-0xffffffffa0022000   45056 sys_init_module+0xc27/0x1d00 ...
+   pages=10 vmalloc N0=10
 
 1.3 IDE devices in /proc/ide
 ----------------------------
@@ -887,7 +931,7 @@ group_prealloc  max_to_scan  mb_groups  mb_history  min_to_scan  order2_req
 stats  stream_req
 
 mb_groups:
-This file gives the details of mutiblock allocator buddy cache of free blocks
+This file gives the details of multiblock allocator buddy cache of free blocks
 
 mb_history:
 Multiblock allocation history.
@@ -1430,7 +1474,7 @@ used because pages_free(1355) is smaller than watermark + protection[2]
 normal page requirement. If requirement is DMA zone(index=0), protection[0]
 (=0) is used.
 
-zone[i]'s protection[j] is calculated by following exprssion.
+zone[i]'s protection[j] is calculated by following expression.
 
 (i < j):
   zone[i]->protection[j]

+ 10 - 0
Documentation/filesystems/relay.txt

@@ -294,6 +294,16 @@ user-defined data with a channel, and is immediately available
 (including in create_buf_file()) via chan->private_data or
 buf->chan->private_data.
 
+Buffer-only channels
+--------------------
+
+These channels have no files associated and can be created with
+relay_open(NULL, NULL, ...). Such channels are useful in scenarios such
+as when doing early tracing in the kernel, before the VFS is up. In these
+cases, one may open a buffer-only channel and then call
+relay_late_setup_files() when the kernel is ready to handle files,
+to expose the buffered data to the userspace.
+
 Channel 'modes'
 ---------------
 

+ 6 - 0
Documentation/filesystems/sysfs.txt

@@ -248,6 +248,7 @@ The top level sysfs directory looks like:
 block/
 bus/
 class/
+dev/
 devices/
 firmware/
 net/
@@ -274,6 +275,11 @@ fs/ contains a directory for some filesystems.  Currently each
 filesystem wanting to export attributes must create its own hierarchy
 below fs/ (see ./fuse.txt for an example).
 
+dev/ contains two directories char/ and block/. Inside these two
+directories there are symlinks named <major>:<minor>.  These symlinks
+point to the sysfs directory for the given device.  /sys/dev provides a
+quick way to lookup the sysfs interface for a device from the result of
+a stat(2) operation.
 
 More information can driver-model specific features can be found in
 Documentation/driver-model/. 

+ 8 - 0
Documentation/filesystems/vfat.txt

@@ -96,6 +96,14 @@ shortname=lower|win95|winnt|mixed
 			emulate the Windows 95 rule for create.
 		 Default setting is `lower'.
 
+tz=UTC        -- Interpret timestamps as UTC rather than local time.
+                 This option disables the conversion of timestamps
+                 between local time (as used by Windows on FAT) and UTC
+                 (which Linux uses internally).  This is particuluarly
+                 useful when mounting devices (like digital cameras)
+                 that are set to UTC in order to avoid the pitfalls of
+                 local time.
+
 <bool>: 0,1,yes,no,true,false
 
 TODO

+ 3 - 3
Documentation/filesystems/vfs.txt

@@ -143,7 +143,7 @@ struct file_system_type {
 
 The get_sb() method has the following arguments:
 
-  struct file_system_type *fs_type: decribes the filesystem, partly initialized
+  struct file_system_type *fs_type: describes the filesystem, partly initialized
   	by the specific filesystem code
 
   int flags: mount flags
@@ -895,9 +895,9 @@ struct dentry_operations {
 	iput() yourself
 
   d_dname: called when the pathname of a dentry should be generated.
-	Usefull for some pseudo filesystems (sockfs, pipefs, ...) to delay
+	Useful for some pseudo filesystems (sockfs, pipefs, ...) to delay
 	pathname generation. (Instead of doing it when dentry is created,
-	its done only when the path is needed.). Real filesystems probably
+	it's done only when the path is needed.). Real filesystems probably
 	dont want to use it, because their dentries are present in global
 	dcache hash, so their hash should be an invariant. As no lock is
 	held, d_dname() should not try to modify the dentry itself, unless

+ 1 - 0
Documentation/ftrace.txt

@@ -4,6 +4,7 @@
 Copyright 2008 Red Hat Inc.
    Author:   Steven Rostedt <srostedt@redhat.com>
   License:   The GNU Free Documentation License, Version 1.2
+               (dual licensed under the GPL v2)
 Reviewers:   Elias Oltmanns, Randy Dunlap, Andrew Morton,
 	     John Kacur, and David Teigland.
 

+ 129 - 6
Documentation/gpio.txt

@@ -347,15 +347,12 @@ necessarily be nonportable.
 Dynamic definition of GPIOs is not currently standard; for example, as
 a side effect of configuring an add-on board with some GPIO expanders.
 
-These calls are purely for kernel space, but a userspace API could be built
-on top of them.
-
 
 GPIO implementor's framework (OPTIONAL)
 =======================================
 As noted earlier, there is an optional implementation framework making it
 easier for platforms to support different kinds of GPIO controller using
-the same programming interface.
+the same programming interface.  This framework is called "gpiolib".
 
 As a debugging aid, if debugfs is available a /sys/kernel/debug/gpio file
 will be found there.  That will list all the controllers registered through
@@ -392,11 +389,21 @@ either NULL or the label associated with that GPIO when it was requested.
 
 Platform Support
 ----------------
-To support this framework, a platform's Kconfig will "select HAVE_GPIO_LIB"
+To support this framework, a platform's Kconfig will "select" either
+ARCH_REQUIRE_GPIOLIB or ARCH_WANT_OPTIONAL_GPIOLIB
 and arrange that its <asm/gpio.h> includes <asm-generic/gpio.h> and defines
 three functions: gpio_get_value(), gpio_set_value(), and gpio_cansleep().
 They may also want to provide a custom value for ARCH_NR_GPIOS.
 
+ARCH_REQUIRE_GPIOLIB means that the gpio-lib code will always get compiled
+into the kernel on that architecture.
+
+ARCH_WANT_OPTIONAL_GPIOLIB means the gpio-lib code defaults to off and the user
+can enable it and build it into the kernel optionally.
+
+If neither of these options are selected, the platform does not support
+GPIOs through GPIO-lib and the code cannot be enabled by the user.
+
 Trivial implementations of those functions can directly use framework
 code, which always dispatches through the gpio_chip:
 
@@ -439,4 +446,120 @@ becomes available.  That may mean the device should not be registered until
 calls for that GPIO can work.  One way to address such dependencies is for
 such gpio_chip controllers to provide setup() and teardown() callbacks to
 board specific code; those board specific callbacks would register devices
-once all the necessary resources are available.
+once all the necessary resources are available, and remove them later when
+the GPIO controller device becomes unavailable.
+
+
+Sysfs Interface for Userspace (OPTIONAL)
+========================================
+Platforms which use the "gpiolib" implementors framework may choose to
+configure a sysfs user interface to GPIOs.  This is different from the
+debugfs interface, since it provides control over GPIO direction and
+value instead of just showing a gpio state summary.  Plus, it could be
+present on production systems without debugging support.
+
+Given approprate hardware documentation for the system, userspace could
+know for example that GPIO #23 controls the write protect line used to
+protect boot loader segments in flash memory.  System upgrade procedures
+may need to temporarily remove that protection, first importing a GPIO,
+then changing its output state, then updating the code before re-enabling
+the write protection.  In normal use, GPIO #23 would never be touched,
+and the kernel would have no need to know about it.
+
+Again depending on appropriate hardware documentation, on some systems
+userspace GPIO can be used to determine system configuration data that
+standard kernels won't know about.  And for some tasks, simple userspace
+GPIO drivers could be all that the system really needs.
+
+Note that standard kernel drivers exist for common "LEDs and Buttons"
+GPIO tasks:  "leds-gpio" and "gpio_keys", respectively.  Use those
+instead of talking directly to the GPIOs; they integrate with kernel
+frameworks better than your userspace code could.
+
+
+Paths in Sysfs
+--------------
+There are three kinds of entry in /sys/class/gpio:
+
+   -	Control interfaces used to get userspace control over GPIOs;
+
+   -	GPIOs themselves; and
+
+   -	GPIO controllers ("gpio_chip" instances).
+
+That's in addition to standard files including the "device" symlink.
+
+The control interfaces are write-only:
+
+    /sys/class/gpio/
+
+    	"export" ... Userspace may ask the kernel to export control of
+		a GPIO to userspace by writing its number to this file.
+
+		Example:  "echo 19 > export" will create a "gpio19" node
+		for GPIO #19, if that's not requested by kernel code.
+
+    	"unexport" ... Reverses the effect of exporting to userspace.
+
+		Example:  "echo 19 > unexport" will remove a "gpio19"
+		node exported using the "export" file.
+
+GPIO signals have paths like /sys/class/gpio/gpio42/ (for GPIO #42)
+and have the following read/write attributes:
+
+    /sys/class/gpio/gpioN/
+
+	"direction" ... reads as either "in" or "out".  This value may
+		normally be written.  Writing as "out" defaults to
+		initializing the value as low.  To ensure glitch free
+		operation, values "low" and "high" may be written to
+		configure the GPIO as an output with that initial value.
+
+		Note that this attribute *will not exist* if the kernel
+		doesn't support changing the direction of a GPIO, or
+		it was exported by kernel code that didn't explicitly
+		allow userspace to reconfigure this GPIO's direction.
+
+	"value" ... reads as either 0 (low) or 1 (high).  If the GPIO
+		is configured as an output, this value may be written;
+		any nonzero value is treated as high.
+
+GPIO controllers have paths like /sys/class/gpio/chipchip42/ (for the
+controller implementing GPIOs starting at #42) and have the following
+read-only attributes:
+
+    /sys/class/gpio/gpiochipN/
+
+    	"base" ... same as N, the first GPIO managed by this chip
+
+    	"label" ... provided for diagnostics (not always unique)
+
+    	"ngpio" ... how many GPIOs this manges (N to N + ngpio - 1)
+
+Board documentation should in most cases cover what GPIOs are used for
+what purposes.  However, those numbers are not always stable; GPIOs on
+a daughtercard might be different depending on the base board being used,
+or other cards in the stack.  In such cases, you may need to use the
+gpiochip nodes (possibly in conjunction with schematics) to determine
+the correct GPIO number to use for a given signal.
+
+
+Exporting from Kernel code
+--------------------------
+Kernel code can explicitly manage exports of GPIOs which have already been
+requested using gpio_request():
+
+	/* export the GPIO to userspace */
+	int gpio_export(unsigned gpio, bool direction_may_change);
+
+	/* reverse gpio_export() */
+	void gpio_unexport();
+
+After a kernel driver requests a GPIO, it may only be made available in
+the sysfs interface by gpio_export().  The driver can control whether the
+signal direction may change.  This helps drivers prevent userspace code
+from accidentally clobbering important system state.
+
+This explicit exporting can help with debugging (by making some kinds
+of experiments easier), or can provide an always-there interface that's
+suitable for documenting as part of a board support package.

+ 41 - 16
Documentation/hwmon/dme1737

@@ -10,6 +10,10 @@ Supported chips:
     Prefix: 'sch311x'
     Addresses scanned: none, address read from Super-I/O config space
     Datasheet: http://www.nuhorizons.com/FeaturedProducts/Volume1/SMSC/311x.pdf
+  * SMSC SCH5027
+    Prefix: 'sch5027'
+    Addresses scanned: I2C 0x2c, 0x2d, 0x2e
+    Datasheet: Provided by SMSC upon request and under NDA
 
 Authors:
     Juerg Haefliger <juergh@gmail.com>
@@ -22,34 +26,36 @@ Module Parameters
 			and PWM output control functions. Using this parameter
 			shouldn't be required since the BIOS usually takes care
 			of this.
-
-Note that there is no need to use this parameter if the driver loads without
-complaining. The driver will say so if it is necessary.
+* probe_all_addr: bool	Include non-standard LPC addresses 0x162e and 0x164e
+			when probing for ISA devices. This is required for the
+			following boards:
+			- VIA EPIA SN18000
 
 
 Description
 -----------
 
 This driver implements support for the hardware monitoring capabilities of the
-SMSC DME1737 and Asus A8000 (which are the same) and SMSC SCH311x Super-I/O
-chips. These chips feature monitoring of 3 temp sensors temp[1-3] (2 remote
-diodes and 1 internal), 7 voltages in[0-6] (6 external and 1 internal) and up
-to 6 fan speeds fan[1-6]. Additionally, the chips implement up to 5 PWM
-outputs pwm[1-3,5-6] for controlling fan speeds both manually and
+SMSC DME1737 and Asus A8000 (which are the same), SMSC SCH5027, and SMSC
+SCH311x Super-I/O chips. These chips feature monitoring of 3 temp sensors
+temp[1-3] (2 remote diodes and 1 internal), 7 voltages in[0-6] (6 external and
+1 internal) and up to 6 fan speeds fan[1-6]. Additionally, the chips implement
+up to 5 PWM outputs pwm[1-3,5-6] for controlling fan speeds both manually and
 automatically.
 
-For the DME1737 and A8000, fan[1-2] and pwm[1-2] are always present. Fan[3-6]
-and pwm[3,5-6] are optional features and their availability depends on the
-configuration of the chip. The driver will detect which features are present
-during initialization and create the sysfs attributes accordingly.
+For the DME1737, A8000 and SCH5027, fan[1-2] and pwm[1-2] are always present.
+Fan[3-6] and pwm[3,5-6] are optional features and their availability depends on
+the configuration of the chip. The driver will detect which features are
+present during initialization and create the sysfs attributes accordingly.
 
 For the SCH311x, fan[1-3] and pwm[1-3] are always present and fan[4-6] and
 pwm[5-6] don't exist.
 
-The hardware monitoring features of the DME1737 and A8000 are only accessible
-via SMBus, while the SCH311x only provides access via the ISA bus. The driver
-will therefore register itself as an I2C client driver if it detects a DME1737
-or A8000 and as a platform driver if it detects a SCH311x chip.
+The hardware monitoring features of the DME1737, A8000, and SCH5027 are only
+accessible via SMBus, while the SCH311x only provides access via the ISA bus.
+The driver will therefore register itself as an I2C client driver if it detects
+a DME1737, A8000, or SCH5027 and as a platform driver if it detects a SCH311x
+chip.
 
 
 Voltage Monitoring
@@ -60,6 +66,7 @@ scaling resistors. The values returned by the driver therefore reflect true
 millivolts and don't need scaling. The voltage inputs are mapped as follows
 (the last column indicates the input ranges):
 
+DME1737, A8000:
 	in0: +5VTR	(+5V standby)		0V - 6.64V
 	in1: Vccp	(processor core)	0V - 3V
 	in2: VCC	(internal +3.3V)	0V - 4.38V
@@ -68,6 +75,24 @@ millivolts and don't need scaling. The voltage inputs are mapped as follows
 	in5: VTR	(+3.3V standby)		0V - 4.38V
 	in6: Vbat	(+3.0V)			0V - 4.38V
 
+SCH311x:
+	in0: +2.5V				0V - 6.64V
+	in1: Vccp	(processor core)	0V - 2V
+	in2: VCC	(internal +3.3V)	0V - 4.38V
+	in3: +5V				0V - 6.64V
+	in4: +12V				0V - 16V
+	in5: VTR	(+3.3V standby)		0V - 4.38V
+	in6: Vbat	(+3.0V)			0V - 4.38V
+
+SCH5027:
+	in0: +5VTR	(+5V standby)		0V - 6.64V
+	in1: Vccp	(processor core)	0V - 3V
+	in2: VCC	(internal +3.3V)	0V - 4.38V
+	in3: V2_IN				0V - 1.5V
+	in4: V1_IN				0V - 1.5V
+	in5: VTR	(+3.3V standby)		0V - 4.38V
+	in6: Vbat	(+3.0V)			0V - 4.38V
+
 Each voltage input has associated min and max limits which trigger an alarm
 when crossed.
 

+ 7 - 6
Documentation/hwmon/it87

@@ -6,12 +6,14 @@ Supported chips:
     Prefix: 'it87'
     Addresses scanned: from Super I/O config space (8 I/O ports)
     Datasheet: Publicly available at the ITE website
-               http://www.ite.com.tw/
+               http://www.ite.com.tw/product_info/file/pc/IT8705F_V.0.4.1.pdf
   * IT8712F
     Prefix: 'it8712'
     Addresses scanned: from Super I/O config space (8 I/O ports)
     Datasheet: Publicly available at the ITE website
-               http://www.ite.com.tw/
+               http://www.ite.com.tw/product_info/file/pc/IT8712F_V0.9.1.pdf
+               http://www.ite.com.tw/product_info/file/pc/Errata%20V0.1%20for%20IT8712F%20V0.9.1.pdf
+               http://www.ite.com.tw/product_info/file/pc/IT8712F_V0.9.3.pdf
   * IT8716F/IT8726F
     Prefix: 'it8716'
     Addresses scanned: from Super I/O config space (8 I/O ports)
@@ -90,14 +92,13 @@ upper VID bits share their pins with voltage inputs (in5 and in6) so you
 can't have both on a given board.
 
 The IT8716F, IT8718F and later IT8712F revisions have support for
-2 additional fans. They are supported by the driver for the IT8716F and
-IT8718F but not for the IT8712F
+2 additional fans. The additional fans are supported by the driver.
 
 The IT8716F and IT8718F, and late IT8712F and IT8705F also have optional
 16-bit tachometer counters for fans 1 to 3. This is better (no more fan
 clock divider mess) but not compatible with the older chips and
-revisions. For now, the driver only uses the 16-bit mode on the
-IT8716F and IT8718F.
+revisions. The 16-bit tachometer mode is enabled by the driver when one
+of the above chips is detected.
 
 The IT8726F is just bit enhanced IT8716F with additional hardware
 for AMD power sequencing. Therefore the chip will appear as IT8716F

+ 4 - 7
Documentation/hwmon/lm85

@@ -96,11 +96,6 @@ initial testing of the ADM1027 it was 1.00 degC steps. Analog Devices has
 confirmed this "bug". The ADT7463 is reported to work as described in the
 documentation. The current lm85 driver does not show the offset register.
 
-The ADT7463 has a THERM asserted counter. This counter has a 22.76ms
-resolution and a range of 5.8 seconds. The driver implements a 32-bit
-accumulator of the counter value to extend the range to over a year. The
-counter will stay at it's max value until read.
-
 See the vendor datasheets for more information. There is application note
 from National (AN-1260) with some additional information about the LM85.
 The Analog Devices datasheet is very detailed and describes a procedure for
@@ -206,13 +201,15 @@ Configuration choices:
 
 The National LM85's have two vendor specific configuration
 features. Tach. mode and Spinup Control. For more details on these,
-see the LM85 datasheet or Application Note AN-1260.
+see the LM85 datasheet or Application Note AN-1260. These features
+are not currently supported by the lm85 driver.
 
 The Analog Devices ADM1027 has several vendor specific enhancements.
 The number of pulses-per-rev of the fans can be set, Tach monitoring
 can be optimized for PWM operation, and an offset can be applied to
 the temperatures to compensate for systemic errors in the
-measurements.
+measurements. These features are not currently supported by the lm85
+driver.
 
 In addition to the ADM1027 features, the ADT7463 also has Tmin control
 and THERM asserted counts. Automatic Tmin control acts to adjust the

+ 0 - 4
Documentation/hwmon/w83627hf

@@ -40,10 +40,6 @@ Module Parameters
   (default is 1)
   Use 'init=0' to bypass initializing the chip.
   Try this if your computer crashes when you load the module.
-* reset: int
-  (default is 0)
-  The driver used to reset the chip on load, but does no more. Use
-  'reset=1' to restore the old behavior. Report if you need to do this.
 
 Description
 -----------

+ 3 - 3
Documentation/hwmon/w83791d

@@ -22,6 +22,7 @@ Credits:
 
 Additional contributors:
     Sven Anders <anders@anduras.de>
+    Marc Hulsman <m.hulsman@tudelft.nl>
 
 Module Parameters
 -----------------
@@ -67,9 +68,8 @@ on until the temperature falls below the Hysteresis value.
 
 Fan rotation speeds are reported in RPM (rotations per minute). An alarm is
 triggered if the rotation speed has dropped below a programmable limit. Fan
-readings can be divided by a programmable divider (1, 2, 4, 8 for fan 1/2/3
-and 1, 2, 4, 8, 16, 32, 64 or 128 for fan 4/5) to give the readings more
-range or accuracy.
+readings can be divided by a programmable divider (1, 2, 4, 8, 16,
+32, 64 or 128 for all fans) to give the readings more range or accuracy.
 
 Voltage sensors (also known as IN sensors) report their values in millivolts.
 An alarm is triggered if the voltage has crossed a programmable minimum

+ 281 - 0
Documentation/i2c/upgrading-clients

@@ -0,0 +1,281 @@
+Upgrading I2C Drivers to the new 2.6 Driver Model
+=================================================
+
+Ben Dooks <ben-linux@fluff.org>
+
+Introduction
+------------
+
+This guide outlines how to alter existing Linux 2.6 client drivers from
+the old to the new new binding methods.
+
+
+Example old-style driver
+------------------------
+
+
+struct example_state {
+	struct i2c_client	client;
+	....
+};
+
+static struct i2c_driver example_driver;
+
+static unsigned short ignore[] = { I2C_CLIENT_END };
+static unsigned short normal_addr[] = { OUR_ADDR, I2C_CLIENT_END };
+
+I2C_CLIENT_INSMOD;
+
+static int example_attach(struct i2c_adapter *adap, int addr, int kind)
+{
+	struct example_state *state;
+	struct device *dev = &adap->dev;  /* to use for dev_ reports */
+	int ret;
+
+	state = kzalloc(sizeof(struct example_state), GFP_KERNEL);
+	if (state == NULL) {
+		dev_err(dev, "failed to create our state\n");
+		return -ENOMEM;
+	}
+
+	example->client.addr    = addr;
+	example->client.flags   = 0;
+	example->client.adapter = adap;
+
+	i2c_set_clientdata(&state->i2c_client, state);
+	strlcpy(client->i2c_client.name, "example", I2C_NAME_SIZE);
+
+	ret = i2c_attach_client(&state->i2c_client);
+	if (ret < 0) {
+		dev_err(dev, "failed to attach client\n");
+		kfree(state);
+		return ret;
+	}
+
+	dev = &state->i2c_client.dev;
+
+	/* rest of the initialisation goes here. */
+
+	dev_info(dev, "example client created\n");
+
+	return 0;
+}
+
+static int __devexit example_detach(struct i2c_client *client)
+{
+	struct example_state *state = i2c_get_clientdata(client);
+
+	i2c_detach_client(client);
+	kfree(state);
+	return 0;
+}
+
+static int example_attach_adapter(struct i2c_adapter *adap)
+{
+	return i2c_probe(adap, &addr_data, example_attach);
+}
+
+static struct i2c_driver example_driver = {
+ 	.driver		= {
+		.owner		= THIS_MODULE,
+		.name		= "example",
+	},
+	.attach_adapter = example_attach_adapter,
+	.detach_client	= __devexit_p(example_detach),
+	.suspend	= example_suspend,
+	.resume		= example_resume,
+};
+
+
+Updating the client
+-------------------
+
+The new style binding model will check against a list of supported
+devices and their associated address supplied by the code registering
+the busses. This means that the driver .attach_adapter and
+.detach_adapter methods can be removed, along with the addr_data,
+as follows:
+
+- static struct i2c_driver example_driver;
+
+- static unsigned short ignore[] = { I2C_CLIENT_END };
+- static unsigned short normal_addr[] = { OUR_ADDR, I2C_CLIENT_END };
+
+- I2C_CLIENT_INSMOD;
+
+- static int example_attach_adapter(struct i2c_adapter *adap)
+- {
+- 	return i2c_probe(adap, &addr_data, example_attach);
+- }
+
+ static struct i2c_driver example_driver = {
+-	.attach_adapter = example_attach_adapter,
+-	.detach_client	= __devexit_p(example_detach),
+ }
+
+Add the probe and remove methods to the i2c_driver, as so:
+
+ static struct i2c_driver example_driver = {
++	.probe		= example_probe,
++	.remove		= __devexit_p(example_remove),
+ }
+
+Change the example_attach method to accept the new parameters
+which include the i2c_client that it will be working with:
+
+- static int example_attach(struct i2c_adapter *adap, int addr, int kind)
++ static int example_probe(struct i2c_client *client,
++			   const struct i2c_device_id *id)
+
+Change the name of example_attach to example_probe to align it with the
+i2c_driver entry names. The rest of the probe routine will now need to be
+changed as the i2c_client has already been setup for use.
+
+The necessary client fields have already been setup before
+the probe function is called, so the following client setup
+can be removed:
+
+-	example->client.addr    = addr;
+-	example->client.flags   = 0;
+-	example->client.adapter = adap;
+-
+-	strlcpy(client->i2c_client.name, "example", I2C_NAME_SIZE);
+
+The i2c_set_clientdata is now:
+
+-	i2c_set_clientdata(&state->client, state);
++	i2c_set_clientdata(client, state);
+
+The call to i2c_attach_client is no longer needed, if the probe
+routine exits successfully, then the driver will be automatically
+attached by the core. Change the probe routine as so:
+
+-	ret = i2c_attach_client(&state->i2c_client);
+-	if (ret < 0) {
+-		dev_err(dev, "failed to attach client\n");
+-		kfree(state);
+-		return ret;
+-	}
+
+
+Remove the storage of 'struct i2c_client' from the 'struct example_state'
+as we are provided with the i2c_client in our example_probe. Instead we
+store a pointer to it for when it is needed.
+
+struct example_state {
+-	struct i2c_client	client;
++	struct i2c_client	*client;
+
+the new i2c client as so:
+
+-	struct device *dev = &adap->dev;  /* to use for dev_ reports */
++ 	struct device *dev = &i2c_client->dev;  /* to use for dev_ reports */
+
+And remove the change after our client is attached, as the driver no
+longer needs to register a new client structure with the core:
+
+-	dev = &state->i2c_client.dev;
+
+In the probe routine, ensure that the new state has the client stored
+in it:
+
+static int example_probe(struct i2c_client *i2c_client,
+			 const struct i2c_device_id *id)
+{
+	struct example_state *state;
+ 	struct device *dev = &i2c_client->dev;
+	int ret;
+
+	state = kzalloc(sizeof(struct example_state), GFP_KERNEL);
+	if (state == NULL) {
+		dev_err(dev, "failed to create our state\n");
+		return -ENOMEM;
+	}
+
++	state->client = i2c_client;
+
+Update the detach method, by changing the name to _remove and
+to delete the i2c_detach_client call. It is possible that you
+can also remove the ret variable as it is not not needed for
+any of the core functions.
+
+- static int __devexit example_detach(struct i2c_client *client)
++ static int __devexit example_remove(struct i2c_client *client)
+{
+	struct example_state *state = i2c_get_clientdata(client);
+
+-	i2c_detach_client(client);
+
+And finally ensure that we have the correct ID table for the i2c-core
+and other utilities:
+
++ struct i2c_device_id example_idtable[] = {
++       { "example", 0 },
++       { }
++};
++
++MODULE_DEVICE_TABLE(i2c, example_idtable);
+
+static struct i2c_driver example_driver = {
+ 	.driver		= {
+		.owner		= THIS_MODULE,
+		.name		= "example",
+	},
++	.id_table	= example_ids,
+
+
+Our driver should now look like this:
+
+struct example_state {
+	struct i2c_client	*client;
+	....
+};
+
+static int example_probe(struct i2c_client *client,
+		     	 const struct i2c_device_id *id)
+{
+	struct example_state *state;
+	struct device *dev = &client->dev;
+
+	state = kzalloc(sizeof(struct example_state), GFP_KERNEL);
+	if (state == NULL) {
+		dev_err(dev, "failed to create our state\n");
+		return -ENOMEM;
+	}
+
+	state->client = client;
+	i2c_set_clientdata(client, state);
+
+	/* rest of the initialisation goes here. */
+
+	dev_info(dev, "example client created\n");
+
+	return 0;
+}
+
+static int __devexit example_remove(struct i2c_client *client)
+{
+	struct example_state *state = i2c_get_clientdata(client);
+
+	kfree(state);
+	return 0;
+}
+
+static struct i2c_device_id example_idtable[] = {
+	{ "example", 0 },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(i2c, example_idtable);
+
+static struct i2c_driver example_driver = {
+ 	.driver		= {
+		.owner		= THIS_MODULE,
+		.name		= "example",
+	},
+	.id_table	= example_idtable,
+	.probe		= example_probe,
+	.remove		= __devexit_p(example_remove),
+	.suspend	= example_suspend,
+	.resume		= example_resume,
+};

+ 4 - 4
Documentation/ia64/kvm.txt

@@ -50,9 +50,9 @@ Note: For step 2, please make sure that host page size == TARGET_PAGE_SIZE of qe
 		/usr/local/bin/qemu-system-ia64 -smp xx -m 512 -hda $your_image
 		(xx is the number of virtual processors for the guest, now the maximum value is 4)
 
-5. Known possibile issue on some platforms with old Firmware.
+5. Known possible issue on some platforms with old Firmware.
 
-If meet strange host crashe issues, try to solve it through either of the following ways:
+In the event of strange host crash issues, try to solve it through either of the following ways:
 
 (1): Upgrade your Firmware to the latest one.
 
@@ -65,8 +65,8 @@ index 0b53344..f02b0f7 100644
 	mov ar.pfs = loc1
 	mov rp = loc0
 	;;
--	srlz.d				// seralize restoration of psr.l
-+	srlz.i			// seralize restoration of psr.l
+-	srlz.d				// serialize restoration of psr.l
++	srlz.i			// serialize restoration of psr.l
 +	;;
 	br.ret.sptk.many b0
  END(ia64_pal_call_static)

+ 137 - 0
Documentation/ia64/paravirt_ops.txt

@@ -0,0 +1,137 @@
+Paravirt_ops on IA64
+====================
+                          21 May 2008, Isaku Yamahata <yamahata@valinux.co.jp>
+
+
+Introduction
+------------
+The aim of this documentation is to help with maintainability and/or to
+encourage people to use paravirt_ops/IA64.
+
+paravirt_ops (pv_ops in short) is a way for virtualization support of
+Linux kernel on x86. Several ways for virtualization support were
+proposed, paravirt_ops is the winner.
+On the other hand, now there are also several IA64 virtualization
+technologies like kvm/IA64, xen/IA64 and many other academic IA64
+hypervisors so that it is good to add generic virtualization
+infrastructure on Linux/IA64.
+
+
+What is paravirt_ops?
+---------------------
+It has been developed on x86 as virtualization support via API, not ABI.
+It allows each hypervisor to override operations which are important for
+hypervisors at API level. And it allows a single kernel binary to run on
+all supported execution environments including native machine.
+Essentially paravirt_ops is a set of function pointers which represent
+operations corresponding to low level sensitive instructions and high
+level functionalities in various area. But one significant difference
+from usual function pointer table is that it allows optimization with
+binary patch. It is because some of these operations are very
+performance sensitive and indirect call overhead is not negligible.
+With binary patch, indirect C function call can be transformed into
+direct C function call or in-place execution to eliminate the overhead.
+
+Thus, operations of paravirt_ops are classified into three categories.
+- simple indirect call
+  These operations correspond to high level functionality so that the
+  overhead of indirect call isn't very important.
+
+- indirect call which allows optimization with binary patch
+  Usually these operations correspond to low level instructions. They
+  are called frequently and performance critical. So the overhead is
+  very important.
+
+- a set of macros for hand written assembly code
+  Hand written assembly codes (.S files) also need paravirtualization
+  because they include sensitive instructions or some of code paths in
+  them are very performance critical.
+
+
+The relation to the IA64 machine vector
+---------------------------------------
+Linux/IA64 has the IA64 machine vector functionality which allows the
+kernel to switch implementations (e.g. initialization, ipi, dma api...)
+depending on executing platform.
+We can replace some implementations very easily defining a new machine
+vector. Thus another approach for virtualization support would be
+enhancing the machine vector functionality.
+But paravirt_ops approach was taken because
+- virtualization support needs wider support than machine vector does.
+  e.g. low level instruction paravirtualization. It must be
+       initialized very early before platform detection.
+
+- virtualization support needs more functionality like binary patch.
+  Probably the calling overhead might not be very large compared to the
+  emulation overhead of virtualization. However in the native case, the
+  overhead should be eliminated completely.
+  A single kernel binary should run on each environment including native,
+  and the overhead of paravirt_ops on native environment should be as
+  small as possible.
+
+- for full virtualization technology, e.g. KVM/IA64 or
+  Xen/IA64 HVM domain, the result would be
+  (the emulated platform machine vector. probably dig) + (pv_ops).
+  This means that the virtualization support layer should be under
+  the machine vector layer.
+
+Possibly it might be better to move some function pointers from
+paravirt_ops to machine vector. In fact, Xen domU case utilizes both
+pv_ops and machine vector.
+
+
+IA64 paravirt_ops
+-----------------
+In this section, the concrete paravirt_ops will be discussed.
+Because of the architecture difference between ia64 and x86, the
+resulting set of functions is very different from x86 pv_ops.
+
+- C function pointer tables
+They are not very performance critical so that simple C indirect
+function call is acceptable. The following structures are defined at
+this moment. For details see linux/include/asm-ia64/paravirt.h
+  - struct pv_info
+    This structure describes the execution environment.
+  - struct pv_init_ops
+    This structure describes the various initialization hooks.
+  - struct pv_iosapic_ops
+    This structure describes hooks to iosapic operations.
+  - struct pv_irq_ops
+    This structure describes hooks to irq related operations
+  - struct pv_time_op
+    This structure describes hooks to steal time accounting.
+
+- a set of indirect calls which need optimization
+Currently this class of functions correspond to a subset of IA64
+intrinsics. At this moment the optimization with binary patch isn't
+implemented yet.
+struct pv_cpu_op is defined. For details see
+linux/include/asm-ia64/paravirt_privop.h
+Mostly they correspond to ia64 intrinsics 1-to-1.
+Caveat: Now they are defined as C indirect function pointers, but in
+order to support binary patch optimization, they will be changed
+using GCC extended inline assembly code.
+
+- a set of macros for hand written assembly code (.S files)
+For maintenance purpose, the taken approach for .S files is single
+source code and compile multiple times with different macros definitions.
+Each pv_ops instance must define those macros to compile.
+The important thing here is that sensitive, but non-privileged
+instructions must be paravirtualized and that some privileged
+instructions also need paravirtualization for reasonable performance.
+Developers who modify .S files must be aware of that. At this moment
+an easy checker is implemented to detect paravirtualization breakage.
+But it doesn't cover all the cases.
+
+Sometimes this set of macros is called pv_cpu_asm_op. But there is no
+corresponding structure in the source code.
+Those macros mostly 1:1 correspond to a subset of privileged
+instructions. See linux/include/asm-ia64/native/inst.h.
+And some functions written in assembly also need to be overrided so
+that each pv_ops instance have to define some macros. Again see
+linux/include/asm-ia64/native/inst.h.
+
+
+Those structures must be initialized very early before start_kernel.
+Probably initialized in head.S using multi entry point or some other trick.
+For native case implementation see linux/arch/ia64/kernel/paravirt.c.

+ 1 - 1
Documentation/input/cs461x.txt

@@ -31,7 +31,7 @@ The driver works with ALSA drivers simultaneously. For example, the xracer
 uses joystick as input device and PCM device as sound output in one time.
 There are no sound or input collisions detected. The source code have
 comments about them; but I've found the joystick can be initialized 
-separately of ALSA modules. So, you canm use only one joystick driver
+separately of ALSA modules. So, you can use only one joystick driver
 without ALSA drivers. The ALSA drivers are not needed to compile or
 run this driver.
 

+ 0 - 2
Documentation/input/gameport-programming.txt

@@ -1,5 +1,3 @@
-$Id: gameport-programming.txt,v 1.3 2001/04/24 13:51:37 vojtech Exp $
-
 Programming gameport drivers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

+ 0 - 1
Documentation/input/input.txt

@@ -1,7 +1,6 @@
 			  Linux Input drivers v1.0
 	       (c) 1999-2001 Vojtech Pavlik <vojtech@ucw.cz>
 			     Sponsored by SuSE
-	    $Id: input.txt,v 1.8 2002/05/29 03:15:01 bradleym Exp $
 ----------------------------------------------------------------------------
 
 0. Disclaimer

+ 0 - 2
Documentation/input/joystick-api.txt

@@ -5,8 +5,6 @@
 
 			      7 Aug 1998
 
-	$Id: joystick-api.txt,v 1.2 2001/05/08 21:21:23 vojtech Exp $
-
 1. Initialization
 ~~~~~~~~~~~~~~~~~
 

+ 0 - 1
Documentation/input/joystick-parport.txt

@@ -2,7 +2,6 @@
 	       (c) 1998-2000 Vojtech Pavlik <vojtech@ucw.cz>
 	       (c) 1998 Andree Borrmann <a.borrmann@tu-bs.de>
 			     Sponsored by SuSE
-	$Id: joystick-parport.txt,v 1.6 2001/09/25 09:31:32 vojtech Exp $
 ----------------------------------------------------------------------------
 
 0. Disclaimer

+ 0 - 1
Documentation/input/joystick.txt

@@ -1,7 +1,6 @@
 		       Linux Joystick driver v2.0.0
 	       (c) 1996-2000 Vojtech Pavlik <vojtech@ucw.cz>
 			     Sponsored by SuSE
-	   $Id: joystick.txt,v 1.12 2002/03/03 12:13:07 jdeneux Exp $
 ----------------------------------------------------------------------------
 
 0. Disclaimer

+ 2 - 2
Documentation/ioctl/ioctl-decoding.txt

@@ -1,6 +1,6 @@
 To decode a hex IOCTL code:
 
-Most architecures use this generic format, but check
+Most architectures use this generic format, but check
 include/ARCH/ioctl.h for specifics, e.g. powerpc
 uses 3 bits to encode read/write and 13 bits for size.
 
@@ -18,7 +18,7 @@ uses 3 bits to encode read/write and 13 bits for size.
  7-0	function #
 
 
- So for example 0x82187201 is a read with arg length of 0x218,
+So for example 0x82187201 is a read with arg length of 0x218,
 character 'r' function 1. Grepping the source reveals this is:
 
 #define VFAT_IOCTL_READDIR_BOTH         _IOR('r', 1, struct dirent [2])

+ 1 - 1
Documentation/iostats.txt

@@ -143,7 +143,7 @@ disk and partition statistics are consistent again. Since we still don't
 keep record of the partition-relative address, an operation is attributed to
 the partition which contains the first sector of the request after the
 eventual merges. As requests can be merged across partition, this could lead
-to some (probably insignificant) innacuracy.
+to some (probably insignificant) inaccuracy.
 
 Additional notes
 ----------------

+ 6 - 0
Documentation/isdn/README.mISDN

@@ -0,0 +1,6 @@
+mISDN is a new modular ISDN driver, in the long term it should replace
+the old I4L driver architecture for passiv ISDN cards.
+It was designed to allow a broad range of applications and interfaces
+but only have the basic function in kernel, the interface to the user
+space is based on sockets with a own address family AF_ISDN.
+

+ 10 - 10
Documentation/kdump/kdump.txt

@@ -65,26 +65,26 @@ Install kexec-tools
 
 2) Download the kexec-tools user-space package from the following URL:
 
-http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/kexec-tools-testing.tar.gz
+http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/kexec-tools.tar.gz
 
-This is a symlink to the latest version, which at the time of writing is
-20061214, the only release of kexec-tools-testing so far. As other versions
-are released, the older ones will remain available at
-http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/
+This is a symlink to the latest version.
 
-Note: Latest kexec-tools-testing git tree is available at
+The latest kexec-tools git tree is available at:
 
-git://git.kernel.org/pub/scm/linux/kernel/git/horms/kexec-tools-testing.git
+git://git.kernel.org/pub/scm/linux/kernel/git/horms/kexec-tools.git
 or
-http://www.kernel.org/git/?p=linux/kernel/git/horms/kexec-tools-testing.git;a=summary
+http://www.kernel.org/git/?p=linux/kernel/git/horms/kexec-tools.git
+
+More information about kexec-tools can be found at
+http://www.kernel.org/pub/linux/kernel/people/horms/kexec-tools/README.html
 
 3) Unpack the tarball with the tar command, as follows:
 
-   tar xvpzf kexec-tools-testing.tar.gz
+   tar xvpzf kexec-tools.tar.gz
 
 4) Change to the kexec-tools directory, as follows:
 
-   cd kexec-tools-testing-VERSION
+   cd kexec-tools-VERSION
 
 5) Configure the package, as follows:
 

+ 50 - 12
Documentation/kernel-parameters.txt

@@ -87,7 +87,8 @@ parameter is applicable:
 	SH	SuperH architecture is enabled.
 	SMP	The kernel is an SMP kernel.
 	SPARC	Sparc architecture is enabled.
-	SWSUSP	Software suspend is enabled.
+	SWSUSP	Software suspend (hibernation) is enabled.
+	SUSPEND	System suspend states are enabled.
 	TS	Appropriate touchscreen support is enabled.
 	USB	USB support is enabled.
 	USBHID	USB Human Interface Device support is enabled.
@@ -147,10 +148,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			default: 0
 
 	acpi_sleep=	[HW,ACPI] Sleep options
-			Format: { s3_bios, s3_mode, s3_beep, old_ordering }
+			Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig, old_ordering }
 			See Documentation/power/video.txt for s3_bios and s3_mode.
 			s3_beep is for debugging; it makes the PC's speaker beep
 			as soon as the kernel's real-mode entry point is called.
+			s4_nohwsig prevents ACPI hardware signature from being
+			used during resume from hibernation.
 			old_ordering causes the ACPI 1.0 ordering of the _PTS
 			control method, wrt putting devices into low power
 			states, to be enforced (the ACPI 2.0 ordering of _PTS is
@@ -774,8 +777,22 @@ and is between 256 and 4096 characters. It is defined in the file
 	hisax=		[HW,ISDN]
 			See Documentation/isdn/README.HiSax.
 
-	hugepages=	[HW,X86-32,IA-64] Maximal number of HugeTLB pages.
-	hugepagesz=	[HW,IA-64,PPC] The size of the HugeTLB pages.
+	hugepages=	[HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
+	hugepagesz=	[HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
+			On x86-64 and powerpc, this option can be specified
+			multiple times interleaved with hugepages= to reserve
+			huge pages of different sizes. Valid pages sizes on
+			x86-64 are 2M (when the CPU supports "pse") and 1G
+			(when the CPU supports the "pdpe1gb" cpuinfo flag)
+			Note that 1GB pages can only be allocated at boot time
+			using hugepages= and not freed afterwards.
+	default_hugepagesz=
+			[same as hugepagesz=] The size of the default
+			HugeTLB page size. This is the size represented by
+			the legacy /proc/ hugepages APIs, used for SHM, and
+			default size when mounting hugetlbfs filesystems.
+			Defaults to the default architecture's huge page size
+			if not specified.
 
 	i8042.direct	[HW] Put keyboard port into non-translated mode
 	i8042.dumbkbd	[HW] Pretend that controller can only read data from
@@ -1206,7 +1223,7 @@ and is between 256 and 4096 characters. It is defined in the file
 			         or
 			         memmap=0x10000$0x18690000
 
-	memtest=	[KNL,X86_64] Enable memtest
+	memtest=	[KNL,X86] Enable memtest
 			Format: <integer>
 			range: 0,4 : pattern number
 			default : 0 <disable>
@@ -1225,6 +1242,14 @@ and is between 256 and 4096 characters. It is defined in the file
 
 	mga=		[HW,DRM]
 
+	mminit_loglevel=
+			[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
+			parameter allows control of the logging verbosity for
+			the additional memory initialisation checks. A value
+			of 0 disables mminit logging and a level of 4 will
+			log everything. Information is printed at KERN_DEBUG
+			so loglevel=8 may also need to be specified.
+
 	mousedev.tap_time=
 			[MOUSE] Maximum time between finger touching and
 			leaving touchpad surface for touch to be considered
@@ -1279,6 +1304,13 @@ and is between 256 and 4096 characters. It is defined in the file
 			This usage is only documented in each driver source
 			file if at all.
 
+	nf_conntrack.acct=
+			[NETFILTER] Enable connection tracking flow accounting
+			0 to disable accounting
+			1 to enable accounting
+			Default value depends on CONFIG_NF_CT_ACCT that is
+			going to be removed in 2.6.29.
+
 	nfsaddrs=	[NFS]
 			See Documentation/filesystems/nfsroot.txt.
 
@@ -2027,6 +2059,9 @@ and is between 256 and 4096 characters. It is defined in the file
 
 	snd-ymfpci=	[HW,ALSA]
 
+	softlockup_panic=
+			[KNL] Should the soft-lockup detector generate panics.
+
 	sonypi.*=	[HW] Sony Programmable I/O Control Device driver
 			See Documentation/sonypi.txt
 
@@ -2091,6 +2126,12 @@ and is between 256 and 4096 characters. It is defined in the file
 
 	tdfx=		[HW,DRM]
 
+	test_suspend=	[SUSPEND]
+			Specify "mem" (for Suspend-to-RAM) or "standby" (for
+			standby suspend) as the system sleep state to briefly
+			enter during system startup.  The system is woken from
+			this state using a wakeup-capable RTC alarm.
+
 	thash_entries=	[KNL,NET]
 			Set number of hash buckets for TCP connection
 
@@ -2118,13 +2159,6 @@ and is between 256 and 4096 characters. It is defined in the file
 			<deci-seconds>: poll all this frequency
 			0: no polling (default)
 
-	tipar.timeout=	[HW,PPT]
-			Set communications timeout in tenths of a second
-			(default 15).
-
-	tipar.delay=	[HW,PPT]
-			Set inter-bit delay in microseconds (default 10).
-
 	tmscsim=	[HW,SCSI]
 			See comment before function dc390_setup() in
 			drivers/scsi/tmscsim.c.
@@ -2158,6 +2192,10 @@ and is between 256 and 4096 characters. It is defined in the file
 			Note that genuine overcurrent events won't be
 			reported either.
 
+	unknown_nmi_panic
+			[X86-32,X86-64]
+			Set unknown_nmi_panic=1 early on boot.
+
 	usbcore.autosuspend=
 			[USB] The autosuspend time delay (in seconds) used
 			for newly-detected USB devices (default 2).  This

+ 1 - 1
Documentation/keys.txt

@@ -864,7 +864,7 @@ payload contents" for more information.
     request_key_with_auxdata() respectively.
 
     These two functions return with the key potentially still under
-    construction.  To wait for contruction completion, the following should be
+    construction.  To wait for construction completion, the following should be
     called:
 
 	int wait_for_key_construction(struct key *key, bool intr);

+ 18 - 8
Documentation/laptops/thinkpad-acpi.txt

@@ -1,7 +1,7 @@
 		     ThinkPad ACPI Extras Driver
 
-                            Version 0.20
-                          April 09th, 2008
+                            Version 0.21
+                           May 29th, 2008
 
                Borislav Deianov <borislav@users.sf.net>
              Henrique de Moraes Holschuh <hmh@hmh.eng.br>
@@ -621,7 +621,8 @@ Bluetooth
 ---------
 
 procfs: /proc/acpi/ibm/bluetooth
-sysfs device attribute: bluetooth_enable
+sysfs device attribute: bluetooth_enable (deprecated)
+sysfs rfkill class: switch "tpacpi_bluetooth_sw"
 
 This feature shows the presence and current state of a ThinkPad
 Bluetooth device in the internal ThinkPad CDC slot.
@@ -643,8 +644,12 @@ Sysfs notes:
 		0: disables Bluetooth / Bluetooth is disabled
 		1: enables Bluetooth / Bluetooth is enabled.
 
-	Note: this interface will be probably be superseded by the
-	generic rfkill class, so it is NOT to be considered stable yet.
+	Note: this interface has been superseded by the	generic rfkill
+	class.  It has been deprecated, and it will be removed in year
+	2010.
+
+	rfkill controller switch "tpacpi_bluetooth_sw": refer to
+	Documentation/rfkill.txt for details.
 
 Video output control -- /proc/acpi/ibm/video
 --------------------------------------------
@@ -1374,7 +1379,8 @@ EXPERIMENTAL: WAN
 -----------------
 
 procfs: /proc/acpi/ibm/wan
-sysfs device attribute: wwan_enable
+sysfs device attribute: wwan_enable (deprecated)
+sysfs rfkill class: switch "tpacpi_wwan_sw"
 
 This feature is marked EXPERIMENTAL because the implementation
 directly accesses hardware registers and may not work as expected. USE
@@ -1404,8 +1410,12 @@ Sysfs notes:
 		0: disables WWAN card / WWAN card is disabled
 		1: enables WWAN card / WWAN card is enabled.
 
-	Note: this interface will be probably be superseded by the
-	generic rfkill class, so it is NOT to be considered stable yet.
+	Note: this interface has been superseded by the	generic rfkill
+	class.  It has been deprecated, and it will be removed in year
+	2010.
+
+	rfkill controller switch "tpacpi_wwan_sw": refer to
+	Documentation/rfkill.txt for details.
 
 Multiple Commands, Module Parameters
 ------------------------------------

+ 1 - 1
Documentation/leds-class.txt

@@ -59,7 +59,7 @@ Hardware accelerated blink of LEDs
 
 Some LEDs can be programmed to blink without any CPU interaction. To
 support this feature, a LED driver can optionally implement the
-blink_set() function (see <linux/leds.h>). If implemeted, triggers can
+blink_set() function (see <linux/leds.h>). If implemented, triggers can
 attempt to use it before falling back to software timers. The blink_set()
 function should return 0 if the blink setting is supported, or -EINVAL
 otherwise, which means that LED blinking will be handled by software.

+ 386 - 133
Documentation/lguest/lguest.c

@@ -36,11 +36,13 @@
 #include <sched.h>
 #include <limits.h>
 #include <stddef.h>
+#include <signal.h>
 #include "linux/lguest_launcher.h"
 #include "linux/virtio_config.h"
 #include "linux/virtio_net.h"
 #include "linux/virtio_blk.h"
 #include "linux/virtio_console.h"
+#include "linux/virtio_rng.h"
 #include "linux/virtio_ring.h"
 #include "asm-x86/bootparam.h"
 /*L:110 We can ignore the 39 include files we need for this program, but I do
@@ -64,8 +66,8 @@ typedef uint8_t u8;
 #endif
 /* We can have up to 256 pages for devices. */
 #define DEVICE_PAGES 256
-/* This will occupy 2 pages: it must be a power of 2. */
-#define VIRTQUEUE_NUM 128
+/* This will occupy 3 pages: it must be a power of 2. */
+#define VIRTQUEUE_NUM 256
 
 /*L:120 verbose is both a global flag and a macro.  The C preprocessor allows
  * this, and although I wouldn't recommend it, it works quite nicely here. */
@@ -74,12 +76,19 @@ static bool verbose;
 	do { if (verbose) printf(args); } while(0)
 /*:*/
 
-/* The pipe to send commands to the waker process */
-static int waker_fd;
+/* File descriptors for the Waker. */
+struct {
+	int pipe[2];
+	int lguest_fd;
+} waker_fds;
+
 /* The pointer to the start of guest memory. */
 static void *guest_base;
 /* The maximum guest physical address allowed, and maximum possible. */
 static unsigned long guest_limit, guest_max;
+/* The pipe for signal hander to write to. */
+static int timeoutpipe[2];
+static unsigned int timeout_usec = 500;
 
 /* a per-cpu variable indicating whose vcpu is currently running */
 static unsigned int __thread cpu_id;
@@ -155,11 +164,14 @@ struct virtqueue
 	/* Last available index we saw. */
 	u16 last_avail_idx;
 
-	/* The routine to call when the Guest pings us. */
-	void (*handle_output)(int fd, struct virtqueue *me);
+	/* The routine to call when the Guest pings us, or timeout. */
+	void (*handle_output)(int fd, struct virtqueue *me, bool timeout);
 
 	/* Outstanding buffers */
 	unsigned int inflight;
+
+	/* Is this blocked awaiting a timer? */
+	bool blocked;
 };
 
 /* Remember the arguments to the program so we can "reboot" */
@@ -190,6 +202,9 @@ static void *_convert(struct iovec *iov, size_t size, size_t align,
 	return iov->iov_base;
 }
 
+/* Wrapper for the last available index.  Makes it easier to change. */
+#define lg_last_avail(vq)	((vq)->last_avail_idx)
+
 /* The virtio configuration space is defined to be little-endian.  x86 is
  * little-endian too, but it's nice to be explicit so we have these helpers. */
 #define cpu_to_le16(v16) (v16)
@@ -199,6 +214,33 @@ static void *_convert(struct iovec *iov, size_t size, size_t align,
 #define le32_to_cpu(v32) (v32)
 #define le64_to_cpu(v64) (v64)
 
+/* Is this iovec empty? */
+static bool iov_empty(const struct iovec iov[], unsigned int num_iov)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_iov; i++)
+		if (iov[i].iov_len)
+			return false;
+	return true;
+}
+
+/* Take len bytes from the front of this iovec. */
+static void iov_consume(struct iovec iov[], unsigned num_iov, unsigned len)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_iov; i++) {
+		unsigned int used;
+
+		used = iov[i].iov_len < len ? iov[i].iov_len : len;
+		iov[i].iov_base += used;
+		iov[i].iov_len -= used;
+		len -= used;
+	}
+	assert(len == 0);
+}
+
 /* The device virtqueue descriptors are followed by feature bitmasks. */
 static u8 *get_feature_bits(struct device *dev)
 {
@@ -254,6 +296,7 @@ static void *map_zeroed_pages(unsigned int num)
 		    PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, fd, 0);
 	if (addr == MAP_FAILED)
 		err(1, "Mmaping %u pages of /dev/zero", num);
+	close(fd);
 
 	return addr;
 }
@@ -540,69 +583,64 @@ static void add_device_fd(int fd)
  * watch, but handing a file descriptor mask through to the kernel is fairly
  * icky.
  *
- * Instead, we fork off a process which watches the file descriptors and writes
+ * Instead, we clone off a thread which watches the file descriptors and writes
  * the LHREQ_BREAK command to the /dev/lguest file descriptor to tell the Host
  * stop running the Guest.  This causes the Launcher to return from the
  * /dev/lguest read with -EAGAIN, where it will write to /dev/lguest to reset
  * the LHREQ_BREAK and wake us up again.
  *
  * This, of course, is merely a different *kind* of icky.
+ *
+ * Given my well-known antipathy to threads, I'd prefer to use processes.  But
+ * it's easier to share Guest memory with threads, and trivial to share the
+ * devices.infds as the Launcher changes it.
  */
-static void wake_parent(int pipefd, int lguest_fd)
+static int waker(void *unused)
 {
-	/* Add the pipe from the Launcher to the fdset in the device_list, so
-	 * we watch it, too. */
-	add_device_fd(pipefd);
+	/* Close the write end of the pipe: only the Launcher has it open. */
+	close(waker_fds.pipe[1]);
 
 	for (;;) {
 		fd_set rfds = devices.infds;
 		unsigned long args[] = { LHREQ_BREAK, 1 };
+		unsigned int maxfd = devices.max_infd;
+
+		/* We also listen to the pipe from the Launcher. */
+		FD_SET(waker_fds.pipe[0], &rfds);
+		if (waker_fds.pipe[0] > maxfd)
+			maxfd = waker_fds.pipe[0];
 
 		/* Wait until input is ready from one of the devices. */
-		select(devices.max_infd+1, &rfds, NULL, NULL, NULL);
-		/* Is it a message from the Launcher? */
-		if (FD_ISSET(pipefd, &rfds)) {
-			int fd;
-			/* If read() returns 0, it means the Launcher has
-			 * exited.  We silently follow. */
-			if (read(pipefd, &fd, sizeof(fd)) == 0)
-				exit(0);
-			/* Otherwise it's telling us to change what file
-			 * descriptors we're to listen to.  Positive means
-			 * listen to a new one, negative means stop
-			 * listening. */
-			if (fd >= 0)
-				FD_SET(fd, &devices.infds);
-			else
-				FD_CLR(-fd - 1, &devices.infds);
-		} else /* Send LHREQ_BREAK command. */
-			pwrite(lguest_fd, args, sizeof(args), cpu_id);
+		select(maxfd+1, &rfds, NULL, NULL, NULL);
+
+		/* Message from Launcher? */
+		if (FD_ISSET(waker_fds.pipe[0], &rfds)) {
+			char c;
+			/* If this fails, then assume Launcher has exited.
+			 * Don't do anything on exit: we're just a thread! */
+			if (read(waker_fds.pipe[0], &c, 1) != 1)
+				_exit(0);
+			continue;
+		}
+
+		/* Send LHREQ_BREAK command to snap the Launcher out of it. */
+		pwrite(waker_fds.lguest_fd, args, sizeof(args), cpu_id);
 	}
+	return 0;
 }
 
 /* This routine just sets up a pipe to the Waker process. */
-static int setup_waker(int lguest_fd)
-{
-	int pipefd[2], child;
-
-	/* We create a pipe to talk to the Waker, and also so it knows when the
-	 * Launcher dies (and closes pipe). */
-	pipe(pipefd);
-	child = fork();
-	if (child == -1)
-		err(1, "forking");
-
-	if (child == 0) {
-		/* We are the Waker: close the "writing" end of our copy of the
-		 * pipe and start waiting for input. */
-		close(pipefd[1]);
-		wake_parent(pipefd[0], lguest_fd);
-	}
-	/* Close the reading end of our copy of the pipe. */
-	close(pipefd[0]);
+static void setup_waker(int lguest_fd)
+{
+	/* This pipe is closed when Launcher dies, telling Waker. */
+	if (pipe(waker_fds.pipe) != 0)
+		err(1, "Creating pipe for Waker");
 
-	/* Here is the fd used to talk to the waker. */
-	return pipefd[1];
+	/* Waker also needs to know the lguest fd */
+	waker_fds.lguest_fd = lguest_fd;
+
+	if (clone(waker, malloc(4096) + 4096, CLONE_VM | SIGCHLD, NULL) == -1)
+		err(1, "Creating Waker");
 }
 
 /*
@@ -661,19 +699,22 @@ static unsigned get_vq_desc(struct virtqueue *vq,
 			    unsigned int *out_num, unsigned int *in_num)
 {
 	unsigned int i, head;
+	u16 last_avail;
 
 	/* Check it isn't doing very strange things with descriptor numbers. */
-	if ((u16)(vq->vring.avail->idx - vq->last_avail_idx) > vq->vring.num)
+	last_avail = lg_last_avail(vq);
+	if ((u16)(vq->vring.avail->idx - last_avail) > vq->vring.num)
 		errx(1, "Guest moved used index from %u to %u",
-		     vq->last_avail_idx, vq->vring.avail->idx);
+		     last_avail, vq->vring.avail->idx);
 
 	/* If there's nothing new since last we looked, return invalid. */
-	if (vq->vring.avail->idx == vq->last_avail_idx)
+	if (vq->vring.avail->idx == last_avail)
 		return vq->vring.num;
 
 	/* Grab the next descriptor number they're advertising, and increment
 	 * the index we've seen. */
-	head = vq->vring.avail->ring[vq->last_avail_idx++ % vq->vring.num];
+	head = vq->vring.avail->ring[last_avail % vq->vring.num];
+	lg_last_avail(vq)++;
 
 	/* If their number is silly, that's a fatal mistake. */
 	if (head >= vq->vring.num)
@@ -821,8 +862,8 @@ static bool handle_console_input(int fd, struct device *dev)
 				unsigned long args[] = { LHREQ_BREAK, 0 };
 				/* Close the fd so Waker will know it has to
 				 * exit. */
-				close(waker_fd);
-				/* Just in case waker is blocked in BREAK, send
+				close(waker_fds.pipe[1]);
+				/* Just in case Waker is blocked in BREAK, send
 				 * unbreak now. */
 				write(fd, args, sizeof(args));
 				exit(2);
@@ -839,7 +880,7 @@ static bool handle_console_input(int fd, struct device *dev)
 
 /* Handling output for console is simple: we just get all the output buffers
  * and write them to stdout. */
-static void handle_console_output(int fd, struct virtqueue *vq)
+static void handle_console_output(int fd, struct virtqueue *vq, bool timeout)
 {
 	unsigned int head, out, in;
 	int len;
@@ -854,6 +895,21 @@ static void handle_console_output(int fd, struct virtqueue *vq)
 	}
 }
 
+static void block_vq(struct virtqueue *vq)
+{
+	struct itimerval itm;
+
+	vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY;
+	vq->blocked = true;
+
+	itm.it_interval.tv_sec = 0;
+	itm.it_interval.tv_usec = 0;
+	itm.it_value.tv_sec = 0;
+	itm.it_value.tv_usec = timeout_usec;
+
+	setitimer(ITIMER_REAL, &itm, NULL);
+}
+
 /*
  * The Network
  *
@@ -861,22 +917,34 @@ static void handle_console_output(int fd, struct virtqueue *vq)
  * and write them (ignoring the first element) to this device's file descriptor
  * (/dev/net/tun).
  */
-static void handle_net_output(int fd, struct virtqueue *vq)
+static void handle_net_output(int fd, struct virtqueue *vq, bool timeout)
 {
-	unsigned int head, out, in;
+	unsigned int head, out, in, num = 0;
 	int len;
 	struct iovec iov[vq->vring.num];
+	static int last_timeout_num;
 
 	/* Keep getting output buffers from the Guest until we run out. */
 	while ((head = get_vq_desc(vq, iov, &out, &in)) != vq->vring.num) {
 		if (in)
 			errx(1, "Input buffers in output queue?");
-		/* Check header, but otherwise ignore it (we told the Guest we
-		 * supported no features, so it shouldn't have anything
-		 * interesting). */
-		(void)convert(&iov[0], struct virtio_net_hdr);
-		len = writev(vq->dev->fd, iov+1, out-1);
+		len = writev(vq->dev->fd, iov, out);
+		if (len < 0)
+			err(1, "Writing network packet to tun");
 		add_used_and_trigger(fd, vq, head, len);
+		num++;
+	}
+
+	/* Block further kicks and set up a timer if we saw anything. */
+	if (!timeout && num)
+		block_vq(vq);
+
+	if (timeout) {
+		if (num < last_timeout_num)
+			timeout_usec += 10;
+		else if (timeout_usec > 1)
+			timeout_usec--;
+		last_timeout_num = num;
 	}
 }
 
@@ -887,7 +955,6 @@ static bool handle_tun_input(int fd, struct device *dev)
 	unsigned int head, in_num, out_num;
 	int len;
 	struct iovec iov[dev->vq->vring.num];
-	struct virtio_net_hdr *hdr;
 
 	/* First we need a network buffer from the Guests's recv virtqueue. */
 	head = get_vq_desc(dev->vq, iov, &out_num, &in_num);
@@ -896,25 +963,23 @@ static bool handle_tun_input(int fd, struct device *dev)
 		 * early, the Guest won't be ready yet.  Wait until the device
 		 * status says it's ready. */
 		/* FIXME: Actually want DRIVER_ACTIVE here. */
-		if (dev->desc->status & VIRTIO_CONFIG_S_DRIVER_OK)
-			warn("network: no dma buffer!");
+
+		/* Now tell it we want to know if new things appear. */
+		dev->vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY;
+		wmb();
+
 		/* We'll turn this back on if input buffers are registered. */
 		return false;
 	} else if (out_num)
 		errx(1, "Output buffers in network recv queue?");
 
-	/* First element is the header: we set it to 0 (no features). */
-	hdr = convert(&iov[0], struct virtio_net_hdr);
-	hdr->flags = 0;
-	hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
-
 	/* Read the packet from the device directly into the Guest's buffer. */
-	len = readv(dev->fd, iov+1, in_num-1);
+	len = readv(dev->fd, iov, in_num);
 	if (len <= 0)
 		err(1, "reading network");
 
 	/* Tell the Guest about the new packet. */
-	add_used_and_trigger(fd, dev->vq, head, sizeof(*hdr) + len);
+	add_used_and_trigger(fd, dev->vq, head, len);
 
 	verbose("tun input packet len %i [%02x %02x] (%s)\n", len,
 		((u8 *)iov[1].iov_base)[0], ((u8 *)iov[1].iov_base)[1],
@@ -927,11 +992,18 @@ static bool handle_tun_input(int fd, struct device *dev)
 /*L:215 This is the callback attached to the network and console input
  * virtqueues: it ensures we try again, in case we stopped console or net
  * delivery because Guest didn't have any buffers. */
-static void enable_fd(int fd, struct virtqueue *vq)
+static void enable_fd(int fd, struct virtqueue *vq, bool timeout)
 {
 	add_device_fd(vq->dev->fd);
-	/* Tell waker to listen to it again */
-	write(waker_fd, &vq->dev->fd, sizeof(vq->dev->fd));
+	/* Snap the Waker out of its select loop. */
+	write(waker_fds.pipe[1], "", 1);
+}
+
+static void net_enable_fd(int fd, struct virtqueue *vq, bool timeout)
+{
+	/* We don't need to know again when Guest refills receive buffer. */
+	vq->vring.used->flags |= VRING_USED_F_NO_NOTIFY;
+	enable_fd(fd, vq, timeout);
 }
 
 /* When the Guest tells us they updated the status field, we handle it. */
@@ -951,7 +1023,7 @@ static void update_device_status(struct device *dev)
 		for (vq = dev->vq; vq; vq = vq->next) {
 			memset(vq->vring.desc, 0,
 			       vring_size(vq->config.num, getpagesize()));
-			vq->last_avail_idx = 0;
+			lg_last_avail(vq) = 0;
 		}
 	} else if (dev->desc->status & VIRTIO_CONFIG_S_FAILED) {
 		warnx("Device %s configuration FAILED", dev->name);
@@ -960,10 +1032,10 @@ static void update_device_status(struct device *dev)
 
 		verbose("Device %s OK: offered", dev->name);
 		for (i = 0; i < dev->desc->feature_len; i++)
-			verbose(" %08x", get_feature_bits(dev)[i]);
+			verbose(" %02x", get_feature_bits(dev)[i]);
 		verbose(", accepted");
 		for (i = 0; i < dev->desc->feature_len; i++)
-			verbose(" %08x", get_feature_bits(dev)
+			verbose(" %02x", get_feature_bits(dev)
 				[dev->desc->feature_len+i]);
 
 		if (dev->ready)
@@ -1000,7 +1072,7 @@ static void handle_output(int fd, unsigned long addr)
 			if (strcmp(vq->dev->name, "console") != 0)
 				verbose("Output to %s\n", vq->dev->name);
 			if (vq->handle_output)
-				vq->handle_output(fd, vq);
+				vq->handle_output(fd, vq, false);
 			return;
 		}
 	}
@@ -1014,6 +1086,29 @@ static void handle_output(int fd, unsigned long addr)
 	      strnlen(from_guest_phys(addr), guest_limit - addr));
 }
 
+static void handle_timeout(int fd)
+{
+	char buf[32];
+	struct device *i;
+	struct virtqueue *vq;
+
+	/* Clear the pipe */
+	read(timeoutpipe[0], buf, sizeof(buf));
+
+	/* Check each device and virtqueue: flush blocked ones. */
+	for (i = devices.dev; i; i = i->next) {
+		for (vq = i->vq; vq; vq = vq->next) {
+			if (!vq->blocked)
+				continue;
+
+			vq->vring.used->flags &= ~VRING_USED_F_NO_NOTIFY;
+			vq->blocked = false;
+			if (vq->handle_output)
+				vq->handle_output(fd, vq, true);
+		}
+	}
+}
+
 /* This is called when the Waker wakes us up: check for incoming file
  * descriptors. */
 static void handle_input(int fd)
@@ -1024,16 +1119,20 @@ static void handle_input(int fd)
 	for (;;) {
 		struct device *i;
 		fd_set fds = devices.infds;
+		int num;
 
+		num = select(devices.max_infd+1, &fds, NULL, NULL, &poll);
+		/* Could get interrupted */
+		if (num < 0)
+			continue;
 		/* If nothing is ready, we're done. */
-		if (select(devices.max_infd+1, &fds, NULL, NULL, &poll) == 0)
+		if (num == 0)
 			break;
 
 		/* Otherwise, call the device(s) which have readable file
 		 * descriptors and a method of handling them.  */
 		for (i = devices.dev; i; i = i->next) {
 			if (i->handle_input && FD_ISSET(i->fd, &fds)) {
-				int dev_fd;
 				if (i->handle_input(fd, i))
 					continue;
 
@@ -1043,13 +1142,12 @@ static void handle_input(int fd)
 				 * buffers to deliver into.  Console also uses
 				 * it when it discovers that stdin is closed. */
 				FD_CLR(i->fd, &devices.infds);
-				/* Tell waker to ignore it too, by sending a
-				 * negative fd number (-1, since 0 is a valid
-				 * FD number). */
-				dev_fd = -i->fd - 1;
-				write(waker_fd, &dev_fd, sizeof(dev_fd));
 			}
 		}
+
+		/* Is this the timeout fd? */
+		if (FD_ISSET(timeoutpipe[0], &fds))
+			handle_timeout(fd);
 	}
 }
 
@@ -1098,7 +1196,7 @@ static struct lguest_device_desc *new_dev_desc(u16 type)
 /* Each device descriptor is followed by the description of its virtqueues.  We
  * specify how many descriptors the virtqueue is to have. */
 static void add_virtqueue(struct device *dev, unsigned int num_descs,
-			  void (*handle_output)(int fd, struct virtqueue *me))
+			  void (*handle_output)(int, struct virtqueue *, bool))
 {
 	unsigned int pages;
 	struct virtqueue **i, *vq = malloc(sizeof(*vq));
@@ -1114,6 +1212,7 @@ static void add_virtqueue(struct device *dev, unsigned int num_descs,
 	vq->last_avail_idx = 0;
 	vq->dev = dev;
 	vq->inflight = 0;
+	vq->blocked = false;
 
 	/* Initialize the configuration. */
 	vq->config.num = num_descs;
@@ -1246,6 +1345,24 @@ static void setup_console(void)
 }
 /*:*/
 
+static void timeout_alarm(int sig)
+{
+	write(timeoutpipe[1], "", 1);
+}
+
+static void setup_timeout(void)
+{
+	if (pipe(timeoutpipe) != 0)
+		err(1, "Creating timeout pipe");
+
+	if (fcntl(timeoutpipe[1], F_SETFL,
+		  fcntl(timeoutpipe[1], F_GETFL) | O_NONBLOCK) != 0)
+		err(1, "Making timeout pipe nonblocking");
+
+	add_device_fd(timeoutpipe[0]);
+	signal(SIGALRM, timeout_alarm);
+}
+
 /*M:010 Inter-guest networking is an interesting area.  Simplest is to have a
  * --sharenet=<name> option which opens or creates a named pipe.  This can be
  * used to send packets to another guest in a 1:1 manner.
@@ -1264,10 +1381,25 @@ static void setup_console(void)
 
 static u32 str2ip(const char *ipaddr)
 {
-	unsigned int byte[4];
+	unsigned int b[4];
 
-	sscanf(ipaddr, "%u.%u.%u.%u", &byte[0], &byte[1], &byte[2], &byte[3]);
-	return (byte[0] << 24) | (byte[1] << 16) | (byte[2] << 8) | byte[3];
+	if (sscanf(ipaddr, "%u.%u.%u.%u", &b[0], &b[1], &b[2], &b[3]) != 4)
+		errx(1, "Failed to parse IP address '%s'", ipaddr);
+	return (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | b[3];
+}
+
+static void str2mac(const char *macaddr, unsigned char mac[6])
+{
+	unsigned int m[6];
+	if (sscanf(macaddr, "%02x:%02x:%02x:%02x:%02x:%02x",
+		   &m[0], &m[1], &m[2], &m[3], &m[4], &m[5]) != 6)
+		errx(1, "Failed to parse mac address '%s'", macaddr);
+	mac[0] = m[0];
+	mac[1] = m[1];
+	mac[2] = m[2];
+	mac[3] = m[3];
+	mac[4] = m[4];
+	mac[5] = m[5];
 }
 
 /* This code is "adapted" from libbridge: it attaches the Host end of the
@@ -1288,6 +1420,7 @@ static void add_to_bridge(int fd, const char *if_name, const char *br_name)
 		errx(1, "interface %s does not exist!", if_name);
 
 	strncpy(ifr.ifr_name, br_name, IFNAMSIZ);
+	ifr.ifr_name[IFNAMSIZ-1] = '\0';
 	ifr.ifr_ifindex = ifidx;
 	if (ioctl(fd, SIOCBRADDIF, &ifr) < 0)
 		err(1, "can't add %s to bridge %s", if_name, br_name);
@@ -1296,64 +1429,90 @@ static void add_to_bridge(int fd, const char *if_name, const char *br_name)
 /* This sets up the Host end of the network device with an IP address, brings
  * it up so packets will flow, the copies the MAC address into the hwaddr
  * pointer. */
-static void configure_device(int fd, const char *devname, u32 ipaddr,
-			     unsigned char hwaddr[6])
+static void configure_device(int fd, const char *tapif, u32 ipaddr)
 {
 	struct ifreq ifr;
 	struct sockaddr_in *sin = (struct sockaddr_in *)&ifr.ifr_addr;
 
-	/* Don't read these incantations.  Just cut & paste them like I did! */
 	memset(&ifr, 0, sizeof(ifr));
-	strcpy(ifr.ifr_name, devname);
+	strcpy(ifr.ifr_name, tapif);
+
+	/* Don't read these incantations.  Just cut & paste them like I did! */
 	sin->sin_family = AF_INET;
 	sin->sin_addr.s_addr = htonl(ipaddr);
 	if (ioctl(fd, SIOCSIFADDR, &ifr) != 0)
-		err(1, "Setting %s interface address", devname);
+		err(1, "Setting %s interface address", tapif);
 	ifr.ifr_flags = IFF_UP;
 	if (ioctl(fd, SIOCSIFFLAGS, &ifr) != 0)
-		err(1, "Bringing interface %s up", devname);
+		err(1, "Bringing interface %s up", tapif);
+}
+
+static void get_mac(int fd, const char *tapif, unsigned char hwaddr[6])
+{
+	struct ifreq ifr;
+
+	memset(&ifr, 0, sizeof(ifr));
+	strcpy(ifr.ifr_name, tapif);
 
 	/* SIOC stands for Socket I/O Control.  G means Get (vs S for Set
 	 * above).  IF means Interface, and HWADDR is hardware address.
 	 * Simple! */
 	if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0)
-		err(1, "getting hw address for %s", devname);
+		err(1, "getting hw address for %s", tapif);
 	memcpy(hwaddr, ifr.ifr_hwaddr.sa_data, 6);
 }
 
-/*L:195 Our network is a Host<->Guest network.  This can either use bridging or
- * routing, but the principle is the same: it uses the "tun" device to inject
- * packets into the Host as if they came in from a normal network card.  We
- * just shunt packets between the Guest and the tun device. */
-static void setup_tun_net(const char *arg)
+static int get_tun_device(char tapif[IFNAMSIZ])
 {
-	struct device *dev;
 	struct ifreq ifr;
-	int netfd, ipfd;
-	u32 ip;
-	const char *br_name = NULL;
-	struct virtio_net_config conf;
+	int netfd;
+
+	/* Start with this zeroed.  Messy but sure. */
+	memset(&ifr, 0, sizeof(ifr));
 
 	/* We open the /dev/net/tun device and tell it we want a tap device.  A
 	 * tap device is like a tun device, only somehow different.  To tell
 	 * the truth, I completely blundered my way through this code, but it
 	 * works now! */
 	netfd = open_or_die("/dev/net/tun", O_RDWR);
-	memset(&ifr, 0, sizeof(ifr));
-	ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
+	ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_VNET_HDR;
 	strcpy(ifr.ifr_name, "tap%d");
 	if (ioctl(netfd, TUNSETIFF, &ifr) != 0)
 		err(1, "configuring /dev/net/tun");
+
+	if (ioctl(netfd, TUNSETOFFLOAD,
+		  TUN_F_CSUM|TUN_F_TSO4|TUN_F_TSO6|TUN_F_TSO_ECN) != 0)
+		err(1, "Could not set features for tun device");
+
 	/* We don't need checksums calculated for packets coming in this
 	 * device: trust us! */
 	ioctl(netfd, TUNSETNOCSUM, 1);
 
+	memcpy(tapif, ifr.ifr_name, IFNAMSIZ);
+	return netfd;
+}
+
+/*L:195 Our network is a Host<->Guest network.  This can either use bridging or
+ * routing, but the principle is the same: it uses the "tun" device to inject
+ * packets into the Host as if they came in from a normal network card.  We
+ * just shunt packets between the Guest and the tun device. */
+static void setup_tun_net(char *arg)
+{
+	struct device *dev;
+	int netfd, ipfd;
+	u32 ip = INADDR_ANY;
+	bool bridging = false;
+	char tapif[IFNAMSIZ], *p;
+	struct virtio_net_config conf;
+
+	netfd = get_tun_device(tapif);
+
 	/* First we create a new network device. */
 	dev = new_device("net", VIRTIO_ID_NET, netfd, handle_tun_input);
 
 	/* Network devices need a receive and a send queue, just like
 	 * console. */
-	add_virtqueue(dev, VIRTQUEUE_NUM, enable_fd);
+	add_virtqueue(dev, VIRTQUEUE_NUM, net_enable_fd);
 	add_virtqueue(dev, VIRTQUEUE_NUM, handle_net_output);
 
 	/* We need a socket to perform the magic network ioctls to bring up the
@@ -1364,28 +1523,56 @@ static void setup_tun_net(const char *arg)
 
 	/* If the command line was --tunnet=bridge:<name> do bridging. */
 	if (!strncmp(BRIDGE_PFX, arg, strlen(BRIDGE_PFX))) {
-		ip = INADDR_ANY;
-		br_name = arg + strlen(BRIDGE_PFX);
-		add_to_bridge(ipfd, ifr.ifr_name, br_name);
-	} else /* It is an IP address to set up the device with */
+		arg += strlen(BRIDGE_PFX);
+		bridging = true;
+	}
+
+	/* A mac address may follow the bridge name or IP address */
+	p = strchr(arg, ':');
+	if (p) {
+		str2mac(p+1, conf.mac);
+		*p = '\0';
+	} else {
+		p = arg + strlen(arg);
+		/* None supplied; query the randomly assigned mac. */
+		get_mac(ipfd, tapif, conf.mac);
+	}
+
+	/* arg is now either an IP address or a bridge name */
+	if (bridging)
+		add_to_bridge(ipfd, tapif, arg);
+	else
 		ip = str2ip(arg);
 
-	/* Set up the tun device, and get the mac address for the interface. */
-	configure_device(ipfd, ifr.ifr_name, ip, conf.mac);
+	/* Set up the tun device. */
+	configure_device(ipfd, tapif, ip);
 
 	/* Tell Guest what MAC address to use. */
 	add_feature(dev, VIRTIO_NET_F_MAC);
 	add_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY);
+	/* Expect Guest to handle everything except UFO */
+	add_feature(dev, VIRTIO_NET_F_CSUM);
+	add_feature(dev, VIRTIO_NET_F_GUEST_CSUM);
+	add_feature(dev, VIRTIO_NET_F_MAC);
+	add_feature(dev, VIRTIO_NET_F_GUEST_TSO4);
+	add_feature(dev, VIRTIO_NET_F_GUEST_TSO6);
+	add_feature(dev, VIRTIO_NET_F_GUEST_ECN);
+	add_feature(dev, VIRTIO_NET_F_HOST_TSO4);
+	add_feature(dev, VIRTIO_NET_F_HOST_TSO6);
+	add_feature(dev, VIRTIO_NET_F_HOST_ECN);
 	set_config(dev, sizeof(conf), &conf);
 
 	/* We don't need the socket any more; setup is done. */
 	close(ipfd);
 
-	verbose("device %u: tun net %u.%u.%u.%u\n",
-		devices.device_num++,
-		(u8)(ip>>24),(u8)(ip>>16),(u8)(ip>>8),(u8)ip);
-	if (br_name)
-		verbose("attached to bridge: %s\n", br_name);
+	devices.device_num++;
+
+	if (bridging)
+		verbose("device %u: tun %s attached to bridge: %s\n",
+			devices.device_num, tapif, arg);
+	else
+		verbose("device %u: tun %s: %s\n",
+			devices.device_num, tapif, arg);
 }
 
 /* Our block (disk) device should be really simple: the Guest asks for a block
@@ -1550,7 +1737,7 @@ static bool handle_io_finish(int fd, struct device *dev)
 }
 
 /* When the Guest submits some I/O, we just need to wake the I/O thread. */
-static void handle_virtblk_output(int fd, struct virtqueue *vq)
+static void handle_virtblk_output(int fd, struct virtqueue *vq, bool timeout)
 {
 	struct vblk_info *vblk = vq->dev->priv;
 	char c = 0;
@@ -1621,6 +1808,64 @@ static void setup_block_file(const char *filename)
 	verbose("device %u: virtblock %llu sectors\n",
 		devices.device_num, le64_to_cpu(conf.capacity));
 }
+
+/* Our random number generator device reads from /dev/random into the Guest's
+ * input buffers.  The usual case is that the Guest doesn't want random numbers
+ * and so has no buffers although /dev/random is still readable, whereas
+ * console is the reverse.
+ *
+ * The same logic applies, however. */
+static bool handle_rng_input(int fd, struct device *dev)
+{
+	int len;
+	unsigned int head, in_num, out_num, totlen = 0;
+	struct iovec iov[dev->vq->vring.num];
+
+	/* First we need a buffer from the Guests's virtqueue. */
+	head = get_vq_desc(dev->vq, iov, &out_num, &in_num);
+
+	/* If they're not ready for input, stop listening to this file
+	 * descriptor.  We'll start again once they add an input buffer. */
+	if (head == dev->vq->vring.num)
+		return false;
+
+	if (out_num)
+		errx(1, "Output buffers in rng?");
+
+	/* This is why we convert to iovecs: the readv() call uses them, and so
+	 * it reads straight into the Guest's buffer.  We loop to make sure we
+	 * fill it. */
+	while (!iov_empty(iov, in_num)) {
+		len = readv(dev->fd, iov, in_num);
+		if (len <= 0)
+			err(1, "Read from /dev/random gave %i", len);
+		iov_consume(iov, in_num, len);
+		totlen += len;
+	}
+
+	/* Tell the Guest about the new input. */
+	add_used_and_trigger(fd, dev->vq, head, totlen);
+
+	/* Everything went OK! */
+	return true;
+}
+
+/* And this creates a "hardware" random number device for the Guest. */
+static void setup_rng(void)
+{
+	struct device *dev;
+	int fd;
+
+	fd = open_or_die("/dev/random", O_RDONLY);
+
+	/* The device responds to return from I/O thread. */
+	dev = new_device("rng", VIRTIO_ID_RNG, fd, handle_rng_input);
+
+	/* The device has one virtqueue, where the Guest places inbufs. */
+	add_virtqueue(dev, VIRTQUEUE_NUM, enable_fd);
+
+	verbose("device %u: rng\n", devices.device_num++);
+}
 /* That's the end of device setup. */
 
 /*L:230 Reboot is pretty easy: clean up and exec() the Launcher afresh. */
@@ -1628,11 +1873,12 @@ static void __attribute__((noreturn)) restart_guest(void)
 {
 	unsigned int i;
 
-	/* Closing pipes causes the Waker thread and io_threads to die, and
-	 * closing /dev/lguest cleans up the Guest.  Since we don't track all
-	 * open fds, we simply close everything beyond stderr. */
+	/* Since we don't track all open fds, we simply close everything beyond
+	 * stderr. */
 	for (i = 3; i < FD_SETSIZE; i++)
 		close(i);
+
+	/* The exec automatically gets rid of the I/O and Waker threads. */
 	execv(main_args[0], main_args);
 	err(1, "Could not exec %s", main_args[0]);
 }
@@ -1663,7 +1909,7 @@ static void __attribute__((noreturn)) run_guest(int lguest_fd)
 		/* ERESTART means that we need to reboot the guest */
 		} else if (errno == ERESTART) {
 			restart_guest();
-		/* EAGAIN means the Waker wanted us to look at some input.
+		/* EAGAIN means a signal (timeout).
 		 * Anything else means a bug or incompatible change. */
 		} else if (errno != EAGAIN)
 			err(1, "Running guest failed");
@@ -1691,13 +1937,14 @@ static struct option opts[] = {
 	{ "verbose", 0, NULL, 'v' },
 	{ "tunnet", 1, NULL, 't' },
 	{ "block", 1, NULL, 'b' },
+	{ "rng", 0, NULL, 'r' },
 	{ "initrd", 1, NULL, 'i' },
 	{ NULL },
 };
 static void usage(void)
 {
 	errx(1, "Usage: lguest [--verbose] "
-	     "[--tunnet=(<ipaddr>|bridge:<bridgename>)\n"
+	     "[--tunnet=(<ipaddr>:<macaddr>|bridge:<bridgename>:<macaddr>)\n"
 	     "|--block=<filename>|--initrd=<filename>]...\n"
 	     "<mem-in-mb> vmlinux [args...]");
 }
@@ -1765,6 +2012,9 @@ int main(int argc, char *argv[])
 		case 'b':
 			setup_block_file(optarg);
 			break;
+		case 'r':
+			setup_rng();
+			break;
 		case 'i':
 			initrd_name = optarg;
 			break;
@@ -1783,6 +2033,9 @@ int main(int argc, char *argv[])
 	/* We always have a console device */
 	setup_console();
 
+	/* We can timeout waiting for Guest network transmit. */
+	setup_timeout();
+
 	/* Now we load the kernel */
 	start = load_kernel(open_or_die(argv[optind+1], O_RDONLY));
 
@@ -1826,10 +2079,10 @@ int main(int argc, char *argv[])
 	 * /dev/lguest file descriptor. */
 	lguest_fd = tell_kernel(pgdir, start);
 
-	/* We fork off a child process, which wakes the Launcher whenever one
-	 * of the input file descriptors needs attention.  We call this the
-	 * Waker, and we'll cover it in a moment. */
-	waker_fd = setup_waker(lguest_fd);
+	/* We clone off a thread, which wakes the Launcher whenever one of the
+	 * input file descriptors needs attention.  We call this the Waker, and
+	 * we'll cover it in a moment. */
+	setup_waker(lguest_fd);
 
 	/* Finally, run the Guest.  This doesn't return. */
 	run_guest(lguest_fd);

+ 1 - 1
Documentation/local_ops.txt

@@ -36,7 +36,7 @@ It can be done by slightly modifying the standard atomic operations : only
 their UP variant must be kept. It typically means removing LOCK prefix (on
 i386 and x86_64) and any SMP sychronization barrier. If the architecture does
 not have a different behavior between SMP and UP, including asm-generic/local.h
-in your archtecture's local.h is sufficient.
+in your architecture's local.h is sufficient.
 
 The local_t type is defined as an opaque signed long by embedding an
 atomic_long_t inside a structure. This is made so a cast from this type to a

+ 29 - 1
Documentation/md.txt

@@ -236,6 +236,11 @@ All md devices contain:
      writing the word for the desired state, however some states
      cannot be explicitly set, and some transitions are not allowed.
 
+     Select/poll works on this file.  All changes except between
+     	active_idle and active (which can be frequent and are not
+	very interesting) are notified.  active->active_idle is
+	reported if the metadata is externally managed.
+
      clear
          No devices, no size, no level
          Writing is equivalent to STOP_ARRAY ioctl
@@ -292,6 +297,10 @@ Each directory contains:
 	      writemostly - device will only be subject to read
 		         requests if there are no other options.
 			 This applies only to raid1 arrays.
+	      blocked  - device has failed, metadata is "external",
+	                 and the failure hasn't been acknowledged yet.
+			 Writes that would write to this device if
+			 it were not faulty are blocked.
 	      spare    - device is working, but not a full member.
 			 This includes spares that are in the process
 			 of being recovered to
@@ -301,6 +310,12 @@ Each directory contains:
 	Writing "remove" removes the device from the array.
 	Writing "writemostly" sets the writemostly flag.
 	Writing "-writemostly" clears the writemostly flag.
+	Writing "blocked" sets the "blocked" flag.
+	Writing "-blocked" clear the "blocked" flag and allows writes
+		to complete.
+
+	This file responds to select/poll. Any change to 'faulty'
+	or 'blocked' causes an event.
 
       errors
 	An approximate count of read errors that have been detected on
@@ -332,7 +347,7 @@ Each directory contains:
         for storage of data.  This will normally be the same as the
 	component_size.  This can be written while assembling an
         array.  If a value less than the current component_size is
-        written, component_size will be reduced to this value.
+        written, it will be rejected.
 
 
 An active md device will also contain and entry for each active device
@@ -381,6 +396,19 @@ also have
 	'check' and 'repair' will start the appropriate process
            providing the current state is 'idle'.
 
+      This file responds to select/poll.  Any important change in the value
+      triggers a poll event.  Sometimes the value will briefly be
+      "recover" if a recovery seems to be needed, but cannot be
+      achieved. In that case, the transition to "recover" isn't
+      notified, but the transition away is.
+
+   degraded
+      This contains a count of the number of devices by which the
+      arrays is degraded.  So an optimal array with show '0'.  A
+      single failed/missing drive will show '1', etc.
+      This file responds to select/poll, any increase or decrease
+      in the count of missing devices will trigger an event.
+
    mismatch_count
       When performing 'check' and 'repair', and possibly when
       performing 'resync', md will count the number of errors that are

+ 252 - 140
Documentation/moxa-smartio

@@ -1,14 +1,22 @@
 =============================================================================
-
-	MOXA Smartio Family Device Driver Ver 1.1 Installation Guide
-		    for Linux Kernel 2.2.x and 2.0.3x
-	       Copyright (C) 1999, Moxa Technologies Co, Ltd.
+          MOXA Smartio/Industio Family Device Driver Installation Guide
+		    for Linux Kernel 2.4.x, 2.6.x
+	       Copyright (C) 2008, Moxa Inc.
 =============================================================================
+Date: 01/21/2008
+
 Content
 
 1. Introduction
 2. System Requirement
 3. Installation
+   3.1 Hardware installation
+   3.2 Driver files
+   3.3 Device naming convention
+   3.4 Module driver configuration
+   3.5 Static driver configuration for Linux kernel 2.4.x and 2.6.x.
+   3.6 Custom configuration
+   3.7 Verify driver installation
 4. Utilities
 5. Setserial
 6. Troubleshooting
@@ -16,27 +24,48 @@ Content
 -----------------------------------------------------------------------------
 1. Introduction
 
-   The Smartio family Linux driver, Ver. 1.1, supports following multiport
+   The Smartio/Industio/UPCI family Linux driver supports following multiport
    boards.
 
-    -C104P/H/HS, C104H/PCI, C104HS/PCI, CI-104J 4 port multiport board.
-    -C168P/H/HS, C168H/PCI 8 port multiport board.
-
-   This driver has been modified a little and cleaned up from the Moxa
-   contributed driver code and merged into Linux 2.2.14pre. In particular
-   official major/minor numbers have been assigned which are different to
-   those the original Moxa supplied driver used.
+    - 2 ports multiport board
+	CP-102U, CP-102UL, CP-102UF
+	CP-132U-I, CP-132UL,
+	CP-132, CP-132I, CP132S, CP-132IS,
+	CI-132, CI-132I, CI-132IS,
+	(C102H, C102HI, C102HIS, C102P, CP-102, CP-102S)
+
+    - 4 ports multiport board
+	CP-104EL,
+	CP-104UL, CP-104JU,
+	CP-134U, CP-134U-I,
+	C104H/PCI, C104HS/PCI,
+	CP-114, CP-114I, CP-114S, CP-114IS, CP-114UL,
+	C104H, C104HS,
+	CI-104J, CI-104JS,
+	CI-134, CI-134I, CI-134IS,
+	(C114HI, CT-114I, C104P)
+	POS-104UL,
+	CB-114,
+	CB-134I
+
+    - 8 ports multiport board
+	CP-118EL, CP-168EL,
+	CP-118U, CP-168U,
+	C168H/PCI,
+	C168H, C168HS,
+	(C168P),
+	CB-108
 
    This driver and installation procedure have been developed upon Linux Kernel
-   2.2.5 and backward compatible to 2.0.3x. This driver supports Intel x86 and
-   Alpha hardware platform. In order to maintain compatibility, this version
-   has also been properly tested with RedHat, OpenLinux, TurboLinux and
-   S.u.S.E Linux. However, if compatibility problem occurs, please contact
-   Moxa at support@moxa.com.tw.
+   2.4.x and 2.6.x. This driver supports Intel x86 hardware platform. In order
+   to maintain compatibility, this version has also been properly tested with
+   RedHat, Mandrake, Fedora and S.u.S.E Linux. However, if compatibility problem
+   occurs, please contact Moxa at support@moxa.com.tw.
 
    In addition to device driver, useful utilities are also provided in this
    version. They are
-    - msdiag     Diagnostic program for detecting installed Moxa Smartio boards.
+    - msdiag     Diagnostic program for displaying installed Moxa
+                 Smartio/Industio boards.
     - msmon      Monitor program to observe data count and line status signals.
     - msterm     A simple terminal program which is useful in testing serial
 	         ports.
@@ -47,8 +76,7 @@ Content
    GNU General Public License in this version. Please refer to GNU General
    Public License announcement in each source code file for more detail.
 
-   In Moxa's ftp sites, you may always find latest driver at
-   ftp://ftp.moxa.com  or ftp://ftp.moxa.com.tw.
+   In Moxa's Web sites, you may always find latest driver at http://web.moxa.com.
 
    This version of driver can be installed as Loadable Module (Module driver)
    or built-in into kernel (Static driver). You may refer to following
@@ -61,18 +89,27 @@ Content
 
 -----------------------------------------------------------------------------
 2. System Requirement
-   - Hardware platform: Intel x86 or Alpha machine
-   - Kernel version: 2.0.3x or 2.2.x
+   - Hardware platform: Intel x86 machine
+   - Kernel version: 2.4.x or 2.6.x
    - gcc version 2.72 or later
    - Maximum 4 boards can be installed in combination
 
 -----------------------------------------------------------------------------
 3. Installation
 
+   3.1 Hardware installation
+   3.2 Driver files
+   3.3 Device naming convention
+   3.4 Module driver configuration
+   3.5 Static driver configuration for Linux kernel 2.4.x, 2.6.x.
+   3.6 Custom configuration
+   3.7 Verify driver installation
+
+
    3.1 Hardware installation
 
-       There are two types of buses, ISA and PCI, for Smartio family multiport
-       board.
+       There are two types of buses, ISA and PCI, for Smartio/Industio
+       family multiport board.
 
        ISA board
        ---------
@@ -81,47 +118,57 @@ Content
        installation procedure in User's Manual before proceed any further.
        Please make sure the JP1 is open after the ISA board is set properly.
 
-       PCI board
-       ---------
+       PCI/UPCI board
+       --------------
        You may need to adjust IRQ usage in BIOS to avoid from IRQ conflict
        with other ISA devices. Please refer to hardware installation
        procedure in User's Manual in advance.
 
-       IRQ Sharing
+       PCI IRQ Sharing
        -----------
        Each port within the same multiport board shares the same IRQ. Up to
-       4 Moxa Smartio Family multiport boards can be installed together on
-       one system and they can share the same IRQ.
+       4 Moxa Smartio/Industio PCI Family multiport boards can be installed
+       together on one system and they can share the same IRQ.
+
 
-   3.2 Driver files and device naming convention
+   3.2 Driver files
 
        The driver file may be obtained from ftp, CD-ROM or floppy disk. The
        first step, anyway, is to copy driver file "mxser.tgz" into specified
        directory. e.g. /moxa. The execute commands as below.
 
+       # cd /
+       # mkdir moxa
        # cd /moxa
-       # tar xvf /dev/fd0 
+       # tar xvf /dev/fd0
+
        or
+
+       # cd /
+       # mkdir moxa
        # cd /moxa
        # cp /mnt/cdrom/<driver directory>/mxser.tgz .
        # tar xvfz mxser.tgz
 
+
+   3.3 Device naming convention
+
        You may find all the driver and utilities files in /moxa/mxser.
        Following installation procedure depends on the model you'd like to
-       run the driver. If you prefer module driver, please refer to 3.3.
-       If static driver is required, please refer to 3.4.
+       run the driver. If you prefer module driver, please refer to 3.4.
+       If static driver is required, please refer to 3.5.
 
        Dialin and callout port
        -----------------------
-       This driver remains traditional serial device properties. There're
+       This driver remains traditional serial device properties. There are
        two special file name for each serial port. One is dial-in port
        which is named "ttyMxx". For callout port, the naming convention
        is "cumxx".
 
        Device naming when more than 2 boards installed
        -----------------------------------------------
-       Naming convention for each Smartio multiport board is pre-defined
-       as below.
+       Naming convention for each Smartio/Industio multiport board is
+       pre-defined as below.
 
        Board Num.	 Dial-in Port	      Callout port
        1st board	ttyM0  - ttyM7	      cum0  - cum7
@@ -129,6 +176,12 @@ Content
        3rd board	ttyM16 - ttyM23       cum16 - cum23
        4th board	ttyM24 - ttym31       cum24 - cum31
 
+
+       !!!!!!!!!!!!!!!!!!!! NOTE !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+       Under Kernel 2.6 the cum Device is Obsolete. So use ttyM*
+       device instead.
+       !!!!!!!!!!!!!!!!!!!! NOTE !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+
        Board sequence
        --------------
        This driver will activate ISA boards according to the parameter set
@@ -138,69 +191,131 @@ Content
        For PCI boards, their sequence will be after ISA boards and C168H/PCI
        has higher priority than C104H/PCI boards.
 
-   3.3 Module driver configuration
+   3.4 Module driver configuration
        Module driver is easiest way to install. If you prefer static driver
        installation, please skip this paragraph.
-       1. Find "Makefile" in /moxa/mxser, then run
 
-	  # make install
+
+       ------------- Prepare to use the MOXA driver--------------------
+       3.4.1 Create tty device with correct major number
+          Before using MOXA driver, your system must have the tty devices
+          which are created with driver's major number. We offer one shell
+          script "msmknod" to simplify the procedure.
+          This step is only needed to be executed once. But you still
+          need to do this procedure when:
+          a. You change the driver's major number. Please refer the "3.7"
+             section.
+          b. Your total installed MOXA boards number is changed. Maybe you
+             add/delete one MOXA board.
+          c. You want to change the tty name. This needs to modify the
+             shell script "msmknod"
+
+          The procedure is:
+	  # cd /moxa/mxser/driver
+	  # ./msmknod
+
+          This shell script will require the major number for dial-in
+          device and callout device to create tty device. You also need
+          to specify the total installed MOXA board number. Default major
+          numbers for dial-in device and callout device are 30, 35. If
+          you need to change to other number, please refer section "3.7"
+          for more detailed procedure.
+          Msmknod will delete any special files occupying the same device
+          naming.
+
+       3.4.2 Build the MOXA driver and utilities
+          Before using the MOXA driver and utilities, you need compile the
+          all the source code. This step is only need to be executed once.
+          But you still re-compile the source code if you modify the source
+          code. For example, if you change the driver's major number (see
+          "3.7" section), then you need to do this step again.
+
+          Find "Makefile" in /moxa/mxser, then run
+
+	  # make clean; make install
+
+          !!!!!!!!!! NOTE !!!!!!!!!!!!!!!!!
+	  For Red Hat 9, Red Hat Enterprise Linux AS3/ES3/WS3 & Fedora Core1:
+	  # make clean; make installsp1
+
+	  For Red Hat Enterprise Linux AS4/ES4/WS4:
+	  # make clean; make installsp2
+          !!!!!!!!!! NOTE !!!!!!!!!!!!!!!!!
 
 	  The driver files "mxser.o" and utilities will be properly compiled
-	  and copied to system directories respectively.Then run
+	  and copied to system directories respectively.
 
-	  # insmod mxser
+       ------------- Load MOXA driver--------------------
+       3.4.3 Load the MOXA driver
 
-	  to activate the modular driver. You may run "lsmod" to check
-	  if "mxser.o" is activated.
+	  # modprobe mxser <argument>
 
-       2. Create special files by executing "msmknod".
-	  # cd /moxa/mxser/driver
-	  # ./msmknod
+	  will activate the module driver. You may run "lsmod" to check
+	  if "mxser" is activated. If the MOXA board is ISA board, the
+          <argument> is needed. Please refer to section "3.4.5" for more
+          information.
+
+
+       ------------- Load MOXA driver on boot --------------------
+       3.4.4 For the above description, you may manually execute
+          "modprobe mxser" to activate this driver and run
+	  "rmmod mxser" to remove it.
+          However, it's better to have a boot time configuration to
+          eliminate manual operation. Boot time configuration can be
+          achieved by rc file. We offer one "rc.mxser" file to simplify
+          the procedure under "moxa/mxser/driver".
 
-	  Default major numbers for dial-in device and callout device are
-	  174, 175. Msmknod will delete any special files occupying the same
-	  device naming.
+          But if you use ISA board, please modify the "modprobe ..." command
+          to add the argument (see "3.4.5" section). After modifying the
+          rc.mxser, please try to execute "/moxa/mxser/driver/rc.mxser"
+          manually to make sure the modification is ok. If any error
+          encountered, please try to modify again. If the modification is
+          completed, follow the below step.
 
-       3. Up to now, you may manually execute "insmod mxser" to activate
-	  this driver and run "rmmod mxser" to remove it. However, it's
-	  better to have a boot time configuration to eliminate manual
-	  operation.
-	  Boot time configuration can be achieved by rc file. Run following
-	  command for setting rc files.
+	  Run following command for setting rc files.
 
 	  # cd /moxa/mxser/driver
 	  # cp ./rc.mxser /etc/rc.d
 	  # cd /etc/rc.d
 
-	  You may have to modify part of the content in rc.mxser to specify
-          parameters for ISA board. Please refer to rc.mxser for more detail.
-          Find "rc.serial". If "rc.serial" doesn't exist, create it by vi.
-	  Add "rc.mxser" in last line. Next, open rc.local by vi
-	  and append following content.
+	  Check "rc.serial" is existed or not. If "rc.serial" doesn't exist,
+	  create it by vi, run "chmod 755 rc.serial" to change the permission.
+	  Add "/etc/rc.d/rc.mxser" in last line,
 
-	  if [ -f /etc/rc.d/rc.serial ]; then
-	     sh /etc/rc.d/rc.serial
-	  fi
+          Reboot and check if moxa.o activated by "lsmod" command.
 
-       4. Reboot and check if mxser.o activated by "lsmod" command.
-       5. If you'd like to drive Smartio ISA boards in the system, you'll
-	  have to add parameter to specify CAP address of given board while
-          activating "mxser.o". The format for parameters are as follows.
+       3.4.5. If you'd like to drive Smartio/Industio ISA boards in the system,
+          you'll have to add parameter to specify CAP address of given
+	  board while activating "mxser.o". The format for parameters are
+	  as follows.
 
-	  insmod mxser ioaddr=0x???,0x???,0x???,0x???
+	  modprobe mxser ioaddr=0x???,0x???,0x???,0x???
 				|      |     |	  |
 				|      |     |	  +- 4th ISA board
 				|      |     +------ 3rd ISA board
 				|      +------------ 2nd ISA board
 				+------------------- 1st ISA board
 
-   3.4 Static driver configuration
+   3.5 Static driver configuration for Linux kernel 2.4.x and 2.6.x
+
+       Note: To use static driver, you must install the linux kernel
+             source package.
+
+       3.5.1 Backup the built-in driver in the kernel.
+          # cd /usr/src/linux/drivers/char
+          # mv mxser.c mxser.c.old
+
+          For Red Hat 7.x user, you need to create link:
+          # cd /usr/src
+          # ln -s linux-2.4 linux
 
-       1. Create link
+       3.5.2 Create link
 	  # cd /usr/src/linux/drivers/char
 	  # ln -s /moxa/mxser/driver/mxser.c mxser.c
 
-       2. Add CAP address list for ISA boards
+       3.5.3 Add CAP address list for ISA boards. For PCI boards user,
+          please skip this step.
+
 	  In module mode, the CAP address for ISA board is given by
 	  parameter. In static driver configuration, you'll have to
 	  assign it within driver's source code. If you will not
@@ -222,73 +337,55 @@ Content
 	     static int mxserBoardCAP[]
 	     = {0x280, 0x180, 0x00, 0x00};
 
-       3. Modify tty_io.c
-	  # cd /usr/src/linux/drivers/char/
-	  # vi tty_io.c
-	    Find pty_init(), insert "mxser_init()" as
+       3.5.4 Setup kernel configuration
 
-	    pty_init();
-	    mxser_init();
+          Configure the kernel:
 
-       4. Modify tty.h
-	  # cd /usr/src/linux/include/linux
-	  # vi tty.h
-	    Find extern int tty_init(void), insert "mxser_init()" as
+            # cd /usr/src/linux
+            # make menuconfig
 
-	    extern int tty_init(void);
-	    extern int mxser_init(void);
-     
-       5. Modify Makefile
-	  # cd /usr/src/linux/drivers/char
-	  # vi Makefile
-	    Find L_OBJS := tty_io.o ...... random.o, add
-	    "mxser.o" at last of this line as
-	    L_OBJS := tty_io.o ....... mxser.o
+          You will go into a menu-driven system. Please select [Character
+          devices][Non-standard serial port support], enable the [Moxa
+          SmartIO support] driver with "[*]" for built-in (not "[M]"), then
+          select [Exit] to exit this program.
 
-       6. Rebuild kernel
-	  The following are for Linux kernel rebuilding,for your reference only.
+       3.5.5 Rebuild kernel
+	  The following are for Linux kernel rebuilding, for your
+          reference only.
 	  For appropriate details, please refer to the Linux document.
 
-	  If 'lilo' utility is installed, please use 'make zlilo' to rebuild
-	  kernel. If 'lilo' is not installed, please follow the following steps.
-
 	   a. cd /usr/src/linux
-	   b. make clean			     /* take a few minutes */
-	   c. make bzImage		   /* take probably 10-20 minutes */
-	   d. Backup original boot kernel.		  /* optional step */
-	   e. cp /usr/src/linux/arch/i386/boot/bzImage /boot/vmlinuz
+	   b. make clean	     /* take a few minutes */
+	   c. make dep		     /* take a few minutes */
+	   d. make bzImage	     /* take probably 10-20 minutes */
+	   e. make install	     /* copy boot image to correct position */
 	   f. Please make sure the boot kernel (vmlinuz) is in the
-	      correct position. If you use 'lilo' utility, you should
-	      check /etc/lilo.conf 'image' item specified the path
-	      which is the 'vmlinuz' path, or you will load wrong
-	      (or old) boot kernel image (vmlinuz).
-	   g. chmod 400 /vmlinuz
-	   h. lilo
-	   i. rdev -R /vmlinuz 1
-	   j. sync
-
-	  Note that if the result of "make zImage" is ERROR, then you have to
-	  go back to Linux configuration Setup. Type "make config" in directory
-	  /usr/src/linux or "setup".
-
-	  Since system include file, /usr/src/linux/include/linux/interrupt.h,
-	  is modified each time the MOXA driver is installed, kernel rebuilding
-	  is inevitable. And it takes about 10 to 20 minutes depends on the
-	  machine.
-
-       7. Make utility
-	  # cd /moxa/mxser/utility
-	  # make install
-       
-       8. Make special file
+	      correct position.
+	   g. If you use 'lilo' utility, you should check /etc/lilo.conf
+	      'image' item specified the path which is the 'vmlinuz' path,
+	      or you will load wrong (or old) boot kernel image (vmlinuz).
+	      After checking /etc/lilo.conf, please run "lilo".
+
+	  Note that if the result of "make bzImage" is ERROR, then you have to
+	  go back to Linux configuration Setup. Type "make menuconfig" in
+          directory /usr/src/linux.
+
+
+       3.5.6 Make tty device and special file
           # cd /moxa/mxser/driver
           # ./msmknod
 
-       9. Reboot
+       3.5.7 Make utility
+	  # cd /moxa/mxser/utility
+	  # make clean; make install
+
+       3.5.8 Reboot
 
-   3.5 Custom configuration
+
+
+   3.6 Custom configuration
        Although this driver already provides you default configuration, you
-       still can change the device name and major number.The instruction to
+       still can change the device name and major number. The instruction to
        change these parameters are shown as below.
 
        Change Device name
@@ -306,33 +403,37 @@ Content
        2 free major numbers for this driver. There are 3 steps to change
        major numbers.
 
-       1. Find free major numbers
+       3.6.1 Find free major numbers
 	  In /proc/devices, you may find all the major numbers occupied
 	  in the system. Please select 2 major numbers that are available.
 	  e.g. 40, 45.
-       2. Create special files
+       3.6.2 Create special files
 	  Run /moxa/mxser/driver/msmknod to create special files with
 	  specified major numbers.
-       3. Modify driver with new major number
+       3.6.3 Modify driver with new major number
 	  Run vi to open /moxa/mxser/driver/mxser.c. Locate the line
 	  contains "MXSERMAJOR". Change the content as below.
 	  #define	  MXSERMAJOR		  40
 	  #define	  MXSERCUMAJOR		  45
-       4. Run # make install in /moxa/mxser/driver.
+       3.6.4 Run "make clean; make install" in /moxa/mxser/driver.
 
-   3.6 Verify driver installation
+   3.7 Verify driver installation
        You may refer to /var/log/messages to check the latest status
        log reported by this driver whenever it's activated.
+
 -----------------------------------------------------------------------------
 4. Utilities
    There are 3 utilities contained in this driver. They are msdiag, msmon and
    msterm. These 3 utilities are released in form of source code. They should
    be compiled into executable file and copied into /usr/bin.
 
+   Before using these utilities, please load driver (refer 3.4 & 3.5) and
+   make sure you had run the "msmknod" utility.
+
    msdiag - Diagnostic
    --------------------
-   This utility provides the function to detect what Moxa Smartio multiport
-   board exists in the system.
+   This utility provides the function to display what Moxa Smartio/Industio
+   board found by driver in the system.
 
    msmon - Port Monitoring
    -----------------------
@@ -353,12 +454,13 @@ Content
    application, for example, sending AT command to a modem connected to the
    port or used as a terminal for login purpose. Note that this is only a
    dumb terminal emulation without handling full screen operation.
+
 -----------------------------------------------------------------------------
 5. Setserial
 
    Supported Setserial parameters are listed as below.
 
-   uart 	  set UART type(16450-->disable FIFO, 16550A-->enable FIFO)
+   uart		  set UART type(16450-->disable FIFO, 16550A-->enable FIFO)
    close_delay	  set the amount of time(in 1/100 of a second) that DTR
 		  should be kept low while being closed.
    closing_wait   set the amount of time(in 1/100 of a second) that the
@@ -366,7 +468,13 @@ Content
 		  being closed, before the receiver is disable.
    spd_hi	  Use  57.6kb  when  the application requests 38.4kb.
    spd_vhi	  Use  115.2kb	when  the application requests 38.4kb.
+   spd_shi	  Use  230.4kb	when  the application requests 38.4kb.
+   spd_warp	  Use  460.8kb	when  the application requests 38.4kb.
    spd_normal	  Use  38.4kb  when  the application requests 38.4kb.
+   spd_cust	  Use  the custom divisor to set the speed when  the
+		  application requests 38.4kb.
+   divisor	  This option set the custom divison.
+   baud_base	  This option set the base baud rate.
 
 -----------------------------------------------------------------------------
 6. Troubleshooting
@@ -375,8 +483,9 @@ Content
    possible. If all the possible solutions fail, please contact our technical
    support team to get more help.
 
-   Error msg: More than 4 Moxa Smartio family boards found. Fifth board and
-	      after are ignored.
+
+   Error msg: More than 4 Moxa Smartio/Industio family boards found. Fifth board
+              and after are ignored.
    Solution:
    To avoid this problem, please unplug fifth and after board, because Moxa
    driver supports up to 4 boards.
@@ -384,7 +493,7 @@ Content
    Error msg: Request_irq fail, IRQ(?) may be conflict with another device.
    Solution:
    Other PCI or ISA devices occupy the assigned IRQ. If you are not sure
-   which device causes the situation,please check /proc/interrupts to find
+   which device causes the situation, please check /proc/interrupts to find
    free IRQ and simply change another free IRQ for Moxa board.
 
    Error msg: Board #: C1xx Series(CAP=xxx) interrupt number invalid.
@@ -397,15 +506,18 @@ Content
    Moxa ISA board needs an interrupt vector.Please refer to user's manual
    "Hardware Installation" chapter to set interrupt vector.
 
-   Error msg: Couldn't install MOXA Smartio family driver!
+   Error msg: Couldn't install MOXA Smartio/Industio family driver!
    Solution:
    Load Moxa driver fail, the major number may conflict with other devices.
-   Please refer to previous section 3.5 to change a free major number for
+   Please refer to previous section 3.7 to change a free major number for
    Moxa driver.
 
-   Error msg: Couldn't install MOXA Smartio family callout driver!
+   Error msg: Couldn't install MOXA Smartio/Industio family callout driver!
    Solution:
    Load Moxa callout driver fail, the callout device major number may
-   conflict with other devices. Please refer to previous section 3.5 to
+   conflict with other devices. Please refer to previous section 3.7 to
    change a free callout device major number for Moxa driver.
+
+
 -----------------------------------------------------------------------------
+

+ 81 - 31
Documentation/networking/bonding.txt

@@ -289,35 +289,73 @@ downdelay
 fail_over_mac
 
 	Specifies whether active-backup mode should set all slaves to
-	the same MAC address (the traditional behavior), or, when
-	enabled, change the bond's MAC address when changing the
-	active interface (i.e., fail over the MAC address itself).
-
-	Fail over MAC is useful for devices that cannot ever alter
-	their MAC address, or for devices that refuse incoming
-	broadcasts with their own source MAC (which interferes with
-	the ARP monitor).
-
-	The down side of fail over MAC is that every device on the
-	network must be updated via gratuitous ARP, vs. just updating
-	a switch or set of switches (which often takes place for any
-	traffic, not just ARP traffic, if the switch snoops incoming
-	traffic to update its tables) for the traditional method.  If
-	the gratuitous ARP is lost, communication may be disrupted.
-
-	When fail over MAC is used in conjuction with the mii monitor,
-	devices which assert link up prior to being able to actually
-	transmit and receive are particularly susecptible to loss of
-	the gratuitous ARP, and an appropriate updelay setting may be
-	required.
-
-	A value of 0 disables fail over MAC, and is the default.  A
-	value of 1 enables fail over MAC.  This option is enabled
-	automatically if the first slave added cannot change its MAC
-	address.  This option may be modified via sysfs only when no
-	slaves are present in the bond.
-
-	This option was added in bonding version 3.2.0.
+	the same MAC address at enslavement (the traditional
+	behavior), or, when enabled, perform special handling of the
+	bond's MAC address in accordance with the selected policy.
+
+	Possible values are:
+
+	none or 0
+
+		This setting disables fail_over_mac, and causes
+		bonding to set all slaves of an active-backup bond to
+		the same MAC address at enslavement time.  This is the
+		default.
+
+	active or 1
+
+		The "active" fail_over_mac policy indicates that the
+		MAC address of the bond should always be the MAC
+		address of the currently active slave.  The MAC
+		address of the slaves is not changed; instead, the MAC
+		address of the bond changes during a failover.
+
+		This policy is useful for devices that cannot ever
+		alter their MAC address, or for devices that refuse
+		incoming broadcasts with their own source MAC (which
+		interferes with the ARP monitor).
+
+		The down side of this policy is that every device on
+		the network must be updated via gratuitous ARP,
+		vs. just updating a switch or set of switches (which
+		often takes place for any traffic, not just ARP
+		traffic, if the switch snoops incoming traffic to
+		update its tables) for the traditional method.  If the
+		gratuitous ARP is lost, communication may be
+		disrupted.
+
+		When this policy is used in conjuction with the mii
+		monitor, devices which assert link up prior to being
+		able to actually transmit and receive are particularly
+		susecptible to loss of the gratuitous ARP, and an
+		appropriate updelay setting may be required.
+
+	follow or 2
+
+		The "follow" fail_over_mac policy causes the MAC
+		address of the bond to be selected normally (normally
+		the MAC address of the first slave added to the bond).
+		However, the second and subsequent slaves are not set
+		to this MAC address while they are in a backup role; a
+		slave is programmed with the bond's MAC address at
+		failover time (and the formerly active slave receives
+		the newly active slave's MAC address).
+
+		This policy is useful for multiport devices that
+		either become confused or incur a performance penalty
+		when multiple ports are programmed with the same MAC
+		address.
+
+
+	The default policy is none, unless the first slave cannot
+	change its MAC address, in which case the active policy is
+	selected by default.
+
+	This option may be modified via sysfs only when no slaves are
+	present in the bond.
+
+	This option was added in bonding version 3.2.0.  The "follow"
+	policy was added in bonding version 3.3.0.
 
 lacp_rate
 
@@ -338,7 +376,8 @@ max_bonds
 	Specifies the number of bonding devices to create for this
 	instance of the bonding driver.  E.g., if max_bonds is 3, and
 	the bonding driver is not already loaded, then bond0, bond1
-	and bond2 will be created.  The default value is 1.
+	and bond2 will be created.  The default value is 1.  Specifying
+	a value of 0 will load bonding, but will not create any devices.
 
 miimon
 
@@ -501,6 +540,17 @@ mode
 		swapped with the new curr_active_slave that was
 		chosen.
 
+num_grat_arp
+
+	Specifies the number of gratuitous ARPs to be issued after a
+	failover event.  One gratuitous ARP is issued immediately after
+	the failover, subsequent ARPs are sent at a rate of one per link
+	monitor interval (arp_interval or miimon, whichever is active).
+
+	The valid range is 0 - 255; the default value is 1.  This option
+	affects only the active-backup mode.  This option was added for
+	bonding version 3.3.0.
+
 primary
 
 	A string (eth0, eth2, etc) specifying which slave is the
@@ -581,7 +631,7 @@ xmit_hash_policy
 		in environments where a layer3 gateway device is
 		required to reach most destinations.
 
-		This algorithm is 802.3ad complient.
+		This algorithm is 802.3ad compliant.
 
 	layer3+4
 

+ 2 - 2
Documentation/networking/can.txt

@@ -186,7 +186,7 @@ solution for a couple of reasons:
 
   The Linux network devices (by default) just can handle the
   transmission and reception of media dependent frames. Due to the
-  arbritration on the CAN bus the transmission of a low prio CAN-ID
+  arbitration on the CAN bus the transmission of a low prio CAN-ID
   may be delayed by the reception of a high prio CAN frame. To
   reflect the correct* traffic on the node the loopback of the sent
   data has to be performed right after a successful transmission. If
@@ -481,7 +481,7 @@ solution for a couple of reasons:
   - stats_timer: To calculate the Socket CAN core statistics
     (e.g. current/maximum frames per second) this 1 second timer is
     invoked at can.ko module start time by default. This timer can be
-    disabled by using stattimer=0 on the module comandline.
+    disabled by using stattimer=0 on the module commandline.
 
   - debug: (removed since SocketCAN SVN r546)
 

+ 167 - 0
Documentation/networking/dm9000.txt

@@ -0,0 +1,167 @@
+DM9000 Network driver
+=====================
+
+Copyright 2008 Simtec Electronics,
+	  Ben Dooks <ben@simtec.co.uk> <ben-linux@fluff.org>
+
+
+Introduction
+------------
+
+This file describes how to use the DM9000 platform-device based network driver
+that is contained in the files drivers/net/dm9000.c and drivers/net/dm9000.h.
+
+The driver supports three DM9000 variants, the DM9000E which is the first chip
+supported as well as the newer DM9000A and DM9000B devices. It is currently
+maintained and tested by Ben Dooks, who should be CC: to any patches for this
+driver.
+
+
+Defining the platform device
+----------------------------
+
+The minimum set of resources attached to the platform device are as follows:
+
+    1) The physical address of the address register
+    2) The physical address of the data register
+    3) The IRQ line the device's interrupt pin is connected to.
+
+These resources should be specified in that order, as the ordering of the
+two address regions is important (the driver expects these to be address
+and then data).
+
+An example from arch/arm/mach-s3c2410/mach-bast.c is:
+
+static struct resource bast_dm9k_resource[] = {
+	[0] = {
+		.start = S3C2410_CS5 + BAST_PA_DM9000,
+		.end   = S3C2410_CS5 + BAST_PA_DM9000 + 3,
+		.flags = IORESOURCE_MEM,
+	},
+	[1] = {
+		.start = S3C2410_CS5 + BAST_PA_DM9000 + 0x40,
+		.end   = S3C2410_CS5 + BAST_PA_DM9000 + 0x40 + 0x3f,
+		.flags = IORESOURCE_MEM,
+	},
+	[2] = {
+		.start = IRQ_DM9000,
+		.end   = IRQ_DM9000,
+		.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHLEVEL,
+	}
+};
+
+static struct platform_device bast_device_dm9k = {
+	.name		= "dm9000",
+	.id		= 0,
+	.num_resources	= ARRAY_SIZE(bast_dm9k_resource),
+	.resource	= bast_dm9k_resource,
+};
+
+Note the setting of the IRQ trigger flag in bast_dm9k_resource[2].flags,
+as this will generate a warning if it is not present. The trigger from
+the flags field will be passed to request_irq() when registering the IRQ
+handler to ensure that the IRQ is setup correctly.
+
+This shows a typical platform device, without the optional configuration
+platform data supplied. The next example uses the same resources, but adds
+the optional platform data to pass extra configuration data:
+
+static struct dm9000_plat_data bast_dm9k_platdata = {
+	.flags		= DM9000_PLATF_16BITONLY,
+};
+
+static struct platform_device bast_device_dm9k = {
+	.name		= "dm9000",
+	.id		= 0,
+	.num_resources	= ARRAY_SIZE(bast_dm9k_resource),
+	.resource	= bast_dm9k_resource,
+	.dev		= {
+		.platform_data = &bast_dm9k_platdata,
+	}
+};
+
+The platform data is defined in include/linux/dm9000.h and described below.
+
+
+Platform data
+-------------
+
+Extra platform data for the DM9000 can describe the IO bus width to the
+device, whether or not an external PHY is attached to the device and
+the availability of an external configuration EEPROM.
+
+The flags for the platform data .flags field are as follows:
+
+DM9000_PLATF_8BITONLY
+
+	The IO should be done with 8bit operations.
+
+DM9000_PLATF_16BITONLY
+
+	The IO should be done with 16bit operations.
+
+DM9000_PLATF_32BITONLY
+
+	The IO should be done with 32bit operations.
+
+DM9000_PLATF_EXT_PHY
+
+	The chip is connected to an external PHY.
+
+DM9000_PLATF_NO_EEPROM
+
+	This can be used to signify that the board does not have an
+	EEPROM, or that the EEPROM should be hidden from the user.
+
+DM9000_PLATF_SIMPLE_PHY
+
+	Switch to using the simpler PHY polling method which does not
+	try and read the MII PHY state regularly. This is only available
+	when using the internal PHY. See the section on link state polling
+	for more information.
+
+	The config symbol DM9000_FORCE_SIMPLE_PHY_POLL, Kconfig entry
+	"Force simple NSR based PHY polling" allows this flag to be
+	forced on at build time.
+
+
+PHY Link state polling
+----------------------
+
+The driver keeps track of the link state and informs the network core
+about link (carrier) availablilty. This is managed by several methods
+depending on the version of the chip and on which PHY is being used.
+
+For the internal PHY, the original (and currently default) method is
+to read the MII state, either when the status changes if we have the
+necessary interrupt support in the chip or every two seconds via a
+periodic timer.
+
+To reduce the overhead for the internal PHY, there is now the option
+of using the DM9000_FORCE_SIMPLE_PHY_POLL config, or DM9000_PLATF_SIMPLE_PHY
+platform data option to read the summary information without the
+expensive MII accesses. This method is faster, but does not print
+as much information.
+
+When using an external PHY, the driver currently has to poll the MII
+link status as there is no method for getting an interrupt on link change.
+
+
+DM9000A / DM9000B
+-----------------
+
+These chips are functionally similar to the DM9000E and are supported easily
+by the same driver. The features are:
+
+   1) Interrupt on internal PHY state change. This means that the periodic
+      polling of the PHY status may be disabled on these devices when using
+      the internal PHY.
+
+   2) TCP/UDP checksum offloading, which the driver does not currently support.
+
+
+ethtool
+-------
+
+The driver supports the ethtool interface for access to the driver
+state information, the PHY state and the EEPROM.

+ 2 - 12
Documentation/networking/e1000.txt

@@ -513,21 +513,11 @@ Additional Configurations
   Intel(R) PRO/1000 PT Dual Port Server Connection
   Intel(R) PRO/1000 PT Dual Port Server Adapter
   Intel(R) PRO/1000 PF Dual Port Server Adapter
-  Intel(R) PRO/1000 PT Quad Port Server Adapter 
+  Intel(R) PRO/1000 PT Quad Port Server Adapter
 
   NAPI
   ----
-  NAPI (Rx polling mode) is supported in the e1000 driver.  NAPI is enabled
-  or disabled based on the configuration of the kernel.  To override
-  the default, use the following compile-time flags.
-
-  To enable NAPI, compile the driver module, passing in a configuration option:
-
-       make CFLAGS_EXTRA=-DE1000_NAPI install
-
-  To disable NAPI, compile the driver module, passing in a configuration option:
-
-       make CFLAGS_EXTRA=-DE1000_NO_NAPI install
+  NAPI (Rx polling mode) is enabled in the e1000 driver.
 
   See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
 

+ 17 - 4
Documentation/networking/ip-sysctl.txt

@@ -551,8 +551,9 @@ icmp_echo_ignore_broadcasts - BOOLEAN
 icmp_ratelimit - INTEGER
 	Limit the maximal rates for sending ICMP packets whose type matches
 	icmp_ratemask (see below) to specific targets.
-	0 to disable any limiting, otherwise the maximal rate in jiffies(1)
-	Default: 100
+	0 to disable any limiting,
+	otherwise the minimal space between responses in milliseconds.
+	Default: 1000
 
 icmp_ratemask - INTEGER
 	Mask made of ICMP types for which rates are being limited.
@@ -1023,11 +1024,23 @@ max_addresses - INTEGER
 	autoconfigured addresses.
 	Default: 16
 
+disable_ipv6 - BOOLEAN
+	Disable IPv6 operation.
+	Default: FALSE (enable IPv6 operation)
+
+accept_dad - INTEGER
+	Whether to accept DAD (Duplicate Address Detection).
+	0: Disable DAD
+	1: Enable DAD (default)
+	2: Enable DAD, and disable IPv6 operation if MAC-based duplicate
+	   link-local address has been found.
+
 icmp/*:
 ratelimit - INTEGER
 	Limit the maximal rates for sending ICMPv6 packets.
-	0 to disable any limiting, otherwise the maximal rate in jiffies(1)
-	Default: 100
+	0 to disable any limiting,
+	otherwise the minimal space between responses in milliseconds.
+	Default: 1000
 
 
 IPv6 Update by:

+ 320 - 99
Documentation/networking/ixgb.txt

@@ -1,7 +1,7 @@
-Linux* Base Driver for the Intel(R) PRO/10GbE Family of Adapters
-================================================================
+Linux Base Driver for 10 Gigabit Intel(R) Network Connection
+=============================================================
 
-November 17, 2004
+October 9, 2007
 
 
 Contents
@@ -9,94 +9,151 @@ Contents
 
 - In This Release
 - Identifying Your Adapter
+- Building and Installation
 - Command Line Parameters
 - Improving Performance
+- Additional Configurations
+- Known Issues/Troubleshooting
 - Support
 
 
+
 In This Release
 ===============
 
-This file describes the Linux* Base Driver for the Intel(R) PRO/10GbE Family 
-of Adapters, version 1.0.x.  
+This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R)
+Network Connection.  This driver includes support for Itanium(R)2-based
+systems.
+
+For questions related to hardware requirements, refer to the documentation
+supplied with your 10 Gigabit adapter.  All hardware requirements listed apply
+to use with Linux.
+
+The following features are available in this kernel:
+ - Native VLANs
+ - Channel Bonding (teaming)
+ - SNMP
+
+Channel Bonding documentation can be found in the Linux kernel source:
+/Documentation/networking/bonding.txt
+
+The driver information previously displayed in the /proc filesystem is not
+supported in this release.  Alternatively, you can use ethtool (version 1.6
+or later), lspci, and ifconfig to obtain the same information.
+
+Instructions on updating ethtool can be found in the section "Additional
+Configurations" later in this document.
 
-For questions related to hardware requirements, refer to the documentation 
-supplied with your Intel PRO/10GbE adapter. All hardware requirements listed 
-apply to use with Linux.
 
 Identifying Your Adapter
 ========================
 
-To verify your Intel adapter is supported, find the board ID number on the 
-adapter. Look for a label that has a barcode and a number in the format  
-A12345-001. 
+The following Intel network adapters are compatible with the drivers in this
+release:
+
+Controller  Adapter Name                 Physical Layer
+----------  ------------                 --------------
+82597EX     Intel(R) PRO/10GbE LR/SR/CX4 10G Base-LR (1310 nm optical fiber)
+            Server Adapters              10G Base-SR (850 nm optical fiber)
+                                         10G Base-CX4(twin-axial copper cabling)
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/network/sb/CS-012904.htm
+
+
+Building and Installation
+=========================
+
+select m for "Intel(R) PRO/10GbE support" located at:
+      Location:
+        -> Device Drivers
+          -> Network device support (NETDEVICES [=y])
+            -> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
+1. make modules && make modules_install
+
+2. Load the module:
+
+    modprobe ixgb <parameter>=<value>
+
+   The insmod command can be used if the full
+   path to the driver module is specified.  For example:
+
+     insmod /lib/modules/<KERNEL VERSION>/kernel/drivers/net/ixgb/ixgb.ko
+
+   With 2.6 based kernels also make sure that older ixgb drivers are
+   removed from the kernel, before loading the new module:
 
-Use the above information and the Adapter & Driver ID Guide at:
+     rmmod ixgb; modprobe ixgb
 
-  http://support.intel.com/support/network/adapter/pro100/21397.htm
+3. Assign an IP address to the interface by entering the following, where
+   x is the interface number:
 
-For the latest Intel network drivers for Linux, go to:
+     ifconfig ethx <IP_address>
+
+4. Verify that the interface works. Enter the following, where <IP_address>
+   is the IP address for another machine on the same subnet as the interface
+   that is being tested:
+
+     ping  <IP_address>
 
-    http://downloadfinder.intel.com/scripts-df/support_intel.asp
 
 Command Line Parameters
 =======================
 
-If the driver is built as a module, the  following optional parameters are 
-used by entering them on the command line with the modprobe or insmod command
-using this syntax:
+If the driver is built as a module, the  following optional parameters are
+used by entering them on the command line with the modprobe command using
+this syntax:
 
      modprobe ixgb [<option>=<VAL1>,<VAL2>,...]
 
-     insmod ixgb [<option>=<VAL1>,<VAL2>,...]
+For example, with two 10GbE PCI adapters, entering:
 
-For example, with two PRO/10GbE PCI adapters, entering:
+     modprobe ixgb TxDescriptors=80,128
 
-    insmod ixgb TxDescriptors=80,128
-
-loads the ixgb driver with 80 TX resources for the first adapter and 128 TX 
+loads the ixgb driver with 80 TX resources for the first adapter and 128 TX
 resources for the second adapter.
 
 The default value for each parameter is generally the recommended setting,
-unless otherwise noted. Also, if the driver is statically built into the
-kernel, the driver is loaded with the default values for all the parameters.
-Ethtool can be used to change some of the parameters at runtime.
+unless otherwise noted.
 
 FlowControl
 Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
 Default: Read from the EEPROM
-         If EEPROM is not detected, default is 3
-    This parameter controls the automatic generation(Tx) and response(Rx) to 
-    Ethernet PAUSE frames.
+         If EEPROM is not detected, default is 1
+    This parameter controls the automatic generation(Tx) and response(Rx) to
+    Ethernet PAUSE frames.  There are hardware bugs associated with enabling
+    Tx flow control so beware.
 
 RxDescriptors
 Valid Range: 64-512
 Default Value: 512
-    This value is the number of receive descriptors allocated by the driver. 
-    Increasing this value allows the driver to buffer more incoming packets. 
-    Each descriptor is 16 bytes.  A receive buffer is also allocated for 
-    each descriptor and can be either 2048, 4056, 8192, or 16384 bytes, 
-    depending on the MTU setting. When the MTU size is 1500 or less, the 
+    This value is the number of receive descriptors allocated by the driver.
+    Increasing this value allows the driver to buffer more incoming packets.
+    Each descriptor is 16 bytes.  A receive buffer is also allocated for
+    each descriptor and can be either 2048, 4056, 8192, or 16384 bytes,
+    depending on the MTU setting.  When the MTU size is 1500 or less, the
     receive buffer size is 2048 bytes. When the MTU is greater than 1500 the
-    receive buffer size will be either 4056, 8192, or 16384 bytes. The 
+    receive buffer size will be either 4056, 8192, or 16384 bytes.  The
     maximum MTU size is 16114.
 
 RxIntDelay
 Valid Range: 0-65535 (0=off)
-Default Value: 6
-    This value delays the generation of receive interrupts in units of 
-    0.8192 microseconds.  Receive interrupt reduction can improve CPU 
-    efficiency if properly tuned for specific network traffic. Increasing 
-    this value adds extra latency to frame reception and can end up 
-    decreasing the throughput of TCP traffic. If the system is reporting 
-    dropped receives, this value may be set too high, causing the driver to 
+Default Value: 72
+    This value delays the generation of receive interrupts in units of
+    0.8192 microseconds.  Receive interrupt reduction can improve CPU
+    efficiency if properly tuned for specific network traffic.  Increasing
+    this value adds extra latency to frame reception and can end up
+    decreasing the throughput of TCP traffic.  If the system is reporting
+    dropped receives, this value may be set too high, causing the driver to
     run out of available receive descriptors.
 
 TxDescriptors
 Valid Range: 64-4096
 Default Value: 256
     This value is the number of transmit descriptors allocated by the driver.
-    Increasing this value allows the driver to queue more transmits. Each 
+    Increasing this value allows the driver to queue more transmits.  Each
     descriptor is 16 bytes.
 
 XsumRX
@@ -105,51 +162,49 @@ Default Value: 1
     A value of '1' indicates that the driver should enable IP checksum
     offload for received packets (both UDP and TCP) to the adapter hardware.
 
-XsumTX
-Valid Range: 0-1
-Default Value: 1
-    A value of '1' indicates that the driver should enable IP checksum
-    offload for transmitted packets (both UDP and TCP) to the adapter 
-    hardware.
 
 Improving Performance
 =====================
 
-With the Intel PRO/10 GbE adapter, the default Linux configuration will very 
-likely limit the total available throughput artificially.  There is a set of 
-things that when applied together increase the ability of Linux to transmit 
-and receive data.  The following enhancements were originally acquired from
-settings published at http://www.spec.org/web99 for various submitted results 
-using Linux.
+With the 10 Gigabit server adapters, the default Linux configuration will
+very likely limit the total available throughput artificially.  There is a set
+of configuration changes that, when applied together, will increase the ability
+of Linux to transmit and receive data.  The following enhancements were
+originally acquired from settings published at http://www.spec.org/web99/ for
+various submitted results using Linux.
 
-NOTE: These changes are only suggestions, and serve as a starting point for 
-tuning your network performance.
+NOTE: These changes are only suggestions, and serve as a starting point for
+      tuning your network performance.
 
 The changes are made in three major ways, listed in order of greatest effect:
-- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen 
+- Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen
   parameter.
 - Use sysctl to modify /proc parameters (essentially kernel tuning)
-- Use setpci to modify the MMRBC field in PCI-X configuration space to increase 
+- Use setpci to modify the MMRBC field in PCI-X configuration space to increase
   transmit burst lengths on the bus.
 
-NOTE: setpci modifies the adapter's configuration registers to allow it to read 
-up to 4k bytes at a time (for transmits).  However, for some systems the 
-behavior after modifying this register may be undefined (possibly errors of some 
-kind). A power-cycle, hard reset or explicitly setting the e6 register back to 
-22 (setpci -d 8086:1048 e6.b=22) may be required to get back to a stable 
-configuration.
+NOTE: setpci modifies the adapter's configuration registers to allow it to read
+up to 4k bytes at a time (for transmits).  However, for some systems the
+behavior after modifying this register may be undefined (possibly errors of
+some kind).  A power-cycle, hard reset or explicitly setting the e6 register
+back to 22 (setpci -d 8086:1a48 e6.b=22) may be required to get back to a
+stable configuration.
 
 - COPY these lines and paste them into ixgb_perf.sh:
 #!/bin/bash
-echo "configuring network performance , edit this file to change the interface"
+echo "configuring network performance , edit this file to change the interface
+or device ID of 10GbE card"
 # set mmrbc to 4k reads, modify only Intel 10GbE device IDs
-setpci -d 8086:1048 e6.b=2e
-# set the MTU (max transmission unit) - it requires your switch and clients to change too!
+# replace 1a48 with appropriate 10GbE device's ID installed on the system,
+# if needed.
+setpci -d 8086:1a48 e6.b=2e
+# set the MTU (max transmission unit) - it requires your switch and clients
+# to change as well.
 # set the txqueuelen
 # your ixgb adapter should be loaded as eth1 for this to work, change if needed
 ifconfig eth1 mtu 9000 txqueuelen 1000 up
-# call the sysctl utility to modify /proc/sys entries 
-sysctl -p ./sysctl_ixgb.conf 
+# call the sysctl utility to modify /proc/sys entries
+sysctl -p ./sysctl_ixgb.conf
 - END ixgb_perf.sh
 
 - COPY these lines and paste them into sysctl_ixgb.conf:
@@ -159,54 +214,220 @@ sysctl -p ./sysctl_ixgb.conf
 # several network benchmark tests, your mileage may vary
 
 ### IPV4 specific settings
-net.ipv4.tcp_timestamps = 0 # turns TCP timestamp support off, default 1, reduces CPU use
-net.ipv4.tcp_sack = 0 # turn SACK support off, default on
-# on systems with a VERY fast bus -> memory interface this is the big gainer 
-net.ipv4.tcp_rmem = 10000000 10000000 10000000 # sets min/default/max TCP read buffer, default 4096 87380 174760
-net.ipv4.tcp_wmem = 10000000 10000000 10000000 # sets min/pressure/max TCP write buffer, default 4096 16384 131072
-net.ipv4.tcp_mem = 10000000 10000000 10000000 # sets min/pressure/max TCP buffer space, default 31744 32256 32768
+# turn TCP timestamp support off, default 1, reduces CPU use
+net.ipv4.tcp_timestamps = 0
+# turn SACK support off, default on
+# on systems with a VERY fast bus -> memory interface this is the big gainer
+net.ipv4.tcp_sack = 0
+# set min/default/max TCP read buffer, default 4096 87380 174760
+net.ipv4.tcp_rmem = 10000000 10000000 10000000
+# set min/pressure/max TCP write buffer, default 4096 16384 131072
+net.ipv4.tcp_wmem = 10000000 10000000 10000000
+# set min/pressure/max TCP buffer space, default 31744 32256 32768
+net.ipv4.tcp_mem = 10000000 10000000 10000000
 
 ### CORE settings (mostly for socket and UDP effect)
-net.core.rmem_max = 524287 # maximum receive socket buffer size, default 131071
-net.core.wmem_max = 524287 # maximum send socket buffer size, default 131071
-net.core.rmem_default = 524287 # default receive socket buffer size, default 65535
-net.core.wmem_default = 524287 # default send socket buffer size, default 65535
-net.core.optmem_max = 524287 # maximum amount of option memory buffers, default 10240
-net.core.netdev_max_backlog = 300000 # number of unprocessed input packets before kernel starts dropping them, default 300
+# set maximum receive socket buffer size, default 131071
+net.core.rmem_max = 524287
+# set maximum send socket buffer size, default 131071
+net.core.wmem_max = 524287
+# set default receive socket buffer size, default 65535
+net.core.rmem_default = 524287
+# set default send socket buffer size, default 65535
+net.core.wmem_default = 524287
+# set maximum amount of option memory buffers, default 10240
+net.core.optmem_max = 524287
+# set number of unprocessed input packets before kernel starts dropping them; default 300
+net.core.netdev_max_backlog = 300000
 - END sysctl_ixgb.conf
 
-Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface 
-your ixgb driver is using.
+Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface
+your ixgb driver is using and/or replace '1a48' with appropriate 10GbE device's
+ID installed on the system.
 
-NOTE: Unless these scripts are added to the boot process, these changes will 
-only last only until the next system reboot.
+NOTE: Unless these scripts are added to the boot process, these changes will
+      only last only until the next system reboot.
 
 
 Resolving Slow UDP Traffic
 --------------------------
+If your server does not seem to be able to receive UDP traffic as fast as it
+can receive TCP traffic, it could be because Linux, by default, does not set
+the network stack buffers as large as they need to be to support high UDP
+transfer rates.  One way to alleviate this problem is to allow more memory to
+be used by the IP stack to store incoming data.
 
-If your server does not seem to be able to receive UDP traffic as fast as it 
-can receive TCP traffic, it could be because Linux, by default, does not set 
-the network stack buffers as large as they need to be to support high UDP 
-transfer rates. One way to alleviate this problem is to allow more memory to 
-be used by the IP stack to store incoming data. 
-
-For instance, use the commands: 
+For instance, use the commands:
     sysctl -w net.core.rmem_max=262143
 and
     sysctl -w net.core.rmem_default=262143
-to increase the read buffer memory max and default to 262143 (256k - 1) from 
-defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables 
-will increase the amount of memory used by the network stack for receives, and 
+to increase the read buffer memory max and default to 262143 (256k - 1) from
+defaults of max=131071 (128k - 1) and default=65535 (64k - 1).  These variables
+will increase the amount of memory used by the network stack for receives, and
 can be increased significantly more if necessary for your application.
 
+
+Additional Configurations
+=========================
+
+  Configuring the Driver on Different Distributions
+  -------------------------------------------------
+  Configuring a network driver to load properly when the system is started is
+  distribution dependent. Typically, the configuration process involves adding
+  an alias line to /etc/modprobe.conf as well as editing other system startup
+  scripts and/or configuration files.  Many popular Linux distributions ship
+  with tools to make these changes for you.  To learn the proper way to
+  configure a network device for your system, refer to your distribution
+  documentation.  If during this process you are asked for the driver or module
+  name, the name for the Linux Base Driver for the Intel 10GbE Family of
+  Adapters is ixgb.
+
+  Viewing Link Messages
+  ---------------------
+  Link messages will not be displayed to the console if the distribution is
+  restricting system messages. In order to see network driver link messages on
+  your console, set dmesg to eight by entering the following:
+
+       dmesg -n 8
+
+  NOTE: This setting is not saved across reboots.
+
+
+  Jumbo Frames
+  ------------
+  The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
+  enabled by changing the MTU to a value larger than the default of 1500.
+  The maximum value for the MTU is 16114.  Use the ifconfig command to
+  increase the MTU size.  For example:
+
+        ifconfig ethx mtu 9000 up
+
+  The maximum MTU setting for Jumbo Frames is 16114.  This value coincides
+  with the maximum Jumbo Frames size of 16128.
+
+
+  Ethtool
+  -------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information.  Ethtool
+  version 1.6 or later is required for this functionality.
+
+  The latest release of ethtool can be found from
+  http://sourceforge.net/projects/gkernel
+
+  NOTE: Ethtool 1.6 only supports a limited set of ethtool options. Support
+        for a more complete ethtool feature set can be enabled by upgrading
+        to the latest version.
+
+
+  NAPI
+  ----
+
+  NAPI (Rx polling mode) is supported in the ixgb driver.  NAPI is enabled
+  or disabled based on the configuration of the kernel.  see CONFIG_IXGB_NAPI
+
+  See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
+
+
+Known Issues/Troubleshooting
+============================
+
+  NOTE: After installing the driver, if your Intel Network Connection is not
+  working, verify in the "In This Release" section of the readme that you have
+  installed the correct driver.
+
+  Intel(R) PRO/10GbE CX4 Server Adapter Cable Interoperability Issue with
+  Fujitsu XENPAK Module in SmartBits Chassis
+  ---------------------------------------------------------------------
+  Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4
+  Server adapter is connected to a Fujitsu XENPAK CX4 module in a SmartBits
+  chassis using 15 m/24AWG cable assemblies manufactured by Fujitsu or Leoni.
+  The CRC errors may be received either by the Intel(R) PRO/10GbE CX4
+  Server adapter or the SmartBits. If this situation occurs using a different
+  cable assembly may resolve the issue.
+
+  CX4 Server Adapter Cable Interoperability Issues with HP Procurve 3400cl
+  Switch Port
+  ------------------------------------------------------------------------
+  Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 Server
+  adapter is connected to an HP Procurve 3400cl switch port using short cables
+  (1 m or shorter). If this situation occurs, using a longer cable may resolve
+  the issue.
+
+  Excessive CRC errors may be observed using Fujitsu 24AWG cable assemblies that
+  Are 10 m or longer or where using a Leoni 15 m/24AWG cable assembly. The CRC
+  errors may be received either by the CX4 Server adapter or at the switch. If
+  this situation occurs, using a different cable assembly may resolve the issue.
+
+
+  Jumbo Frames System Requirement
+  -------------------------------
+  Memory allocation failures have been observed on Linux systems with 64 MB
+  of RAM or less that are running Jumbo Frames.  If you are using Jumbo
+  Frames, your system may require more than the advertised minimum
+  requirement of 64 MB of system memory.
+
+
+  Performance Degradation with Jumbo Frames
+  -----------------------------------------
+  Degradation in throughput performance may be observed in some Jumbo frames
+  environments.  If this is observed, increasing the application's socket buffer
+  size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
+  See the specific application manual and /usr/src/linux*/Documentation/
+  networking/ip-sysctl.txt for more details.
+
+
+  Allocating Rx Buffers when Using Jumbo Frames
+  ---------------------------------------------
+  Allocating Rx buffers when using Jumbo Frames on 2.6.x kernels may fail if
+  the available memory is heavily fragmented. This issue may be seen with PCI-X
+  adapters or with packet split disabled. This can be reduced or eliminated
+  by changing the amount of available memory for receive buffer allocation, by
+  increasing /proc/sys/vm/min_free_kbytes.
+
+
+  Multiple Interfaces on Same Ethernet Broadcast Network
+  ------------------------------------------------------
+  Due to the default ARP behavior on Linux, it is not possible to have
+  one system on two IP networks in the same Ethernet broadcast domain
+  (non-partitioned switch) behave as expected.  All Ethernet interfaces
+  will respond to IP traffic for any IP address assigned to the system.
+  This results in unbalanced receive traffic.
+
+  If you have multiple interfaces in a server, do either of the following:
+
+  - Turn on ARP filtering by entering:
+      echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
+
+  - Install the interfaces in separate broadcast domains - either in
+    different switches or in a switch partitioned to VLANs.
+
+
+  UDP Stress Test Dropped Packet Issue
+  --------------------------------------
+  Under small packets UDP stress test with 10GbE driver, the Linux system
+  may drop UDP packets due to the fullness of socket buffers. You may want
+  to change the driver's Flow Control variables to the minimum value for
+  controlling packet reception.
+
+
+  Tx Hangs Possible Under Stress
+  ------------------------------
+  Under stress conditions, if TX hangs occur, turning off TSO
+  "ethtool -K eth0 tso off" may resolve the problem.
+
+
 Support
 =======
 
-For general information and support, go to the Intel support website at:
+For general information, go to the Intel support website at:
 
     http://support.intel.com
 
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
 If an issue is identified with the released source code on the supported
-kernel with a supported adapter, email the specific information related to 
-the issue to linux.nics@intel.com.
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net

+ 67 - 0
Documentation/networking/mac80211_hwsim/README

@@ -0,0 +1,67 @@
+mac80211_hwsim - software simulator of 802.11 radio(s) for mac80211
+Copyright (c) 2008, Jouni Malinen <j@w1.fi>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License version 2 as
+published by the Free Software Foundation.
+
+
+Introduction
+
+mac80211_hwsim is a Linux kernel module that can be used to simulate
+arbitrary number of IEEE 802.11 radios for mac80211. It can be used to
+test most of the mac80211 functionality and user space tools (e.g.,
+hostapd and wpa_supplicant) in a way that matches very closely with
+the normal case of using real WLAN hardware. From the mac80211 view
+point, mac80211_hwsim is yet another hardware driver, i.e., no changes
+to mac80211 are needed to use this testing tool.
+
+The main goal for mac80211_hwsim is to make it easier for developers
+to test their code and work with new features to mac80211, hostapd,
+and wpa_supplicant. The simulated radios do not have the limitations
+of real hardware, so it is easy to generate an arbitrary test setup
+and always reproduce the same setup for future tests. In addition,
+since all radio operation is simulated, any channel can be used in
+tests regardless of regulatory rules.
+
+mac80211_hwsim kernel module has a parameter 'radios' that can be used
+to select how many radios are simulated (default 2). This allows
+configuration of both very simply setups (e.g., just a single access
+point and a station) or large scale tests (multiple access points with
+hundreds of stations).
+
+mac80211_hwsim works by tracking the current channel of each virtual
+radio and copying all transmitted frames to all other radios that are
+currently enabled and on the same channel as the transmitting
+radio. Software encryption in mac80211 is used so that the frames are
+actually encrypted over the virtual air interface to allow more
+complete testing of encryption.
+
+A global monitoring netdev, hwsim#, is created independent of
+mac80211. This interface can be used to monitor all transmitted frames
+regardless of channel.
+
+
+Simple example
+
+This example shows how to use mac80211_hwsim to simulate two radios:
+one to act as an access point and the other as a station that
+associates with the AP. hostapd and wpa_supplicant are used to take
+care of WPA2-PSK authentication. In addition, hostapd is also
+processing access point side of association.
+
+Please note that the current Linux kernel does not enable AP mode, so a
+simple patch is needed to enable AP mode selection:
+http://johannes.sipsolutions.net/patches/kernel/all/LATEST/006-allow-ap-vlan-modes.patch
+
+
+# Build mac80211_hwsim as part of kernel configuration
+
+# Load the module
+modprobe mac80211_hwsim
+
+# Run hostapd (AP) for wlan0
+hostapd hostapd.conf
+
+# Run wpa_supplicant (station) for wlan1
+wpa_supplicant -Dwext -iwlan1 -c wpa_supplicant.conf

+ 11 - 0
Documentation/networking/mac80211_hwsim/hostapd.conf

@@ -0,0 +1,11 @@
+interface=wlan0
+driver=nl80211
+
+hw_mode=g
+channel=1
+ssid=mac80211 test
+
+wpa=2
+wpa_key_mgmt=WPA-PSK
+wpa_pairwise=CCMP
+wpa_passphrase=12345678

+ 10 - 0
Documentation/networking/mac80211_hwsim/wpa_supplicant.conf

@@ -0,0 +1,10 @@
+ctrl_interface=/var/run/wpa_supplicant
+
+network={
+	ssid="mac80211 test"
+	psk="12345678"
+	key_mgmt=WPA-PSK
+	proto=WPA2
+	pairwise=CCMP
+	group=CCMP
+}

+ 1 - 89
Documentation/networking/multiqueue.txt

@@ -3,19 +3,11 @@
 		===========================================
 
 Section 1: Base driver requirements for implementing multiqueue support
-Section 2: Qdisc support for multiqueue devices
-Section 3: Brief howto using PRIO or RR for multiqueue devices
-
 
 Intro: Kernel support for multiqueue devices
 ---------------------------------------------------------
 
-Kernel support for multiqueue devices is only an API that is presented to the
-netdevice layer for base drivers to implement.  This feature is part of the
-core networking stack, and all network devices will be running on the
-multiqueue-aware stack.  If a base driver only has one queue, then these
-changes are transparent to that driver.
-
+Kernel support for multiqueue devices is always present.
 
 Section 1: Base driver requirements for implementing multiqueue support
 -----------------------------------------------------------------------
@@ -32,84 +24,4 @@ netif_{start|stop|wake}_subqueue() functions to manage each queue while the
 device is still operational.  netdev->queue_lock is still used when the device
 comes online or when it's completely shut down (unregister_netdev(), etc.).
 
-Finally, the base driver should indicate that it is a multiqueue device.  The
-feature flag NETIF_F_MULTI_QUEUE should be added to the netdev->features
-bitmap on device initialization.  Below is an example from e1000:
-
-#ifdef CONFIG_E1000_MQ
-	if ( (adapter->hw.mac.type == e1000_82571) ||
-	     (adapter->hw.mac.type == e1000_82572) ||
-	     (adapter->hw.mac.type == e1000_80003es2lan))
-		netdev->features |= NETIF_F_MULTI_QUEUE;
-#endif
-
-
-Section 2: Qdisc support for multiqueue devices
------------------------------------------------
-
-Currently two qdiscs support multiqueue devices.  A new round-robin qdisc,
-sch_rr, and sch_prio. The qdisc is responsible for classifying the skb's to
-bands and queues, and will store the queue mapping into skb->queue_mapping.
-Use this field in the base driver to determine which queue to send the skb
-to.
-
-sch_rr has been added for hardware that doesn't want scheduling policies from
-software, so it's a straight round-robin qdisc.  It uses the same syntax and
-classification priomap that sch_prio uses, so it should be intuitive to
-configure for people who've used sch_prio.
-
-In order to utilitize the multiqueue features of the qdiscs, the network
-device layer needs to enable multiple queue support.  This can be done by
-selecting NETDEVICES_MULTIQUEUE under Drivers.
-
-The PRIO qdisc naturally plugs into a multiqueue device.  If
-NETDEVICES_MULTIQUEUE is selected, then on qdisc load, the number of
-bands requested is compared to the number of queues on the hardware.  If they
-are equal, it sets a one-to-one mapping up between the queues and bands.  If
-they're not equal, it will not load the qdisc.  This is the same behavior
-for RR.  Once the association is made, any skb that is classified will have
-skb->queue_mapping set, which will allow the driver to properly queue skb's
-to multiple queues.
-
-
-Section 3: Brief howto using PRIO and RR for multiqueue devices
----------------------------------------------------------------
-
-The userspace command 'tc,' part of the iproute2 package, is used to configure
-qdiscs.  To add the PRIO qdisc to your network device, assuming the device is
-called eth0, run the following command:
-
-# tc qdisc add dev eth0 root handle 1: prio bands 4 multiqueue
-
-This will create 4 bands, 0 being highest priority, and associate those bands
-to the queues on your NIC.  Assuming eth0 has 4 Tx queues, the band mapping
-would look like:
-
-band 0 => queue 0
-band 1 => queue 1
-band 2 => queue 2
-band 3 => queue 3
-
-Traffic will begin flowing through each queue if your TOS values are assigning
-traffic across the various bands.  For example, ssh traffic will always try to
-go out band 0 based on TOS -> Linux priority conversion (realtime traffic),
-so it will be sent out queue 0.  ICMP traffic (pings) fall into the "normal"
-traffic classification, which is band 1.  Therefore pings will be send out
-queue 1 on the NIC.
-
-Note the use of the multiqueue keyword.  This is only in versions of iproute2
-that support multiqueue networking devices; if this is omitted when loading
-a qdisc onto a multiqueue device, the qdisc will load and operate the same
-if it were loaded onto a single-queue device (i.e. - sends all traffic to
-queue 0).
-
-Another alternative to multiqueue band allocation can be done by using the
-multiqueue option and specify 0 bands.  If this is the case, the qdisc will
-allocate the number of bands to equal the number of queues that the device
-reports, and bring the qdisc online.
-
-The behavior of tc filters remains the same, where it will override TOS priority
-classification.
-
-
 Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>

+ 1 - 1
Documentation/networking/packet_mmap.txt

@@ -326,7 +326,7 @@ just one call to mmap is needed:
     mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
 
 If tp_frame_size is a divisor of tp_block_size frames will be 
-contiguosly spaced by tp_frame_size bytes. If not, each 
+contiguously spaced by tp_frame_size bytes. If not, each
 tp_block_size/tp_frame_size frames there will be a gap between 
 the frames. This is because a frame cannot be spawn across two
 blocks. 

+ 2 - 5
Documentation/networking/s2io.txt

@@ -52,13 +52,10 @@ d. MSI/MSI-X. Can be enabled on platforms which support this feature
 (IA64, Xeon) resulting in noticeable performance improvement(upto 7%
 on certain platforms).
 
-e. NAPI. Compile-time option(CONFIG_S2IO_NAPI) for better Rx interrupt 
-moderation.
-
-f. Statistics. Comprehensive MAC-level and software statistics displayed
+e. Statistics. Comprehensive MAC-level and software statistics displayed
 using "ethtool -S" option.
 
-g. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings, 
+f. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
 with multiple steering options.
 
 4.  Command line parameters

+ 8 - 7
Documentation/networking/tc-actions-env-rules.txt

@@ -4,26 +4,27 @@ The "enviromental" rules for authors of any new tc actions are:
 1) If you stealeth or borroweth any packet thou shalt be branching
 from the righteous path and thou shalt cloneth.
 
-For example if your action queues a packet to be processed later
-or intentionaly branches by redirecting a packet then you need to
+For example if your action queues a packet to be processed later,
+or intentionally branches by redirecting a packet, then you need to
 clone the packet.
+
 There are certain fields in the skb tc_verd that need to be reset so we
-avoid loops etc. A few are generic enough so much so that skb_act_clone()
-resets them for you. So invoke skb_act_clone() rather than skb_clone()
+avoid loops, etc.  A few are generic enough that skb_act_clone()
+resets them for you, so invoke skb_act_clone() rather than skb_clone().
 
 2) If you munge any packet thou shalt call pskb_expand_head in the case
 someone else is referencing the skb. After that you "own" the skb.
 You must also tell us if it is ok to munge the packet (TC_OK2MUNGE),
 this way any action downstream can stomp on the packet.
 
-3) dropping packets you dont own is a nono. You simply return
+3) Dropping packets you don't own is a no-no. You simply return
 TC_ACT_SHOT to the caller and they will drop it.
 
 The "enviromental" rules for callers of actions (qdiscs etc) are:
 
-*) thou art responsible for freeing anything returned as being
+*) Thou art responsible for freeing anything returned as being
 TC_ACT_SHOT/STOLEN/QUEUED. If none of TC_ACT_SHOT/STOLEN/QUEUED is
-returned then all is great and you dont need to do anything.
+returned, then all is great and you don't need to do anything.
 
 Post on netdev if something is unclear.
 

+ 1 - 1
Documentation/networking/udplite.txt

@@ -148,7 +148,7 @@
         getsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, &value, ...);
 
   is meaningless (as in TCP). Packets with a zero checksum field are
-  illegal (cf. RFC 3828, sec. 3.1) will be silently discarded.
+  illegal (cf. RFC 3828, sec. 3.1) and will be silently discarded.
 
   4) Fragmentation
 

+ 2 - 2
Documentation/power/00-INDEX

@@ -1,5 +1,7 @@
 00-INDEX
 	- This file
+apm-acpi.txt
+	- basic info about the APM and ACPI support.
 basic-pm-debugging.txt
 	- Debugging suspend and resume
 devices.txt
@@ -14,8 +16,6 @@ notifiers.txt
 	- Registering suspend notifiers in device drivers
 pci.txt
 	- How the PCI Subsystem Does Power Management
-pm.txt
-	- info on Linux power management support.
 pm_qos_interface.txt
 	- info on Linux PM Quality of Service interface
 power_supply_class.txt

+ 32 - 0
Documentation/power/apm-acpi.txt

@@ -0,0 +1,32 @@
+APM or ACPI?
+------------
+If you have a relatively recent x86 mobile, desktop, or server system,
+odds are it supports either Advanced Power Management (APM) or
+Advanced Configuration and Power Interface (ACPI).  ACPI is the newer
+of the two technologies and puts power management in the hands of the
+operating system, allowing for more intelligent power management than
+is possible with BIOS controlled APM.
+
+The best way to determine which, if either, your system supports is to
+build a kernel with both ACPI and APM enabled (as of 2.3.x ACPI is
+enabled by default).  If a working ACPI implementation is found, the
+ACPI driver will override and disable APM, otherwise the APM driver
+will be used.
+
+No, sorry, you cannot have both ACPI and APM enabled and running at
+once.  Some people with broken ACPI or broken APM implementations
+would like to use both to get a full set of working features, but you
+simply cannot mix and match the two.  Only one power management
+interface can be in control of the machine at once.  Think about it..
+
+User-space Daemons
+------------------
+Both APM and ACPI rely on user-space daemons, apmd and acpid
+respectively, to be completely functional.  Obtain both of these
+daemons from your Linux distribution or from the Internet (see below)
+and be sure that they are started sometime in the system boot process.
+Go ahead and start both.  If ACPI or APM is not available on your
+system the associated daemon will exit gracefully.
+
+  apmd:   http://worldvisions.ca/~apenwarr/apmd/
+  acpid:  http://acpid.sf.net/

+ 0 - 257
Documentation/power/pm.txt

@@ -1,257 +0,0 @@
-               Linux Power Management Support
-
-This document briefly describes how to use power management with your
-Linux system and how to add power management support to Linux drivers.
-
-APM or ACPI?
-------------
-If you have a relatively recent x86 mobile, desktop, or server system,
-odds are it supports either Advanced Power Management (APM) or
-Advanced Configuration and Power Interface (ACPI).  ACPI is the newer
-of the two technologies and puts power management in the hands of the
-operating system, allowing for more intelligent power management than
-is possible with BIOS controlled APM.
-
-The best way to determine which, if either, your system supports is to
-build a kernel with both ACPI and APM enabled (as of 2.3.x ACPI is
-enabled by default).  If a working ACPI implementation is found, the
-ACPI driver will override and disable APM, otherwise the APM driver
-will be used.
-
-No, sorry, you cannot have both ACPI and APM enabled and running at
-once.  Some people with broken ACPI or broken APM implementations
-would like to use both to get a full set of working features, but you
-simply cannot mix and match the two.  Only one power management
-interface can be in control of the machine at once.  Think about it..
-
-User-space Daemons
-------------------
-Both APM and ACPI rely on user-space daemons, apmd and acpid
-respectively, to be completely functional.  Obtain both of these
-daemons from your Linux distribution or from the Internet (see below)
-and be sure that they are started sometime in the system boot process.
-Go ahead and start both.  If ACPI or APM is not available on your
-system the associated daemon will exit gracefully.
-
-  apmd:   http://worldvisions.ca/~apenwarr/apmd/
-  acpid:  http://acpid.sf.net/
-
-Driver Interface -- OBSOLETE, DO NOT USE!
-----------------*************************
-
-Note: pm_register(), pm_access(), pm_dev_idle() and friends are
-obsolete. Please do not use them. Instead you should properly hook
-your driver into the driver model, and use its suspend()/resume()
-callbacks to do this kind of stuff.
-
-If you are writing a new driver or maintaining an old driver, it
-should include power management support.  Without power management
-support, a single driver may prevent a system with power management
-capabilities from ever being able to suspend (safely).
-
-Overview:
-1) Register each instance of a device with "pm_register"
-2) Call "pm_access" before accessing the hardware.
-   (this will ensure that the hardware is awake and ready)
-3) Your "pm_callback" is called before going into a
-   suspend state (ACPI D1-D3) or after resuming (ACPI D0)
-   from a suspend.
-4) Call "pm_dev_idle" when the device is not being used
-   (optional but will improve device idle detection)
-5) When unloaded, unregister the device with "pm_unregister"
-
-/*
- * Description: Register a device with the power-management subsystem
- *
- * Parameters:
- *   type - device type (PCI device, system device, ...)
- *   id - instance number or unique identifier
- *   cback - request handler callback (suspend, resume, ...)
- *
- * Returns: Registered PM device or NULL on error
- *
- * Examples:
- *   dev = pm_register(PM_SYS_DEV, PM_SYS_VGA, vga_callback);
- *
- *   struct pci_dev *pci_dev = pci_find_dev(...);
- *   dev = pm_register(PM_PCI_DEV, PM_PCI_ID(pci_dev), callback);
- */
-struct pm_dev *pm_register(pm_dev_t type, unsigned long id, pm_callback cback);
-
-/*
- * Description: Unregister a device with the power management subsystem
- *
- * Parameters:
- *   dev - PM device previously returned from pm_register
- */
-void pm_unregister(struct pm_dev *dev);
-
-/*
- * Description: Unregister all devices with a matching callback function
- *
- * Parameters:
- *   cback - previously registered request callback
- *
- * Notes: Provided for easier porting from old APM interface
- */
-void pm_unregister_all(pm_callback cback);
-
-/*
- * Power management request callback
- *
- * Parameters:
- *   dev - PM device previously returned from pm_register
- *   rqst - request type
- *   data - data, if any, associated with the request
- *
- * Returns: 0 if the request is successful
- *          EINVAL if the request is not supported
- *          EBUSY if the device is now busy and cannot handle the request
- *          ENOMEM if the device was unable to handle the request due to memory
- *
- * Details: The device request callback will be called before the
- *          device/system enters a suspend state (ACPI D1-D3) or
- *          or after the device/system resumes from suspend (ACPI D0).
- *          For PM_SUSPEND, the ACPI D-state being entered is passed
- *          as the "data" argument to the callback.  The device
- *          driver should save (PM_SUSPEND) or restore (PM_RESUME)
- *          device context when the request callback is called.
- *
- *          Once a driver returns 0 (success) from a suspend
- *          request, it should not process any further requests or
- *          access the device hardware until a call to "pm_access" is made.
- */
-typedef int (*pm_callback)(struct pm_dev *dev, pm_request_t rqst, void *data);
-
-Driver Details
---------------
-This is just a quick Q&A as a stopgap until a real driver writers'
-power management guide is available.
-
-Q: When is a device suspended?
-
-Devices can be suspended based on direct user request (eg. laptop lid
-closes), system power policy (eg.  sleep after 30 minutes of console
-inactivity), or device power policy (eg. power down device after 5
-minutes of inactivity)
-
-Q: Must a driver honor a suspend request?
-
-No, a driver can return -EBUSY from a suspend request and this
-will stop the system from suspending.  When a suspend request
-fails, all suspended devices are resumed and the system continues
-to run.  Suspend can be retried at a later time.
-
-Q: Can the driver block suspend/resume requests?
-
-Yes, a driver can delay its return from a suspend or resume
-request until the device is ready to handle requests.  It
-is advantageous to return as quickly as possible from a
-request as suspend/resume are done serially.
-
-Q: What context is a suspend/resume initiated from?
-
-A suspend or resume is initiated from a kernel thread context.
-It is safe to block, allocate memory, initiate requests
-or anything else you can do within the kernel.
-
-Q: Will requests continue to arrive after a suspend?
-
-Possibly.  It is the driver's responsibility to queue(*),
-fail, or drop any requests that arrive after returning
-success to a suspend request.  It is important that the
-driver not access its device until after it receives
-a resume request as the device's bus may no longer
-be active.
-
-(*) If a driver queues requests for processing after
-    resume be aware that the device, network, etc.
-    might be in a different state than at suspend time.
-    It's probably better to drop requests unless
-    the driver is a storage device.
-
-Q: Do I have to manage bus-specific power management registers
-
-No.  It is the responsibility of the bus driver to manage
-PCI, USB, etc. power management registers.  The bus driver
-or the power management subsystem will also enable any
-wake-on functionality that the device has.
-
-Q: So, really, what do I need to do to support suspend/resume?
-
-You need to save any device context that would
-be lost if the device was powered off and then restore
-it at resume time.  When ACPI is active, there are
-three levels of device suspend states; D1, D2, and D3.
-(The suspend state is passed as the "data" argument
-to the device callback.)  With D3, the device is powered
-off and loses all context, D1 and D2 are shallower power
-states and require less device context to be saved.  To
-play it safe, just save everything at suspend and restore
-everything at resume.
-
-Q: Where do I store device context for suspend?
-
-Anywhere in memory, kmalloc a buffer or store it
-in the device descriptor.  You are guaranteed that the
-contents of memory will be restored and accessible
-before resume, even when the system suspends to disk.
-
-Q: What do I need to do for ACPI vs. APM vs. etc?
-
-Drivers need not be aware of the specific power management
-technology that is active.  They just need to be aware
-of when the overlying power management system requests
-that they suspend or resume.
-
-Q: What about device dependencies?
-
-When a driver registers a device, the power management
-subsystem uses the information provided to build a
-tree of device dependencies (eg. USB device X is on
-USB controller Y which is on PCI bus Z)  When power
-management wants to suspend a device, it first sends
-a suspend request to its driver, then the bus driver,
-and so on up to the system bus.  Device resumes
-proceed in the opposite direction.
-
-Q: Who do I contact for additional information about
-   enabling power management for my specific driver/device?
-
-ACPI Development mailing list: linux-acpi@vger.kernel.org
-
-System Interface -- OBSOLETE, DO NOT USE!
-----------------*************************
-If you are providing new power management support to Linux (ie.
-adding support for something like APM or ACPI), you should
-communicate with drivers through the existing generic power
-management interface.
-
-/*
- * Send a request to all devices
- *
- * Parameters:
- *   rqst - request type
- *   data - data, if any, associated with the request
- *
- * Returns: 0 if the request is successful
- *          See "pm_callback" return for errors
- *
- * Details: Walk list of registered devices and call pm_send
- *          for each until complete or an error is encountered.
- *          If an error is encountered for a suspend request,
- *          return all devices to the state they were in before
- *          the suspend request.
- */
-int pm_send_all(pm_request_t rqst, void *data);
-
-/*
- * Find a matching device
- *
- * Parameters:
- *   type - device type (PCI device, system device, or 0 to match all devices)
- *   from - previous match or NULL to start from the beginning
- *
- * Returns: Matching device or NULL if none found
- */
-struct pm_dev *pm_find(pm_dev_t type, struct pm_dev *from);

+ 6 - 1
Documentation/power/pm_qos_interface.txt

@@ -1,4 +1,4 @@
-PM quality of Service interface.
+PM Quality Of Service Interface.
 
 This interface provides a kernel and user mode interface for registering
 performance expectations by drivers, subsystems and user space applications on
@@ -7,6 +7,11 @@ one of the parameters.
 Currently we have {cpu_dma_latency, network_latency, network_throughput} as the
 initial set of pm_qos parameters.
 
+Each parameters have defined units:
+ * latency: usec
+ * timeout: usec
+ * throughput: kbs (kilo bit / sec)
+
 The infrastructure exposes multiple misc device nodes one per implemented
 parameter.  The set of parameters implement is defined by pm_qos_power_init()
 and pm_qos_params.h.  This is done because having the available parameters

部分文件因为文件数量过多而无法显示