diff --git a/CREDITS b/CREDITS index e97bea06b59ff12bc827c0661fb57974a6eff4d6..077b147388bd593c971ec2643e0c8b9ecd194b65 100644 --- a/CREDITS +++ b/CREDITS @@ -3344,8 +3344,7 @@ S: Spain N: Linus Torvalds E: torvalds@linux-foundation.org D: Original kernel hacker -S: 12725 SW Millikan Way, Suite 400 -S: Beaverton, Oregon 97005 +S: Portland, Oregon 97005 S: USA N: Marcelo Tosatti diff --git a/Documentation/ABI/testing/sysfs-dev b/Documentation/ABI/testing/sysfs-dev new file mode 100644 index 0000000000000000000000000000000000000000..a9f2b8b0530fb193be12ccfb7d0a529bf0ba435d --- /dev/null +++ b/Documentation/ABI/testing/sysfs-dev @@ -0,0 +1,20 @@ +What: /sys/dev +Date: April 2008 +KernelVersion: 2.6.26 +Contact: Dan Williams +Description: The /sys/dev tree provides a method to look up the sysfs + path for a device using the information returned from + stat(2). There are two directories, 'block' and 'char', + beneath /sys/dev containing symbolic links with names of + the form ":". These links point to the + corresponding sysfs path for the given device. + + Example: + $ readlink /sys/dev/block/8:32 + ../../block/sdc + + Entries in /sys/dev/char and /sys/dev/block will be + dynamically created and destroyed as devices enter and + leave the system. + +Users: mdadm diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index 6d772f84b477c9f5a4698e87d50fb63649af83d3..b768cc0e402b8e6f1cfa8ae0b0f23c4e1c50168a 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -22,3 +22,12 @@ ready and available in memory. The DMA of the "completion indication" could race with data DMA. Mapping the memory used for completion indications with DMA_ATTR_WRITE_BARRIER would prevent the race. +DMA_ATTR_WEAK_ORDERING +---------------------- + +DMA_ATTR_WEAK_ORDERING specifies that reads and writes to the mapping +may be weakly ordered, that is that reads and writes may pass each other. + +Since it is optional for platforms to implement DMA_ATTR_WEAK_ORDERING, +those that do not will simply ignore the attribute and exhibit default +behavior. diff --git a/Documentation/DocBook/gadget.tmpl b/Documentation/DocBook/gadget.tmpl index 5a8ffa761e09991bfc59dc97fa01a4deef94e9f7..ea3bc9565e6a7e7ae48a168841c222e07bfd0154 100644 --- a/Documentation/DocBook/gadget.tmpl +++ b/Documentation/DocBook/gadget.tmpl @@ -524,6 +524,44 @@ These utilities include endpoint autoconfiguration. +Composite Device Framework + +The core API is sufficient for writing drivers for composite +USB devices (with more than one function in a given configuration), +and also multi-configuration devices (also more than one function, +but not necessarily sharing a given configuration). +There is however an optional framework which makes it easier to +reuse and combine functions. + + +Devices using this framework provide a struct +usb_composite_driver, which in turn provides one or +more struct usb_configuration instances. +Each such configuration includes at least one +struct usb_function, which packages a user +visible role such as "network link" or "mass storage device". +Management functions may also exist, such as "Device Firmware +Upgrade". + + +!Iinclude/linux/usb/composite.h +!Edrivers/usb/gadget/composite.c + + + +Composite Device Functions + +At this writing, a few of the current gadget drivers have +been converted to this framework. +Near-term plans include converting all of them, except for "gadgetfs". + + +!Edrivers/usb/gadget/f_acm.c +!Edrivers/usb/gadget/f_serial.c + + + + Peripheral Controller Drivers diff --git a/Documentation/DocBook/uio-howto.tmpl b/Documentation/DocBook/uio-howto.tmpl index fdd7f4f887b75ba7bc96b7ab81b9abc7d3348d73..df87d1b93605ac54cf400bd7b8d44b0866d0789a 100644 --- a/Documentation/DocBook/uio-howto.tmpl +++ b/Documentation/DocBook/uio-howto.tmpl @@ -21,6 +21,18 @@ + + 2006-2008 + Hans-Jürgen Koch. + + + + +This documentation is Free Software licensed under the terms of the +GPL version 2. + + + 2006-12-11 @@ -29,6 +41,12 @@ + + 0.5 + 2008-05-22 + hjk + Added description of write() function. + 0.4 2007-11-26 @@ -57,20 +75,9 @@ - + About this document - - -Copyright and License - - Copyright (c) 2006 by Hans-Jürgen Koch. - -This documentation is Free Software licensed under the terms of the -GPL version 2. - - - Translations @@ -189,6 +196,30 @@ interested in translating it, please email me represents the total interrupt count. You can use this number to figure out if you missed some interrupts. + + For some hardware that has more than one interrupt source internally, + but not separate IRQ mask and status registers, there might be + situations where userspace cannot determine what the interrupt source + was if the kernel handler disables them by writing to the chip's IRQ + register. In such a case, the kernel has to disable the IRQ completely + to leave the chip's register untouched. Now the userspace part can + determine the cause of the interrupt, but it cannot re-enable + interrupts. Another cornercase is chips where re-enabling interrupts + is a read-modify-write operation to a combined IRQ status/acknowledge + register. This would be racy if a new interrupt occurred + simultaneously. + + + To address these problems, UIO also implements a write() function. It + is normally not used and can be ignored for hardware that has only a + single interrupt source or has separate IRQ mask and status registers. + If you need it, however, a write to /dev/uioX + will call the irqcontrol() function implemented + by the driver. You have to write a 32-bit value that is usually either + 0 or 1 to disable or enable interrupts. If a driver does not implement + irqcontrol(), write() will + return with -ENOSYS. + To handle interrupts properly, your custom kernel module can @@ -362,6 +393,14 @@ device is actually used. open(), you will probably also want a custom release() function. + + +int (*irqcontrol)(struct uio_info *info, s32 irq_on) +: Optional. If you need to be able to enable or disable +interrupts from userspace by writing to /dev/uioX, +you can implement this function. The parameter irq_on +will be 0 to disable interrupts and 1 to enable them. + diff --git a/Documentation/HOWTO b/Documentation/HOWTO index 619e8caf30db88508a3f2859d140f41ce363f552..c2371c5a98f99b5eaa785bd0affd6c40187e84e3 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -358,7 +358,7 @@ Here is a list of some of the different kernel trees available: - pcmcia, Dominik Brodowski git.kernel.org:/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git - - SCSI, James Bottomley + - SCSI, James Bottomley git.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git - x86, Ingo Molnar diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 65a1482457a89ec9d6c5beec89464bc049f3f5e2..9f73587219e87a601f361b0942530888e56c3303 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -308,9 +308,41 @@ Who: Matthew Wilcox --------------------------- +What: SCTP_GET_PEER_ADDRS_NUM_OLD, SCTP_GET_PEER_ADDRS_OLD, + SCTP_GET_LOCAL_ADDRS_NUM_OLD, SCTP_GET_LOCAL_ADDRS_OLD +When: June 2009 +Why: A newer version of the options have been introduced in 2005 that + removes the limitions of the old API. The sctp library has been + converted to use these new options at the same time. Any user + space app that directly uses the old options should convert to using + the new options. +Who: Vlad Yasevich + +--------------------------- + What: CONFIG_THERMAL_HWMON When: January 2009 Why: This option was introduced just to allow older lm-sensors userspace to keep working over the upgrade to 2.6.26. At the scheduled time of removal fixed lm-sensors (2.x or 3.x) should be readily available. Who: Rene Herman + +--------------------------- + +What: Code that is now under CONFIG_WIRELESS_EXT_SYSFS + (in net/core/net-sysfs.c) +When: After the only user (hal) has seen a release with the patches + for enough time, probably some time in 2010. +Why: Over 1K .text/.data size reduction, data is available in other + ways (ioctls) +Who: Johannes Berg + +--------------------------- + +What: CONFIG_NF_CT_ACCT +When: 2.6.29 +Why: Accounting can now be enabled/disabled without kernel recompilation. + Currently used only to set a default value for a feature that is also + controlled by a kernel/module/sysfs/sysctl parameter. +Who: Krzysztof Piotr Oledzki + diff --git a/Documentation/filesystems/bfs.txt b/Documentation/filesystems/bfs.txt index ea825e178e797b3b8af53d8d6d5fa2c1f1974a0d..78043d5a8fc35f01c343ced6f956aadee03e1026 100644 --- a/Documentation/filesystems/bfs.txt +++ b/Documentation/filesystems/bfs.txt @@ -26,11 +26,11 @@ You can simplify mounting by just typing: this will allocate the first available loopback device (and load loop.o kernel module if necessary) automatically. If the loopback driver is not -loaded automatically, make sure that your kernel is compiled with kmod -support (CONFIG_KMOD) enabled. Beware that umount will not -deallocate /dev/loopN device if /etc/mtab file on your system is a -symbolic link to /proc/mounts. You will need to do it manually using -"-d" switch of losetup(8). Read losetup(8) manpage for more info. +loaded automatically, make sure that you have compiled the module and +that modprobe is functioning. Beware that umount will not deallocate +/dev/loopN device if /etc/mtab file on your system is a symbolic link to +/proc/mounts. You will need to do it manually using "-d" switch of +losetup(8). Read losetup(8) manpage for more info. To create the BFS image under UnixWare you need to find out first which slice contains it. The command prtvtoc(1M) is your friend: diff --git a/Documentation/filesystems/configfs/configfs.txt b/Documentation/filesystems/configfs/configfs.txt index 15838d706ea24afaac3d1951d3452006b15d9742..44c97e6accb2655faed63b52e4bc63848571a7ab 100644 --- a/Documentation/filesystems/configfs/configfs.txt +++ b/Documentation/filesystems/configfs/configfs.txt @@ -233,12 +233,10 @@ accomplished via the group operations specified on the group's config_item_type. struct configfs_group_operations { - int (*make_item)(struct config_group *group, - const char *name, - struct config_item **new_item); - int (*make_group)(struct config_group *group, - const char *name, - struct config_group **new_group); + struct config_item *(*make_item)(struct config_group *group, + const char *name); + struct config_group *(*make_group)(struct config_group *group, + const char *name); int (*commit_item)(struct config_item *item); void (*disconnect_notify)(struct config_group *group, struct config_item *item); diff --git a/Documentation/filesystems/configfs/configfs_example.c b/Documentation/filesystems/configfs/configfs_example.c index 0b422acd470c5826e269d6a98628b4cd0d81cf07..039648791701f6e1b74a970d36f361cab695ab64 100644 --- a/Documentation/filesystems/configfs/configfs_example.c +++ b/Documentation/filesystems/configfs/configfs_example.c @@ -273,13 +273,13 @@ static inline struct simple_children *to_simple_children(struct config_item *ite return item ? container_of(to_config_group(item), struct simple_children, group) : NULL; } -static int simple_children_make_item(struct config_group *group, const char *name, struct config_item **new_item) +static struct config_item *simple_children_make_item(struct config_group *group, const char *name) { struct simple_child *simple_child; simple_child = kzalloc(sizeof(struct simple_child), GFP_KERNEL); if (!simple_child) - return -ENOMEM; + return ERR_PTR(-ENOMEM); config_item_init_type_name(&simple_child->item, name, @@ -287,8 +287,7 @@ static int simple_children_make_item(struct config_group *group, const char *nam simple_child->storeme = 0; - *new_item = &simple_child->item; - return 0; + return &simple_child->item; } static struct configfs_attribute simple_children_attr_description = { @@ -360,21 +359,20 @@ static struct configfs_subsystem simple_children_subsys = { * children of its own. */ -static int group_children_make_group(struct config_group *group, const char *name, struct config_group **new_group) +static struct config_group *group_children_make_group(struct config_group *group, const char *name) { struct simple_children *simple_children; simple_children = kzalloc(sizeof(struct simple_children), GFP_KERNEL); if (!simple_children) - return -ENOMEM; + return ERR_PTR(-ENOMEM); config_group_init_type_name(&simple_children->group, name, &simple_children_type); - *new_group = &simple_children->group; - return 0; + return &simple_children->group; } static struct configfs_attribute group_children_attr_description = { diff --git a/Documentation/filesystems/nfs-rdma.txt b/Documentation/filesystems/nfs-rdma.txt index d0ec45ae4e7dfa4585fa74c0757be4d80dc02bf5..44bd766f2e5d2225488337681a948c160bd2a821 100644 --- a/Documentation/filesystems/nfs-rdma.txt +++ b/Documentation/filesystems/nfs-rdma.txt @@ -5,7 +5,7 @@ ################################################################################ Author: NetApp and Open Grid Computing - Date: April 15, 2008 + Date: May 29, 2008 Table of Contents ~~~~~~~~~~~~~~~~~ @@ -60,16 +60,18 @@ Installation The procedures described in this document have been tested with distributions from Red Hat's Fedora Project (http://fedora.redhat.com/). - - Install nfs-utils-1.1.1 or greater on the client + - Install nfs-utils-1.1.2 or greater on the client - An NFS/RDMA mount point can only be obtained by using the mount.nfs - command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs - you are using, type: + An NFS/RDMA mount point can be obtained by using the mount.nfs command in + nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils + version with support for NFS/RDMA mounts, but for various reasons we + recommend using nfs-utils-1.1.2 or greater). To see which version of + mount.nfs you are using, type: - > /sbin/mount.nfs -V + $ /sbin/mount.nfs -V - If the version is less than 1.1.1 or the command does not exist, - then you will need to install the latest version of nfs-utils. + If the version is less than 1.1.2 or the command does not exist, + you should install the latest version of nfs-utils. Download the latest package from: @@ -77,22 +79,33 @@ Installation Uncompress the package and follow the installation instructions. - If you will not be using GSS and NFSv4, the installation process - can be simplified by disabling these features when running configure: + If you will not need the idmapper and gssd executables (you do not need + these to create an NFS/RDMA enabled mount command), the installation + process can be simplified by disabling these features when running + configure: - > ./configure --disable-gss --disable-nfsv4 + $ ./configure --disable-gss --disable-nfsv4 - For more information on this see the package's README and INSTALL files. + To build nfs-utils you will need the tcp_wrappers package installed. For + more information on this see the package's README and INSTALL files. After building the nfs-utils package, there will be a mount.nfs binary in the utils/mount directory. This binary can be used to initiate NFS v2, v3, - or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4. - The standard technique is to create a symlink called mount.nfs4 to mount.nfs. + or v4 mounts. To initiate a v4 mount, the binary must be called + mount.nfs4. The standard technique is to create a symlink called + mount.nfs4 to mount.nfs. - NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed + This mount.nfs binary should be installed at /sbin/mount.nfs as follows: + + $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs + + In this location, mount.nfs will be invoked automatically for NFS mounts + by the system mount commmand. + + NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed on the NFS client machine. You do not need this specific version of nfs-utils on the server. Furthermore, only the mount.nfs command from - nfs-utils-1.1.1 is needed on the client. + nfs-utils-1.1.2 is needed on the client. - Install a Linux kernel with NFS/RDMA @@ -156,8 +169,8 @@ Check RDMA and NFS Setup this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel card: - > modprobe ib_mthca - > modprobe ib_ipoib + $ modprobe ib_mthca + $ modprobe ib_ipoib If you are using InfiniBand, make sure there is a Subnet Manager (SM) running on the network. If your IB switch has an embedded SM, you can @@ -166,7 +179,7 @@ Check RDMA and NFS Setup If an SM is running on your network, you should see the following: - > cat /sys/class/infiniband/driverX/ports/1/state + $ cat /sys/class/infiniband/driverX/ports/1/state 4: ACTIVE where driverX is mthca0, ipath5, ehca3, etc. @@ -174,10 +187,10 @@ Check RDMA and NFS Setup To further test the InfiniBand software stack, use IPoIB (this assumes you have two IB hosts named host1 and host2): - host1> ifconfig ib0 a.b.c.x - host2> ifconfig ib0 a.b.c.y - host1> ping a.b.c.y - host2> ping a.b.c.x + host1$ ifconfig ib0 a.b.c.x + host2$ ifconfig ib0 a.b.c.y + host1$ ping a.b.c.y + host2$ ping a.b.c.x For other device types, follow the appropriate procedures. @@ -202,11 +215,11 @@ NFS/RDMA Setup /vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash) /vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash) - The IP address(es) is(are) the client's IPoIB address for an InfiniBand HCA or the - cleint's iWARP address(es) for an RNIC. + The IP address(es) is(are) the client's IPoIB address for an InfiniBand + HCA or the cleint's iWARP address(es) for an RNIC. - NOTE: The "insecure" option must be used because the NFS/RDMA client does not - use a reserved port. + NOTE: The "insecure" option must be used because the NFS/RDMA client does + not use a reserved port. Each time a machine boots: @@ -214,43 +227,45 @@ NFS/RDMA Setup For InfiniBand using a Mellanox adapter: - > modprobe ib_mthca - > modprobe ib_ipoib - > ifconfig ib0 a.b.c.d + $ modprobe ib_mthca + $ modprobe ib_ipoib + $ ifconfig ib0 a.b.c.d NOTE: use unique addresses for the client and server - Start the NFS server - If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), - load the RDMA transport module: + If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in + kernel config), load the RDMA transport module: - > modprobe svcrdma + $ modprobe svcrdma - Regardless of how the server was built (module or built-in), start the server: + Regardless of how the server was built (module or built-in), start the + server: - > /etc/init.d/nfs start + $ /etc/init.d/nfs start or - > service nfs start + $ service nfs start Instruct the server to listen on the RDMA transport: - > echo rdma 2050 > /proc/fs/nfsd/portlist + $ echo rdma 2050 > /proc/fs/nfsd/portlist - On the client system - If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), - load the RDMA client module: + If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in + kernel config), load the RDMA client module: - > modprobe xprtrdma.ko + $ modprobe xprtrdma.ko - Regardless of how the client was built (module or built-in), issue the mount.nfs command: + Regardless of how the client was built (module or built-in), use this + command to mount the NFS/RDMA server: - > /path/to/your/mount.nfs :/ /mnt -i -o rdma,port=2050 + $ mount -o rdma,port=2050 :/ /mnt - To verify that the mount is using RDMA, run "cat /proc/mounts" and check the - "proto" field for the given mount. + To verify that the mount is using RDMA, run "cat /proc/mounts" and check + the "proto" field for the given mount. Congratulations! You're using NFS/RDMA! diff --git a/Documentation/filesystems/sysfs.txt b/Documentation/filesystems/sysfs.txt index 7f27b8f840d06210df1511db6ef603cedb8f0e58..9e9c348275a96fd362acf9916d2789be8a08c4c8 100644 --- a/Documentation/filesystems/sysfs.txt +++ b/Documentation/filesystems/sysfs.txt @@ -248,6 +248,7 @@ The top level sysfs directory looks like: block/ bus/ class/ +dev/ devices/ firmware/ net/ @@ -274,6 +275,11 @@ fs/ contains a directory for some filesystems. Currently each filesystem wanting to export attributes must create its own hierarchy below fs/ (see ./fuse.txt for an example). +dev/ contains two directories char/ and block/. Inside these two +directories there are symlinks named :. These symlinks +point to the sysfs directory for the given device. /sys/dev provides a +quick way to lookup the sysfs interface for a device from the result of +a stat(2) operation. More information can driver-model specific features can be found in Documentation/driver-model/. diff --git a/Documentation/ia64/paravirt_ops.txt b/Documentation/ia64/paravirt_ops.txt new file mode 100644 index 0000000000000000000000000000000000000000..39ded02ec33fc8a06c3e09d34ad1760d9e417b08 --- /dev/null +++ b/Documentation/ia64/paravirt_ops.txt @@ -0,0 +1,137 @@ +Paravirt_ops on IA64 +==================== + 21 May 2008, Isaku Yamahata + + +Introduction +------------ +The aim of this documentation is to help with maintainability and/or to +encourage people to use paravirt_ops/IA64. + +paravirt_ops (pv_ops in short) is a way for virtualization support of +Linux kernel on x86. Several ways for virtualization support were +proposed, paravirt_ops is the winner. +On the other hand, now there are also several IA64 virtualization +technologies like kvm/IA64, xen/IA64 and many other academic IA64 +hypervisors so that it is good to add generic virtualization +infrastructure on Linux/IA64. + + +What is paravirt_ops? +--------------------- +It has been developed on x86 as virtualization support via API, not ABI. +It allows each hypervisor to override operations which are important for +hypervisors at API level. And it allows a single kernel binary to run on +all supported execution environments including native machine. +Essentially paravirt_ops is a set of function pointers which represent +operations corresponding to low level sensitive instructions and high +level functionalities in various area. But one significant difference +from usual function pointer table is that it allows optimization with +binary patch. It is because some of these operations are very +performance sensitive and indirect call overhead is not negligible. +With binary patch, indirect C function call can be transformed into +direct C function call or in-place execution to eliminate the overhead. + +Thus, operations of paravirt_ops are classified into three categories. +- simple indirect call + These operations correspond to high level functionality so that the + overhead of indirect call isn't very important. + +- indirect call which allows optimization with binary patch + Usually these operations correspond to low level instructions. They + are called frequently and performance critical. So the overhead is + very important. + +- a set of macros for hand written assembly code + Hand written assembly codes (.S files) also need paravirtualization + because they include sensitive instructions or some of code paths in + them are very performance critical. + + +The relation to the IA64 machine vector +--------------------------------------- +Linux/IA64 has the IA64 machine vector functionality which allows the +kernel to switch implementations (e.g. initialization, ipi, dma api...) +depending on executing platform. +We can replace some implementations very easily defining a new machine +vector. Thus another approach for virtualization support would be +enhancing the machine vector functionality. +But paravirt_ops approach was taken because +- virtualization support needs wider support than machine vector does. + e.g. low level instruction paravirtualization. It must be + initialized very early before platform detection. + +- virtualization support needs more functionality like binary patch. + Probably the calling overhead might not be very large compared to the + emulation overhead of virtualization. However in the native case, the + overhead should be eliminated completely. + A single kernel binary should run on each environment including native, + and the overhead of paravirt_ops on native environment should be as + small as possible. + +- for full virtualization technology, e.g. KVM/IA64 or + Xen/IA64 HVM domain, the result would be + (the emulated platform machine vector. probably dig) + (pv_ops). + This means that the virtualization support layer should be under + the machine vector layer. + +Possibly it might be better to move some function pointers from +paravirt_ops to machine vector. In fact, Xen domU case utilizes both +pv_ops and machine vector. + + +IA64 paravirt_ops +----------------- +In this section, the concrete paravirt_ops will be discussed. +Because of the architecture difference between ia64 and x86, the +resulting set of functions is very different from x86 pv_ops. + +- C function pointer tables +They are not very performance critical so that simple C indirect +function call is acceptable. The following structures are defined at +this moment. For details see linux/include/asm-ia64/paravirt.h + - struct pv_info + This structure describes the execution environment. + - struct pv_init_ops + This structure describes the various initialization hooks. + - struct pv_iosapic_ops + This structure describes hooks to iosapic operations. + - struct pv_irq_ops + This structure describes hooks to irq related operations + - struct pv_time_op + This structure describes hooks to steal time accounting. + +- a set of indirect calls which need optimization +Currently this class of functions correspond to a subset of IA64 +intrinsics. At this moment the optimization with binary patch isn't +implemented yet. +struct pv_cpu_op is defined. For details see +linux/include/asm-ia64/paravirt_privop.h +Mostly they correspond to ia64 intrinsics 1-to-1. +Caveat: Now they are defined as C indirect function pointers, but in +order to support binary patch optimization, they will be changed +using GCC extended inline assembly code. + +- a set of macros for hand written assembly code (.S files) +For maintenance purpose, the taken approach for .S files is single +source code and compile multiple times with different macros definitions. +Each pv_ops instance must define those macros to compile. +The important thing here is that sensitive, but non-privileged +instructions must be paravirtualized and that some privileged +instructions also need paravirtualization for reasonable performance. +Developers who modify .S files must be aware of that. At this moment +an easy checker is implemented to detect paravirtualization breakage. +But it doesn't cover all the cases. + +Sometimes this set of macros is called pv_cpu_asm_op. But there is no +corresponding structure in the source code. +Those macros mostly 1:1 correspond to a subset of privileged +instructions. See linux/include/asm-ia64/native/inst.h. +And some functions written in assembly also need to be overrided so +that each pv_ops instance have to define some macros. Again see +linux/include/asm-ia64/native/inst.h. + + +Those structures must be initialized very early before start_kernel. +Probably initialized in head.S using multi entry point or some other trick. +For native case implementation see linux/arch/ia64/kernel/paravirt.c. diff --git a/Documentation/input/gameport-programming.txt b/Documentation/input/gameport-programming.txt index 14e0a8b70225f5cce93c2cf8bdb85a0aa984b8ee..03a74fc3b49679c0326a59faa43b2e423751e0c5 100644 --- a/Documentation/input/gameport-programming.txt +++ b/Documentation/input/gameport-programming.txt @@ -1,5 +1,3 @@ -$Id: gameport-programming.txt,v 1.3 2001/04/24 13:51:37 vojtech Exp $ - Programming gameport drivers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/Documentation/input/input.txt b/Documentation/input/input.txt index ff8cea0225f90bdce6c28adc5b3af5816f9eea44..686ee9932dffd86fa78775859a97815ecc514be2 100644 --- a/Documentation/input/input.txt +++ b/Documentation/input/input.txt @@ -1,7 +1,6 @@ Linux Input drivers v1.0 (c) 1999-2001 Vojtech Pavlik Sponsored by SuSE - $Id: input.txt,v 1.8 2002/05/29 03:15:01 bradleym Exp $ ---------------------------------------------------------------------------- 0. Disclaimer diff --git a/Documentation/input/joystick-api.txt b/Documentation/input/joystick-api.txt index acbd32b88454dc5c48c7a10bb112757910d3c7e5..c507330740cd4fb14e37e617d01d95262057d129 100644 --- a/Documentation/input/joystick-api.txt +++ b/Documentation/input/joystick-api.txt @@ -5,8 +5,6 @@ 7 Aug 1998 - $Id: joystick-api.txt,v 1.2 2001/05/08 21:21:23 vojtech Exp $ - 1. Initialization ~~~~~~~~~~~~~~~~~ diff --git a/Documentation/input/joystick-parport.txt b/Documentation/input/joystick-parport.txt index ede5f33daad3dbc0f7346654a348fd4eb28d7387..1c856f32ff2c3bc880746533c149ec6cc146806f 100644 --- a/Documentation/input/joystick-parport.txt +++ b/Documentation/input/joystick-parport.txt @@ -2,7 +2,6 @@ (c) 1998-2000 Vojtech Pavlik (c) 1998 Andree Borrmann Sponsored by SuSE - $Id: joystick-parport.txt,v 1.6 2001/09/25 09:31:32 vojtech Exp $ ---------------------------------------------------------------------------- 0. Disclaimer diff --git a/Documentation/input/joystick.txt b/Documentation/input/joystick.txt index 389de9bd987894017bb25ddc95b911c98526bdcb..154d767b2acbc4be459044d78e7f63b975faa207 100644 --- a/Documentation/input/joystick.txt +++ b/Documentation/input/joystick.txt @@ -1,7 +1,6 @@ Linux Joystick driver v2.0.0 (c) 1996-2000 Vojtech Pavlik Sponsored by SuSE - $Id: joystick.txt,v 1.12 2002/03/03 12:13:07 jdeneux Exp $ ---------------------------------------------------------------------------- 0. Disclaimer diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 09ad7450647bc81dff32a3eaf7ea3c0858f4a896..47e7d8794fc6fbad46c37edbc1d9a5bc4f6c3e82 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1206,7 +1206,7 @@ and is between 256 and 4096 characters. It is defined in the file or memmap=0x10000$0x18690000 - memtest= [KNL,X86_64] Enable memtest + memtest= [KNL,X86] Enable memtest Format: range: 0,4 : pattern number default : 0 @@ -1279,6 +1279,13 @@ and is between 256 and 4096 characters. It is defined in the file This usage is only documented in each driver source file if at all. + nf_conntrack.acct= + [NETFILTER] Enable connection tracking flow accounting + 0 to disable accounting + 1 to enable accounting + Default value depends on CONFIG_NF_CT_ACCT that is + going to be removed in 2.6.29. + nfsaddrs= [NFS] See Documentation/filesystems/nfsroot.txt. @@ -2027,6 +2034,9 @@ and is between 256 and 4096 characters. It is defined in the file snd-ymfpci= [HW,ALSA] + softlockup_panic= + [KNL] Should the soft-lockup detector generate panics. + sonypi.*= [HW] Sony Programmable I/O Control Device driver See Documentation/sonypi.txt @@ -2158,6 +2168,10 @@ and is between 256 and 4096 characters. It is defined in the file Note that genuine overcurrent events won't be reported either. + unknown_nmi_panic + [X86-32,X86-64] + Set unknown_nmi_panic=1 early on boot. + usbcore.autosuspend= [USB] The autosuspend time delay (in seconds) used for newly-detected USB devices (default 2). This diff --git a/Documentation/md.txt b/Documentation/md.txt index a8b430627473aa243995ab6f6e173b9cf1ff819e..1da9d1b1793f3436b3561a48de7fbcd8c893e74e 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -236,6 +236,11 @@ All md devices contain: writing the word for the desired state, however some states cannot be explicitly set, and some transitions are not allowed. + Select/poll works on this file. All changes except between + active_idle and active (which can be frequent and are not + very interesting) are notified. active->active_idle is + reported if the metadata is externally managed. + clear No devices, no size, no level Writing is equivalent to STOP_ARRAY ioctl @@ -292,6 +297,10 @@ Each directory contains: writemostly - device will only be subject to read requests if there are no other options. This applies only to raid1 arrays. + blocked - device has failed, metadata is "external", + and the failure hasn't been acknowledged yet. + Writes that would write to this device if + it were not faulty are blocked. spare - device is working, but not a full member. This includes spares that are in the process of being recovered to @@ -301,6 +310,12 @@ Each directory contains: Writing "remove" removes the device from the array. Writing "writemostly" sets the writemostly flag. Writing "-writemostly" clears the writemostly flag. + Writing "blocked" sets the "blocked" flag. + Writing "-blocked" clear the "blocked" flag and allows writes + to complete. + + This file responds to select/poll. Any change to 'faulty' + or 'blocked' causes an event. errors An approximate count of read errors that have been detected on @@ -332,7 +347,7 @@ Each directory contains: for storage of data. This will normally be the same as the component_size. This can be written while assembling an array. If a value less than the current component_size is - written, component_size will be reduced to this value. + written, it will be rejected. An active md device will also contain and entry for each active device @@ -381,6 +396,19 @@ also have 'check' and 'repair' will start the appropriate process providing the current state is 'idle'. + This file responds to select/poll. Any important change in the value + triggers a poll event. Sometimes the value will briefly be + "recover" if a recovery seems to be needed, but cannot be + achieved. In that case, the transition to "recover" isn't + notified, but the transition away is. + + degraded + This contains a count of the number of devices by which the + arrays is degraded. So an optimal array with show '0'. A + single failed/missing drive will show '1', etc. + This file responds to select/poll, any increase or decrease + in the count of missing devices will trigger an event. + mismatch_count When performing 'check' and 'repair', and possibly when performing 'resync', md will count the number of errors that are diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index a0cda062bc33b6e1de3dd2bb60d5ff62bdf97cd4..7fa7fe71d7a8310c00c8cade78688ece2d5250c4 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -289,35 +289,73 @@ downdelay fail_over_mac Specifies whether active-backup mode should set all slaves to - the same MAC address (the traditional behavior), or, when - enabled, change the bond's MAC address when changing the - active interface (i.e., fail over the MAC address itself). - - Fail over MAC is useful for devices that cannot ever alter - their MAC address, or for devices that refuse incoming - broadcasts with their own source MAC (which interferes with - the ARP monitor). - - The down side of fail over MAC is that every device on the - network must be updated via gratuitous ARP, vs. just updating - a switch or set of switches (which often takes place for any - traffic, not just ARP traffic, if the switch snoops incoming - traffic to update its tables) for the traditional method. If - the gratuitous ARP is lost, communication may be disrupted. - - When fail over MAC is used in conjuction with the mii monitor, - devices which assert link up prior to being able to actually - transmit and receive are particularly susecptible to loss of - the gratuitous ARP, and an appropriate updelay setting may be - required. - - A value of 0 disables fail over MAC, and is the default. A - value of 1 enables fail over MAC. This option is enabled - automatically if the first slave added cannot change its MAC - address. This option may be modified via sysfs only when no - slaves are present in the bond. - - This option was added in bonding version 3.2.0. + the same MAC address at enslavement (the traditional + behavior), or, when enabled, perform special handling of the + bond's MAC address in accordance with the selected policy. + + Possible values are: + + none or 0 + + This setting disables fail_over_mac, and causes + bonding to set all slaves of an active-backup bond to + the same MAC address at enslavement time. This is the + default. + + active or 1 + + The "active" fail_over_mac policy indicates that the + MAC address of the bond should always be the MAC + address of the currently active slave. The MAC + address of the slaves is not changed; instead, the MAC + address of the bond changes during a failover. + + This policy is useful for devices that cannot ever + alter their MAC address, or for devices that refuse + incoming broadcasts with their own source MAC (which + interferes with the ARP monitor). + + The down side of this policy is that every device on + the network must be updated via gratuitous ARP, + vs. just updating a switch or set of switches (which + often takes place for any traffic, not just ARP + traffic, if the switch snoops incoming traffic to + update its tables) for the traditional method. If the + gratuitous ARP is lost, communication may be + disrupted. + + When this policy is used in conjuction with the mii + monitor, devices which assert link up prior to being + able to actually transmit and receive are particularly + susecptible to loss of the gratuitous ARP, and an + appropriate updelay setting may be required. + + follow or 2 + + The "follow" fail_over_mac policy causes the MAC + address of the bond to be selected normally (normally + the MAC address of the first slave added to the bond). + However, the second and subsequent slaves are not set + to this MAC address while they are in a backup role; a + slave is programmed with the bond's MAC address at + failover time (and the formerly active slave receives + the newly active slave's MAC address). + + This policy is useful for multiport devices that + either become confused or incur a performance penalty + when multiple ports are programmed with the same MAC + address. + + + The default policy is none, unless the first slave cannot + change its MAC address, in which case the active policy is + selected by default. + + This option may be modified via sysfs only when no slaves are + present in the bond. + + This option was added in bonding version 3.2.0. The "follow" + policy was added in bonding version 3.3.0. lacp_rate @@ -338,7 +376,8 @@ max_bonds Specifies the number of bonding devices to create for this instance of the bonding driver. E.g., if max_bonds is 3, and the bonding driver is not already loaded, then bond0, bond1 - and bond2 will be created. The default value is 1. + and bond2 will be created. The default value is 1. Specifying + a value of 0 will load bonding, but will not create any devices. miimon @@ -501,6 +540,17 @@ mode swapped with the new curr_active_slave that was chosen. +num_grat_arp + + Specifies the number of gratuitous ARPs to be issued after a + failover event. One gratuitous ARP is issued immediately after + the failover, subsequent ARPs are sent at a rate of one per link + monitor interval (arp_interval or miimon, whichever is active). + + The valid range is 0 - 255; the default value is 1. This option + affects only the active-backup mode. This option was added for + bonding version 3.3.0. + primary A string (eth0, eth2, etc) specifying which slave is the diff --git a/Documentation/networking/dm9000.txt b/Documentation/networking/dm9000.txt new file mode 100644 index 0000000000000000000000000000000000000000..65df3dea55613bd581dd787305054b7062657507 --- /dev/null +++ b/Documentation/networking/dm9000.txt @@ -0,0 +1,167 @@ +DM9000 Network driver +===================== + +Copyright 2008 Simtec Electronics, + Ben Dooks + + +Introduction +------------ + +This file describes how to use the DM9000 platform-device based network driver +that is contained in the files drivers/net/dm9000.c and drivers/net/dm9000.h. + +The driver supports three DM9000 variants, the DM9000E which is the first chip +supported as well as the newer DM9000A and DM9000B devices. It is currently +maintained and tested by Ben Dooks, who should be CC: to any patches for this +driver. + + +Defining the platform device +---------------------------- + +The minimum set of resources attached to the platform device are as follows: + + 1) The physical address of the address register + 2) The physical address of the data register + 3) The IRQ line the device's interrupt pin is connected to. + +These resources should be specified in that order, as the ordering of the +two address regions is important (the driver expects these to be address +and then data). + +An example from arch/arm/mach-s3c2410/mach-bast.c is: + +static struct resource bast_dm9k_resource[] = { + [0] = { + .start = S3C2410_CS5 + BAST_PA_DM9000, + .end = S3C2410_CS5 + BAST_PA_DM9000 + 3, + .flags = IORESOURCE_MEM, + }, + [1] = { + .start = S3C2410_CS5 + BAST_PA_DM9000 + 0x40, + .end = S3C2410_CS5 + BAST_PA_DM9000 + 0x40 + 0x3f, + .flags = IORESOURCE_MEM, + }, + [2] = { + .start = IRQ_DM9000, + .end = IRQ_DM9000, + .flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHLEVEL, + } +}; + +static struct platform_device bast_device_dm9k = { + .name = "dm9000", + .id = 0, + .num_resources = ARRAY_SIZE(bast_dm9k_resource), + .resource = bast_dm9k_resource, +}; + +Note the setting of the IRQ trigger flag in bast_dm9k_resource[2].flags, +as this will generate a warning if it is not present. The trigger from +the flags field will be passed to request_irq() when registering the IRQ +handler to ensure that the IRQ is setup correctly. + +This shows a typical platform device, without the optional configuration +platform data supplied. The next example uses the same resources, but adds +the optional platform data to pass extra configuration data: + +static struct dm9000_plat_data bast_dm9k_platdata = { + .flags = DM9000_PLATF_16BITONLY, +}; + +static struct platform_device bast_device_dm9k = { + .name = "dm9000", + .id = 0, + .num_resources = ARRAY_SIZE(bast_dm9k_resource), + .resource = bast_dm9k_resource, + .dev = { + .platform_data = &bast_dm9k_platdata, + } +}; + +The platform data is defined in include/linux/dm9000.h and described below. + + +Platform data +------------- + +Extra platform data for the DM9000 can describe the IO bus width to the +device, whether or not an external PHY is attached to the device and +the availability of an external configuration EEPROM. + +The flags for the platform data .flags field are as follows: + +DM9000_PLATF_8BITONLY + + The IO should be done with 8bit operations. + +DM9000_PLATF_16BITONLY + + The IO should be done with 16bit operations. + +DM9000_PLATF_32BITONLY + + The IO should be done with 32bit operations. + +DM9000_PLATF_EXT_PHY + + The chip is connected to an external PHY. + +DM9000_PLATF_NO_EEPROM + + This can be used to signify that the board does not have an + EEPROM, or that the EEPROM should be hidden from the user. + +DM9000_PLATF_SIMPLE_PHY + + Switch to using the simpler PHY polling method which does not + try and read the MII PHY state regularly. This is only available + when using the internal PHY. See the section on link state polling + for more information. + + The config symbol DM9000_FORCE_SIMPLE_PHY_POLL, Kconfig entry + "Force simple NSR based PHY polling" allows this flag to be + forced on at build time. + + +PHY Link state polling +---------------------- + +The driver keeps track of the link state and informs the network core +about link (carrier) availablilty. This is managed by several methods +depending on the version of the chip and on which PHY is being used. + +For the internal PHY, the original (and currently default) method is +to read the MII state, either when the status changes if we have the +necessary interrupt support in the chip or every two seconds via a +periodic timer. + +To reduce the overhead for the internal PHY, there is now the option +of using the DM9000_FORCE_SIMPLE_PHY_POLL config, or DM9000_PLATF_SIMPLE_PHY +platform data option to read the summary information without the +expensive MII accesses. This method is faster, but does not print +as much information. + +When using an external PHY, the driver currently has to poll the MII +link status as there is no method for getting an interrupt on link change. + + +DM9000A / DM9000B +----------------- + +These chips are functionally similar to the DM9000E and are supported easily +by the same driver. The features are: + + 1) Interrupt on internal PHY state change. This means that the periodic + polling of the PHY status may be disabled on these devices when using + the internal PHY. + + 2) TCP/UDP checksum offloading, which the driver does not currently support. + + +ethtool +------- + +The driver supports the ethtool interface for access to the driver +state information, the PHY state and the EEPROM. diff --git a/Documentation/networking/e1000.txt b/Documentation/networking/e1000.txt index 61b171cf5313c9b94c3e4fe185a9ff9edf9b5cbc..2df71861e578b6ebcb2b94371d4a32fbdf7fca61 100644 --- a/Documentation/networking/e1000.txt +++ b/Documentation/networking/e1000.txt @@ -513,21 +513,11 @@ Additional Configurations Intel(R) PRO/1000 PT Dual Port Server Connection Intel(R) PRO/1000 PT Dual Port Server Adapter Intel(R) PRO/1000 PF Dual Port Server Adapter - Intel(R) PRO/1000 PT Quad Port Server Adapter + Intel(R) PRO/1000 PT Quad Port Server Adapter NAPI ---- - NAPI (Rx polling mode) is supported in the e1000 driver. NAPI is enabled - or disabled based on the configuration of the kernel. To override - the default, use the following compile-time flags. - - To enable NAPI, compile the driver module, passing in a configuration option: - - make CFLAGS_EXTRA=-DE1000_NAPI install - - To disable NAPI, compile the driver module, passing in a configuration option: - - make CFLAGS_EXTRA=-DE1000_NO_NAPI install + NAPI (Rx polling mode) is enabled in the e1000 driver. See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI. diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 946b66e1b65214f4becc5d2563f73180fe331837..d84932650fd3e926cc11d5fa0bcaee6cc8a8cd33 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -551,8 +551,9 @@ icmp_echo_ignore_broadcasts - BOOLEAN icmp_ratelimit - INTEGER Limit the maximal rates for sending ICMP packets whose type matches icmp_ratemask (see below) to specific targets. - 0 to disable any limiting, otherwise the maximal rate in jiffies(1) - Default: 100 + 0 to disable any limiting, + otherwise the minimal space between responses in milliseconds. + Default: 1000 icmp_ratemask - INTEGER Mask made of ICMP types for which rates are being limited. @@ -1023,11 +1024,23 @@ max_addresses - INTEGER autoconfigured addresses. Default: 16 +disable_ipv6 - BOOLEAN + Disable IPv6 operation. + Default: FALSE (enable IPv6 operation) + +accept_dad - INTEGER + Whether to accept DAD (Duplicate Address Detection). + 0: Disable DAD + 1: Enable DAD (default) + 2: Enable DAD, and disable IPv6 operation if MAC-based duplicate + link-local address has been found. + icmp/*: ratelimit - INTEGER Limit the maximal rates for sending ICMPv6 packets. - 0 to disable any limiting, otherwise the maximal rate in jiffies(1) - Default: 100 + 0 to disable any limiting, + otherwise the minimal space between responses in milliseconds. + Default: 1000 IPv6 Update by: diff --git a/Documentation/networking/ixgb.txt b/Documentation/networking/ixgb.txt index 7c98277777ebcc14ee7c30991873cc6c92296d28..a0d0ffb5e584c1ae1b24a49552bfaef4c3b4684a 100644 --- a/Documentation/networking/ixgb.txt +++ b/Documentation/networking/ixgb.txt @@ -1,7 +1,7 @@ -Linux* Base Driver for the Intel(R) PRO/10GbE Family of Adapters -================================================================ +Linux Base Driver for 10 Gigabit Intel(R) Network Connection +============================================================= -November 17, 2004 +October 9, 2007 Contents @@ -9,94 +9,151 @@ Contents - In This Release - Identifying Your Adapter +- Building and Installation - Command Line Parameters - Improving Performance +- Additional Configurations +- Known Issues/Troubleshooting - Support + In This Release =============== -This file describes the Linux* Base Driver for the Intel(R) PRO/10GbE Family -of Adapters, version 1.0.x. +This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R) +Network Connection. This driver includes support for Itanium(R)2-based +systems. + +For questions related to hardware requirements, refer to the documentation +supplied with your 10 Gigabit adapter. All hardware requirements listed apply +to use with Linux. + +The following features are available in this kernel: + - Native VLANs + - Channel Bonding (teaming) + - SNMP + +Channel Bonding documentation can be found in the Linux kernel source: +/Documentation/networking/bonding.txt + +The driver information previously displayed in the /proc filesystem is not +supported in this release. Alternatively, you can use ethtool (version 1.6 +or later), lspci, and ifconfig to obtain the same information. + +Instructions on updating ethtool can be found in the section "Additional +Configurations" later in this document. -For questions related to hardware requirements, refer to the documentation -supplied with your Intel PRO/10GbE adapter. All hardware requirements listed -apply to use with Linux. Identifying Your Adapter ======================== -To verify your Intel adapter is supported, find the board ID number on the -adapter. Look for a label that has a barcode and a number in the format -A12345-001. +The following Intel network adapters are compatible with the drivers in this +release: + +Controller Adapter Name Physical Layer +---------- ------------ -------------- +82597EX Intel(R) PRO/10GbE LR/SR/CX4 10G Base-LR (1310 nm optical fiber) + Server Adapters 10G Base-SR (850 nm optical fiber) + 10G Base-CX4(twin-axial copper cabling) + +For more information on how to identify your adapter, go to the Adapter & +Driver ID Guide at: + + http://support.intel.com/support/network/sb/CS-012904.htm + + +Building and Installation +========================= + +select m for "Intel(R) PRO/10GbE support" located at: + Location: + -> Device Drivers + -> Network device support (NETDEVICES [=y]) + -> Ethernet (10000 Mbit) (NETDEV_10000 [=y]) +1. make modules && make modules_install + +2. Load the module: + +    modprobe ixgb = + + The insmod command can be used if the full + path to the driver module is specified. For example: + + insmod /lib/modules//kernel/drivers/net/ixgb/ixgb.ko + + With 2.6 based kernels also make sure that older ixgb drivers are + removed from the kernel, before loading the new module: -Use the above information and the Adapter & Driver ID Guide at: + rmmod ixgb; modprobe ixgb - http://support.intel.com/support/network/adapter/pro100/21397.htm +3. Assign an IP address to the interface by entering the following, where + x is the interface number: -For the latest Intel network drivers for Linux, go to: + ifconfig ethx + +4. Verify that the interface works. Enter the following, where + is the IP address for another machine on the same subnet as the interface + that is being tested: + + ping - http://downloadfinder.intel.com/scripts-df/support_intel.asp Command Line Parameters ======================= -If the driver is built as a module, the following optional parameters are -used by entering them on the command line with the modprobe or insmod command -using this syntax: +If the driver is built as a module, the following optional parameters are +used by entering them on the command line with the modprobe command using +this syntax: modprobe ixgb [