-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USB DAC not working on r34 #4
Comments
DerRomtester
referenced
this issue
in DerRomtester/one_plus_one
Jan 8, 2015
net/ping: handle protocol mismatching scenario ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net 91a0b603469069cdcce4d572b7525ffc9fd352a6] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: James Morris <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Jane Zhou <[email protected]> Signed-off-by: Yiwei Zhao <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Lorenzo Colitti <[email protected]> tcp: must unclone packets before mangling them [ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ] TCP stack should make sure it owns skbs before mangling them. We had various crashes using bnx2x, and it turned out gso_size was cleared right before bnx2x driver was populating TC descriptor of the _previous_ packet send. TCP stack can sometime retransmit packets that are still in Qdisc. Of course we could make bnx2x driver more robust (using ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack. We have identified two points where skb_unclone() was needed. This patch adds a WARN_ON_ONCE() to warn us if we missed another fix of this kind. Kudos to Neal for finding the root cause of this bug. Its visible using small MSS. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> tcp: do not forget FIN in tcp_shifted_skb() [ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ] Yuchung found following problem : There are bugs in the SACK processing code, merging part in tcp_shift_skb_data(), that incorrectly resets or ignores the sacked skbs FIN flag. When a receiver first SACK the FIN sequence, and later throw away ofo queue (e.g., sack-reneging), the sender will stop retransmitting the FIN flag, and hangs forever. Following packetdrill test can be used to reproduce the bug. $ cat sack-merge-bug.pkt `sysctl -q net.ipv4.tcp_fack=0` // Establish a connection and send 10 MSS. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +.000 bind(3, ..., ...) = 0 +.000 listen(3, 1) = 0 +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.001 < . 1:1(0) ack 1 win 1024 +.000 accept(3, ..., ...) = 4 +.100 write(4, ..., 12000) = 12000 +.000 shutdown(4, SHUT_WR) = 0 +.000 > . 1:10001(10000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 +.000 > FP. 10001:12001(2000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop> +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop> // SACK reneg +.050 < . 1:1(0) ack 12001 win 257 +0 %{ print "unacked: ",tcpi_unacked }% +5 %{ print "" }% First, a typo inverted left/right of one OR operation, then code forgot to advance end_seq if the merged skb carried FIN. Bug was added in 2.6.29 by commit 832d11c5cd076ab ("tcp: Try to restore large SKBs while SACK processing") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Acked-by: Neal Cardwell <[email protected]> Cc: Ilpo Järvinen <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: do not call sock_put() on TIMEWAIT sockets [ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: update statistics timer from timer context only [ Upstream commit 041b4ddb84989f06ff1df0ca869b950f1ee3cb1c ] Each port driver installs a periodic timer to update port statistics by calling mib_counters_update. As mib_counters_update is also called from non-timer context, we should not reschedule the timer there but rather move it to timer-only context. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: fix orphaned statistics timer crash [ Upstream commit f564412c935111c583b787bcc18157377b208e2e ] The periodic statistics timer gets started at port _probe() time, but is stopped on _stop() only. In a modular environment, this can cause the timer to access already deallocated memory, if the module is unloaded without starting the eth device. To fix this, we add the timer right before the port is started, instead of at _probe() time. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: heap overflow in __audit_sockaddr() [ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> proc connector: fix info leaks [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ] Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv4: fix ineffective source address selection [ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ] When sending out multicast messages, the source address in inet->mc_addr is ignored and rewritten by an autoselected one. This is caused by a typo in commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output route lookups"). Signed-off-by: Jiri Benc <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: dev: fix nlmsg size calculation in can_get_size() [ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv6: restrict neighbor entry creation to output flow This patch is based on 3.2.y branch, the one used by reporter. Please let me know if it should be different. Thanks. The patch which introduced the regression was applied on stables: 3.0.64 3.4.31 3.7.8 3.2.39 The patch which introduced the regression was for stable trees only. ---8<--- Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create neighbor entries for local delivery" introduced a regression on which routes to local delivery would not work anymore. Like this: $ ip -6 route add local 2001::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes ping: sendmsg: Invalid argument As this is a local delivery, that commit would not allow the creation of a neighbor entry and thus the packet cannot be sent. But as TPROXY scenario actually needs to avoid the neighbor entry creation only for input flow, this patch now limits previous patch to input flow, keeping output as before that patch. Reported-by: Debabrata Banerjee <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> CC: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bridge: Correctly clamp MAX forward_delay when enabling STP [ Upstream commit 4b6c7879d84ad06a2ac5b964808ed599187a188d ] Commit be4f154d5ef0ca147ab6bcd38857a774133f5450 bridge: Clamp forward_delay when enabling STP had a typo when attempting to clamp maximum forward delay. It is possible to set bridge_forward_delay to be higher then permitted maximum when STP is off. When turning STP on, the higher then allowed delay has to be clamed down to max value. Signed-off-by: Vlad Yasevich <[email protected]> CC: Herbert Xu <[email protected]> CC: Stephen Hemminger <[email protected]> Reviewed-by: Veaceslav Falico <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: vlan: fix nlmsg size calculation in vlan_get_size() [ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> l2tp: must disable bh before calling l2tp_xmit_skb() [ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ] François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165 ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <[email protected]> Tested-by: François Cachereul <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> farsync: fix info leak in ioctl [ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ] The fst_get_iface() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> unix_diag: fix info leak [ Upstream commit 6865d1e834be84ddd5808d93d5035b492346c64a ] When filling the netlink message we miss to wipe the pad field, therefore leak one byte of heap memory to userland. Fix this by setting pad to 0. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> connector: use nlmsg_len() to check message length [ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ] The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bnx2x: record rx queue for LRO packets [ Upstream commit 60e66fee56b2256dcb1dc2ea1b2ddcb6e273857d ] RPS support is kind of broken on bnx2x, because only non LRO packets get proper rx queue information. This triggers reorders, as it seems bnx2x like to generate a non LRO packet for segment including TCP PUSH flag : (this might be pure coincidence, but all the reorders I've seen involve segments with a PUSH) 11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797> 11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> We must call skb_record_rx_queue() to properly give to RPS (and more generally for TX queue selection on forward path) the receive queue information. Similar fix is needed for skb_mark_napi_id(), but will be handled in a separate patch to ease stable backports. Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Eilon Greenstein <[email protected]> Acked-by: Dmitry Kravkov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: dst: provide accessor function to dst->xfrm [ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ] dst->xfrm is conditionally defined. Provide accessor funtion that is always available. Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Use software crc32 checksum when xfrm transform will happen. [ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ] igb/ixgbe have hardware sctp checksum support, when this feature is enabled and also IPsec is armed to protect sctp traffic, ugly things happened as xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing up and pack the 16bits result in the checksum field). The result is fail establishment of sctp communication. Signed-off-by: Fan Du <[email protected]> Cc: Neil Horman <[email protected]> Cc: Steffen Klassert <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Perform software checksum if packet has to be fragmented. [ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ] IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum. This causes problems if SCTP packets has to be fragmented and ipsummed has been set to PARTIAL due to checksum offload support. This condition can happen when retransmitting after MTU discover, or when INIT or other control chunks are larger then MTU. Check for the rare fragmentation condition in SCTP and use software checksum calculation in this case. CC: Fan Du <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wanxl: fix info leak in ioctl [ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ] The wanxl_ioctl() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Salva Peiró <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race [ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ] In the case of credentials passing in unix stream sockets (dgram sockets seem not affected), we get a rather sparse race after commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default"). We have a stream server on receiver side that requests credential passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED on each spawned/accepted socket on server side to 1 first (as it's not inherited), it can happen that in the time between accept() and setsockopt() we get interrupted, the sender is being scheduled and continues with passing data to our receiver. At that time SO_PASSCRED is neither set on sender nor receiver side, hence in cmsg's SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534 (== overflow{u,g}id) instead of what we actually would like to see. On the sender side, here nc -U, the tests in maybe_add_creds() invoked through unix_stream_sendmsg() would fail, as at that exact time, as mentioned, the sender has neither SO_PASSCRED on his side nor sees it on the server side, and we have a valid 'other' socket in place. Thus, sender believes it would just look like a normal connection, not needing/requesting SO_PASSCRED at that time. As reverting 16e5726 would not be an option due to the significant performance regression reported when having creds always passed, one way/trade-off to prevent that would be to set SO_PASSCRED on the listener socket and allow inheriting these flags to the spawned socket on server side in accept(). It seems also logical to do so if we'd tell the listener socket to pass those flags onwards, and would fix the race. Before, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}}, msg_flags=0}, 0) = 5 After, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}}, msg_flags=0}, 0) = 5 Signed-off-by: Daniel Borkmann <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Eric W. Biederman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: fix cipso packet validation when !NETLABEL [ Upstream commit f2e5ddcc0d12f9c4c7b254358ad245c9dddce13b ] When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel crash in an SMP system, since the CPU executing this function will stall /not respond to IPIs. This problem can be reproduced by running the IP Stack Integrity Checker (http://isic.sourceforge.net) using the following command on a Linux machine connected to DUT: "icmpsic -s rand -d <DUT IP address> -r 123456" wait (1-2 min) Signed-off-by: Seif Mazareeb <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> inet: fix possible memory corruption with UDP_CORK and UFO [ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> davinci_emac.c: Fix IFF_ALLMULTI setup [ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ] When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't, emac_dev_mcast_set should only enable RX of multicasts and reset MACHASH registers. It does this, but afterwards it either sets up multicast MACs filtering or disables RX of multicasts and resets MACHASH registers again, rendering IFF_ALLMULTI flag useless. This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set. Tested with kernel 2.6.37. Signed-off-by: Mariusz Ceier <[email protected]> Acked-by: Mugunthan V N <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ext3: return 32/64-bit dir name hash according to usage type commit d7dab39b6e16d5eea78ed3c705d2a2d0772b4f06 upstream. This is based on commit d1f5273e9adb40724a85272f248f210dc4ce919a ext4: return 32/64-bit dir name hash according to usage type by Fan Yong <[email protected]> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext3 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This patch does implement a new ext3_dir_llseek op, because with 64-bit hashes, nfs will attempt to seek to a hash "offset" which is much larger than ext3's s_maxbytes. So for dx dirs, we call generic_file_llseek_size() with the appropriate max hash value as the maximum seekable size. Otherwise we just pass through to generic_file_llseek(). Patch-updated-by: Bernd Schubert <[email protected]> Patch-updated-by: Eric Sandeen <[email protected]> (blame us if something is not correct) Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: Jan Kara <[email protected]> Cc: Benjamin LaHaise <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> dm snapshot: fix data corruption commit e9c6a182649f4259db704ae15a91ac820e63b0ca upstream. This patch fixes a particular type of data corruption that has been encountered when loading a snapshot's metadata from disk. When we allocate a new chunk in persistent_prepare, we increment ps->next_free and we make sure that it doesn't point to a metadata area by further incrementing it if necessary. When we load metadata from disk on device activation, ps->next_free is positioned after the last used data chunk. However, if this last used data chunk is followed by a metadata area, ps->next_free is positioned erroneously to the metadata area. A newly-allocated chunk is placed at the same location as the metadata area, resulting in data or metadata corruption. This patch changes the code so that ps->next_free skips the metadata area when metadata are loaded in function read_exceptions. The patch also moves a piece of code from persistent_prepare_exception to a separate function skip_metadata to avoid code duplication. CVE-2013-4299 Signed-off-by: Mikulas Patocka <[email protected]> Cc: Mike Snitzer <[email protected]> Signed-off-by: Alasdair G Kergon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> writeback: fix negative bdi max pause commit e3b6c655b91e01a1dade056cfa358581b47a5351 upstream. Toralf runs trinity on UML/i386. After some time it hangs and the last message line is BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521] It's found that pages_dirtied becomes very large. More than 1000000000 pages in this case: period = HZ * pages_dirtied / task_ratelimit; BUG_ON(pages_dirtied > 2000000000); BUG_ON(pages_dirtied > 1000000000); <--------- UML debug printf shows that we got negative pause here: ick: pause : -984 ick: pages_dirtied : 0 ick: task_ratelimit: 0 pause: + if (pause < 0) { + extern int printf(char *, ...); + printf("ick : pause : %li\n", pause); + printf("ick: pages_dirtied : %lu\n", pages_dirtied); + printf("ick: task_ratelimit: %lu\n", task_ratelimit); + BUG_ON(1); + } trace_balance_dirty_pages(bdi, Since pause is bounded by [min_pause, max_pause] where min_pause is also bounded by max_pause. It's suspected and demonstrated that the max_pause calculation goes wrong: ick: pause : -717 ick: min_pause : -177 ick: max_pause : -717 ick: pages_dirtied : 14 ick: task_ratelimit: 0 The problem lies in the two "long = unsigned long" assignments in bdi_max_pause() which might go negative if the highest bit is 1, and the min_t(long, ...) check failed to protect it falling under 0. Fix all of them by using "unsigned long" throughout the function. Signed-off-by: Fengguang Wu <[email protected]> Reported-by: Toralf Förster <[email protected]> Tested-by: Toralf Förster <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wireless: radiotap: fix parsing buffer overrun commit f5563318ff1bde15b10e736e97ffce13be08bc1a upstream. When parsing an invalid radiotap header, the parser can overrun the buffer that is passed in because it doesn't correctly check 1) the minimum radiotap header size 2) the space for extended bitmaps The first issue doesn't affect any in-kernel user as they all check the minimum size before calling the radiotap function. The second issue could potentially affect the kernel if an skb is passed in that consists only of the radiotap header with a lot of extended bitmaps that extend past the SKB. In that case a read-only buffer overrun by at most 4 bytes is possible. Fix this by adding the appropriate checks to the parser. Reported-by: Evan Huus <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ti_usb_3410_5052: add Abbott strip port ID to combined table as well. commit c9d09dc7ad106492c17c587b6eeb99fe3f43e522 upstream. Without this change, the USB cable for Freestyle Option and compatible glucometers will not be detected by the driver. Signed-off-by: Diego Elio Pettenò <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: option: add support for Inovia SEW858 device commit f4c19b8e165cff1a6607c21f8809441d61cab7ec upstream. This patch adds the device id for the Inovia SEW858 device to the option driver. Reported-by: Pavel Parkhomenko <[email protected]> Tested-by: Pavel Parkhomenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> usb: serial: option: blacklist Olivetti Olicard200 commit fd8573f5828873343903215f203f14dc82de397c upstream. Interface 6 of this device speaks QMI as per tests done by us. Credits go to Antonella for providing the hardware. Signed-off-by: Enrico Mioso <[email protected]> Signed-off-by: Antonella Pellizzari <[email protected]> Tested-by: Dan Williams <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.68 Signed-off-by: Chet Kener <[email protected]> USB: support new huawei devices in option.c commit d544db293a44a2a3b09feab7dbd59668b692de71 upstream. Add new supporting declarations to option.c, to support Huawei new devices with new bInterfaceSubClass value. Signed-off-by: fangxiaozhi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks.c: add one device that cannot deal with suspension commit 4294bca7b423d1a5aa24307e3d112a04075e3763 upstream. The device is not responsive when resumed, unless it is reset. Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks: add touchscreen that is dazzeled by remote wakeup commit 614ced91fc6fbb5a1cdd12f0f1b6c9197d9f1350 upstream. The device descriptors are messed up after remote wakeup Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ftdi_sio: add id for Z3X Box device commit e1466ad5b1aeda303f9282463d55798d2eda218c upstream. Custom VID/PID for Z3X Box device, popular tool for cellphone flashing. Signed-off-by: Alexey E. Kramarenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: correctly close cancelled scans commit a754055a1296fcbe6f32de3a5eaca6efb2fd1865 upstream. __ieee80211_scan_completed is called from a worker. This means that the following flow is possible. * driver calls ieee80211_scan_completed * mac80211 cancels the scan (that is already complete) * __ieee80211_scan_completed runs When scan_work will finally run, it will see that the scan hasn't been aborted and might even trigger another scan on another band. This leads to a situation where cfg80211's scan is not done and no further scan can be issued. Fix this by setting a new flag when a HW scan is being cancelled so that no other scan will be triggered. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: update sta->last_rx on acked tx frames commit 0c5b93290b2f3c7a376567c03ae8d385b0e99851 upstream. When clients are idle for too long, hostapd sends nullfunc frames for probing. When those are acked by the client, the idle time needs to be updated. To make this work (and to avoid unnecessary probing), update sta->last_rx whenever an ACK was received for a tx packet. Only do this if the flag IEEE80211_HW_REPORTS_TX_ACK_STATUS is set. Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> rtlwifi: rtl8192cu: Fix error in pointer arithmetic commit 9473ca6e920a3b9ca902753ce52833657f9221cc upstream. An error in calculating the offset in an skb causes the driver to read essential device info from the wrong locations. The main effect is that automatic gain calculations are nonsense. Signed-off-by: Mark Cave-Ayland <[email protected]> Signed-off-by: Larry Finger <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> jfs: fix error path in ialloc commit 8660998608cfa1077e560034db81885af8e1e885 upstream. If insert_inode_locked() fails, we shouldn't be calling unlock_new_inode(). Signed-off-by: Dave Kleikamp <[email protected]> Tested-by: Michael L. Semon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX commit d5a7b406c529e4595ce03dc8f6dcf7fa36f106fa upstream. In patch 0d1862e can: flexcan: fix flexcan_chip_start() on imx6 the loop in flexcan_chip_start() that iterates over all mailboxes after the soft reset of the CAN core was removed. This loop put all mailboxes (even the ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63, this aborts any pending TX messages. After a cold boot there is random garbage in the mailboxes, which leads to spontaneous transmit of CAN frames during first activation. Further if the interface was disabled with a pending message (usually due to an error condition on the CAN bus), this message is retransmitted after enabling the interface again. This patch fixes the regression by: 1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX FIFO, 8 is used by TX. 2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that mailbox is aborted. Cc: Lothar Waßmann <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures commit f13e220161e738c2710b9904dcb3cf8bb0bcce61 upstream. libata EH decrements scmd->retries when the command failed for reasons unrelated to the command itself so that, for example, commands aborted due to suspend / resume cycle don't get penalized; however, decrementing scmd->retries isn't enough for ATA passthrough commands. Without this fix, ATA passthrough commands are not resend to the drive, and no error is signalled to the caller because: - allowed retry count is 1 - ata_eh_qc_complete fill the sense data, so result is valid - sense data is filled with untouched ATA registers. Signed-off-by: Gwendal Grignou <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> md: Fix skipping recovery for read-only arrays. commit 61e4947c99c4494336254ec540c50186d186150b upstream. Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Signed-off-by: Pawel Baldysiak <[email protected]> Signed-off-by: Lukasz Dorau <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> clockevents: Sanitize ticks to nsec conversion commit 97b9410643475d6557d2517c2aff9fd2221141a9 upstream. Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use clockevents_config_and_register() where possible" caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt->mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair "mult <= 1 << shift" is true. For the case where "mult > 1 << shift" we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting "clc" value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch << evt->shift) / evt->mult; So there is no sanity check for (latch << evt->shift) exceeding the 64bit boundary. The latch argument is "unsigned long", so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch << evt->shift) + evt->mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. [ukl: move assignment to rnd after eventually changing mult, fix build issue and correct comment with the right math] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: [email protected] Cc: Marc Pignat <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Ronald Wahl <[email protected]> Cc: LAK <[email protected]> Cc: Ludovic Desroches <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM commit 54e181e073fc1415e41917d725ebdbd7de956455 upstream. Since the beginning of the parisc-linux port, sometimes 64bit SMP kernels were not able to bring up other CPUs than the monarch CPU and instead crashed the kernel. The reason was unclear, esp. since it involved various machines (e.g. J5600, J6750 and SuperDome). Testing showed, that those crashes didn't happened when less than 4GB were installed, or if a 32bit Linux kernel was booted. In the end, the fix for those SMP problems is trivial: During the early phase of the initialization of the CPUs, including the monarch CPU, the PDC_PSW firmware function to enable WIDE (=64bit) mode is called. It's documented that this firmware function may clobber various registers, and one one of those possibly clobbered registers is %cr30 which holds the task thread info pointer. Now, if %cr30 would always have been clobbered, then this bug would have been detected much earlier. But lots of testing finally showed, that - at least for %cr30 - on some machines only the upper 32bits of the 64bit register suddenly turned zero after the firmware call. So, after finding the root cause, the explanation for the various crashes became clear: - On 32bit SMP Linux kernels all upper 32bit were zero, so we didn't faced this problem. - Monarch CPUs in 64bit mode always booted sucessfully, because the inital task thread info pointer was below 4GB. - Secondary CPUs booted sucessfully on machines with less than 4GB RAM because the upper 32bit were zero anyay. - Secondary CPus failed to boot if we had more than 4GB RAM and the task thread info pointer was located above the 4GB boundary. Finally, the patch to fix this problem is trivial by saving the %cr30 register before the firmware call and restoring it afterwards. Signed-off-by: Helge Deller <[email protected]> Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Add a fixup for ASUS N76VZ commit 6fc16e58adf50c0f1e4478538983fb5ff6f453d4 upstream. ASUS N76VZ needs the same fixup as N56VZ for supporting the boost speaker. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=846529 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: wm_hubs: Add missing break in hp_supply_event() commit 268ff14525edba31da29a12a9dd693cdd6a7872e upstream. Spotted by coverity CID 115170. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: dapm: Fix source list debugfs outputs commit ff18620c2157671a8ee21ebb8e6a3520ea209b1f upstream. ... due to a copy & paste error. Spotted by coverity CID 710923. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> staging: ozwpan: prevent overflow in oz_cdev_write() commit c2c65cd2e14ada6de44cb527e7f1990bede24e15 upstream. We need to check "count" so we don't overflow the ei->data buffer. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Staging: bcm: info leak in ioctl commit 8d1e72250c847fa96498ec029891de4dc638a5ba upstream. The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel information to user space. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> uml: check length in exitcode_proc_write() commit 201f99f170df14ba52ea4c52847779042b7a623b upstream. We don't cap the size of buffer from the user so we could write past the end of the array here. Only root can write to this file. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> xtensa: don't use alternate signal stack on threads commit cba9a90053e3b7973eff4f1946f33032e98eeed5 upstream. According to create_thread(3): "The new thread does not inherit the creating thread's alternate signal stack". Since commit f9a3879a (Fix sigaltstack corruption among cloned threads), current->sas_ss_size is set to 0 for cloned processes sharing VM with their parent. Don't use the (nonexistent) alternate signal stack in this case. This has been broken since commit 29c4dfd9 ([XTENSA] Remove non-rt signal handling). Fixes the SA_ONSTACK part of the nptl/tst-cancel20 test from uClibc. Signed-off-by: Baruch Siach <[email protected]> Signed-off-by: Max Filippov <[email protected]> Signed-off-by: Chris Zankel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> lib/scatterlist.c: don't flush_kernel_dcache_page on slab page commit 3d77b50c5874b7e923be946ba793644f82336b75 upstream. Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei <[email protected]> Reported-by: Aaro Koskinen <[email protected]> Tested-by: Simon Baatz <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Will Deacon <[email protected]> Cc: Aaro Koskinen <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Tejun Heo <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> aacraid: missing capable() check in compat ioctl commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream. In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we added a check on CAP_SYS_RAWIO to the ioctl. The compat ioctls need the check as well. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mm: fix aio performance regression for database caused by THP commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream. I am working with a tool that simulates oracle database I/O workload. This tool (orion to be specific - <http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>) allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag. It then does aio into these pages from flash disks using various common block sizes used by database. I am looking at performance with two of the most common block sizes - 1M and 64K. aio performance with these two block sizes plunged after Transparent HugePages was introduced in the kernel. Here are performance numbers: pre-THP 2.6.39 3.11-rc5 1M read 8384 MB/s 5629 MB/s 6501 MB/s 64K read 7867 MB/s 4576 MB/s 4251 MB/s I have narrowed the performance impact down to the overheads introduced by THP in __get_page_tail() and put_compound_page() routines. perf top shows >40% of cycles being spent in these two routines. Every time direct I/O to hugetlbfs pages starts, kernel calls get_page() to grab a reference to the pages and calls put_page() when I/O completes to put the reference away. THP introduced significant amount of locking overhead to get_page() and put_page() when dealing with compound pages because hugepages can be split underneath get_page() and put_page(). It added this overhead irrespective of whether it is dealing with hugetlbfs pages or transparent hugepages. This resulted in 20%-45% drop in aio performance when using hugetlbfs pages. Since hugetlbfs pages can not be split, there is no reason to go through all the locking overhead for these pages from what I can see. I added code to __get_page_tail() and put_compound_page() to bypass all the locking code when working with hugetlbfs pages. This improved performance significantly. Performance numbers with this patch: pre-THP 3.11-rc5 3.11-rc5 + Patch 1M read 8384 MB/s 6501 MB/s 8371 MB/s 64K read 7867 MB/s 4251 MB/s 6510 MB/s Performance with 64K read is still lower than what it was before THP, but still a 53% improvement. It does mean there is more work to be done but I will take a 53% improvement for now. Please take a look at the following patch and let me know if it looks reasonable. [[email protected]: tweak comments] Signed-off-by: Khalid Aziz <[email protected]> Cc: Pravin B Shelar <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm: Prevent overwriting from userspace underallocating core ioctl structs commit b062672e305ce071f21eb9e18b102c2a430e0999 upstream. Apply the protections from commit 1b2f1489633888d4a06028315dc19d65768a1c05 Author: Dave Airlie <[email protected]> Date: Sat Aug 14 20:20:34 2010 +1000 drm: block userspace under allocating buffer and having drivers overwrite it (v2) to the core ioctl structs as well, for we found one instance where there is a 32-/64-bit size mismatch and were guilty of writing beyond the end of the user's buffer. Signed-off-by: Chris Wilson <[email protected]> Cc: Dave Airlie <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm/radeon/atom: workaround vbios bug in transmitter table on rs780 commit c23632d4e57c0dd20bf50eca08fa0eb8ad3ff680 upstream. Some rs780 asics seem to be affected as well. See: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=91f3a6aaf280294b07c05dfe606e6c27b7ba3c72 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60791 Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.69 Signed-off-by: Chet Kener <[email protected]> xen-netback: use jiffies_64 value to calculate credit timeout [ Upstream commit 059dfa6a93b779516321e5112db9d7621b1367ba ] time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <[email protected]> Signed-off-by: Wei Liu <[email protected]> Reviewed-by: David Vrabel <[email protected]> Cc: Ian Campbell <[email protected]> Cc: Jason Luan <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> PCI: fix truncation of resource size to 32 bits commit d6776e6d5c2f8db0252f447b09736075e1bbe387 upstream. _pci_assign_resource() took an int "size" argument, which meant that sizes larger than 4GB were truncated. Change type to resource_size_t. [bhelgaas: changelog] Signed-off-by: Nikhil P Rao <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jiri Slaby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: add new zte 3g-dongle's pid to option.c commit 0636fc507a976cdc40f21bdbcce6f0b98ff1dfe9 upstream. Signed-off-by: Rui li <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Move one-time init codes from generic_hdmi_init() commit 8b8d654b55648561287bd8baca0f75f964a17038 upstream. The codes to initialize work struct or create a proc interface should be called only once and never although it's called many times through the init callback. Move that stuff into patch_generic_hdmi() so that it's called only once. Signed-off-by: Takashi Iwai <[email protected]> BugLink: https://bugs.launchpad.net/bugs/1212160 Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet commit 3a7b…
DerRomtester
referenced
this issue
in DerRomtester/one_plus_one
Jan 8, 2015
net/ping: handle protocol mismatching scenario ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net 91a0b603469069cdcce4d572b7525ffc9fd352a6] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: James Morris <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Jane Zhou <[email protected]> Signed-off-by: Yiwei Zhao <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Lorenzo Colitti <[email protected]> tcp: must unclone packets before mangling them [ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ] TCP stack should make sure it owns skbs before mangling them. We had various crashes using bnx2x, and it turned out gso_size was cleared right before bnx2x driver was populating TC descriptor of the _previous_ packet send. TCP stack can sometime retransmit packets that are still in Qdisc. Of course we could make bnx2x driver more robust (using ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack. We have identified two points where skb_unclone() was needed. This patch adds a WARN_ON_ONCE() to warn us if we missed another fix of this kind. Kudos to Neal for finding the root cause of this bug. Its visible using small MSS. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> tcp: do not forget FIN in tcp_shifted_skb() [ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ] Yuchung found following problem : There are bugs in the SACK processing code, merging part in tcp_shift_skb_data(), that incorrectly resets or ignores the sacked skbs FIN flag. When a receiver first SACK the FIN sequence, and later throw away ofo queue (e.g., sack-reneging), the sender will stop retransmitting the FIN flag, and hangs forever. Following packetdrill test can be used to reproduce the bug. $ cat sack-merge-bug.pkt `sysctl -q net.ipv4.tcp_fack=0` // Establish a connection and send 10 MSS. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +.000 bind(3, ..., ...) = 0 +.000 listen(3, 1) = 0 +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.001 < . 1:1(0) ack 1 win 1024 +.000 accept(3, ..., ...) = 4 +.100 write(4, ..., 12000) = 12000 +.000 shutdown(4, SHUT_WR) = 0 +.000 > . 1:10001(10000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 +.000 > FP. 10001:12001(2000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop> +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop> // SACK reneg +.050 < . 1:1(0) ack 12001 win 257 +0 %{ print "unacked: ",tcpi_unacked }% +5 %{ print "" }% First, a typo inverted left/right of one OR operation, then code forgot to advance end_seq if the merged skb carried FIN. Bug was added in 2.6.29 by commit 832d11c5cd076ab ("tcp: Try to restore large SKBs while SACK processing") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Acked-by: Neal Cardwell <[email protected]> Cc: Ilpo Järvinen <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: do not call sock_put() on TIMEWAIT sockets [ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: update statistics timer from timer context only [ Upstream commit 041b4ddb84989f06ff1df0ca869b950f1ee3cb1c ] Each port driver installs a periodic timer to update port statistics by calling mib_counters_update. As mib_counters_update is also called from non-timer context, we should not reschedule the timer there but rather move it to timer-only context. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: fix orphaned statistics timer crash [ Upstream commit f564412c935111c583b787bcc18157377b208e2e ] The periodic statistics timer gets started at port _probe() time, but is stopped on _stop() only. In a modular environment, this can cause the timer to access already deallocated memory, if the module is unloaded without starting the eth device. To fix this, we add the timer right before the port is started, instead of at _probe() time. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: heap overflow in __audit_sockaddr() [ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> proc connector: fix info leaks [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ] Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv4: fix ineffective source address selection [ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ] When sending out multicast messages, the source address in inet->mc_addr is ignored and rewritten by an autoselected one. This is caused by a typo in commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output route lookups"). Signed-off-by: Jiri Benc <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: dev: fix nlmsg size calculation in can_get_size() [ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv6: restrict neighbor entry creation to output flow This patch is based on 3.2.y branch, the one used by reporter. Please let me know if it should be different. Thanks. The patch which introduced the regression was applied on stables: 3.0.64 3.4.31 3.7.8 3.2.39 The patch which introduced the regression was for stable trees only. ---8<--- Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create neighbor entries for local delivery" introduced a regression on which routes to local delivery would not work anymore. Like this: $ ip -6 route add local 2001::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes ping: sendmsg: Invalid argument As this is a local delivery, that commit would not allow the creation of a neighbor entry and thus the packet cannot be sent. But as TPROXY scenario actually needs to avoid the neighbor entry creation only for input flow, this patch now limits previous patch to input flow, keeping output as before that patch. Reported-by: Debabrata Banerjee <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> CC: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bridge: Correctly clamp MAX forward_delay when enabling STP [ Upstream commit 4b6c7879d84ad06a2ac5b964808ed599187a188d ] Commit be4f154d5ef0ca147ab6bcd38857a774133f5450 bridge: Clamp forward_delay when enabling STP had a typo when attempting to clamp maximum forward delay. It is possible to set bridge_forward_delay to be higher then permitted maximum when STP is off. When turning STP on, the higher then allowed delay has to be clamed down to max value. Signed-off-by: Vlad Yasevich <[email protected]> CC: Herbert Xu <[email protected]> CC: Stephen Hemminger <[email protected]> Reviewed-by: Veaceslav Falico <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: vlan: fix nlmsg size calculation in vlan_get_size() [ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> l2tp: must disable bh before calling l2tp_xmit_skb() [ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ] François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165 ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <[email protected]> Tested-by: François Cachereul <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> farsync: fix info leak in ioctl [ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ] The fst_get_iface() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> unix_diag: fix info leak [ Upstream commit 6865d1e834be84ddd5808d93d5035b492346c64a ] When filling the netlink message we miss to wipe the pad field, therefore leak one byte of heap memory to userland. Fix this by setting pad to 0. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> connector: use nlmsg_len() to check message length [ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ] The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bnx2x: record rx queue for LRO packets [ Upstream commit 60e66fee56b2256dcb1dc2ea1b2ddcb6e273857d ] RPS support is kind of broken on bnx2x, because only non LRO packets get proper rx queue information. This triggers reorders, as it seems bnx2x like to generate a non LRO packet for segment including TCP PUSH flag : (this might be pure coincidence, but all the reorders I've seen involve segments with a PUSH) 11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797> 11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> We must call skb_record_rx_queue() to properly give to RPS (and more generally for TX queue selection on forward path) the receive queue information. Similar fix is needed for skb_mark_napi_id(), but will be handled in a separate patch to ease stable backports. Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Eilon Greenstein <[email protected]> Acked-by: Dmitry Kravkov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: dst: provide accessor function to dst->xfrm [ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ] dst->xfrm is conditionally defined. Provide accessor funtion that is always available. Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Use software crc32 checksum when xfrm transform will happen. [ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ] igb/ixgbe have hardware sctp checksum support, when this feature is enabled and also IPsec is armed to protect sctp traffic, ugly things happened as xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing up and pack the 16bits result in the checksum field). The result is fail establishment of sctp communication. Signed-off-by: Fan Du <[email protected]> Cc: Neil Horman <[email protected]> Cc: Steffen Klassert <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Perform software checksum if packet has to be fragmented. [ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ] IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum. This causes problems if SCTP packets has to be fragmented and ipsummed has been set to PARTIAL due to checksum offload support. This condition can happen when retransmitting after MTU discover, or when INIT or other control chunks are larger then MTU. Check for the rare fragmentation condition in SCTP and use software checksum calculation in this case. CC: Fan Du <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wanxl: fix info leak in ioctl [ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ] The wanxl_ioctl() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Salva Peiró <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race [ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ] In the case of credentials passing in unix stream sockets (dgram sockets seem not affected), we get a rather sparse race after commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default"). We have a stream server on receiver side that requests credential passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED on each spawned/accepted socket on server side to 1 first (as it's not inherited), it can happen that in the time between accept() and setsockopt() we get interrupted, the sender is being scheduled and continues with passing data to our receiver. At that time SO_PASSCRED is neither set on sender nor receiver side, hence in cmsg's SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534 (== overflow{u,g}id) instead of what we actually would like to see. On the sender side, here nc -U, the tests in maybe_add_creds() invoked through unix_stream_sendmsg() would fail, as at that exact time, as mentioned, the sender has neither SO_PASSCRED on his side nor sees it on the server side, and we have a valid 'other' socket in place. Thus, sender believes it would just look like a normal connection, not needing/requesting SO_PASSCRED at that time. As reverting 16e5726 would not be an option due to the significant performance regression reported when having creds always passed, one way/trade-off to prevent that would be to set SO_PASSCRED on the listener socket and allow inheriting these flags to the spawned socket on server side in accept(). It seems also logical to do so if we'd tell the listener socket to pass those flags onwards, and would fix the race. Before, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}}, msg_flags=0}, 0) = 5 After, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}}, msg_flags=0}, 0) = 5 Signed-off-by: Daniel Borkmann <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Eric W. Biederman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: fix cipso packet validation when !NETLABEL [ Upstream commit f2e5ddcc0d12f9c4c7b254358ad245c9dddce13b ] When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel crash in an SMP system, since the CPU executing this function will stall /not respond to IPIs. This problem can be reproduced by running the IP Stack Integrity Checker (http://isic.sourceforge.net) using the following command on a Linux machine connected to DUT: "icmpsic -s rand -d <DUT IP address> -r 123456" wait (1-2 min) Signed-off-by: Seif Mazareeb <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> inet: fix possible memory corruption with UDP_CORK and UFO [ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> davinci_emac.c: Fix IFF_ALLMULTI setup [ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ] When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't, emac_dev_mcast_set should only enable RX of multicasts and reset MACHASH registers. It does this, but afterwards it either sets up multicast MACs filtering or disables RX of multicasts and resets MACHASH registers again, rendering IFF_ALLMULTI flag useless. This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set. Tested with kernel 2.6.37. Signed-off-by: Mariusz Ceier <[email protected]> Acked-by: Mugunthan V N <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ext3: return 32/64-bit dir name hash according to usage type commit d7dab39b6e16d5eea78ed3c705d2a2d0772b4f06 upstream. This is based on commit d1f5273e9adb40724a85272f248f210dc4ce919a ext4: return 32/64-bit dir name hash according to usage type by Fan Yong <[email protected]> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext3 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This patch does implement a new ext3_dir_llseek op, because with 64-bit hashes, nfs will attempt to seek to a hash "offset" which is much larger than ext3's s_maxbytes. So for dx dirs, we call generic_file_llseek_size() with the appropriate max hash value as the maximum seekable size. Otherwise we just pass through to generic_file_llseek(). Patch-updated-by: Bernd Schubert <[email protected]> Patch-updated-by: Eric Sandeen <[email protected]> (blame us if something is not correct) Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: Jan Kara <[email protected]> Cc: Benjamin LaHaise <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> dm snapshot: fix data corruption commit e9c6a182649f4259db704ae15a91ac820e63b0ca upstream. This patch fixes a particular type of data corruption that has been encountered when loading a snapshot's metadata from disk. When we allocate a new chunk in persistent_prepare, we increment ps->next_free and we make sure that it doesn't point to a metadata area by further incrementing it if necessary. When we load metadata from disk on device activation, ps->next_free is positioned after the last used data chunk. However, if this last used data chunk is followed by a metadata area, ps->next_free is positioned erroneously to the metadata area. A newly-allocated chunk is placed at the same location as the metadata area, resulting in data or metadata corruption. This patch changes the code so that ps->next_free skips the metadata area when metadata are loaded in function read_exceptions. The patch also moves a piece of code from persistent_prepare_exception to a separate function skip_metadata to avoid code duplication. CVE-2013-4299 Signed-off-by: Mikulas Patocka <[email protected]> Cc: Mike Snitzer <[email protected]> Signed-off-by: Alasdair G Kergon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> writeback: fix negative bdi max pause commit e3b6c655b91e01a1dade056cfa358581b47a5351 upstream. Toralf runs trinity on UML/i386. After some time it hangs and the last message line is BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521] It's found that pages_dirtied becomes very large. More than 1000000000 pages in this case: period = HZ * pages_dirtied / task_ratelimit; BUG_ON(pages_dirtied > 2000000000); BUG_ON(pages_dirtied > 1000000000); <--------- UML debug printf shows that we got negative pause here: ick: pause : -984 ick: pages_dirtied : 0 ick: task_ratelimit: 0 pause: + if (pause < 0) { + extern int printf(char *, ...); + printf("ick : pause : %li\n", pause); + printf("ick: pages_dirtied : %lu\n", pages_dirtied); + printf("ick: task_ratelimit: %lu\n", task_ratelimit); + BUG_ON(1); + } trace_balance_dirty_pages(bdi, Since pause is bounded by [min_pause, max_pause] where min_pause is also bounded by max_pause. It's suspected and demonstrated that the max_pause calculation goes wrong: ick: pause : -717 ick: min_pause : -177 ick: max_pause : -717 ick: pages_dirtied : 14 ick: task_ratelimit: 0 The problem lies in the two "long = unsigned long" assignments in bdi_max_pause() which might go negative if the highest bit is 1, and the min_t(long, ...) check failed to protect it falling under 0. Fix all of them by using "unsigned long" throughout the function. Signed-off-by: Fengguang Wu <[email protected]> Reported-by: Toralf Förster <[email protected]> Tested-by: Toralf Förster <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wireless: radiotap: fix parsing buffer overrun commit f5563318ff1bde15b10e736e97ffce13be08bc1a upstream. When parsing an invalid radiotap header, the parser can overrun the buffer that is passed in because it doesn't correctly check 1) the minimum radiotap header size 2) the space for extended bitmaps The first issue doesn't affect any in-kernel user as they all check the minimum size before calling the radiotap function. The second issue could potentially affect the kernel if an skb is passed in that consists only of the radiotap header with a lot of extended bitmaps that extend past the SKB. In that case a read-only buffer overrun by at most 4 bytes is possible. Fix this by adding the appropriate checks to the parser. Reported-by: Evan Huus <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ti_usb_3410_5052: add Abbott strip port ID to combined table as well. commit c9d09dc7ad106492c17c587b6eeb99fe3f43e522 upstream. Without this change, the USB cable for Freestyle Option and compatible glucometers will not be detected by the driver. Signed-off-by: Diego Elio Pettenò <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: option: add support for Inovia SEW858 device commit f4c19b8e165cff1a6607c21f8809441d61cab7ec upstream. This patch adds the device id for the Inovia SEW858 device to the option driver. Reported-by: Pavel Parkhomenko <[email protected]> Tested-by: Pavel Parkhomenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> usb: serial: option: blacklist Olivetti Olicard200 commit fd8573f5828873343903215f203f14dc82de397c upstream. Interface 6 of this device speaks QMI as per tests done by us. Credits go to Antonella for providing the hardware. Signed-off-by: Enrico Mioso <[email protected]> Signed-off-by: Antonella Pellizzari <[email protected]> Tested-by: Dan Williams <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.68 Signed-off-by: Chet Kener <[email protected]> USB: support new huawei devices in option.c commit d544db293a44a2a3b09feab7dbd59668b692de71 upstream. Add new supporting declarations to option.c, to support Huawei new devices with new bInterfaceSubClass value. Signed-off-by: fangxiaozhi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks.c: add one device that cannot deal with suspension commit 4294bca7b423d1a5aa24307e3d112a04075e3763 upstream. The device is not responsive when resumed, unless it is reset. Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks: add touchscreen that is dazzeled by remote wakeup commit 614ced91fc6fbb5a1cdd12f0f1b6c9197d9f1350 upstream. The device descriptors are messed up after remote wakeup Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ftdi_sio: add id for Z3X Box device commit e1466ad5b1aeda303f9282463d55798d2eda218c upstream. Custom VID/PID for Z3X Box device, popular tool for cellphone flashing. Signed-off-by: Alexey E. Kramarenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: correctly close cancelled scans commit a754055a1296fcbe6f32de3a5eaca6efb2fd1865 upstream. __ieee80211_scan_completed is called from a worker. This means that the following flow is possible. * driver calls ieee80211_scan_completed * mac80211 cancels the scan (that is already complete) * __ieee80211_scan_completed runs When scan_work will finally run, it will see that the scan hasn't been aborted and might even trigger another scan on another band. This leads to a situation where cfg80211's scan is not done and no further scan can be issued. Fix this by setting a new flag when a HW scan is being cancelled so that no other scan will be triggered. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: update sta->last_rx on acked tx frames commit 0c5b93290b2f3c7a376567c03ae8d385b0e99851 upstream. When clients are idle for too long, hostapd sends nullfunc frames for probing. When those are acked by the client, the idle time needs to be updated. To make this work (and to avoid unnecessary probing), update sta->last_rx whenever an ACK was received for a tx packet. Only do this if the flag IEEE80211_HW_REPORTS_TX_ACK_STATUS is set. Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> rtlwifi: rtl8192cu: Fix error in pointer arithmetic commit 9473ca6e920a3b9ca902753ce52833657f9221cc upstream. An error in calculating the offset in an skb causes the driver to read essential device info from the wrong locations. The main effect is that automatic gain calculations are nonsense. Signed-off-by: Mark Cave-Ayland <[email protected]> Signed-off-by: Larry Finger <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> jfs: fix error path in ialloc commit 8660998608cfa1077e560034db81885af8e1e885 upstream. If insert_inode_locked() fails, we shouldn't be calling unlock_new_inode(). Signed-off-by: Dave Kleikamp <[email protected]> Tested-by: Michael L. Semon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX commit d5a7b406c529e4595ce03dc8f6dcf7fa36f106fa upstream. In patch 0d1862e can: flexcan: fix flexcan_chip_start() on imx6 the loop in flexcan_chip_start() that iterates over all mailboxes after the soft reset of the CAN core was removed. This loop put all mailboxes (even the ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63, this aborts any pending TX messages. After a cold boot there is random garbage in the mailboxes, which leads to spontaneous transmit of CAN frames during first activation. Further if the interface was disabled with a pending message (usually due to an error condition on the CAN bus), this message is retransmitted after enabling the interface again. This patch fixes the regression by: 1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX FIFO, 8 is used by TX. 2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that mailbox is aborted. Cc: Lothar Waßmann <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures commit f13e220161e738c2710b9904dcb3cf8bb0bcce61 upstream. libata EH decrements scmd->retries when the command failed for reasons unrelated to the command itself so that, for example, commands aborted due to suspend / resume cycle don't get penalized; however, decrementing scmd->retries isn't enough for ATA passthrough commands. Without this fix, ATA passthrough commands are not resend to the drive, and no error is signalled to the caller because: - allowed retry count is 1 - ata_eh_qc_complete fill the sense data, so result is valid - sense data is filled with untouched ATA registers. Signed-off-by: Gwendal Grignou <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> md: Fix skipping recovery for read-only arrays. commit 61e4947c99c4494336254ec540c50186d186150b upstream. Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Signed-off-by: Pawel Baldysiak <[email protected]> Signed-off-by: Lukasz Dorau <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> clockevents: Sanitize ticks to nsec conversion commit 97b9410643475d6557d2517c2aff9fd2221141a9 upstream. Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use clockevents_config_and_register() where possible" caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt->mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair "mult <= 1 << shift" is true. For the case where "mult > 1 << shift" we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting "clc" value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch << evt->shift) / evt->mult; So there is no sanity check for (latch << evt->shift) exceeding the 64bit boundary. The latch argument is "unsigned long", so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch << evt->shift) + evt->mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. [ukl: move assignment to rnd after eventually changing mult, fix build issue and correct comment with the right math] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: [email protected] Cc: Marc Pignat <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Ronald Wahl <[email protected]> Cc: LAK <[email protected]> Cc: Ludovic Desroches <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM commit 54e181e073fc1415e41917d725ebdbd7de956455 upstream. Since the beginning of the parisc-linux port, sometimes 64bit SMP kernels were not able to bring up other CPUs than the monarch CPU and instead crashed the kernel. The reason was unclear, esp. since it involved various machines (e.g. J5600, J6750 and SuperDome). Testing showed, that those crashes didn't happened when less than 4GB were installed, or if a 32bit Linux kernel was booted. In the end, the fix for those SMP problems is trivial: During the early phase of the initialization of the CPUs, including the monarch CPU, the PDC_PSW firmware function to enable WIDE (=64bit) mode is called. It's documented that this firmware function may clobber various registers, and one one of those possibly clobbered registers is %cr30 which holds the task thread info pointer. Now, if %cr30 would always have been clobbered, then this bug would have been detected much earlier. But lots of testing finally showed, that - at least for %cr30 - on some machines only the upper 32bits of the 64bit register suddenly turned zero after the firmware call. So, after finding the root cause, the explanation for the various crashes became clear: - On 32bit SMP Linux kernels all upper 32bit were zero, so we didn't faced this problem. - Monarch CPUs in 64bit mode always booted sucessfully, because the inital task thread info pointer was below 4GB. - Secondary CPUs booted sucessfully on machines with less than 4GB RAM because the upper 32bit were zero anyay. - Secondary CPus failed to boot if we had more than 4GB RAM and the task thread info pointer was located above the 4GB boundary. Finally, the patch to fix this problem is trivial by saving the %cr30 register before the firmware call and restoring it afterwards. Signed-off-by: Helge Deller <[email protected]> Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Add a fixup for ASUS N76VZ commit 6fc16e58adf50c0f1e4478538983fb5ff6f453d4 upstream. ASUS N76VZ needs the same fixup as N56VZ for supporting the boost speaker. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=846529 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: wm_hubs: Add missing break in hp_supply_event() commit 268ff14525edba31da29a12a9dd693cdd6a7872e upstream. Spotted by coverity CID 115170. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: dapm: Fix source list debugfs outputs commit ff18620c2157671a8ee21ebb8e6a3520ea209b1f upstream. ... due to a copy & paste error. Spotted by coverity CID 710923. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> staging: ozwpan: prevent overflow in oz_cdev_write() commit c2c65cd2e14ada6de44cb527e7f1990bede24e15 upstream. We need to check "count" so we don't overflow the ei->data buffer. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Staging: bcm: info leak in ioctl commit 8d1e72250c847fa96498ec029891de4dc638a5ba upstream. The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel information to user space. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> uml: check length in exitcode_proc_write() commit 201f99f170df14ba52ea4c52847779042b7a623b upstream. We don't cap the size of buffer from the user so we could write past the end of the array here. Only root can write to this file. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> xtensa: don't use alternate signal stack on threads commit cba9a90053e3b7973eff4f1946f33032e98eeed5 upstream. According to create_thread(3): "The new thread does not inherit the creating thread's alternate signal stack". Since commit f9a3879a (Fix sigaltstack corruption among cloned threads), current->sas_ss_size is set to 0 for cloned processes sharing VM with their parent. Don't use the (nonexistent) alternate signal stack in this case. This has been broken since commit 29c4dfd9 ([XTENSA] Remove non-rt signal handling). Fixes the SA_ONSTACK part of the nptl/tst-cancel20 test from uClibc. Signed-off-by: Baruch Siach <[email protected]> Signed-off-by: Max Filippov <[email protected]> Signed-off-by: Chris Zankel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> lib/scatterlist.c: don't flush_kernel_dcache_page on slab page commit 3d77b50c5874b7e923be946ba793644f82336b75 upstream. Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei <[email protected]> Reported-by: Aaro Koskinen <[email protected]> Tested-by: Simon Baatz <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Will Deacon <[email protected]> Cc: Aaro Koskinen <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Tejun Heo <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> aacraid: missing capable() check in compat ioctl commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream. In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we added a check on CAP_SYS_RAWIO to the ioctl. The compat ioctls need the check as well. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mm: fix aio performance regression for database caused by THP commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream. I am working with a tool that simulates oracle database I/O workload. This tool (orion to be specific - <http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>) allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag. It then does aio into these pages from flash disks using various common block sizes used by database. I am looking at performance with two of the most common block sizes - 1M and 64K. aio performance with these two block sizes plunged after Transparent HugePages was introduced in the kernel. Here are performance numbers: pre-THP 2.6.39 3.11-rc5 1M read 8384 MB/s 5629 MB/s 6501 MB/s 64K read 7867 MB/s 4576 MB/s 4251 MB/s I have narrowed the performance impact down to the overheads introduced by THP in __get_page_tail() and put_compound_page() routines. perf top shows >40% of cycles being spent in these two routines. Every time direct I/O to hugetlbfs pages starts, kernel calls get_page() to grab a reference to the pages and calls put_page() when I/O completes to put the reference away. THP introduced significant amount of locking overhead to get_page() and put_page() when dealing with compound pages because hugepages can be split underneath get_page() and put_page(). It added this overhead irrespective of whether it is dealing with hugetlbfs pages or transparent hugepages. This resulted in 20%-45% drop in aio performance when using hugetlbfs pages. Since hugetlbfs pages can not be split, there is no reason to go through all the locking overhead for these pages from what I can see. I added code to __get_page_tail() and put_compound_page() to bypass all the locking code when working with hugetlbfs pages. This improved performance significantly. Performance numbers with this patch: pre-THP 3.11-rc5 3.11-rc5 + Patch 1M read 8384 MB/s 6501 MB/s 8371 MB/s 64K read 7867 MB/s 4251 MB/s 6510 MB/s Performance with 64K read is still lower than what it was before THP, but still a 53% improvement. It does mean there is more work to be done but I will take a 53% improvement for now. Please take a look at the following patch and let me know if it looks reasonable. [[email protected]: tweak comments] Signed-off-by: Khalid Aziz <[email protected]> Cc: Pravin B Shelar <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm: Prevent overwriting from userspace underallocating core ioctl structs commit b062672e305ce071f21eb9e18b102c2a430e0999 upstream. Apply the protections from commit 1b2f1489633888d4a06028315dc19d65768a1c05 Author: Dave Airlie <[email protected]> Date: Sat Aug 14 20:20:34 2010 +1000 drm: block userspace under allocating buffer and having drivers overwrite it (v2) to the core ioctl structs as well, for we found one instance where there is a 32-/64-bit size mismatch and were guilty of writing beyond the end of the user's buffer. Signed-off-by: Chris Wilson <[email protected]> Cc: Dave Airlie <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm/radeon/atom: workaround vbios bug in transmitter table on rs780 commit c23632d4e57c0dd20bf50eca08fa0eb8ad3ff680 upstream. Some rs780 asics seem to be affected as well. See: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=91f3a6aaf280294b07c05dfe606e6c27b7ba3c72 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60791 Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.69 Signed-off-by: Chet Kener <[email protected]> xen-netback: use jiffies_64 value to calculate credit timeout [ Upstream commit 059dfa6a93b779516321e5112db9d7621b1367ba ] time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <[email protected]> Signed-off-by: Wei Liu <[email protected]> Reviewed-by: David Vrabel <[email protected]> Cc: Ian Campbell <[email protected]> Cc: Jason Luan <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> PCI: fix truncation of resource size to 32 bits commit d6776e6d5c2f8db0252f447b09736075e1bbe387 upstream. _pci_assign_resource() took an int "size" argument, which meant that sizes larger than 4GB were truncated. Change type to resource_size_t. [bhelgaas: changelog] Signed-off-by: Nikhil P Rao <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jiri Slaby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: add new zte 3g-dongle's pid to option.c commit 0636fc507a976cdc40f21bdbcce6f0b98ff1dfe9 upstream. Signed-off-by: Rui li <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Move one-time init codes from generic_hdmi_init() commit 8b8d654b55648561287bd8baca0f75f964a17038 upstream. The codes to initialize work struct or create a proc interface should be called only once and never although it's called many times through the init callback. Move that stuff into patch_generic_hdmi() so that it's called only once. Signed-off-by: Takashi Iwai <[email protected]> BugLink: https://bugs.launchpad.net/bugs/1212160 Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet commit 3a7b…
DerRomtester
referenced
this issue
in DerRomtester/one_plus_one
Jan 16, 2015
net/ping: handle protocol mismatching scenario ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net 91a0b603469069cdcce4d572b7525ffc9fd352a6] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: James Morris <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Jane Zhou <[email protected]> Signed-off-by: Yiwei Zhao <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Lorenzo Colitti <[email protected]> tcp: must unclone packets before mangling them [ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ] TCP stack should make sure it owns skbs before mangling them. We had various crashes using bnx2x, and it turned out gso_size was cleared right before bnx2x driver was populating TC descriptor of the _previous_ packet send. TCP stack can sometime retransmit packets that are still in Qdisc. Of course we could make bnx2x driver more robust (using ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack. We have identified two points where skb_unclone() was needed. This patch adds a WARN_ON_ONCE() to warn us if we missed another fix of this kind. Kudos to Neal for finding the root cause of this bug. Its visible using small MSS. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> tcp: do not forget FIN in tcp_shifted_skb() [ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ] Yuchung found following problem : There are bugs in the SACK processing code, merging part in tcp_shift_skb_data(), that incorrectly resets or ignores the sacked skbs FIN flag. When a receiver first SACK the FIN sequence, and later throw away ofo queue (e.g., sack-reneging), the sender will stop retransmitting the FIN flag, and hangs forever. Following packetdrill test can be used to reproduce the bug. $ cat sack-merge-bug.pkt `sysctl -q net.ipv4.tcp_fack=0` // Establish a connection and send 10 MSS. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +.000 bind(3, ..., ...) = 0 +.000 listen(3, 1) = 0 +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.001 < . 1:1(0) ack 1 win 1024 +.000 accept(3, ..., ...) = 4 +.100 write(4, ..., 12000) = 12000 +.000 shutdown(4, SHUT_WR) = 0 +.000 > . 1:10001(10000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 +.000 > FP. 10001:12001(2000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop> +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop> // SACK reneg +.050 < . 1:1(0) ack 12001 win 257 +0 %{ print "unacked: ",tcpi_unacked }% +5 %{ print "" }% First, a typo inverted left/right of one OR operation, then code forgot to advance end_seq if the merged skb carried FIN. Bug was added in 2.6.29 by commit 832d11c5cd076ab ("tcp: Try to restore large SKBs while SACK processing") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Acked-by: Neal Cardwell <[email protected]> Cc: Ilpo Järvinen <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: do not call sock_put() on TIMEWAIT sockets [ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: update statistics timer from timer context only [ Upstream commit 041b4ddb84989f06ff1df0ca869b950f1ee3cb1c ] Each port driver installs a periodic timer to update port statistics by calling mib_counters_update. As mib_counters_update is also called from non-timer context, we should not reschedule the timer there but rather move it to timer-only context. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: fix orphaned statistics timer crash [ Upstream commit f564412c935111c583b787bcc18157377b208e2e ] The periodic statistics timer gets started at port _probe() time, but is stopped on _stop() only. In a modular environment, this can cause the timer to access already deallocated memory, if the module is unloaded without starting the eth device. To fix this, we add the timer right before the port is started, instead of at _probe() time. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: heap overflow in __audit_sockaddr() [ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> proc connector: fix info leaks [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ] Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv4: fix ineffective source address selection [ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ] When sending out multicast messages, the source address in inet->mc_addr is ignored and rewritten by an autoselected one. This is caused by a typo in commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output route lookups"). Signed-off-by: Jiri Benc <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: dev: fix nlmsg size calculation in can_get_size() [ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv6: restrict neighbor entry creation to output flow This patch is based on 3.2.y branch, the one used by reporter. Please let me know if it should be different. Thanks. The patch which introduced the regression was applied on stables: 3.0.64 3.4.31 3.7.8 3.2.39 The patch which introduced the regression was for stable trees only. ---8<--- Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create neighbor entries for local delivery" introduced a regression on which routes to local delivery would not work anymore. Like this: $ ip -6 route add local 2001::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes ping: sendmsg: Invalid argument As this is a local delivery, that commit would not allow the creation of a neighbor entry and thus the packet cannot be sent. But as TPROXY scenario actually needs to avoid the neighbor entry creation only for input flow, this patch now limits previous patch to input flow, keeping output as before that patch. Reported-by: Debabrata Banerjee <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> CC: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bridge: Correctly clamp MAX forward_delay when enabling STP [ Upstream commit 4b6c7879d84ad06a2ac5b964808ed599187a188d ] Commit be4f154d5ef0ca147ab6bcd38857a774133f5450 bridge: Clamp forward_delay when enabling STP had a typo when attempting to clamp maximum forward delay. It is possible to set bridge_forward_delay to be higher then permitted maximum when STP is off. When turning STP on, the higher then allowed delay has to be clamed down to max value. Signed-off-by: Vlad Yasevich <[email protected]> CC: Herbert Xu <[email protected]> CC: Stephen Hemminger <[email protected]> Reviewed-by: Veaceslav Falico <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: vlan: fix nlmsg size calculation in vlan_get_size() [ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> l2tp: must disable bh before calling l2tp_xmit_skb() [ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ] François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165 ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <[email protected]> Tested-by: François Cachereul <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> farsync: fix info leak in ioctl [ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ] The fst_get_iface() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> unix_diag: fix info leak [ Upstream commit 6865d1e834be84ddd5808d93d5035b492346c64a ] When filling the netlink message we miss to wipe the pad field, therefore leak one byte of heap memory to userland. Fix this by setting pad to 0. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> connector: use nlmsg_len() to check message length [ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ] The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bnx2x: record rx queue for LRO packets [ Upstream commit 60e66fee56b2256dcb1dc2ea1b2ddcb6e273857d ] RPS support is kind of broken on bnx2x, because only non LRO packets get proper rx queue information. This triggers reorders, as it seems bnx2x like to generate a non LRO packet for segment including TCP PUSH flag : (this might be pure coincidence, but all the reorders I've seen involve segments with a PUSH) 11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797> 11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> We must call skb_record_rx_queue() to properly give to RPS (and more generally for TX queue selection on forward path) the receive queue information. Similar fix is needed for skb_mark_napi_id(), but will be handled in a separate patch to ease stable backports. Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Eilon Greenstein <[email protected]> Acked-by: Dmitry Kravkov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: dst: provide accessor function to dst->xfrm [ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ] dst->xfrm is conditionally defined. Provide accessor funtion that is always available. Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Use software crc32 checksum when xfrm transform will happen. [ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ] igb/ixgbe have hardware sctp checksum support, when this feature is enabled and also IPsec is armed to protect sctp traffic, ugly things happened as xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing up and pack the 16bits result in the checksum field). The result is fail establishment of sctp communication. Signed-off-by: Fan Du <[email protected]> Cc: Neil Horman <[email protected]> Cc: Steffen Klassert <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Perform software checksum if packet has to be fragmented. [ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ] IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum. This causes problems if SCTP packets has to be fragmented and ipsummed has been set to PARTIAL due to checksum offload support. This condition can happen when retransmitting after MTU discover, or when INIT or other control chunks are larger then MTU. Check for the rare fragmentation condition in SCTP and use software checksum calculation in this case. CC: Fan Du <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wanxl: fix info leak in ioctl [ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ] The wanxl_ioctl() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Salva Peiró <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race [ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ] In the case of credentials passing in unix stream sockets (dgram sockets seem not affected), we get a rather sparse race after commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default"). We have a stream server on receiver side that requests credential passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED on each spawned/accepted socket on server side to 1 first (as it's not inherited), it can happen that in the time between accept() and setsockopt() we get interrupted, the sender is being scheduled and continues with passing data to our receiver. At that time SO_PASSCRED is neither set on sender nor receiver side, hence in cmsg's SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534 (== overflow{u,g}id) instead of what we actually would like to see. On the sender side, here nc -U, the tests in maybe_add_creds() invoked through unix_stream_sendmsg() would fail, as at that exact time, as mentioned, the sender has neither SO_PASSCRED on his side nor sees it on the server side, and we have a valid 'other' socket in place. Thus, sender believes it would just look like a normal connection, not needing/requesting SO_PASSCRED at that time. As reverting 16e5726 would not be an option due to the significant performance regression reported when having creds always passed, one way/trade-off to prevent that would be to set SO_PASSCRED on the listener socket and allow inheriting these flags to the spawned socket on server side in accept(). It seems also logical to do so if we'd tell the listener socket to pass those flags onwards, and would fix the race. Before, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}}, msg_flags=0}, 0) = 5 After, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}}, msg_flags=0}, 0) = 5 Signed-off-by: Daniel Borkmann <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Eric W. Biederman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: fix cipso packet validation when !NETLABEL [ Upstream commit f2e5ddcc0d12f9c4c7b254358ad245c9dddce13b ] When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel crash in an SMP system, since the CPU executing this function will stall /not respond to IPIs. This problem can be reproduced by running the IP Stack Integrity Checker (http://isic.sourceforge.net) using the following command on a Linux machine connected to DUT: "icmpsic -s rand -d <DUT IP address> -r 123456" wait (1-2 min) Signed-off-by: Seif Mazareeb <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> inet: fix possible memory corruption with UDP_CORK and UFO [ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> davinci_emac.c: Fix IFF_ALLMULTI setup [ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ] When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't, emac_dev_mcast_set should only enable RX of multicasts and reset MACHASH registers. It does this, but afterwards it either sets up multicast MACs filtering or disables RX of multicasts and resets MACHASH registers again, rendering IFF_ALLMULTI flag useless. This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set. Tested with kernel 2.6.37. Signed-off-by: Mariusz Ceier <[email protected]> Acked-by: Mugunthan V N <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ext3: return 32/64-bit dir name hash according to usage type commit d7dab39b6e16d5eea78ed3c705d2a2d0772b4f06 upstream. This is based on commit d1f5273e9adb40724a85272f248f210dc4ce919a ext4: return 32/64-bit dir name hash according to usage type by Fan Yong <[email protected]> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext3 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This patch does implement a new ext3_dir_llseek op, because with 64-bit hashes, nfs will attempt to seek to a hash "offset" which is much larger than ext3's s_maxbytes. So for dx dirs, we call generic_file_llseek_size() with the appropriate max hash value as the maximum seekable size. Otherwise we just pass through to generic_file_llseek(). Patch-updated-by: Bernd Schubert <[email protected]> Patch-updated-by: Eric Sandeen <[email protected]> (blame us if something is not correct) Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: Jan Kara <[email protected]> Cc: Benjamin LaHaise <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> dm snapshot: fix data corruption commit e9c6a182649f4259db704ae15a91ac820e63b0ca upstream. This patch fixes a particular type of data corruption that has been encountered when loading a snapshot's metadata from disk. When we allocate a new chunk in persistent_prepare, we increment ps->next_free and we make sure that it doesn't point to a metadata area by further incrementing it if necessary. When we load metadata from disk on device activation, ps->next_free is positioned after the last used data chunk. However, if this last used data chunk is followed by a metadata area, ps->next_free is positioned erroneously to the metadata area. A newly-allocated chunk is placed at the same location as the metadata area, resulting in data or metadata corruption. This patch changes the code so that ps->next_free skips the metadata area when metadata are loaded in function read_exceptions. The patch also moves a piece of code from persistent_prepare_exception to a separate function skip_metadata to avoid code duplication. CVE-2013-4299 Signed-off-by: Mikulas Patocka <[email protected]> Cc: Mike Snitzer <[email protected]> Signed-off-by: Alasdair G Kergon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> writeback: fix negative bdi max pause commit e3b6c655b91e01a1dade056cfa358581b47a5351 upstream. Toralf runs trinity on UML/i386. After some time it hangs and the last message line is BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521] It's found that pages_dirtied becomes very large. More than 1000000000 pages in this case: period = HZ * pages_dirtied / task_ratelimit; BUG_ON(pages_dirtied > 2000000000); BUG_ON(pages_dirtied > 1000000000); <--------- UML debug printf shows that we got negative pause here: ick: pause : -984 ick: pages_dirtied : 0 ick: task_ratelimit: 0 pause: + if (pause < 0) { + extern int printf(char *, ...); + printf("ick : pause : %li\n", pause); + printf("ick: pages_dirtied : %lu\n", pages_dirtied); + printf("ick: task_ratelimit: %lu\n", task_ratelimit); + BUG_ON(1); + } trace_balance_dirty_pages(bdi, Since pause is bounded by [min_pause, max_pause] where min_pause is also bounded by max_pause. It's suspected and demonstrated that the max_pause calculation goes wrong: ick: pause : -717 ick: min_pause : -177 ick: max_pause : -717 ick: pages_dirtied : 14 ick: task_ratelimit: 0 The problem lies in the two "long = unsigned long" assignments in bdi_max_pause() which might go negative if the highest bit is 1, and the min_t(long, ...) check failed to protect it falling under 0. Fix all of them by using "unsigned long" throughout the function. Signed-off-by: Fengguang Wu <[email protected]> Reported-by: Toralf Förster <[email protected]> Tested-by: Toralf Förster <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wireless: radiotap: fix parsing buffer overrun commit f5563318ff1bde15b10e736e97ffce13be08bc1a upstream. When parsing an invalid radiotap header, the parser can overrun the buffer that is passed in because it doesn't correctly check 1) the minimum radiotap header size 2) the space for extended bitmaps The first issue doesn't affect any in-kernel user as they all check the minimum size before calling the radiotap function. The second issue could potentially affect the kernel if an skb is passed in that consists only of the radiotap header with a lot of extended bitmaps that extend past the SKB. In that case a read-only buffer overrun by at most 4 bytes is possible. Fix this by adding the appropriate checks to the parser. Reported-by: Evan Huus <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ti_usb_3410_5052: add Abbott strip port ID to combined table as well. commit c9d09dc7ad106492c17c587b6eeb99fe3f43e522 upstream. Without this change, the USB cable for Freestyle Option and compatible glucometers will not be detected by the driver. Signed-off-by: Diego Elio Pettenò <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: option: add support for Inovia SEW858 device commit f4c19b8e165cff1a6607c21f8809441d61cab7ec upstream. This patch adds the device id for the Inovia SEW858 device to the option driver. Reported-by: Pavel Parkhomenko <[email protected]> Tested-by: Pavel Parkhomenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> usb: serial: option: blacklist Olivetti Olicard200 commit fd8573f5828873343903215f203f14dc82de397c upstream. Interface 6 of this device speaks QMI as per tests done by us. Credits go to Antonella for providing the hardware. Signed-off-by: Enrico Mioso <[email protected]> Signed-off-by: Antonella Pellizzari <[email protected]> Tested-by: Dan Williams <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.68 Signed-off-by: Chet Kener <[email protected]> USB: support new huawei devices in option.c commit d544db293a44a2a3b09feab7dbd59668b692de71 upstream. Add new supporting declarations to option.c, to support Huawei new devices with new bInterfaceSubClass value. Signed-off-by: fangxiaozhi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks.c: add one device that cannot deal with suspension commit 4294bca7b423d1a5aa24307e3d112a04075e3763 upstream. The device is not responsive when resumed, unless it is reset. Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks: add touchscreen that is dazzeled by remote wakeup commit 614ced91fc6fbb5a1cdd12f0f1b6c9197d9f1350 upstream. The device descriptors are messed up after remote wakeup Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ftdi_sio: add id for Z3X Box device commit e1466ad5b1aeda303f9282463d55798d2eda218c upstream. Custom VID/PID for Z3X Box device, popular tool for cellphone flashing. Signed-off-by: Alexey E. Kramarenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: correctly close cancelled scans commit a754055a1296fcbe6f32de3a5eaca6efb2fd1865 upstream. __ieee80211_scan_completed is called from a worker. This means that the following flow is possible. * driver calls ieee80211_scan_completed * mac80211 cancels the scan (that is already complete) * __ieee80211_scan_completed runs When scan_work will finally run, it will see that the scan hasn't been aborted and might even trigger another scan on another band. This leads to a situation where cfg80211's scan is not done and no further scan can be issued. Fix this by setting a new flag when a HW scan is being cancelled so that no other scan will be triggered. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: update sta->last_rx on acked tx frames commit 0c5b93290b2f3c7a376567c03ae8d385b0e99851 upstream. When clients are idle for too long, hostapd sends nullfunc frames for probing. When those are acked by the client, the idle time needs to be updated. To make this work (and to avoid unnecessary probing), update sta->last_rx whenever an ACK was received for a tx packet. Only do this if the flag IEEE80211_HW_REPORTS_TX_ACK_STATUS is set. Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> rtlwifi: rtl8192cu: Fix error in pointer arithmetic commit 9473ca6e920a3b9ca902753ce52833657f9221cc upstream. An error in calculating the offset in an skb causes the driver to read essential device info from the wrong locations. The main effect is that automatic gain calculations are nonsense. Signed-off-by: Mark Cave-Ayland <[email protected]> Signed-off-by: Larry Finger <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> jfs: fix error path in ialloc commit 8660998608cfa1077e560034db81885af8e1e885 upstream. If insert_inode_locked() fails, we shouldn't be calling unlock_new_inode(). Signed-off-by: Dave Kleikamp <[email protected]> Tested-by: Michael L. Semon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX commit d5a7b406c529e4595ce03dc8f6dcf7fa36f106fa upstream. In patch 0d1862e can: flexcan: fix flexcan_chip_start() on imx6 the loop in flexcan_chip_start() that iterates over all mailboxes after the soft reset of the CAN core was removed. This loop put all mailboxes (even the ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63, this aborts any pending TX messages. After a cold boot there is random garbage in the mailboxes, which leads to spontaneous transmit of CAN frames during first activation. Further if the interface was disabled with a pending message (usually due to an error condition on the CAN bus), this message is retransmitted after enabling the interface again. This patch fixes the regression by: 1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX FIFO, 8 is used by TX. 2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that mailbox is aborted. Cc: Lothar Waßmann <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures commit f13e220161e738c2710b9904dcb3cf8bb0bcce61 upstream. libata EH decrements scmd->retries when the command failed for reasons unrelated to the command itself so that, for example, commands aborted due to suspend / resume cycle don't get penalized; however, decrementing scmd->retries isn't enough for ATA passthrough commands. Without this fix, ATA passthrough commands are not resend to the drive, and no error is signalled to the caller because: - allowed retry count is 1 - ata_eh_qc_complete fill the sense data, so result is valid - sense data is filled with untouched ATA registers. Signed-off-by: Gwendal Grignou <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> md: Fix skipping recovery for read-only arrays. commit 61e4947c99c4494336254ec540c50186d186150b upstream. Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Signed-off-by: Pawel Baldysiak <[email protected]> Signed-off-by: Lukasz Dorau <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> clockevents: Sanitize ticks to nsec conversion commit 97b9410643475d6557d2517c2aff9fd2221141a9 upstream. Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use clockevents_config_and_register() where possible" caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt->mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair "mult <= 1 << shift" is true. For the case where "mult > 1 << shift" we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting "clc" value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch << evt->shift) / evt->mult; So there is no sanity check for (latch << evt->shift) exceeding the 64bit boundary. The latch argument is "unsigned long", so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch << evt->shift) + evt->mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. [ukl: move assignment to rnd after eventually changing mult, fix build issue and correct comment with the right math] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: [email protected] Cc: Marc Pignat <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Ronald Wahl <[email protected]> Cc: LAK <[email protected]> Cc: Ludovic Desroches <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM commit 54e181e073fc1415e41917d725ebdbd7de956455 upstream. Since the beginning of the parisc-linux port, sometimes 64bit SMP kernels were not able to bring up other CPUs than the monarch CPU and instead crashed the kernel. The reason was unclear, esp. since it involved various machines (e.g. J5600, J6750 and SuperDome). Testing showed, that those crashes didn't happened when less than 4GB were installed, or if a 32bit Linux kernel was booted. In the end, the fix for those SMP problems is trivial: During the early phase of the initialization of the CPUs, including the monarch CPU, the PDC_PSW firmware function to enable WIDE (=64bit) mode is called. It's documented that this firmware function may clobber various registers, and one one of those possibly clobbered registers is %cr30 which holds the task thread info pointer. Now, if %cr30 would always have been clobbered, then this bug would have been detected much earlier. But lots of testing finally showed, that - at least for %cr30 - on some machines only the upper 32bits of the 64bit register suddenly turned zero after the firmware call. So, after finding the root cause, the explanation for the various crashes became clear: - On 32bit SMP Linux kernels all upper 32bit were zero, so we didn't faced this problem. - Monarch CPUs in 64bit mode always booted sucessfully, because the inital task thread info pointer was below 4GB. - Secondary CPUs booted sucessfully on machines with less than 4GB RAM because the upper 32bit were zero anyay. - Secondary CPus failed to boot if we had more than 4GB RAM and the task thread info pointer was located above the 4GB boundary. Finally, the patch to fix this problem is trivial by saving the %cr30 register before the firmware call and restoring it afterwards. Signed-off-by: Helge Deller <[email protected]> Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Add a fixup for ASUS N76VZ commit 6fc16e58adf50c0f1e4478538983fb5ff6f453d4 upstream. ASUS N76VZ needs the same fixup as N56VZ for supporting the boost speaker. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=846529 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: wm_hubs: Add missing break in hp_supply_event() commit 268ff14525edba31da29a12a9dd693cdd6a7872e upstream. Spotted by coverity CID 115170. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: dapm: Fix source list debugfs outputs commit ff18620c2157671a8ee21ebb8e6a3520ea209b1f upstream. ... due to a copy & paste error. Spotted by coverity CID 710923. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> staging: ozwpan: prevent overflow in oz_cdev_write() commit c2c65cd2e14ada6de44cb527e7f1990bede24e15 upstream. We need to check "count" so we don't overflow the ei->data buffer. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Staging: bcm: info leak in ioctl commit 8d1e72250c847fa96498ec029891de4dc638a5ba upstream. The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel information to user space. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> uml: check length in exitcode_proc_write() commit 201f99f170df14ba52ea4c52847779042b7a623b upstream. We don't cap the size of buffer from the user so we could write past the end of the array here. Only root can write to this file. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> xtensa: don't use alternate signal stack on threads commit cba9a90053e3b7973eff4f1946f33032e98eeed5 upstream. According to create_thread(3): "The new thread does not inherit the creating thread's alternate signal stack". Since commit f9a3879a (Fix sigaltstack corruption among cloned threads), current->sas_ss_size is set to 0 for cloned processes sharing VM with their parent. Don't use the (nonexistent) alternate signal stack in this case. This has been broken since commit 29c4dfd9 ([XTENSA] Remove non-rt signal handling). Fixes the SA_ONSTACK part of the nptl/tst-cancel20 test from uClibc. Signed-off-by: Baruch Siach <[email protected]> Signed-off-by: Max Filippov <[email protected]> Signed-off-by: Chris Zankel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> lib/scatterlist.c: don't flush_kernel_dcache_page on slab page commit 3d77b50c5874b7e923be946ba793644f82336b75 upstream. Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei <[email protected]> Reported-by: Aaro Koskinen <[email protected]> Tested-by: Simon Baatz <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Will Deacon <[email protected]> Cc: Aaro Koskinen <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Tejun Heo <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> aacraid: missing capable() check in compat ioctl commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream. In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we added a check on CAP_SYS_RAWIO to the ioctl. The compat ioctls need the check as well. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mm: fix aio performance regression for database caused by THP commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream. I am working with a tool that simulates oracle database I/O workload. This tool (orion to be specific - <http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>) allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag. It then does aio into these pages from flash disks using various common block sizes used by database. I am looking at performance with two of the most common block sizes - 1M and 64K. aio performance with these two block sizes plunged after Transparent HugePages was introduced in the kernel. Here are performance numbers: pre-THP 2.6.39 3.11-rc5 1M read 8384 MB/s 5629 MB/s 6501 MB/s 64K read 7867 MB/s 4576 MB/s 4251 MB/s I have narrowed the performance impact down to the overheads introduced by THP in __get_page_tail() and put_compound_page() routines. perf top shows >40% of cycles being spent in these two routines. Every time direct I/O to hugetlbfs pages starts, kernel calls get_page() to grab a reference to the pages and calls put_page() when I/O completes to put the reference away. THP introduced significant amount of locking overhead to get_page() and put_page() when dealing with compound pages because hugepages can be split underneath get_page() and put_page(). It added this overhead irrespective of whether it is dealing with hugetlbfs pages or transparent hugepages. This resulted in 20%-45% drop in aio performance when using hugetlbfs pages. Since hugetlbfs pages can not be split, there is no reason to go through all the locking overhead for these pages from what I can see. I added code to __get_page_tail() and put_compound_page() to bypass all the locking code when working with hugetlbfs pages. This improved performance significantly. Performance numbers with this patch: pre-THP 3.11-rc5 3.11-rc5 + Patch 1M read 8384 MB/s 6501 MB/s 8371 MB/s 64K read 7867 MB/s 4251 MB/s 6510 MB/s Performance with 64K read is still lower than what it was before THP, but still a 53% improvement. It does mean there is more work to be done but I will take a 53% improvement for now. Please take a look at the following patch and let me know if it looks reasonable. [[email protected]: tweak comments] Signed-off-by: Khalid Aziz <[email protected]> Cc: Pravin B Shelar <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm: Prevent overwriting from userspace underallocating core ioctl structs commit b062672e305ce071f21eb9e18b102c2a430e0999 upstream. Apply the protections from commit 1b2f1489633888d4a06028315dc19d65768a1c05 Author: Dave Airlie <[email protected]> Date: Sat Aug 14 20:20:34 2010 +1000 drm: block userspace under allocating buffer and having drivers overwrite it (v2) to the core ioctl structs as well, for we found one instance where there is a 32-/64-bit size mismatch and were guilty of writing beyond the end of the user's buffer. Signed-off-by: Chris Wilson <[email protected]> Cc: Dave Airlie <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm/radeon/atom: workaround vbios bug in transmitter table on rs780 commit c23632d4e57c0dd20bf50eca08fa0eb8ad3ff680 upstream. Some rs780 asics seem to be affected as well. See: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=91f3a6aaf280294b07c05dfe606e6c27b7ba3c72 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60791 Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.69 Signed-off-by: Chet Kener <[email protected]> xen-netback: use jiffies_64 value to calculate credit timeout [ Upstream commit 059dfa6a93b779516321e5112db9d7621b1367ba ] time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <[email protected]> Signed-off-by: Wei Liu <[email protected]> Reviewed-by: David Vrabel <[email protected]> Cc: Ian Campbell <[email protected]> Cc: Jason Luan <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> PCI: fix truncation of resource size to 32 bits commit d6776e6d5c2f8db0252f447b09736075e1bbe387 upstream. _pci_assign_resource() took an int "size" argument, which meant that sizes larger than 4GB were truncated. Change type to resource_size_t. [bhelgaas: changelog] Signed-off-by: Nikhil P Rao <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jiri Slaby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: add new zte 3g-dongle's pid to option.c commit 0636fc507a976cdc40f21bdbcce6f0b98ff1dfe9 upstream. Signed-off-by: Rui li <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Move one-time init codes from generic_hdmi_init() commit 8b8d654b55648561287bd8baca0f75f964a17038 upstream. The codes to initialize work struct or create a proc interface should be called only once and never although it's called many times through the init callback. Move that stuff into patch_generic_hdmi() so that it's called only once. Signed-off-by: Takashi Iwai <[email protected]> BugLink: https://bugs.launchpad.net/bugs/1212160 Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet commit 3a7b…
DerRomtester
referenced
this issue
in DerRomtester/one_plus_one
Jan 21, 2015
net/ping: handle protocol mismatching scenario ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net 91a0b603469069cdcce4d572b7525ffc9fd352a6] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: James Morris <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Jane Zhou <[email protected]> Signed-off-by: Yiwei Zhao <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Lorenzo Colitti <[email protected]> tcp: must unclone packets before mangling them [ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ] TCP stack should make sure it owns skbs before mangling them. We had various crashes using bnx2x, and it turned out gso_size was cleared right before bnx2x driver was populating TC descriptor of the _previous_ packet send. TCP stack can sometime retransmit packets that are still in Qdisc. Of course we could make bnx2x driver more robust (using ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack. We have identified two points where skb_unclone() was needed. This patch adds a WARN_ON_ONCE() to warn us if we missed another fix of this kind. Kudos to Neal for finding the root cause of this bug. Its visible using small MSS. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> tcp: do not forget FIN in tcp_shifted_skb() [ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ] Yuchung found following problem : There are bugs in the SACK processing code, merging part in tcp_shift_skb_data(), that incorrectly resets or ignores the sacked skbs FIN flag. When a receiver first SACK the FIN sequence, and later throw away ofo queue (e.g., sack-reneging), the sender will stop retransmitting the FIN flag, and hangs forever. Following packetdrill test can be used to reproduce the bug. $ cat sack-merge-bug.pkt `sysctl -q net.ipv4.tcp_fack=0` // Establish a connection and send 10 MSS. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +.000 bind(3, ..., ...) = 0 +.000 listen(3, 1) = 0 +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.001 < . 1:1(0) ack 1 win 1024 +.000 accept(3, ..., ...) = 4 +.100 write(4, ..., 12000) = 12000 +.000 shutdown(4, SHUT_WR) = 0 +.000 > . 1:10001(10000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 +.000 > FP. 10001:12001(2000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop> +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop> // SACK reneg +.050 < . 1:1(0) ack 12001 win 257 +0 %{ print "unacked: ",tcpi_unacked }% +5 %{ print "" }% First, a typo inverted left/right of one OR operation, then code forgot to advance end_seq if the merged skb carried FIN. Bug was added in 2.6.29 by commit 832d11c5cd076ab ("tcp: Try to restore large SKBs while SACK processing") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Acked-by: Neal Cardwell <[email protected]> Cc: Ilpo Järvinen <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: do not call sock_put() on TIMEWAIT sockets [ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: update statistics timer from timer context only [ Upstream commit 041b4ddb84989f06ff1df0ca869b950f1ee3cb1c ] Each port driver installs a periodic timer to update port statistics by calling mib_counters_update. As mib_counters_update is also called from non-timer context, we should not reschedule the timer there but rather move it to timer-only context. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: fix orphaned statistics timer crash [ Upstream commit f564412c935111c583b787bcc18157377b208e2e ] The periodic statistics timer gets started at port _probe() time, but is stopped on _stop() only. In a modular environment, this can cause the timer to access already deallocated memory, if the module is unloaded without starting the eth device. To fix this, we add the timer right before the port is started, instead of at _probe() time. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: heap overflow in __audit_sockaddr() [ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> proc connector: fix info leaks [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ] Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv4: fix ineffective source address selection [ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ] When sending out multicast messages, the source address in inet->mc_addr is ignored and rewritten by an autoselected one. This is caused by a typo in commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output route lookups"). Signed-off-by: Jiri Benc <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: dev: fix nlmsg size calculation in can_get_size() [ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv6: restrict neighbor entry creation to output flow This patch is based on 3.2.y branch, the one used by reporter. Please let me know if it should be different. Thanks. The patch which introduced the regression was applied on stables: 3.0.64 3.4.31 3.7.8 3.2.39 The patch which introduced the regression was for stable trees only. ---8<--- Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create neighbor entries for local delivery" introduced a regression on which routes to local delivery would not work anymore. Like this: $ ip -6 route add local 2001::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes ping: sendmsg: Invalid argument As this is a local delivery, that commit would not allow the creation of a neighbor entry and thus the packet cannot be sent. But as TPROXY scenario actually needs to avoid the neighbor entry creation only for input flow, this patch now limits previous patch to input flow, keeping output as before that patch. Reported-by: Debabrata Banerjee <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> CC: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bridge: Correctly clamp MAX forward_delay when enabling STP [ Upstream commit 4b6c7879d84ad06a2ac5b964808ed599187a188d ] Commit be4f154d5ef0ca147ab6bcd38857a774133f5450 bridge: Clamp forward_delay when enabling STP had a typo when attempting to clamp maximum forward delay. It is possible to set bridge_forward_delay to be higher then permitted maximum when STP is off. When turning STP on, the higher then allowed delay has to be clamed down to max value. Signed-off-by: Vlad Yasevich <[email protected]> CC: Herbert Xu <[email protected]> CC: Stephen Hemminger <[email protected]> Reviewed-by: Veaceslav Falico <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: vlan: fix nlmsg size calculation in vlan_get_size() [ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> l2tp: must disable bh before calling l2tp_xmit_skb() [ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ] François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165 ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <[email protected]> Tested-by: François Cachereul <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> farsync: fix info leak in ioctl [ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ] The fst_get_iface() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> unix_diag: fix info leak [ Upstream commit 6865d1e834be84ddd5808d93d5035b492346c64a ] When filling the netlink message we miss to wipe the pad field, therefore leak one byte of heap memory to userland. Fix this by setting pad to 0. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> connector: use nlmsg_len() to check message length [ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ] The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bnx2x: record rx queue for LRO packets [ Upstream commit 60e66fee56b2256dcb1dc2ea1b2ddcb6e273857d ] RPS support is kind of broken on bnx2x, because only non LRO packets get proper rx queue information. This triggers reorders, as it seems bnx2x like to generate a non LRO packet for segment including TCP PUSH flag : (this might be pure coincidence, but all the reorders I've seen involve segments with a PUSH) 11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797> 11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> We must call skb_record_rx_queue() to properly give to RPS (and more generally for TX queue selection on forward path) the receive queue information. Similar fix is needed for skb_mark_napi_id(), but will be handled in a separate patch to ease stable backports. Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Eilon Greenstein <[email protected]> Acked-by: Dmitry Kravkov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: dst: provide accessor function to dst->xfrm [ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ] dst->xfrm is conditionally defined. Provide accessor funtion that is always available. Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Use software crc32 checksum when xfrm transform will happen. [ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ] igb/ixgbe have hardware sctp checksum support, when this feature is enabled and also IPsec is armed to protect sctp traffic, ugly things happened as xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing up and pack the 16bits result in the checksum field). The result is fail establishment of sctp communication. Signed-off-by: Fan Du <[email protected]> Cc: Neil Horman <[email protected]> Cc: Steffen Klassert <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Perform software checksum if packet has to be fragmented. [ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ] IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum. This causes problems if SCTP packets has to be fragmented and ipsummed has been set to PARTIAL due to checksum offload support. This condition can happen when retransmitting after MTU discover, or when INIT or other control chunks are larger then MTU. Check for the rare fragmentation condition in SCTP and use software checksum calculation in this case. CC: Fan Du <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wanxl: fix info leak in ioctl [ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ] The wanxl_ioctl() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Salva Peiró <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race [ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ] In the case of credentials passing in unix stream sockets (dgram sockets seem not affected), we get a rather sparse race after commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default"). We have a stream server on receiver side that requests credential passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED on each spawned/accepted socket on server side to 1 first (as it's not inherited), it can happen that in the time between accept() and setsockopt() we get interrupted, the sender is being scheduled and continues with passing data to our receiver. At that time SO_PASSCRED is neither set on sender nor receiver side, hence in cmsg's SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534 (== overflow{u,g}id) instead of what we actually would like to see. On the sender side, here nc -U, the tests in maybe_add_creds() invoked through unix_stream_sendmsg() would fail, as at that exact time, as mentioned, the sender has neither SO_PASSCRED on his side nor sees it on the server side, and we have a valid 'other' socket in place. Thus, sender believes it would just look like a normal connection, not needing/requesting SO_PASSCRED at that time. As reverting 16e5726 would not be an option due to the significant performance regression reported when having creds always passed, one way/trade-off to prevent that would be to set SO_PASSCRED on the listener socket and allow inheriting these flags to the spawned socket on server side in accept(). It seems also logical to do so if we'd tell the listener socket to pass those flags onwards, and would fix the race. Before, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}}, msg_flags=0}, 0) = 5 After, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}}, msg_flags=0}, 0) = 5 Signed-off-by: Daniel Borkmann <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Eric W. Biederman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: fix cipso packet validation when !NETLABEL [ Upstream commit f2e5ddcc0d12f9c4c7b254358ad245c9dddce13b ] When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel crash in an SMP system, since the CPU executing this function will stall /not respond to IPIs. This problem can be reproduced by running the IP Stack Integrity Checker (http://isic.sourceforge.net) using the following command on a Linux machine connected to DUT: "icmpsic -s rand -d <DUT IP address> -r 123456" wait (1-2 min) Signed-off-by: Seif Mazareeb <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> inet: fix possible memory corruption with UDP_CORK and UFO [ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> davinci_emac.c: Fix IFF_ALLMULTI setup [ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ] When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't, emac_dev_mcast_set should only enable RX of multicasts and reset MACHASH registers. It does this, but afterwards it either sets up multicast MACs filtering or disables RX of multicasts and resets MACHASH registers again, rendering IFF_ALLMULTI flag useless. This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set. Tested with kernel 2.6.37. Signed-off-by: Mariusz Ceier <[email protected]> Acked-by: Mugunthan V N <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ext3: return 32/64-bit dir name hash according to usage type commit d7dab39b6e16d5eea78ed3c705d2a2d0772b4f06 upstream. This is based on commit d1f5273e9adb40724a85272f248f210dc4ce919a ext4: return 32/64-bit dir name hash according to usage type by Fan Yong <[email protected]> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext3 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This patch does implement a new ext3_dir_llseek op, because with 64-bit hashes, nfs will attempt to seek to a hash "offset" which is much larger than ext3's s_maxbytes. So for dx dirs, we call generic_file_llseek_size() with the appropriate max hash value as the maximum seekable size. Otherwise we just pass through to generic_file_llseek(). Patch-updated-by: Bernd Schubert <[email protected]> Patch-updated-by: Eric Sandeen <[email protected]> (blame us if something is not correct) Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: Jan Kara <[email protected]> Cc: Benjamin LaHaise <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> dm snapshot: fix data corruption commit e9c6a182649f4259db704ae15a91ac820e63b0ca upstream. This patch fixes a particular type of data corruption that has been encountered when loading a snapshot's metadata from disk. When we allocate a new chunk in persistent_prepare, we increment ps->next_free and we make sure that it doesn't point to a metadata area by further incrementing it if necessary. When we load metadata from disk on device activation, ps->next_free is positioned after the last used data chunk. However, if this last used data chunk is followed by a metadata area, ps->next_free is positioned erroneously to the metadata area. A newly-allocated chunk is placed at the same location as the metadata area, resulting in data or metadata corruption. This patch changes the code so that ps->next_free skips the metadata area when metadata are loaded in function read_exceptions. The patch also moves a piece of code from persistent_prepare_exception to a separate function skip_metadata to avoid code duplication. CVE-2013-4299 Signed-off-by: Mikulas Patocka <[email protected]> Cc: Mike Snitzer <[email protected]> Signed-off-by: Alasdair G Kergon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> writeback: fix negative bdi max pause commit e3b6c655b91e01a1dade056cfa358581b47a5351 upstream. Toralf runs trinity on UML/i386. After some time it hangs and the last message line is BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521] It's found that pages_dirtied becomes very large. More than 1000000000 pages in this case: period = HZ * pages_dirtied / task_ratelimit; BUG_ON(pages_dirtied > 2000000000); BUG_ON(pages_dirtied > 1000000000); <--------- UML debug printf shows that we got negative pause here: ick: pause : -984 ick: pages_dirtied : 0 ick: task_ratelimit: 0 pause: + if (pause < 0) { + extern int printf(char *, ...); + printf("ick : pause : %li\n", pause); + printf("ick: pages_dirtied : %lu\n", pages_dirtied); + printf("ick: task_ratelimit: %lu\n", task_ratelimit); + BUG_ON(1); + } trace_balance_dirty_pages(bdi, Since pause is bounded by [min_pause, max_pause] where min_pause is also bounded by max_pause. It's suspected and demonstrated that the max_pause calculation goes wrong: ick: pause : -717 ick: min_pause : -177 ick: max_pause : -717 ick: pages_dirtied : 14 ick: task_ratelimit: 0 The problem lies in the two "long = unsigned long" assignments in bdi_max_pause() which might go negative if the highest bit is 1, and the min_t(long, ...) check failed to protect it falling under 0. Fix all of them by using "unsigned long" throughout the function. Signed-off-by: Fengguang Wu <[email protected]> Reported-by: Toralf Förster <[email protected]> Tested-by: Toralf Förster <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wireless: radiotap: fix parsing buffer overrun commit f5563318ff1bde15b10e736e97ffce13be08bc1a upstream. When parsing an invalid radiotap header, the parser can overrun the buffer that is passed in because it doesn't correctly check 1) the minimum radiotap header size 2) the space for extended bitmaps The first issue doesn't affect any in-kernel user as they all check the minimum size before calling the radiotap function. The second issue could potentially affect the kernel if an skb is passed in that consists only of the radiotap header with a lot of extended bitmaps that extend past the SKB. In that case a read-only buffer overrun by at most 4 bytes is possible. Fix this by adding the appropriate checks to the parser. Reported-by: Evan Huus <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ti_usb_3410_5052: add Abbott strip port ID to combined table as well. commit c9d09dc7ad106492c17c587b6eeb99fe3f43e522 upstream. Without this change, the USB cable for Freestyle Option and compatible glucometers will not be detected by the driver. Signed-off-by: Diego Elio Pettenò <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: option: add support for Inovia SEW858 device commit f4c19b8e165cff1a6607c21f8809441d61cab7ec upstream. This patch adds the device id for the Inovia SEW858 device to the option driver. Reported-by: Pavel Parkhomenko <[email protected]> Tested-by: Pavel Parkhomenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> usb: serial: option: blacklist Olivetti Olicard200 commit fd8573f5828873343903215f203f14dc82de397c upstream. Interface 6 of this device speaks QMI as per tests done by us. Credits go to Antonella for providing the hardware. Signed-off-by: Enrico Mioso <[email protected]> Signed-off-by: Antonella Pellizzari <[email protected]> Tested-by: Dan Williams <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.68 Signed-off-by: Chet Kener <[email protected]> USB: support new huawei devices in option.c commit d544db293a44a2a3b09feab7dbd59668b692de71 upstream. Add new supporting declarations to option.c, to support Huawei new devices with new bInterfaceSubClass value. Signed-off-by: fangxiaozhi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks.c: add one device that cannot deal with suspension commit 4294bca7b423d1a5aa24307e3d112a04075e3763 upstream. The device is not responsive when resumed, unless it is reset. Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks: add touchscreen that is dazzeled by remote wakeup commit 614ced91fc6fbb5a1cdd12f0f1b6c9197d9f1350 upstream. The device descriptors are messed up after remote wakeup Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ftdi_sio: add id for Z3X Box device commit e1466ad5b1aeda303f9282463d55798d2eda218c upstream. Custom VID/PID for Z3X Box device, popular tool for cellphone flashing. Signed-off-by: Alexey E. Kramarenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: correctly close cancelled scans commit a754055a1296fcbe6f32de3a5eaca6efb2fd1865 upstream. __ieee80211_scan_completed is called from a worker. This means that the following flow is possible. * driver calls ieee80211_scan_completed * mac80211 cancels the scan (that is already complete) * __ieee80211_scan_completed runs When scan_work will finally run, it will see that the scan hasn't been aborted and might even trigger another scan on another band. This leads to a situation where cfg80211's scan is not done and no further scan can be issued. Fix this by setting a new flag when a HW scan is being cancelled so that no other scan will be triggered. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: update sta->last_rx on acked tx frames commit 0c5b93290b2f3c7a376567c03ae8d385b0e99851 upstream. When clients are idle for too long, hostapd sends nullfunc frames for probing. When those are acked by the client, the idle time needs to be updated. To make this work (and to avoid unnecessary probing), update sta->last_rx whenever an ACK was received for a tx packet. Only do this if the flag IEEE80211_HW_REPORTS_TX_ACK_STATUS is set. Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> rtlwifi: rtl8192cu: Fix error in pointer arithmetic commit 9473ca6e920a3b9ca902753ce52833657f9221cc upstream. An error in calculating the offset in an skb causes the driver to read essential device info from the wrong locations. The main effect is that automatic gain calculations are nonsense. Signed-off-by: Mark Cave-Ayland <[email protected]> Signed-off-by: Larry Finger <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> jfs: fix error path in ialloc commit 8660998608cfa1077e560034db81885af8e1e885 upstream. If insert_inode_locked() fails, we shouldn't be calling unlock_new_inode(). Signed-off-by: Dave Kleikamp <[email protected]> Tested-by: Michael L. Semon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX commit d5a7b406c529e4595ce03dc8f6dcf7fa36f106fa upstream. In patch 0d1862e can: flexcan: fix flexcan_chip_start() on imx6 the loop in flexcan_chip_start() that iterates over all mailboxes after the soft reset of the CAN core was removed. This loop put all mailboxes (even the ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63, this aborts any pending TX messages. After a cold boot there is random garbage in the mailboxes, which leads to spontaneous transmit of CAN frames during first activation. Further if the interface was disabled with a pending message (usually due to an error condition on the CAN bus), this message is retransmitted after enabling the interface again. This patch fixes the regression by: 1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX FIFO, 8 is used by TX. 2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that mailbox is aborted. Cc: Lothar Waßmann <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures commit f13e220161e738c2710b9904dcb3cf8bb0bcce61 upstream. libata EH decrements scmd->retries when the command failed for reasons unrelated to the command itself so that, for example, commands aborted due to suspend / resume cycle don't get penalized; however, decrementing scmd->retries isn't enough for ATA passthrough commands. Without this fix, ATA passthrough commands are not resend to the drive, and no error is signalled to the caller because: - allowed retry count is 1 - ata_eh_qc_complete fill the sense data, so result is valid - sense data is filled with untouched ATA registers. Signed-off-by: Gwendal Grignou <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> md: Fix skipping recovery for read-only arrays. commit 61e4947c99c4494336254ec540c50186d186150b upstream. Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Signed-off-by: Pawel Baldysiak <[email protected]> Signed-off-by: Lukasz Dorau <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> clockevents: Sanitize ticks to nsec conversion commit 97b9410643475d6557d2517c2aff9fd2221141a9 upstream. Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use clockevents_config_and_register() where possible" caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt->mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair "mult <= 1 << shift" is true. For the case where "mult > 1 << shift" we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting "clc" value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch << evt->shift) / evt->mult; So there is no sanity check for (latch << evt->shift) exceeding the 64bit boundary. The latch argument is "unsigned long", so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch << evt->shift) + evt->mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. [ukl: move assignment to rnd after eventually changing mult, fix build issue and correct comment with the right math] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: [email protected] Cc: Marc Pignat <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Ronald Wahl <[email protected]> Cc: LAK <[email protected]> Cc: Ludovic Desroches <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM commit 54e181e073fc1415e41917d725ebdbd7de956455 upstream. Since the beginning of the parisc-linux port, sometimes 64bit SMP kernels were not able to bring up other CPUs than the monarch CPU and instead crashed the kernel. The reason was unclear, esp. since it involved various machines (e.g. J5600, J6750 and SuperDome). Testing showed, that those crashes didn't happened when less than 4GB were installed, or if a 32bit Linux kernel was booted. In the end, the fix for those SMP problems is trivial: During the early phase of the initialization of the CPUs, including the monarch CPU, the PDC_PSW firmware function to enable WIDE (=64bit) mode is called. It's documented that this firmware function may clobber various registers, and one one of those possibly clobbered registers is %cr30 which holds the task thread info pointer. Now, if %cr30 would always have been clobbered, then this bug would have been detected much earlier. But lots of testing finally showed, that - at least for %cr30 - on some machines only the upper 32bits of the 64bit register suddenly turned zero after the firmware call. So, after finding the root cause, the explanation for the various crashes became clear: - On 32bit SMP Linux kernels all upper 32bit were zero, so we didn't faced this problem. - Monarch CPUs in 64bit mode always booted sucessfully, because the inital task thread info pointer was below 4GB. - Secondary CPUs booted sucessfully on machines with less than 4GB RAM because the upper 32bit were zero anyay. - Secondary CPus failed to boot if we had more than 4GB RAM and the task thread info pointer was located above the 4GB boundary. Finally, the patch to fix this problem is trivial by saving the %cr30 register before the firmware call and restoring it afterwards. Signed-off-by: Helge Deller <[email protected]> Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Add a fixup for ASUS N76VZ commit 6fc16e58adf50c0f1e4478538983fb5ff6f453d4 upstream. ASUS N76VZ needs the same fixup as N56VZ for supporting the boost speaker. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=846529 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: wm_hubs: Add missing break in hp_supply_event() commit 268ff14525edba31da29a12a9dd693cdd6a7872e upstream. Spotted by coverity CID 115170. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: dapm: Fix source list debugfs outputs commit ff18620c2157671a8ee21ebb8e6a3520ea209b1f upstream. ... due to a copy & paste error. Spotted by coverity CID 710923. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> staging: ozwpan: prevent overflow in oz_cdev_write() commit c2c65cd2e14ada6de44cb527e7f1990bede24e15 upstream. We need to check "count" so we don't overflow the ei->data buffer. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Staging: bcm: info leak in ioctl commit 8d1e72250c847fa96498ec029891de4dc638a5ba upstream. The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel information to user space. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> uml: check length in exitcode_proc_write() commit 201f99f170df14ba52ea4c52847779042b7a623b upstream. We don't cap the size of buffer from the user so we could write past the end of the array here. Only root can write to this file. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> xtensa: don't use alternate signal stack on threads commit cba9a90053e3b7973eff4f1946f33032e98eeed5 upstream. According to create_thread(3): "The new thread does not inherit the creating thread's alternate signal stack". Since commit f9a3879a (Fix sigaltstack corruption among cloned threads), current->sas_ss_size is set to 0 for cloned processes sharing VM with their parent. Don't use the (nonexistent) alternate signal stack in this case. This has been broken since commit 29c4dfd9 ([XTENSA] Remove non-rt signal handling). Fixes the SA_ONSTACK part of the nptl/tst-cancel20 test from uClibc. Signed-off-by: Baruch Siach <[email protected]> Signed-off-by: Max Filippov <[email protected]> Signed-off-by: Chris Zankel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> lib/scatterlist.c: don't flush_kernel_dcache_page on slab page commit 3d77b50c5874b7e923be946ba793644f82336b75 upstream. Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei <[email protected]> Reported-by: Aaro Koskinen <[email protected]> Tested-by: Simon Baatz <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Will Deacon <[email protected]> Cc: Aaro Koskinen <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Tejun Heo <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> aacraid: missing capable() check in compat ioctl commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream. In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we added a check on CAP_SYS_RAWIO to the ioctl. The compat ioctls need the check as well. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mm: fix aio performance regression for database caused by THP commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream. I am working with a tool that simulates oracle database I/O workload. This tool (orion to be specific - <http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>) allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag. It then does aio into these pages from flash disks using various common block sizes used by database. I am looking at performance with two of the most common block sizes - 1M and 64K. aio performance with these two block sizes plunged after Transparent HugePages was introduced in the kernel. Here are performance numbers: pre-THP 2.6.39 3.11-rc5 1M read 8384 MB/s 5629 MB/s 6501 MB/s 64K read 7867 MB/s 4576 MB/s 4251 MB/s I have narrowed the performance impact down to the overheads introduced by THP in __get_page_tail() and put_compound_page() routines. perf top shows >40% of cycles being spent in these two routines. Every time direct I/O to hugetlbfs pages starts, kernel calls get_page() to grab a reference to the pages and calls put_page() when I/O completes to put the reference away. THP introduced significant amount of locking overhead to get_page() and put_page() when dealing with compound pages because hugepages can be split underneath get_page() and put_page(). It added this overhead irrespective of whether it is dealing with hugetlbfs pages or transparent hugepages. This resulted in 20%-45% drop in aio performance when using hugetlbfs pages. Since hugetlbfs pages can not be split, there is no reason to go through all the locking overhead for these pages from what I can see. I added code to __get_page_tail() and put_compound_page() to bypass all the locking code when working with hugetlbfs pages. This improved performance significantly. Performance numbers with this patch: pre-THP 3.11-rc5 3.11-rc5 + Patch 1M read 8384 MB/s 6501 MB/s 8371 MB/s 64K read 7867 MB/s 4251 MB/s 6510 MB/s Performance with 64K read is still lower than what it was before THP, but still a 53% improvement. It does mean there is more work to be done but I will take a 53% improvement for now. Please take a look at the following patch and let me know if it looks reasonable. [[email protected]: tweak comments] Signed-off-by: Khalid Aziz <[email protected]> Cc: Pravin B Shelar <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm: Prevent overwriting from userspace underallocating core ioctl structs commit b062672e305ce071f21eb9e18b102c2a430e0999 upstream. Apply the protections from commit 1b2f1489633888d4a06028315dc19d65768a1c05 Author: Dave Airlie <[email protected]> Date: Sat Aug 14 20:20:34 2010 +1000 drm: block userspace under allocating buffer and having drivers overwrite it (v2) to the core ioctl structs as well, for we found one instance where there is a 32-/64-bit size mismatch and were guilty of writing beyond the end of the user's buffer. Signed-off-by: Chris Wilson <[email protected]> Cc: Dave Airlie <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm/radeon/atom: workaround vbios bug in transmitter table on rs780 commit c23632d4e57c0dd20bf50eca08fa0eb8ad3ff680 upstream. Some rs780 asics seem to be affected as well. See: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=91f3a6aaf280294b07c05dfe606e6c27b7ba3c72 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60791 Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.69 Signed-off-by: Chet Kener <[email protected]> xen-netback: use jiffies_64 value to calculate credit timeout [ Upstream commit 059dfa6a93b779516321e5112db9d7621b1367ba ] time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <[email protected]> Signed-off-by: Wei Liu <[email protected]> Reviewed-by: David Vrabel <[email protected]> Cc: Ian Campbell <[email protected]> Cc: Jason Luan <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> PCI: fix truncation of resource size to 32 bits commit d6776e6d5c2f8db0252f447b09736075e1bbe387 upstream. _pci_assign_resource() took an int "size" argument, which meant that sizes larger than 4GB were truncated. Change type to resource_size_t. [bhelgaas: changelog] Signed-off-by: Nikhil P Rao <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jiri Slaby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: add new zte 3g-dongle's pid to option.c commit 0636fc507a976cdc40f21bdbcce6f0b98ff1dfe9 upstream. Signed-off-by: Rui li <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Move one-time init codes from generic_hdmi_init() commit 8b8d654b55648561287bd8baca0f75f964a17038 upstream. The codes to initialize work struct or create a proc interface should be called only once and never although it's called many times through the init callback. Move that stuff into patch_generic_hdmi() so that it's called only once. Signed-off-by: Takashi Iwai <[email protected]> BugLink: https://bugs.launchpad.net/bugs/1212160 Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet commit 3a7b…
franciscofranco
pushed a commit
that referenced
this issue
Mar 18, 2015
commit 71abdc15adf8c702a1dd535f8e30df50758848d2 upstream. When kswapd exits, it can end up taking locks that were previously held by allocating tasks while they waited for reclaim. Lockdep currently warns about this: On Wed, May 28, 2014 at 06:06:34PM +0800, Gu Zheng wrote: > inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage. > kswapd2/1151 [HC0[0]:SC0[0]:HE1:SE1] takes: > (&sig->group_rwsem){+++++?}, at: exit_signals+0x24/0x130 > {RECLAIM_FS-ON-W} state was registered at: > mark_held_locks+0xb9/0x140 > lockdep_trace_alloc+0x7a/0xe0 > kmem_cache_alloc_trace+0x37/0x240 > flex_array_alloc+0x99/0x1a0 > cgroup_attach_task+0x63/0x430 > attach_task_by_pid+0x210/0x280 > cgroup_procs_write+0x16/0x20 > cgroup_file_write+0x120/0x2c0 > vfs_write+0xc0/0x1f0 > SyS_write+0x4c/0xa0 > tracesys+0xdd/0xe2 > irq event stamp: 49 > hardirqs last enabled at (49): _raw_spin_unlock_irqrestore+0x36/0x70 > hardirqs last disabled at (48): _raw_spin_lock_irqsave+0x2b/0xa0 > softirqs last enabled at (0): copy_process.part.24+0x627/0x15f0 > softirqs last disabled at (0): (null) > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&sig->group_rwsem); > <Interrupt> > lock(&sig->group_rwsem); > > *** DEADLOCK *** > > no locks held by kswapd2/1151. > > stack backtrace: > CPU: 30 PID: 1151 Comm: kswapd2 Not tainted 3.10.39+ #4 > Call Trace: > dump_stack+0x19/0x1b > print_usage_bug+0x1f7/0x208 > mark_lock+0x21d/0x2a0 > __lock_acquire+0x52a/0xb60 > lock_acquire+0xa2/0x140 > down_read+0x51/0xa0 > exit_signals+0x24/0x130 > do_exit+0xb5/0xa50 > kthread+0xdb/0x100 > ret_from_fork+0x7c/0xb0 This is because the kswapd thread is still marked as a reclaimer at the time of exit. But because it is exiting, nobody is actually waiting on it to make reclaim progress anymore, and it's nothing but a regular thread at this point. Be tidy and strip it of all its powers (PF_MEMALLOC, PF_SWAPWRITE, PF_KSWAPD, and the lockdep reclaim state) before returning from the thread function. Signed-off-by: Johannes Weiner <[email protected]> Reported-by: Gu Zheng <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Tang Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: franciscofranco <[email protected]>
franciscofranco
pushed a commit
that referenced
this issue
Mar 18, 2015
commit 504d58745c9ca28d33572e2d8a9990b43e06075d upstream. clockevents_increase_min_delta() calls printk() from under hrtimer_bases.lock. That causes lock inversion on scheduler locks because printk() can call into the scheduler. Lockdep puts it as: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc8-06195-g939f04b #2 Not tainted ------------------------------------------------------- trinity-main/74 is trying to acquire lock: (&port_lock_key){-.....}, at: [<811c60be>] serial8250_console_write+0x8c/0x10c but task is already holding lock: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (hrtimer_bases.lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<8103c918>] __hrtimer_start_range_ns+0x1c/0x197 [<8107ec20>] perf_swevent_start_hrtimer.part.41+0x7a/0x85 [<81080792>] task_clock_event_start+0x3a/0x3f [<810807a4>] task_clock_event_add+0xd/0x14 [<8108259a>] event_sched_in+0xb6/0x17a [<810826a2>] group_sched_in+0x44/0x122 [<81082885>] ctx_sched_in.isra.67+0x105/0x11f [<810828e6>] perf_event_sched_in.isra.70+0x47/0x4b [<81082bf6>] __perf_install_in_context+0x8b/0xa3 [<8107eb8e>] remote_function+0x12/0x2a [<8105f5af>] smp_call_function_single+0x2d/0x53 [<8107e17d>] task_function_call+0x30/0x36 [<8107fb82>] perf_install_in_context+0x87/0xbb [<810852c9>] SYSC_perf_event_open+0x5c6/0x701 [<810856f9>] SyS_perf_event_open+0x17/0x19 [<8142f8ee>] syscall_call+0x7/0xb -> #4 (&ctx->lock){......}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 -> #3 (&rq->lock){-.-.-.}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81040873>] __task_rq_lock+0x33/0x3a [<8104184c>] wake_up_new_task+0x25/0xc2 [<8102474b>] do_fork+0x15c/0x2a0 [<810248a9>] kernel_thread+0x1a/0x1f [<814232a2>] rest_init+0x1a/0x10e [<817af949>] start_kernel+0x303/0x308 [<817af2ab>] i386_start_kernel+0x79/0x7d -> #2 (&p->pi_lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<810413dd>] try_to_wake_up+0x1d/0xd6 [<810414cd>] default_wake_function+0xb/0xd [<810461f3>] __wake_up_common+0x39/0x59 [<81046346>] __wake_up+0x29/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #1 (&tty->write_wait){-.....}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<81046332>] __wake_up+0x15/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #0 (&port_lock_key){-.....}: [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105c548>] clockevents_program_event+0xe7/0xf3 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 other info that might help us debug this: Chain exists of: &port_lock_key --> &ctx->lock --> hrtimer_bases.lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hrtimer_bases.lock); lock(&ctx->lock); lock(hrtimer_bases.lock); lock(&port_lock_key); *** DEADLOCK *** 4 locks held by trinity-main/74: #0: (&rq->lock){-.-.-.}, at: [<8142c6f3>] __schedule+0xed/0x4cb #1: (&ctx->lock){......}, at: [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f #2: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 #3: (console_lock){+.+...}, at: [<8104fb5d>] vprintk_emit+0x3c7/0x3e4 stack backtrace: CPU: 0 PID: 74 Comm: trinity-main Not tainted 3.15.0-rc8-06195-g939f04b #2 00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570 8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0 8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003 Call Trace: [<81426f69>] dump_stack+0x16/0x18 [<81425a99>] print_circular_bug+0x18f/0x19c [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104af87>] ? lock_release+0x191/0x223 [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8104416d>] ? __dequeue_entity+0x23/0x27 [<81044505>] ? pick_next_task_fair+0xb1/0x120 [<8142cacc>] __schedule+0x4c6/0x4cb [<81047574>] ? trace_hardirqs_off_caller+0xd7/0x108 [<810475b0>] ? trace_hardirqs_off+0xb/0xd [<81056346>] ? rcu_irq_exit+0x64/0x77 Fix the problem by using printk_deferred() which does not call into the scheduler. Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: franciscofranco <[email protected]>
franciscofranco
pushed a commit
that referenced
this issue
Apr 17, 2015
commit 71abdc15adf8c702a1dd535f8e30df50758848d2 upstream. When kswapd exits, it can end up taking locks that were previously held by allocating tasks while they waited for reclaim. Lockdep currently warns about this: On Wed, May 28, 2014 at 06:06:34PM +0800, Gu Zheng wrote: > inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage. > kswapd2/1151 [HC0[0]:SC0[0]:HE1:SE1] takes: > (&sig->group_rwsem){+++++?}, at: exit_signals+0x24/0x130 > {RECLAIM_FS-ON-W} state was registered at: > mark_held_locks+0xb9/0x140 > lockdep_trace_alloc+0x7a/0xe0 > kmem_cache_alloc_trace+0x37/0x240 > flex_array_alloc+0x99/0x1a0 > cgroup_attach_task+0x63/0x430 > attach_task_by_pid+0x210/0x280 > cgroup_procs_write+0x16/0x20 > cgroup_file_write+0x120/0x2c0 > vfs_write+0xc0/0x1f0 > SyS_write+0x4c/0xa0 > tracesys+0xdd/0xe2 > irq event stamp: 49 > hardirqs last enabled at (49): _raw_spin_unlock_irqrestore+0x36/0x70 > hardirqs last disabled at (48): _raw_spin_lock_irqsave+0x2b/0xa0 > softirqs last enabled at (0): copy_process.part.24+0x627/0x15f0 > softirqs last disabled at (0): (null) > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&sig->group_rwsem); > <Interrupt> > lock(&sig->group_rwsem); > > *** DEADLOCK *** > > no locks held by kswapd2/1151. > > stack backtrace: > CPU: 30 PID: 1151 Comm: kswapd2 Not tainted 3.10.39+ #4 > Call Trace: > dump_stack+0x19/0x1b > print_usage_bug+0x1f7/0x208 > mark_lock+0x21d/0x2a0 > __lock_acquire+0x52a/0xb60 > lock_acquire+0xa2/0x140 > down_read+0x51/0xa0 > exit_signals+0x24/0x130 > do_exit+0xb5/0xa50 > kthread+0xdb/0x100 > ret_from_fork+0x7c/0xb0 This is because the kswapd thread is still marked as a reclaimer at the time of exit. But because it is exiting, nobody is actually waiting on it to make reclaim progress anymore, and it's nothing but a regular thread at this point. Be tidy and strip it of all its powers (PF_MEMALLOC, PF_SWAPWRITE, PF_KSWAPD, and the lockdep reclaim state) before returning from the thread function. Signed-off-by: Johannes Weiner <[email protected]> Reported-by: Gu Zheng <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Tang Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: franciscofranco <[email protected]>
franciscofranco
pushed a commit
that referenced
this issue
Apr 17, 2015
commit 504d58745c9ca28d33572e2d8a9990b43e06075d upstream. clockevents_increase_min_delta() calls printk() from under hrtimer_bases.lock. That causes lock inversion on scheduler locks because printk() can call into the scheduler. Lockdep puts it as: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc8-06195-g939f04b #2 Not tainted ------------------------------------------------------- trinity-main/74 is trying to acquire lock: (&port_lock_key){-.....}, at: [<811c60be>] serial8250_console_write+0x8c/0x10c but task is already holding lock: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (hrtimer_bases.lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<8103c918>] __hrtimer_start_range_ns+0x1c/0x197 [<8107ec20>] perf_swevent_start_hrtimer.part.41+0x7a/0x85 [<81080792>] task_clock_event_start+0x3a/0x3f [<810807a4>] task_clock_event_add+0xd/0x14 [<8108259a>] event_sched_in+0xb6/0x17a [<810826a2>] group_sched_in+0x44/0x122 [<81082885>] ctx_sched_in.isra.67+0x105/0x11f [<810828e6>] perf_event_sched_in.isra.70+0x47/0x4b [<81082bf6>] __perf_install_in_context+0x8b/0xa3 [<8107eb8e>] remote_function+0x12/0x2a [<8105f5af>] smp_call_function_single+0x2d/0x53 [<8107e17d>] task_function_call+0x30/0x36 [<8107fb82>] perf_install_in_context+0x87/0xbb [<810852c9>] SYSC_perf_event_open+0x5c6/0x701 [<810856f9>] SyS_perf_event_open+0x17/0x19 [<8142f8ee>] syscall_call+0x7/0xb -> #4 (&ctx->lock){......}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 -> #3 (&rq->lock){-.-.-.}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81040873>] __task_rq_lock+0x33/0x3a [<8104184c>] wake_up_new_task+0x25/0xc2 [<8102474b>] do_fork+0x15c/0x2a0 [<810248a9>] kernel_thread+0x1a/0x1f [<814232a2>] rest_init+0x1a/0x10e [<817af949>] start_kernel+0x303/0x308 [<817af2ab>] i386_start_kernel+0x79/0x7d -> #2 (&p->pi_lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<810413dd>] try_to_wake_up+0x1d/0xd6 [<810414cd>] default_wake_function+0xb/0xd [<810461f3>] __wake_up_common+0x39/0x59 [<81046346>] __wake_up+0x29/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #1 (&tty->write_wait){-.....}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<81046332>] __wake_up+0x15/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #0 (&port_lock_key){-.....}: [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105c548>] clockevents_program_event+0xe7/0xf3 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 other info that might help us debug this: Chain exists of: &port_lock_key --> &ctx->lock --> hrtimer_bases.lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hrtimer_bases.lock); lock(&ctx->lock); lock(hrtimer_bases.lock); lock(&port_lock_key); *** DEADLOCK *** 4 locks held by trinity-main/74: #0: (&rq->lock){-.-.-.}, at: [<8142c6f3>] __schedule+0xed/0x4cb #1: (&ctx->lock){......}, at: [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f #2: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 #3: (console_lock){+.+...}, at: [<8104fb5d>] vprintk_emit+0x3c7/0x3e4 stack backtrace: CPU: 0 PID: 74 Comm: trinity-main Not tainted 3.15.0-rc8-06195-g939f04b #2 00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570 8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0 8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003 Call Trace: [<81426f69>] dump_stack+0x16/0x18 [<81425a99>] print_circular_bug+0x18f/0x19c [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104af87>] ? lock_release+0x191/0x223 [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8104416d>] ? __dequeue_entity+0x23/0x27 [<81044505>] ? pick_next_task_fair+0xb1/0x120 [<8142cacc>] __schedule+0x4c6/0x4cb [<81047574>] ? trace_hardirqs_off_caller+0xd7/0x108 [<810475b0>] ? trace_hardirqs_off+0xb/0xd [<81056346>] ? rcu_irq_exit+0x64/0x77 Fix the problem by using printk_deferred() which does not call into the scheduler. Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Jan Kara <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: franciscofranco <[email protected]>
DerRomtester
referenced
this issue
in DerRomtester/one_plus_one
May 2, 2015
Linux 3.4.68 - 3.4.105 net/ping: handle protocol mismatching scenario ping_lookup() may return a wrong sock if sk_buff's and sock's protocols dont' match. For example, sk_buff's protocol is ETH_P_IPV6, but sock's sk_family is AF_INET, in that case, if sk->sk_bound_dev_if is zero, a wrong sock will be returned. the fix is to "continue" the searching, if no matching, return NULL. [cherry-pick of net 91a0b603469069cdcce4d572b7525ffc9fd352a6] Bug: 18512516 Change-Id: I520223ce53c0d4e155c37d6b65a03489cc7fd494 Cc: "David S. Miller" <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: James Morris <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Jane Zhou <[email protected]> Signed-off-by: Yiwei Zhao <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Lorenzo Colitti <[email protected]> tcp: must unclone packets before mangling them [ Upstream commit c52e2421f7368fd36cbe330d2cf41b10452e39a9 ] TCP stack should make sure it owns skbs before mangling them. We had various crashes using bnx2x, and it turned out gso_size was cleared right before bnx2x driver was populating TC descriptor of the _previous_ packet send. TCP stack can sometime retransmit packets that are still in Qdisc. Of course we could make bnx2x driver more robust (using ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack. We have identified two points where skb_unclone() was needed. This patch adds a WARN_ON_ONCE() to warn us if we missed another fix of this kind. Kudos to Neal for finding the root cause of this bug. Its visible using small MSS. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> tcp: do not forget FIN in tcp_shifted_skb() [ Upstream commit 5e8a402f831dbe7ee831340a91439e46f0d38acd ] Yuchung found following problem : There are bugs in the SACK processing code, merging part in tcp_shift_skb_data(), that incorrectly resets or ignores the sacked skbs FIN flag. When a receiver first SACK the FIN sequence, and later throw away ofo queue (e.g., sack-reneging), the sender will stop retransmitting the FIN flag, and hangs forever. Following packetdrill test can be used to reproduce the bug. $ cat sack-merge-bug.pkt `sysctl -q net.ipv4.tcp_fack=0` // Establish a connection and send 10 MSS. 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +.000 bind(3, ..., ...) = 0 +.000 listen(3, 1) = 0 +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.001 < . 1:1(0) ack 1 win 1024 +.000 accept(3, ..., ...) = 4 +.100 write(4, ..., 12000) = 12000 +.000 shutdown(4, SHUT_WR) = 0 +.000 > . 1:10001(10000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 +.000 > FP. 10001:12001(2000) ack 1 +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop> +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop> // SACK reneg +.050 < . 1:1(0) ack 12001 win 257 +0 %{ print "unacked: ",tcpi_unacked }% +5 %{ print "" }% First, a typo inverted left/right of one OR operation, then code forgot to advance end_seq if the merged skb carried FIN. Bug was added in 2.6.29 by commit 832d11c5cd076ab ("tcp: Try to restore large SKBs while SACK processing") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Acked-by: Neal Cardwell <[email protected]> Cc: Ilpo Järvinen <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: do not call sock_put() on TIMEWAIT sockets [ Upstream commit 80ad1d61e72d626e30ebe8529a0455e660ca4693 ] commit 3ab5aee7fe84 ("net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets. We should instead use inet_twsk_put() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: update statistics timer from timer context only [ Upstream commit 041b4ddb84989f06ff1df0ca869b950f1ee3cb1c ] Each port driver installs a periodic timer to update port statistics by calling mib_counters_update. As mib_counters_update is also called from non-timer context, we should not reschedule the timer there but rather move it to timer-only context. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: mv643xx_eth: fix orphaned statistics timer crash [ Upstream commit f564412c935111c583b787bcc18157377b208e2e ] The periodic statistics timer gets started at port _probe() time, but is stopped on _stop() only. In a modular environment, this can cause the timer to access already deallocated memory, if the module is unloaded without starting the eth device. To fix this, we add the timer right before the port is started, instead of at _probe() time. Signed-off-by: Sebastian Hesselbarth <[email protected]> Acked-by: Jason Cooper <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: heap overflow in __audit_sockaddr() [ Upstream commit 1661bf364ae9c506bc8795fef70d1532931be1e8 ] We need to cap ->msg_namelen or it leads to a buffer overflow when we to the memcpy() in __audit_sockaddr(). It requires CAP_AUDIT_CONTROL to exploit this bug. The call tree is: ___sys_recvmsg() move_addr_to_user() audit_sockaddr() __audit_sockaddr() Reported-by: Jüri Aedla <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> proc connector: fix info leaks [ Upstream commit e727ca82e0e9616ab4844301e6bae60ca7327682 ] Initialize event_data for all possible message types to prevent leaking kernel stack contents to userland (up to 20 bytes). Also set the flags member of the connector message to 0 to prevent leaking two more stack bytes this way. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv4: fix ineffective source address selection [ Upstream commit 0a7e22609067ff524fc7bbd45c6951dd08561667 ] When sending out multicast messages, the source address in inet->mc_addr is ignored and rewritten by an autoselected one. This is caused by a typo in commit 813b3b5db831 ("ipv4: Use caller's on-stack flowi as-is in output route lookups"). Signed-off-by: Jiri Benc <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: dev: fix nlmsg size calculation in can_get_size() [ Upstream commit fe119a05f8ca481623a8d02efcc984332e612528 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ipv6: restrict neighbor entry creation to output flow This patch is based on 3.2.y branch, the one used by reporter. Please let me know if it should be different. Thanks. The patch which introduced the regression was applied on stables: 3.0.64 3.4.31 3.7.8 3.2.39 The patch which introduced the regression was for stable trees only. ---8<--- Commit 0d6a77079c475033cb622c07c5a880b392ef664e "ipv6: do not create neighbor entries for local delivery" introduced a regression on which routes to local delivery would not work anymore. Like this: $ ip -6 route add local 2001::/64 dev lo $ ping6 -c1 2001::9 PING 2001::9(2001::9) 56 data bytes ping: sendmsg: Invalid argument As this is a local delivery, that commit would not allow the creation of a neighbor entry and thus the packet cannot be sent. But as TPROXY scenario actually needs to avoid the neighbor entry creation only for input flow, this patch now limits previous patch to input flow, keeping output as before that patch. Reported-by: Debabrata Banerjee <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> CC: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bridge: Correctly clamp MAX forward_delay when enabling STP [ Upstream commit 4b6c7879d84ad06a2ac5b964808ed599187a188d ] Commit be4f154d5ef0ca147ab6bcd38857a774133f5450 bridge: Clamp forward_delay when enabling STP had a typo when attempting to clamp maximum forward delay. It is possible to set bridge_forward_delay to be higher then permitted maximum when STP is off. When turning STP on, the higher then allowed delay has to be clamed down to max value. Signed-off-by: Vlad Yasevich <[email protected]> CC: Herbert Xu <[email protected]> CC: Stephen Hemminger <[email protected]> Reviewed-by: Veaceslav Falico <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: vlan: fix nlmsg size calculation in vlan_get_size() [ Upstream commit c33a39c575068c2ea9bffb22fd6de2df19c74b89 ] This patch fixes the calculation of the nlmsg size, by adding the missing nla_total_size(). Cc: Patrick McHardy <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> l2tp: must disable bh before calling l2tp_xmit_skb() [ Upstream commit 455cc32bf128e114455d11ad919321ab89a2c312 ] François Cachereul made a very nice bug report and suspected the bh_lock_sock() / bh_unlok_sock() pair used in l2tp_xmit_skb() from process context was not good. This problem was added by commit 6af88da14ee284aaad6e4326da09a89191ab6165 ("l2tp: Fix locking in l2tp_core.c"). l2tp_eth_dev_xmit() runs from BH context, so we must disable BH from other l2tp_xmit_skb() users. [ 452.060011] BUG: soft lockup - CPU#1 stuck for 23s! [accel-pppd:6662] [ 452.061757] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppoe pppox ppp_generic slhc ipv6 ext3 mbcache jbd virtio_balloon xfs exportfs dm_mod virtio_blk ata_generic virtio_net floppy ata_piix libata virtio_pci virtio_ring virtio [last unloaded: scsi_wait_scan] [ 452.064012] CPU 1 [ 452.080015] BUG: soft lockup - CPU#2 stuck for 23s! [accel-pppd:6643] [ 452.080015] CPU 2 [ 452.080015] [ 452.080015] Pid: 6643, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.080015] RIP: 0010:[<ffffffff81059f6c>] [<ffffffff81059f6c>] do_raw_spin_lock+0x17/0x1f [ 452.080015] RSP: 0018:ffff88007125fc18 EFLAGS: 00000293 [ 452.080015] RAX: 000000000000aba9 RBX: ffffffff811d0703 RCX: 0000000000000000 [ 452.080015] RDX: 00000000000000ab RSI: ffff8800711f6896 RDI: ffff8800745c8110 [ 452.080015] RBP: ffff88007125fc18 R08: 0000000000000020 R09: 0000000000000000 [ 452.080015] R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000286 [ 452.080015] R13: 0000000000000020 R14: 0000000000000240 R15: 0000000000000000 [ 452.080015] FS: 00007fdc0cc24700(0000) GS:ffff8800b6f00000(0000) knlGS:0000000000000000 [ 452.080015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.080015] CR2: 00007fdb054899b8 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.080015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.080015] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.080015] Process accel-pppd (pid: 6643, threadinfo ffff88007125e000, task ffff8800b27e6dd0) [ 452.080015] Stack: [ 452.080015] ffff88007125fc28 ffffffff81256559 ffff88007125fc98 ffffffffa01b2bd1 [ 452.080015] ffff88007125fc58 000000000000000c 00000000029490d0 0000009c71dbe25e [ 452.080015] 000000000000005c 000000080000000e 0000000000000000 ffff880071170600 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.080015] Code: 81 48 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 <8a> 07 eb f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 [ 452.080015] Call Trace: [ 452.080015] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.080015] [<ffffffffa01b2bd1>] l2tp_xmit_skb+0x189/0x4ac [l2tp_core] [ 452.080015] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.080015] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.080015] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.080015] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.080015] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.080015] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.080015] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.080015] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.080015] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] [ 452.064012] Pid: 6662, comm: accel-pppd Not tainted 3.2.46.mini #1 Bochs Bochs [ 452.064012] RIP: 0010:[<ffffffff81059f6e>] [<ffffffff81059f6e>] do_raw_spin_lock+0x19/0x1f [ 452.064012] RSP: 0018:ffff8800b6e83ba0 EFLAGS: 00000297 [ 452.064012] RAX: 000000000000aaa9 RBX: ffff8800b6e83b40 RCX: 0000000000000002 [ 452.064012] RDX: 00000000000000aa RSI: 000000000000000a RDI: ffff8800745c8110 [ 452.064012] RBP: ffff8800b6e83ba0 R08: 000000000000c802 R09: 000000000000001c [ 452.064012] R10: ffff880071096c4e R11: 0000000000000006 R12: ffff8800b6e83b18 [ 452.064012] R13: ffffffff8125d51e R14: ffff8800b6e83ba0 R15: ffff880072a589c0 [ 452.064012] FS: 00007fdc0b81e700(0000) GS:ffff8800b6e80000(0000) knlGS:0000000000000000 [ 452.064012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 452.064012] CR2: 0000000000625208 CR3: 0000000074404000 CR4: 00000000000006a0 [ 452.064012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 452.064012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 452.064012] Process accel-pppd (pid: 6662, threadinfo ffff88007129a000, task ffff8800744f7410) [ 452.064012] Stack: [ 452.064012] ffff8800b6e83bb0 ffffffff81256559 ffff8800b6e83bc0 ffffffff8121c64a [ 452.064012] ffff8800b6e83bf0 ffffffff8121ec7a ffff880072a589c0 ffff880071096c62 [ 452.064012] 0000000000000011 ffffffff81430024 ffff8800b6e83c80 ffffffff8121f276 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [ 452.064012] [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [ 452.064012] [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b [ 452.064012] Code: 89 e5 72 0c 31 c0 48 81 ff 45 66 25 81 0f 92 c0 5d c3 55 b8 00 01 00 00 48 89 e5 f0 66 0f c1 07 0f b6 d4 38 d0 74 06 f3 90 8a 07 <eb> f6 5d c3 90 90 55 48 89 e5 9c 58 0f 1f 44 00 00 5d c3 55 48 [ 452.064012] Call Trace: [ 452.064012] <IRQ> [<ffffffff81256559>] _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8121c64a>] spin_lock+0x9/0xb [ 452.064012] [<ffffffff8121ec7a>] udp_queue_rcv_skb+0x186/0x269 [ 452.064012] [<ffffffff8121f276>] __udp4_lib_rcv+0x297/0x4ae [ 452.064012] [<ffffffff8121c178>] ? raw_rcv+0xe9/0xf0 [ 452.064012] [<ffffffff8121f4a7>] udp_rcv+0x1a/0x1c [ 452.064012] [<ffffffff811fe385>] ip_local_deliver_finish+0x12b/0x1a5 [ 452.064012] [<ffffffff811fe54e>] ip_local_deliver+0x53/0x84 [ 452.064012] [<ffffffff811fe1d0>] ip_rcv_finish+0x2bc/0x2f3 [ 452.064012] [<ffffffff811fe78f>] ip_rcv+0x210/0x269 [ 452.064012] [<ffffffff8101911e>] ? kvm_clock_get_cycles+0x9/0xb [ 452.064012] [<ffffffff811d88cd>] __netif_receive_skb+0x3a5/0x3f7 [ 452.064012] [<ffffffff811d8eba>] netif_receive_skb+0x57/0x5e [ 452.064012] [<ffffffff811cf30f>] ? __netdev_alloc_skb+0x1f/0x3b [ 452.064012] [<ffffffffa0049126>] virtnet_poll+0x4ba/0x5a4 [virtio_net] [ 452.064012] [<ffffffff811d9417>] net_rx_action+0x73/0x184 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffff810343b9>] __do_softirq+0xc3/0x1a8 [ 452.064012] [<ffffffff81013b56>] ? ack_APIC_irq+0x10/0x12 [ 452.064012] [<ffffffff81256559>] ? _raw_spin_lock+0xe/0x10 [ 452.064012] [<ffffffff8125e0ac>] call_softirq+0x1c/0x26 [ 452.064012] [<ffffffff81003587>] do_softirq+0x45/0x82 [ 452.064012] [<ffffffff81034667>] irq_exit+0x42/0x9c [ 452.064012] [<ffffffff8125e146>] do_IRQ+0x8e/0xa5 [ 452.064012] [<ffffffff8125676e>] common_interrupt+0x6e/0x6e [ 452.064012] <EOI> [<ffffffff810b82a1>] ? kfree+0x8a/0xa3 [ 452.064012] [<ffffffffa01b2cc2>] ? l2tp_xmit_skb+0x27a/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01b2c25>] ? l2tp_xmit_skb+0x1dd/0x4ac [l2tp_core] [ 452.064012] [<ffffffffa01c2d36>] pppol2tp_sendmsg+0x15e/0x19c [l2tp_ppp] [ 452.064012] [<ffffffff811c7872>] __sock_sendmsg_nosec+0x22/0x24 [ 452.064012] [<ffffffff811c83bd>] sock_sendmsg+0xa1/0xb6 [ 452.064012] [<ffffffff81254e88>] ? __schedule+0x5c1/0x616 [ 452.064012] [<ffffffff8103c7c6>] ? __dequeue_signal+0xb7/0x10c [ 452.064012] [<ffffffff810bbd21>] ? fget_light+0x75/0x89 [ 452.064012] [<ffffffff811c8444>] ? sockfd_lookup_light+0x20/0x56 [ 452.064012] [<ffffffff811c9b34>] sys_sendto+0x10c/0x13b [ 452.064012] [<ffffffff8125cac2>] system_call_fastpath+0x16/0x1b Reported-by: François Cachereul <[email protected]> Tested-by: François Cachereul <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> farsync: fix info leak in ioctl [ Upstream commit 96b340406724d87e4621284ebac5e059d67b2194 ] The fst_get_iface() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> unix_diag: fix info leak [ Upstream commit 6865d1e834be84ddd5808d93d5035b492346c64a ] When filling the netlink message we miss to wipe the pad field, therefore leak one byte of heap memory to userland. Fix this by setting pad to 0. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> connector: use nlmsg_len() to check message length [ Upstream commit 162b2bedc084d2d908a04c93383ba02348b648b0 ] The current code tests the length of the whole netlink message to be at least as long to fit a cn_msg. This is wrong as nlmsg_len includes the length of the netlink message header. Use nlmsg_len() instead to fix this "off-by-NLMSG_HDRLEN" size check. Signed-off-by: Mathias Krause <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> bnx2x: record rx queue for LRO packets [ Upstream commit 60e66fee56b2256dcb1dc2ea1b2ddcb6e273857d ] RPS support is kind of broken on bnx2x, because only non LRO packets get proper rx queue information. This triggers reorders, as it seems bnx2x like to generate a non LRO packet for segment including TCP PUSH flag : (this might be pure coincidence, but all the reorders I've seen involve segments with a PUSH) 11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457 <nop,nop,timestamp 3789336 3985797> 11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985797> 11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> 11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457 <nop,nop,timestamp 3789337 3985798> We must call skb_record_rx_queue() to properly give to RPS (and more generally for TX queue selection on forward path) the receive queue information. Similar fix is needed for skb_mark_napi_id(), but will be handled in a separate patch to ease stable backports. Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Eilon Greenstein <[email protected]> Acked-by: Dmitry Kravkov <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: dst: provide accessor function to dst->xfrm [ Upstream commit e87b3998d795123b4139bc3f25490dd236f68212 ] dst->xfrm is conditionally defined. Provide accessor funtion that is always available. Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Use software crc32 checksum when xfrm transform will happen. [ Upstream commit 27127a82561a2a3ed955ce207048e1b066a80a2a ] igb/ixgbe have hardware sctp checksum support, when this feature is enabled and also IPsec is armed to protect sctp traffic, ugly things happened as xfrm_output checks CHECKSUM_PARTIAL to do checksum operation(sum every thing up and pack the 16bits result in the checksum field). The result is fail establishment of sctp communication. Signed-off-by: Fan Du <[email protected]> Cc: Neil Horman <[email protected]> Cc: Steffen Klassert <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> sctp: Perform software checksum if packet has to be fragmented. [ Upstream commit d2dbbba77e95dff4b4f901fee236fef6d9552072 ] IP/IPv6 fragmentation knows how to compute only TCP/UDP checksum. This causes problems if SCTP packets has to be fragmented and ipsummed has been set to PARTIAL due to checksum offload support. This condition can happen when retransmitting after MTU discover, or when INIT or other control chunks are larger then MTU. Check for the rare fragmentation condition in SCTP and use software checksum calculation in this case. CC: Fan Du <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wanxl: fix info leak in ioctl [ Upstream commit 2b13d06c9584b4eb773f1e80bbaedab9a1c344e1 ] The wanxl_ioctl() code fails to initialize the two padding bytes of struct sync_serial_settings after the ->loopback member. Add an explicit memset(0) before filling the structure to avoid the info leak. Signed-off-by: Salva Peiró <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race [ Upstream commit 90c6bd34f884cd9cee21f1d152baf6c18bcac949 ] In the case of credentials passing in unix stream sockets (dgram sockets seem not affected), we get a rather sparse race after commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default"). We have a stream server on receiver side that requests credential passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED on each spawned/accepted socket on server side to 1 first (as it's not inherited), it can happen that in the time between accept() and setsockopt() we get interrupted, the sender is being scheduled and continues with passing data to our receiver. At that time SO_PASSCRED is neither set on sender nor receiver side, hence in cmsg's SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534 (== overflow{u,g}id) instead of what we actually would like to see. On the sender side, here nc -U, the tests in maybe_add_creds() invoked through unix_stream_sendmsg() would fail, as at that exact time, as mentioned, the sender has neither SO_PASSCRED on his side nor sees it on the server side, and we have a valid 'other' socket in place. Thus, sender believes it would just look like a normal connection, not needing/requesting SO_PASSCRED at that time. As reverting 16e5726 would not be an option due to the significant performance regression reported when having creds always passed, one way/trade-off to prevent that would be to set SO_PASSCRED on the listener socket and allow inheriting these flags to the spawned socket on server side in accept(). It seems also logical to do so if we'd tell the listener socket to pass those flags onwards, and would fix the race. Before, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}}, msg_flags=0}, 0) = 5 After, strace: recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}}, msg_flags=0}, 0) = 5 Signed-off-by: Daniel Borkmann <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Eric W. Biederman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> net: fix cipso packet validation when !NETLABEL [ Upstream commit f2e5ddcc0d12f9c4c7b254358ad245c9dddce13b ] When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel crash in an SMP system, since the CPU executing this function will stall /not respond to IPIs. This problem can be reproduced by running the IP Stack Integrity Checker (http://isic.sourceforge.net) using the following command on a Linux machine connected to DUT: "icmpsic -s rand -d <DUT IP address> -r 123456" wait (1-2 min) Signed-off-by: Seif Mazareeb <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> inet: fix possible memory corruption with UDP_CORK and UFO [ This is a simplified -stable version of a set of upstream commits. ] This is a replacement patch only for stable which does fix the problems handled by the following two commits in -net: "ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9) "ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b) Three frames are written on a corked udp socket for which the output netdevice has UFO enabled. If the first and third frame are smaller than the mtu and the second one is bigger, we enqueue the second frame with skb_append_datato_frags without initializing the gso fields. This leads to the third frame appended regulary and thus constructing an invalid skb. This fixes the problem by always using skb_append_datato_frags as soon as the first frag got enqueued to the skb without marking the packet as SKB_GSO_UDP. The problem with only two frames for ipv6 was fixed by "ipv6: udp packets following an UFO enqueued packet need also be handled by UFO" (2811ebac2521ceac84f2bdae402455baa6a7fb47). Cc: Jiri Pirko <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> davinci_emac.c: Fix IFF_ALLMULTI setup [ Upstream commit d69e0f7ea95fef8059251325a79c004bac01f018 ] When IFF_ALLMULTI flag is set on interface and IFF_PROMISC isn't, emac_dev_mcast_set should only enable RX of multicasts and reset MACHASH registers. It does this, but afterwards it either sets up multicast MACs filtering or disables RX of multicasts and resets MACHASH registers again, rendering IFF_ALLMULTI flag useless. This patch fixes emac_dev_mcast_set, so that multicast MACs filtering and disabling of RX of multicasts are skipped when IFF_ALLMULTI flag is set. Tested with kernel 2.6.37. Signed-off-by: Mariusz Ceier <[email protected]> Acked-by: Mugunthan V N <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ext3: return 32/64-bit dir name hash according to usage type commit d7dab39b6e16d5eea78ed3c705d2a2d0772b4f06 upstream. This is based on commit d1f5273e9adb40724a85272f248f210dc4ce919a ext4: return 32/64-bit dir name hash according to usage type by Fan Yong <[email protected]> Traditionally ext2/3/4 has returned a 32-bit hash value from llseek() to appease NFSv2, which can only handle a 32-bit cookie for seekdir() and telldir(). However, this causes problems if there are 32-bit hash collisions, since the NFSv2 server can get stuck resending the same entries from the directory repeatedly. Allow ext3 to return a full 64-bit hash (both major and minor) for telldir to decrease the chance of hash collisions. This patch does implement a new ext3_dir_llseek op, because with 64-bit hashes, nfs will attempt to seek to a hash "offset" which is much larger than ext3's s_maxbytes. So for dx dirs, we call generic_file_llseek_size() with the appropriate max hash value as the maximum seekable size. Otherwise we just pass through to generic_file_llseek(). Patch-updated-by: Bernd Schubert <[email protected]> Patch-updated-by: Eric Sandeen <[email protected]> (blame us if something is not correct) Signed-off-by: Eric Sandeen <[email protected]> Signed-off-by: Jan Kara <[email protected]> Cc: Benjamin LaHaise <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> dm snapshot: fix data corruption commit e9c6a182649f4259db704ae15a91ac820e63b0ca upstream. This patch fixes a particular type of data corruption that has been encountered when loading a snapshot's metadata from disk. When we allocate a new chunk in persistent_prepare, we increment ps->next_free and we make sure that it doesn't point to a metadata area by further incrementing it if necessary. When we load metadata from disk on device activation, ps->next_free is positioned after the last used data chunk. However, if this last used data chunk is followed by a metadata area, ps->next_free is positioned erroneously to the metadata area. A newly-allocated chunk is placed at the same location as the metadata area, resulting in data or metadata corruption. This patch changes the code so that ps->next_free skips the metadata area when metadata are loaded in function read_exceptions. The patch also moves a piece of code from persistent_prepare_exception to a separate function skip_metadata to avoid code duplication. CVE-2013-4299 Signed-off-by: Mikulas Patocka <[email protected]> Cc: Mike Snitzer <[email protected]> Signed-off-by: Alasdair G Kergon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> writeback: fix negative bdi max pause commit e3b6c655b91e01a1dade056cfa358581b47a5351 upstream. Toralf runs trinity on UML/i386. After some time it hangs and the last message line is BUG: soft lockup - CPU#0 stuck for 22s! [trinity-child0:1521] It's found that pages_dirtied becomes very large. More than 1000000000 pages in this case: period = HZ * pages_dirtied / task_ratelimit; BUG_ON(pages_dirtied > 2000000000); BUG_ON(pages_dirtied > 1000000000); <--------- UML debug printf shows that we got negative pause here: ick: pause : -984 ick: pages_dirtied : 0 ick: task_ratelimit: 0 pause: + if (pause < 0) { + extern int printf(char *, ...); + printf("ick : pause : %li\n", pause); + printf("ick: pages_dirtied : %lu\n", pages_dirtied); + printf("ick: task_ratelimit: %lu\n", task_ratelimit); + BUG_ON(1); + } trace_balance_dirty_pages(bdi, Since pause is bounded by [min_pause, max_pause] where min_pause is also bounded by max_pause. It's suspected and demonstrated that the max_pause calculation goes wrong: ick: pause : -717 ick: min_pause : -177 ick: max_pause : -717 ick: pages_dirtied : 14 ick: task_ratelimit: 0 The problem lies in the two "long = unsigned long" assignments in bdi_max_pause() which might go negative if the highest bit is 1, and the min_t(long, ...) check failed to protect it falling under 0. Fix all of them by using "unsigned long" throughout the function. Signed-off-by: Fengguang Wu <[email protected]> Reported-by: Toralf Förster <[email protected]> Tested-by: Toralf Förster <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> wireless: radiotap: fix parsing buffer overrun commit f5563318ff1bde15b10e736e97ffce13be08bc1a upstream. When parsing an invalid radiotap header, the parser can overrun the buffer that is passed in because it doesn't correctly check 1) the minimum radiotap header size 2) the space for extended bitmaps The first issue doesn't affect any in-kernel user as they all check the minimum size before calling the radiotap function. The second issue could potentially affect the kernel if an skb is passed in that consists only of the radiotap header with a lot of extended bitmaps that extend past the SKB. In that case a read-only buffer overrun by at most 4 bytes is possible. Fix this by adding the appropriate checks to the parser. Reported-by: Evan Huus <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ti_usb_3410_5052: add Abbott strip port ID to combined table as well. commit c9d09dc7ad106492c17c587b6eeb99fe3f43e522 upstream. Without this change, the USB cable for Freestyle Option and compatible glucometers will not be detected by the driver. Signed-off-by: Diego Elio Pettenò <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: option: add support for Inovia SEW858 device commit f4c19b8e165cff1a6607c21f8809441d61cab7ec upstream. This patch adds the device id for the Inovia SEW858 device to the option driver. Reported-by: Pavel Parkhomenko <[email protected]> Tested-by: Pavel Parkhomenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> usb: serial: option: blacklist Olivetti Olicard200 commit fd8573f5828873343903215f203f14dc82de397c upstream. Interface 6 of this device speaks QMI as per tests done by us. Credits go to Antonella for providing the hardware. Signed-off-by: Enrico Mioso <[email protected]> Signed-off-by: Antonella Pellizzari <[email protected]> Tested-by: Dan Williams <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.68 Signed-off-by: Chet Kener <[email protected]> USB: support new huawei devices in option.c commit d544db293a44a2a3b09feab7dbd59668b692de71 upstream. Add new supporting declarations to option.c, to support Huawei new devices with new bInterfaceSubClass value. Signed-off-by: fangxiaozhi <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks.c: add one device that cannot deal with suspension commit 4294bca7b423d1a5aa24307e3d112a04075e3763 upstream. The device is not responsive when resumed, unless it is reset. Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: quirks: add touchscreen that is dazzeled by remote wakeup commit 614ced91fc6fbb5a1cdd12f0f1b6c9197d9f1350 upstream. The device descriptors are messed up after remote wakeup Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: serial: ftdi_sio: add id for Z3X Box device commit e1466ad5b1aeda303f9282463d55798d2eda218c upstream. Custom VID/PID for Z3X Box device, popular tool for cellphone flashing. Signed-off-by: Alexey E. Kramarenko <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: correctly close cancelled scans commit a754055a1296fcbe6f32de3a5eaca6efb2fd1865 upstream. __ieee80211_scan_completed is called from a worker. This means that the following flow is possible. * driver calls ieee80211_scan_completed * mac80211 cancels the scan (that is already complete) * __ieee80211_scan_completed runs When scan_work will finally run, it will see that the scan hasn't been aborted and might even trigger another scan on another band. This leads to a situation where cfg80211's scan is not done and no further scan can be issued. Fix this by setting a new flag when a HW scan is being cancelled so that no other scan will be triggered. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mac80211: update sta->last_rx on acked tx frames commit 0c5b93290b2f3c7a376567c03ae8d385b0e99851 upstream. When clients are idle for too long, hostapd sends nullfunc frames for probing. When those are acked by the client, the idle time needs to be updated. To make this work (and to avoid unnecessary probing), update sta->last_rx whenever an ACK was received for a tx packet. Only do this if the flag IEEE80211_HW_REPORTS_TX_ACK_STATUS is set. Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> rtlwifi: rtl8192cu: Fix error in pointer arithmetic commit 9473ca6e920a3b9ca902753ce52833657f9221cc upstream. An error in calculating the offset in an skb causes the driver to read essential device info from the wrong locations. The main effect is that automatic gain calculations are nonsense. Signed-off-by: Mark Cave-Ayland <[email protected]> Signed-off-by: Larry Finger <[email protected]> Signed-off-by: John W. Linville <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> jfs: fix error path in ialloc commit 8660998608cfa1077e560034db81885af8e1e885 upstream. If insert_inode_locked() fails, we shouldn't be calling unlock_new_inode(). Signed-off-by: Dave Kleikamp <[email protected]> Tested-by: Michael L. Semon <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> can: flexcan: flexcan_chip_start: fix regression, mark one MB for TX and abort pending TX commit d5a7b406c529e4595ce03dc8f6dcf7fa36f106fa upstream. In patch 0d1862e can: flexcan: fix flexcan_chip_start() on imx6 the loop in flexcan_chip_start() that iterates over all mailboxes after the soft reset of the CAN core was removed. This loop put all mailboxes (even the ones marked as reserved 1...7) into EMPTY/INACTIVE mode. On mailboxes 8...63, this aborts any pending TX messages. After a cold boot there is random garbage in the mailboxes, which leads to spontaneous transmit of CAN frames during first activation. Further if the interface was disabled with a pending message (usually due to an error condition on the CAN bus), this message is retransmitted after enabling the interface again. This patch fixes the regression by: 1) Limiting the maximum number of used mailboxes to 8, 0...7 are used by the RX FIFO, 8 is used by TX. 2) Marking the TX mailbox as EMPTY/INACTIVE, so that any pending TX of that mailbox is aborted. Cc: Lothar Waßmann <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> libata: make ata_eh_qc_retry() bump scmd->allowed on bogus failures commit f13e220161e738c2710b9904dcb3cf8bb0bcce61 upstream. libata EH decrements scmd->retries when the command failed for reasons unrelated to the command itself so that, for example, commands aborted due to suspend / resume cycle don't get penalized; however, decrementing scmd->retries isn't enough for ATA passthrough commands. Without this fix, ATA passthrough commands are not resend to the drive, and no error is signalled to the caller because: - allowed retry count is 1 - ata_eh_qc_complete fill the sense data, so result is valid - sense data is filled with untouched ATA registers. Signed-off-by: Gwendal Grignou <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> md: Fix skipping recovery for read-only arrays. commit 61e4947c99c4494336254ec540c50186d186150b upstream. Since: commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86 md: Allow devices to be re-added to a read-only array. spares are activated on a read-only array. In case of raid1 and raid10 personalities it causes that not-in-sync devices are marked in-sync without checking if recovery has been finished. If a read-only array is degraded and one of its devices is not in-sync (because the array has been only partially recovered) recovery will be skipped. This patch adds checking if recovery has been finished before marking a device in-sync for raid1 and raid10 personalities. In case of raid5 personality such condition is already present (at raid5.c:6029). Bug was introduced in 3.10 and causes data corruption. Signed-off-by: Pawel Baldysiak <[email protected]> Signed-off-by: Lukasz Dorau <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> clockevents: Sanitize ticks to nsec conversion commit 97b9410643475d6557d2517c2aff9fd2221141a9 upstream. Marc Kleine-Budde pointed out, that commit 77cc982 "clocksource: use clockevents_config_and_register() where possible" caused a regression for some of the converted subarchs. The reason is, that the clockevents core code converts the minimal hardware tick delta to a nanosecond value for core internal usage. This conversion is affected by integer math rounding loss, so the backwards conversion to hardware ticks will likely result in a value which is less than the configured hardware limitation. The affected subarchs used their own workaround (SIGH!) which got lost in the conversion. The solution for the issue at hand is simple: adding evt->mult - 1 to the shifted value before the integer divison in the core conversion function takes care of it. But this only works for the case where for the scaled math mult/shift pair "mult <= 1 << shift" is true. For the case where "mult > 1 << shift" we can apply the rounding add only for the minimum delta value to make sure that the backward conversion is not less than the given hardware limit. For the upper bound we need to omit the rounding add, because the backwards conversion is always larger than the original latch value. That would violate the upper bound of the hardware device. Though looking closer at the details of that function reveals another bogosity: The upper bounds check is broken as well. Checking for a resulting "clc" value greater than KTIME_MAX after the conversion is pointless. The conversion does: u64 clc = (latch << evt->shift) / evt->mult; So there is no sanity check for (latch << evt->shift) exceeding the 64bit boundary. The latch argument is "unsigned long", so on a 64bit arch the handed in argument could easily lead to an unnoticed shift overflow. With the above rounding fix applied the calculation before the divison is: u64 clc = (latch << evt->shift) + evt->mult - 1; So we need to make sure, that neither the shift nor the rounding add is overflowing the u64 boundary. [ukl: move assignment to rnd after eventually changing mult, fix build issue and correct comment with the right math] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Marc Kleine-Budde <[email protected]> Cc: [email protected] Cc: Marc Pignat <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Ronald Wahl <[email protected]> Cc: LAK <[email protected]> Cc: Ludovic Desroches <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> parisc: Do not crash 64bit SMP kernels on machines with >= 4GB RAM commit 54e181e073fc1415e41917d725ebdbd7de956455 upstream. Since the beginning of the parisc-linux port, sometimes 64bit SMP kernels were not able to bring up other CPUs than the monarch CPU and instead crashed the kernel. The reason was unclear, esp. since it involved various machines (e.g. J5600, J6750 and SuperDome). Testing showed, that those crashes didn't happened when less than 4GB were installed, or if a 32bit Linux kernel was booted. In the end, the fix for those SMP problems is trivial: During the early phase of the initialization of the CPUs, including the monarch CPU, the PDC_PSW firmware function to enable WIDE (=64bit) mode is called. It's documented that this firmware function may clobber various registers, and one one of those possibly clobbered registers is %cr30 which holds the task thread info pointer. Now, if %cr30 would always have been clobbered, then this bug would have been detected much earlier. But lots of testing finally showed, that - at least for %cr30 - on some machines only the upper 32bits of the 64bit register suddenly turned zero after the firmware call. So, after finding the root cause, the explanation for the various crashes became clear: - On 32bit SMP Linux kernels all upper 32bit were zero, so we didn't faced this problem. - Monarch CPUs in 64bit mode always booted sucessfully, because the inital task thread info pointer was below 4GB. - Secondary CPUs booted sucessfully on machines with less than 4GB RAM because the upper 32bit were zero anyay. - Secondary CPus failed to boot if we had more than 4GB RAM and the task thread info pointer was located above the 4GB boundary. Finally, the patch to fix this problem is trivial by saving the %cr30 register before the firmware call and restoring it afterwards. Signed-off-by: Helge Deller <[email protected]> Signed-off-by: John David Anglin <[email protected]> Signed-off-by: Helge Deller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Add a fixup for ASUS N76VZ commit 6fc16e58adf50c0f1e4478538983fb5ff6f453d4 upstream. ASUS N76VZ needs the same fixup as N56VZ for supporting the boost speaker. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=846529 Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: wm_hubs: Add missing break in hp_supply_event() commit 268ff14525edba31da29a12a9dd693cdd6a7872e upstream. Spotted by coverity CID 115170. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ASoC: dapm: Fix source list debugfs outputs commit ff18620c2157671a8ee21ebb8e6a3520ea209b1f upstream. ... due to a copy & paste error. Spotted by coverity CID 710923. Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> staging: ozwpan: prevent overflow in oz_cdev_write() commit c2c65cd2e14ada6de44cb527e7f1990bede24e15 upstream. We need to check "count" so we don't overflow the ei->data buffer. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Staging: bcm: info leak in ioctl commit 8d1e72250c847fa96498ec029891de4dc638a5ba upstream. The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel information to user space. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> uml: check length in exitcode_proc_write() commit 201f99f170df14ba52ea4c52847779042b7a623b upstream. We don't cap the size of buffer from the user so we could write past the end of the array here. Only root can write to this file. Reported-by: Nico Golde <[email protected]> Reported-by: Fabian Yamaguchi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> xtensa: don't use alternate signal stack on threads commit cba9a90053e3b7973eff4f1946f33032e98eeed5 upstream. According to create_thread(3): "The new thread does not inherit the creating thread's alternate signal stack". Since commit f9a3879a (Fix sigaltstack corruption among cloned threads), current->sas_ss_size is set to 0 for cloned processes sharing VM with their parent. Don't use the (nonexistent) alternate signal stack in this case. This has been broken since commit 29c4dfd9 ([XTENSA] Remove non-rt signal handling). Fixes the SA_ONSTACK part of the nptl/tst-cancel20 test from uClibc. Signed-off-by: Baruch Siach <[email protected]> Signed-off-by: Max Filippov <[email protected]> Signed-off-by: Chris Zankel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> lib/scatterlist.c: don't flush_kernel_dcache_page on slab page commit 3d77b50c5874b7e923be946ba793644f82336b75 upstream. Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper functions") introduces two sg buffer copy helpers, and calls flush_kernel_dcache_page() on pages in SG list after these pages are written to. Unfortunately, the commit may introduce a potential bug: - Before sending some SCSI commands, kmalloc() buffer may be passed to block layper, so flush_kernel_dcache_page() can see a slab page finally - According to cachetlb.txt, flush_kernel_dcache_page() is only called on "a user page", which surely can't be a slab page. - ARCH's implementation of flush_kernel_dcache_page() may use page mapping information to do optimization so page_mapping() will see the slab page, then VM_BUG_ON() is triggered. Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled, and this patch fixes the bug by adding test of '!PageSlab(miter->page)' before calling flush_kernel_dcache_page(). Signed-off-by: Ming Lei <[email protected]> Reported-by: Aaro Koskinen <[email protected]> Tested-by: Simon Baatz <[email protected]> Cc: Russell King - ARM Linux <[email protected]> Cc: Will Deacon <[email protected]> Cc: Aaro Koskinen <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Tejun Heo <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> aacraid: missing capable() check in compat ioctl commit f856567b930dfcdbc3323261bf77240ccdde01f5 upstream. In commit d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we added a check on CAP_SYS_RAWIO to the ioctl. The compat ioctls need the check as well. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> mm: fix aio performance regression for database caused by THP commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream. I am working with a tool that simulates oracle database I/O workload. This tool (orion to be specific - <http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>) allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag. It then does aio into these pages from flash disks using various common block sizes used by database. I am looking at performance with two of the most common block sizes - 1M and 64K. aio performance with these two block sizes plunged after Transparent HugePages was introduced in the kernel. Here are performance numbers: pre-THP 2.6.39 3.11-rc5 1M read 8384 MB/s 5629 MB/s 6501 MB/s 64K read 7867 MB/s 4576 MB/s 4251 MB/s I have narrowed the performance impact down to the overheads introduced by THP in __get_page_tail() and put_compound_page() routines. perf top shows >40% of cycles being spent in these two routines. Every time direct I/O to hugetlbfs pages starts, kernel calls get_page() to grab a reference to the pages and calls put_page() when I/O completes to put the reference away. THP introduced significant amount of locking overhead to get_page() and put_page() when dealing with compound pages because hugepages can be split underneath get_page() and put_page(). It added this overhead irrespective of whether it is dealing with hugetlbfs pages or transparent hugepages. This resulted in 20%-45% drop in aio performance when using hugetlbfs pages. Since hugetlbfs pages can not be split, there is no reason to go through all the locking overhead for these pages from what I can see. I added code to __get_page_tail() and put_compound_page() to bypass all the locking code when working with hugetlbfs pages. This improved performance significantly. Performance numbers with this patch: pre-THP 3.11-rc5 3.11-rc5 + Patch 1M read 8384 MB/s 6501 MB/s 8371 MB/s 64K read 7867 MB/s 4251 MB/s 6510 MB/s Performance with 64K read is still lower than what it was before THP, but still a 53% improvement. It does mean there is more work to be done but I will take a 53% improvement for now. Please take a look at the following patch and let me know if it looks reasonable. [[email protected]: tweak comments] Signed-off-by: Khalid Aziz <[email protected]> Cc: Pravin B Shelar <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm: Prevent overwriting from userspace underallocating core ioctl structs commit b062672e305ce071f21eb9e18b102c2a430e0999 upstream. Apply the protections from commit 1b2f1489633888d4a06028315dc19d65768a1c05 Author: Dave Airlie <[email protected]> Date: Sat Aug 14 20:20:34 2010 +1000 drm: block userspace under allocating buffer and having drivers overwrite it (v2) to the core ioctl structs as well, for we found one instance where there is a 32-/64-bit size mismatch and were guilty of writing beyond the end of the user's buffer. Signed-off-by: Chris Wilson <[email protected]> Cc: Dave Airlie <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> drm/radeon/atom: workaround vbios bug in transmitter table on rs780 commit c23632d4e57c0dd20bf50eca08fa0eb8ad3ff680 upstream. Some rs780 asics seem to be affected as well. See: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=91f3a6aaf280294b07c05dfe606e6c27b7ba3c72 Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60791 Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> Linux 3.4.69 Signed-off-by: Chet Kener <[email protected]> xen-netback: use jiffies_64 value to calculate credit timeout [ Upstream commit 059dfa6a93b779516321e5112db9d7621b1367ba ] time_after_eq() only works if the delta is < MAX_ULONG/2. For a 32bit Dom0, if netfront sends packets at a very low rate, the time between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2 and the test for timer_after_eq() will be incorrect. Credit will not be replenished and the guest may become unable to send packets (e.g., if prior to the long gap, all credit was exhausted). Use jiffies_64 variant to mitigate this problem for 32bit Dom0. Suggested-by: Jan Beulich <[email protected]> Signed-off-by: Wei Liu <[email protected]> Reviewed-by: David Vrabel <[email protected]> Cc: Ian Campbell <[email protected]> Cc: Jason Luan <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> PCI: fix truncation of resource size to 32 bits commit d6776e6d5c2f8db0252f447b09736075e1bbe387 upstream. _pci_assign_resource() took an int "size" argument, which meant that sizes larger than 4GB were truncated. Change type to resource_size_t. [bhelgaas: changelog] Signed-off-by: Nikhil P Rao <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jiri Slaby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> USB: add new zte 3g-dongle's pid to option.c commit 0636fc507a976cdc40f21bdbcce6f0b98ff1dfe9 upstream. Signed-off-by: Rui li <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> ALSA: hda - Move one-time init codes from generic_hdmi_init() commit 8b8d654b55648561287bd8baca0f75f964a17038 upstream. The codes to initialize work struct or create a proc interface should be called only once and never although it's called many times through the init callback. Move that stuff into patch_generic_hdmi() so that it's called only once. Signed-off-by: Takashi Iwai <[email protected]> BugLink: https://bugs.launchpad.net/bugs/1212160 Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Chet Kener <[email protected]> netfilter: nf_ct_sip: don't drop packets with offsets p…
franciscofranco
pushed a commit
that referenced
this issue
Mar 26, 2017
Change-Id: Ib07ead1e23e816c96552254c049016825a164f2c UPSTREAM: zram/zcomp: use GFP_NOIO to allocate streams (cherry picked from commit 3d5fe03a3ea013060ebba2a811aeb0f23f56aefa) We can end up allocating a new compression stream with GFP_KERNEL from within the IO path, which may result is nested (recursive) IO operations. That can introduce problems if the IO path in question is a reclaimer, holding some locks that will deadlock nested IOs. Allocate streams and working memory using GFP_NOIO flag, forbidding recursive IO and FS operations. An example: inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes: (jbd2_handle){+.+.?.}, at: start_this_handle+0x4ca/0x555 {IN-RECLAIM_FS-W} state was registered at: __lock_acquire+0x8da/0x117b lock_acquire+0x10c/0x1a7 start_this_handle+0x52d/0x555 jbd2__journal_start+0xb4/0x237 __ext4_journal_start_sb+0x108/0x17e ext4_dirty_inode+0x32/0x61 __mark_inode_dirty+0x16b/0x60c iput+0x11e/0x274 __dentry_kill+0x148/0x1b8 shrink_dentry_list+0x274/0x44a prune_dcache_sb+0x4a/0x55 super_cache_scan+0xfc/0x176 shrink_slab.part.14.constprop.25+0x2a2/0x4d3 shrink_zone+0x74/0x140 kswapd+0x6b7/0x930 kthread+0x107/0x10f ret_from_fork+0x3f/0x70 irq event stamp: 138297 hardirqs last enabled at (138297): debug_check_no_locks_freed+0x113/0x12f hardirqs last disabled at (138296): debug_check_no_locks_freed+0x33/0x12f softirqs last enabled at (137818): __do_softirq+0x2d3/0x3e9 softirqs last disabled at (137813): irq_exit+0x41/0x95 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(jbd2_handle); <Interrupt> lock(jbd2_handle); *** DEADLOCK *** 5 locks held by git/20158: #0: (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b #1: (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3 #2: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b #3: (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b #4: (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 stack backtrace: CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211 Call Trace: dump_stack+0x4c/0x6e mark_lock+0x384/0x56d mark_held_locks+0x5f/0x76 lockdep_trace_alloc+0xb2/0xb5 kmem_cache_alloc_trace+0x32/0x1e2 zcomp_strm_alloc+0x25/0x73 [zram] zcomp_strm_multi_find+0xe7/0x173 [zram] zcomp_strm_find+0xc/0xe [zram] zram_bvec_rw+0x2ca/0x7e0 [zram] zram_make_request+0x1fa/0x301 [zram] generic_make_request+0x9c/0xdb submit_bio+0xf7/0x120 ext4_io_submit+0x2e/0x43 ext4_bio_write_page+0x1b7/0x300 mpage_submit_page+0x60/0x77 mpage_map_and_submit_buffers+0x10f/0x21d ext4_writepages+0xc8c/0xe1b do_writepages+0x23/0x2c __filemap_fdatawrite_range+0x84/0x8b filemap_flush+0x1c/0x1e ext4_alloc_da_blocks+0xb8/0x117 ext4_rename+0x132/0x6dc ? mark_held_locks+0x5f/0x76 ext4_rename2+0x29/0x2b vfs_rename+0x540/0x636 SyS_renameat2+0x359/0x44d SyS_rename+0x1e/0x20 entry_SYSCALL_64_fastpath+0x12/0x6f [[email protected]: add stable mark] Signed-off-by: Sergey Senozhatsky <[email protected]> Acked-by: Minchan Kim <[email protected]> Cc: Kyeongdon Kim <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> UPSTREAM: zram: try vmalloc() after kmalloc() (cherry picked from commit d913897abace843bba20249f3190167f7895e9c3) When we're using LZ4 multi compression streams for zram swap, we found out page allocation failure message in system running test. That was not only once, but a few(2 - 5 times per test). Also, some failure cases were continually occurring to try allocation order 3. In order to make parallel compression private data, we should call kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order 2/3 size memory to allocate in that time, page allocation fails. This patch makes to use vmalloc() as fallback of kmalloc(), this prevents page alloc failure warning. After using this, we never found warning message in running test, also It could reduce process startup latency about 60-120ms in each case. For reference a call trace : Binder_1: page allocation failure: order:3, mode:0x10c0d0 CPU: 0 PID: 424 Comm: Binder_1 Tainted: GW 3.10.49-perf-g991d02b-dirty #20 Call trace: dump_backtrace+0x0/0x270 show_stack+0x10/0x1c dump_stack+0x1c/0x28 warn_alloc_failed+0xfc/0x11c __alloc_pages_nodemask+0x724/0x7f0 __get_free_pages+0x14/0x5c kmalloc_order_trace+0x38/0xd8 zcomp_lz4_create+0x2c/0x38 zcomp_strm_alloc+0x34/0x78 zcomp_strm_multi_find+0x124/0x1ec zcomp_strm_find+0xc/0x18 zram_bvec_rw+0x2fc/0x780 zram_make_request+0x25c/0x2d4 generic_make_request+0x80/0xbc submit_bio+0xa4/0x15c __swap_writepage+0x218/0x230 swap_writepage+0x3c/0x4c shrink_page_list+0x51c/0x8d0 shrink_inactive_list+0x3f8/0x60c shrink_lruvec+0x33c/0x4cc shrink_zone+0x3c/0x100 try_to_free_pages+0x2b8/0x54c __alloc_pages_nodemask+0x514/0x7f0 __get_free_pages+0x14/0x5c proc_info_read+0x50/0xe4 vfs_read+0xa0/0x12c SyS_read+0x44/0x74 DMA: 3397*4kB (MC) 26*8kB (RC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13796kB [[email protected]: change vmalloc gfp and adding comment about gfp] [[email protected]: tweak comments and styles] Signed-off-by: Kyeongdon Kim <[email protected]> Signed-off-by: Minchan Kim <[email protected]> Acked-by: Sergey Senozhatsky <[email protected]> Sergey Senozhatsky <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> UPSTREAM: zram: pass gfp from zcomp frontend to backend (cherry picked from commit 75d8947a36d0c9aedd69118d1f14bf424005c7c2) Each zcomp backend uses own gfp flag but it's pointless because the context they could be called is driven by upper layer(ie, zcomp frontend). As well, zcomp frondend could call them in different context. One context(ie, zram init part) is it should be better to make sure successful allocation other context(ie, further stream allocation part for accelarating I/O speed) is just optional so let's pass gfp down from driver (ie, zcomp frontend) like normal MM convention. [[email protected]: add missing __vmalloc zero and highmem gfps] Signed-off-by: Minchan Kim <[email protected]> Signed-off-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> UPSTREAM: zram/zcomp: do not zero out zcomp private pages (cherry picked from commit e02d238c9852a91b30da9ea32ce36d1416cdc683) Do not __GFP_ZERO allocated zcomp ->private pages. We keep allocated streams around and use them for read/write requests, so we supply a zeroed out ->private to compression algorithm as a scratch buffer only once -- the first time we use that stream. For the rest of IO requests served by this stream ->private usually contains some temporarily data from the previous requests. Signed-off-by: Sergey Senozhatsky <[email protected]> Acked-by: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> UPSTREAM: block: disable entropy contributions for nonrot devices (cherry picked from commit b277da0a8a594308e17881f4926879bd5fca2a2d) Clear QUEUE_FLAG_ADD_RANDOM in all block drivers that set QUEUE_FLAG_NONROT. Historically, all block devices have automatically made entropy contributions. But as previously stated in commit e2e1a14 ("block: add sysfs knob for turning off disk entropy contributions"): - On SSD disks, the completion times aren't as random as they are for rotational drives. So it's questionable whether they should contribute to the random pool in the first place. - Calling add_disk_randomness() has a lot of overhead. There are more reliable sources for randomness than non-rotational block devices. From a security perspective it is better to err on the side of caution than to allow entropy contributions from unreliable "random" sources. Change-Id: I2a4f86bacee8786e2cb1a82d45156338f79d64e0 Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
franciscofranco
pushed a commit
that referenced
this issue
Mar 26, 2017
commit e81107d4c6bd098878af9796b24edc8d4a9524fd upstream. My colleague ran into a program stall on a x86_64 server, where n_tty_read() was waiting for data even if there was data in the buffer in the pty. kernel stack for the stuck process looks like below. #0 [ffff88303d107b58] __schedule at ffffffff815c4b20 #1 [ffff88303d107bd0] schedule at ffffffff815c513e #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23 #5 [ffff88303d107dd0] tty_read at ffffffff81368013 #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704 #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57 #8 [ffff88303d107f00] sys_read at ffffffff811a4306 #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7 There seems to be two problems causing this issue. First, in drivers/tty/n_tty.c, __receive_buf() stores the data and updates ldata->commit_head using smp_store_release() and then checks the wait queue using waitqueue_active(). However, since there is no memory barrier, __receive_buf() could return without calling wake_up_interactive_poll(), and at the same time, n_tty_read() could start to wait in wait_woken() as in the following chart. __receive_buf() n_tty_read() ------------------------------------------------------------------------ if (waitqueue_active(&tty->read_wait)) /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ add_wait_queue(&tty->read_wait, &wait); ... if (!input_available_p(tty, 0)) { smp_store_release(&ldata->commit_head, ldata->read_head); ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ The second problem is that n_tty_read() also lacks a memory barrier call and could also cause __receive_buf() to return without calling wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken() as in the chart below. __receive_buf() n_tty_read() ------------------------------------------------------------------------ spin_lock_irqsave(&q->lock, flags); /* from add_wait_queue() */ ... if (!input_available_p(tty, 0)) { /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ smp_store_release(&ldata->commit_head, ldata->read_head); if (waitqueue_active(&tty->read_wait)) __add_wait_queue(q, wait); spin_unlock_irqrestore(&q->lock,flags); /* from add_wait_queue() */ ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ There are also other places in drivers/tty/n_tty.c which have similar calls to waitqueue_active(), so instead of adding many memory barrier calls, this patch simply removes the call to waitqueue_active(), leaving just wake_up*() behind. This fixes both problems because, even though the memory access before or after the spinlocks in both wake_up*() and add_wait_queue() can sneak into the critical section, it cannot go past it and the critical section assures that they will be serialized (please see "INTER-CPU ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a better explanation). Moreover, the resulting code is much simpler. Latency measurement using a ping-pong test over a pty doesn't show any visible performance drop. Signed-off-by: Kosuke Tatsukawa <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> [lizf: Backported to 3.4: - adjust context - s/wake_up_interruptible_poll/wake_up_interruptible/ - drop changes to __receive_buf() and n_tty_set_termios()] Signed-off-by: Zefan Li <[email protected]>
franciscofranco
pushed a commit
that referenced
this issue
Mar 26, 2017
We can end up allocating a new compression stream with GFP_KERNEL from within the IO path, which may result is nested (recursive) IO operations. That can introduce problems if the IO path in question is a reclaimer, holding some locks that will deadlock nested IOs. Allocate streams and working memory using GFP_NOIO flag, forbidding recursive IO and FS operations. An example: [ 747.233722] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. [ 747.233724] git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 747.233725] (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 [ 747.233733] {IN-RECLAIM_FS-W} state was registered at: [ 747.233735] [<ffffffff8107b8e9>] __lock_acquire+0x8da/0x117b [ 747.233738] [<ffffffff8107c950>] lock_acquire+0x10c/0x1a7 [ 747.233740] [<ffffffff811e323e>] start_this_handle+0x52d/0x555 [ 747.233742] [<ffffffff811e331a>] jbd2__journal_start+0xb4/0x237 [ 747.233744] [<ffffffff811cc6c7>] __ext4_journal_start_sb+0x108/0x17e [ 747.233748] [<ffffffff811a90bf>] ext4_dirty_inode+0x32/0x61 [ 747.233750] [<ffffffff8115f37e>] __mark_inode_dirty+0x16b/0x60c [ 747.233754] [<ffffffff81150ad6>] iput+0x11e/0x274 [ 747.233757] [<ffffffff8114bfbd>] __dentry_kill+0x148/0x1b8 [ 747.233759] [<ffffffff8114c9d9>] shrink_dentry_list+0x274/0x44a [ 747.233761] [<ffffffff8114d38a>] prune_dcache_sb+0x4a/0x55 [ 747.233763] [<ffffffff8113b1ad>] super_cache_scan+0xfc/0x176 [ 747.233767] [<ffffffff810fa089>] shrink_slab.part.14.constprop.25+0x2a2/0x4d3 [ 747.233770] [<ffffffff810fcccb>] shrink_zone+0x74/0x140 [ 747.233772] [<ffffffff810fd924>] kswapd+0x6b7/0x930 [ 747.233774] [<ffffffff81058887>] kthread+0x107/0x10f [ 747.233778] [<ffffffff814fadff>] ret_from_fork+0x3f/0x70 [ 747.233783] irq event stamp: 138297 [ 747.233784] hardirqs last enabled at (138297): [<ffffffff8107aff3>] debug_check_no_locks_freed+0x113/0x12f [ 747.233786] hardirqs last disabled at (138296): [<ffffffff8107af13>] debug_check_no_locks_freed+0x33/0x12f [ 747.233788] softirqs last enabled at (137818): [<ffffffff81040f89>] __do_softirq+0x2d3/0x3e9 [ 747.233792] softirqs last disabled at (137813): [<ffffffff81041292>] irq_exit+0x41/0x95 [ 747.233794] other info that might help us debug this: [ 747.233796] Possible unsafe locking scenario: [ 747.233797] CPU0 [ 747.233798] ---- [ 747.233799] lock(jbd2_handle); [ 747.233801] <Interrupt> [ 747.233801] lock(jbd2_handle); [ 747.233803] *** DEADLOCK *** [ 747.233805] 5 locks held by git/20158: [ 747.233806] #0: (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b [ 747.233811] #1: (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3 [ 747.233817] #2: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b [ 747.233822] #3: (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b [ 747.233827] #4: (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 [ 747.233831] stack backtrace: [ 747.233834] CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211 [ 747.233837] ffff8800a56cea40 ffff88010d0a75f8 ffffffff814f446d ffffffff81077036 [ 747.233840] ffffffff823a84b0 ffff88010d0a7638 ffffffff814f3849 0000000000000001 [ 747.233843] 000000000000000a ffff8800a56cf6f8 ffff8800a56cea40 ffffffff810795dd [ 747.233846] Call Trace: [ 747.233849] [<ffffffff814f446d>] dump_stack+0x4c/0x6e [ 747.233852] [<ffffffff81077036>] ? up+0x39/0x3e [ 747.233854] [<ffffffff814f3849>] print_usage_bug.part.23+0x25b/0x26a [ 747.233857] [<ffffffff810795dd>] ? print_shortest_lock_dependencies+0x182/0x182 [ 747.233859] [<ffffffff8107a9c9>] mark_lock+0x384/0x56d [ 747.233862] [<ffffffff8107ac11>] mark_held_locks+0x5f/0x76 [ 747.233865] [<ffffffffa023d2f3>] ? zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233867] [<ffffffff8107d13b>] lockdep_trace_alloc+0xb2/0xb5 [ 747.233870] [<ffffffff8112bac7>] kmem_cache_alloc_trace+0x32/0x1e2 [ 747.233873] [<ffffffffa023d2f3>] zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233876] [<ffffffffa023d428>] zcomp_strm_multi_find+0xe7/0x173 [zram] [ 747.233879] [<ffffffffa023d58b>] zcomp_strm_find+0xc/0xe [zram] [ 747.233881] [<ffffffffa023f292>] zram_bvec_rw+0x2ca/0x7e0 [zram] [ 747.233885] [<ffffffffa023fa8c>] zram_make_request+0x1fa/0x301 [zram] [ 747.233889] [<ffffffff812142f8>] generic_make_request+0x9c/0xdb [ 747.233891] [<ffffffff8121442e>] submit_bio+0xf7/0x120 [ 747.233895] [<ffffffff810f1c0c>] ? __test_set_page_writeback+0x1a0/0x1b8 [ 747.233897] [<ffffffff811a9d00>] ext4_io_submit+0x2e/0x43 [ 747.233899] [<ffffffff811a9efa>] ext4_bio_write_page+0x1b7/0x300 [ 747.233902] [<ffffffff811a2106>] mpage_submit_page+0x60/0x77 [ 747.233905] [<ffffffff811a25b0>] mpage_map_and_submit_buffers+0x10f/0x21d [ 747.233907] [<ffffffff811a6814>] ext4_writepages+0xc8c/0xe1b [ 747.233910] [<ffffffff810f3f77>] do_writepages+0x23/0x2c [ 747.233913] [<ffffffff810ea5d1>] __filemap_fdatawrite_range+0x84/0x8b [ 747.233915] [<ffffffff810ea657>] filemap_flush+0x1c/0x1e [ 747.233917] [<ffffffff811a3851>] ext4_alloc_da_blocks+0xb8/0x117 [ 747.233919] [<ffffffff811af52a>] ext4_rename+0x132/0x6dc [ 747.233921] [<ffffffff8107ac11>] ? mark_held_locks+0x5f/0x76 [ 747.233924] [<ffffffff811afafd>] ext4_rename2+0x29/0x2b [ 747.233926] [<ffffffff811427ea>] vfs_rename+0x540/0x636 [ 747.233928] [<ffffffff81146a01>] SyS_renameat2+0x359/0x44d [ 747.233931] [<ffffffff81146b26>] SyS_rename+0x1e/0x20 [ 747.233933] [<ffffffff814faa17>] entry_SYSCALL_64_fastpath+0x12/0x6f [[email protected]: add stable mark] Signed-off-by: Sergey Senozhatsky <[email protected]> Acked-by: Minchan Kim <[email protected]> Cc: Kyeongdon Kim <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Francisco Franco <[email protected]>
franciscofranco
pushed a commit
that referenced
this issue
May 9, 2018
We can end up allocating a new compression stream with GFP_KERNEL from within the IO path, which may result is nested (recursive) IO operations. That can introduce problems if the IO path in question is a reclaimer, holding some locks that will deadlock nested IOs. Allocate streams and working memory using GFP_NOIO flag, forbidding recursive IO and FS operations. An example: [ 747.233722] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. [ 747.233724] git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 747.233725] (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 [ 747.233733] {IN-RECLAIM_FS-W} state was registered at: [ 747.233735] [<ffffffff8107b8e9>] __lock_acquire+0x8da/0x117b [ 747.233738] [<ffffffff8107c950>] lock_acquire+0x10c/0x1a7 [ 747.233740] [<ffffffff811e323e>] start_this_handle+0x52d/0x555 [ 747.233742] [<ffffffff811e331a>] jbd2__journal_start+0xb4/0x237 [ 747.233744] [<ffffffff811cc6c7>] __ext4_journal_start_sb+0x108/0x17e [ 747.233748] [<ffffffff811a90bf>] ext4_dirty_inode+0x32/0x61 [ 747.233750] [<ffffffff8115f37e>] __mark_inode_dirty+0x16b/0x60c [ 747.233754] [<ffffffff81150ad6>] iput+0x11e/0x274 [ 747.233757] [<ffffffff8114bfbd>] __dentry_kill+0x148/0x1b8 [ 747.233759] [<ffffffff8114c9d9>] shrink_dentry_list+0x274/0x44a [ 747.233761] [<ffffffff8114d38a>] prune_dcache_sb+0x4a/0x55 [ 747.233763] [<ffffffff8113b1ad>] super_cache_scan+0xfc/0x176 [ 747.233767] [<ffffffff810fa089>] shrink_slab.part.14.constprop.25+0x2a2/0x4d3 [ 747.233770] [<ffffffff810fcccb>] shrink_zone+0x74/0x140 [ 747.233772] [<ffffffff810fd924>] kswapd+0x6b7/0x930 [ 747.233774] [<ffffffff81058887>] kthread+0x107/0x10f [ 747.233778] [<ffffffff814fadff>] ret_from_fork+0x3f/0x70 [ 747.233783] irq event stamp: 138297 [ 747.233784] hardirqs last enabled at (138297): [<ffffffff8107aff3>] debug_check_no_locks_freed+0x113/0x12f [ 747.233786] hardirqs last disabled at (138296): [<ffffffff8107af13>] debug_check_no_locks_freed+0x33/0x12f [ 747.233788] softirqs last enabled at (137818): [<ffffffff81040f89>] __do_softirq+0x2d3/0x3e9 [ 747.233792] softirqs last disabled at (137813): [<ffffffff81041292>] irq_exit+0x41/0x95 [ 747.233794] other info that might help us debug this: [ 747.233796] Possible unsafe locking scenario: [ 747.233797] CPU0 [ 747.233798] ---- [ 747.233799] lock(jbd2_handle); [ 747.233801] <Interrupt> [ 747.233801] lock(jbd2_handle); [ 747.233803] *** DEADLOCK *** [ 747.233805] 5 locks held by git/20158: [ 747.233806] #0: (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b [ 747.233811] #1: (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3 [ 747.233817] #2: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b [ 747.233822] #3: (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b [ 747.233827] #4: (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 [ 747.233831] stack backtrace: [ 747.233834] CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211 [ 747.233837] ffff8800a56cea40 ffff88010d0a75f8 ffffffff814f446d ffffffff81077036 [ 747.233840] ffffffff823a84b0 ffff88010d0a7638 ffffffff814f3849 0000000000000001 [ 747.233843] 000000000000000a ffff8800a56cf6f8 ffff8800a56cea40 ffffffff810795dd [ 747.233846] Call Trace: [ 747.233849] [<ffffffff814f446d>] dump_stack+0x4c/0x6e [ 747.233852] [<ffffffff81077036>] ? up+0x39/0x3e [ 747.233854] [<ffffffff814f3849>] print_usage_bug.part.23+0x25b/0x26a [ 747.233857] [<ffffffff810795dd>] ? print_shortest_lock_dependencies+0x182/0x182 [ 747.233859] [<ffffffff8107a9c9>] mark_lock+0x384/0x56d [ 747.233862] [<ffffffff8107ac11>] mark_held_locks+0x5f/0x76 [ 747.233865] [<ffffffffa023d2f3>] ? zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233867] [<ffffffff8107d13b>] lockdep_trace_alloc+0xb2/0xb5 [ 747.233870] [<ffffffff8112bac7>] kmem_cache_alloc_trace+0x32/0x1e2 [ 747.233873] [<ffffffffa023d2f3>] zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233876] [<ffffffffa023d428>] zcomp_strm_multi_find+0xe7/0x173 [zram] [ 747.233879] [<ffffffffa023d58b>] zcomp_strm_find+0xc/0xe [zram] [ 747.233881] [<ffffffffa023f292>] zram_bvec_rw+0x2ca/0x7e0 [zram] [ 747.233885] [<ffffffffa023fa8c>] zram_make_request+0x1fa/0x301 [zram] [ 747.233889] [<ffffffff812142f8>] generic_make_request+0x9c/0xdb [ 747.233891] [<ffffffff8121442e>] submit_bio+0xf7/0x120 [ 747.233895] [<ffffffff810f1c0c>] ? __test_set_page_writeback+0x1a0/0x1b8 [ 747.233897] [<ffffffff811a9d00>] ext4_io_submit+0x2e/0x43 [ 747.233899] [<ffffffff811a9efa>] ext4_bio_write_page+0x1b7/0x300 [ 747.233902] [<ffffffff811a2106>] mpage_submit_page+0x60/0x77 [ 747.233905] [<ffffffff811a25b0>] mpage_map_and_submit_buffers+0x10f/0x21d [ 747.233907] [<ffffffff811a6814>] ext4_writepages+0xc8c/0xe1b [ 747.233910] [<ffffffff810f3f77>] do_writepages+0x23/0x2c [ 747.233913] [<ffffffff810ea5d1>] __filemap_fdatawrite_range+0x84/0x8b [ 747.233915] [<ffffffff810ea657>] filemap_flush+0x1c/0x1e [ 747.233917] [<ffffffff811a3851>] ext4_alloc_da_blocks+0xb8/0x117 [ 747.233919] [<ffffffff811af52a>] ext4_rename+0x132/0x6dc [ 747.233921] [<ffffffff8107ac11>] ? mark_held_locks+0x5f/0x76 [ 747.233924] [<ffffffff811afafd>] ext4_rename2+0x29/0x2b [ 747.233926] [<ffffffff811427ea>] vfs_rename+0x540/0x636 [ 747.233928] [<ffffffff81146a01>] SyS_renameat2+0x359/0x44d [ 747.233931] [<ffffffff81146b26>] SyS_rename+0x1e/0x20 [ 747.233933] [<ffffffff814faa17>] entry_SYSCALL_64_fastpath+0x12/0x6f [[email protected]: add stable mark] Signed-off-by: Sergey Senozhatsky <[email protected]> Acked-by: Minchan Kim <[email protected]> Cc: Kyeongdon Kim <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Francisco Franco <[email protected]>
franciscofranco
pushed a commit
that referenced
this issue
Sep 16, 2018
We can end up allocating a new compression stream with GFP_KERNEL from within the IO path, which may result is nested (recursive) IO operations. That can introduce problems if the IO path in question is a reclaimer, holding some locks that will deadlock nested IOs. Allocate streams and working memory using GFP_NOIO flag, forbidding recursive IO and FS operations. An example: [ 747.233722] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. [ 747.233724] git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 747.233725] (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 [ 747.233733] {IN-RECLAIM_FS-W} state was registered at: [ 747.233735] [<ffffffff8107b8e9>] __lock_acquire+0x8da/0x117b [ 747.233738] [<ffffffff8107c950>] lock_acquire+0x10c/0x1a7 [ 747.233740] [<ffffffff811e323e>] start_this_handle+0x52d/0x555 [ 747.233742] [<ffffffff811e331a>] jbd2__journal_start+0xb4/0x237 [ 747.233744] [<ffffffff811cc6c7>] __ext4_journal_start_sb+0x108/0x17e [ 747.233748] [<ffffffff811a90bf>] ext4_dirty_inode+0x32/0x61 [ 747.233750] [<ffffffff8115f37e>] __mark_inode_dirty+0x16b/0x60c [ 747.233754] [<ffffffff81150ad6>] iput+0x11e/0x274 [ 747.233757] [<ffffffff8114bfbd>] __dentry_kill+0x148/0x1b8 [ 747.233759] [<ffffffff8114c9d9>] shrink_dentry_list+0x274/0x44a [ 747.233761] [<ffffffff8114d38a>] prune_dcache_sb+0x4a/0x55 [ 747.233763] [<ffffffff8113b1ad>] super_cache_scan+0xfc/0x176 [ 747.233767] [<ffffffff810fa089>] shrink_slab.part.14.constprop.25+0x2a2/0x4d3 [ 747.233770] [<ffffffff810fcccb>] shrink_zone+0x74/0x140 [ 747.233772] [<ffffffff810fd924>] kswapd+0x6b7/0x930 [ 747.233774] [<ffffffff81058887>] kthread+0x107/0x10f [ 747.233778] [<ffffffff814fadff>] ret_from_fork+0x3f/0x70 [ 747.233783] irq event stamp: 138297 [ 747.233784] hardirqs last enabled at (138297): [<ffffffff8107aff3>] debug_check_no_locks_freed+0x113/0x12f [ 747.233786] hardirqs last disabled at (138296): [<ffffffff8107af13>] debug_check_no_locks_freed+0x33/0x12f [ 747.233788] softirqs last enabled at (137818): [<ffffffff81040f89>] __do_softirq+0x2d3/0x3e9 [ 747.233792] softirqs last disabled at (137813): [<ffffffff81041292>] irq_exit+0x41/0x95 [ 747.233794] other info that might help us debug this: [ 747.233796] Possible unsafe locking scenario: [ 747.233797] CPU0 [ 747.233798] ---- [ 747.233799] lock(jbd2_handle); [ 747.233801] <Interrupt> [ 747.233801] lock(jbd2_handle); [ 747.233803] *** DEADLOCK *** [ 747.233805] 5 locks held by git/20158: [ 747.233806] #0: (sb_writers#7){.+.+.+}, at: [<ffffffff81155411>] mnt_want_write+0x24/0x4b [ 747.233811] #1: (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [<ffffffff81145087>] lock_rename+0xd9/0xe3 [ 747.233817] #2: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8114f8e2>] lock_two_nondirectories+0x3f/0x6b [ 747.233822] #3: (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [<ffffffff8114f909>] lock_two_nondirectories+0x66/0x6b [ 747.233827] #4: (jbd2_handle){+.+.?.}, at: [<ffffffff811e31db>] start_this_handle+0x4ca/0x555 [ 747.233831] stack backtrace: [ 747.233834] CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211 [ 747.233837] ffff8800a56cea40 ffff88010d0a75f8 ffffffff814f446d ffffffff81077036 [ 747.233840] ffffffff823a84b0 ffff88010d0a7638 ffffffff814f3849 0000000000000001 [ 747.233843] 000000000000000a ffff8800a56cf6f8 ffff8800a56cea40 ffffffff810795dd [ 747.233846] Call Trace: [ 747.233849] [<ffffffff814f446d>] dump_stack+0x4c/0x6e [ 747.233852] [<ffffffff81077036>] ? up+0x39/0x3e [ 747.233854] [<ffffffff814f3849>] print_usage_bug.part.23+0x25b/0x26a [ 747.233857] [<ffffffff810795dd>] ? print_shortest_lock_dependencies+0x182/0x182 [ 747.233859] [<ffffffff8107a9c9>] mark_lock+0x384/0x56d [ 747.233862] [<ffffffff8107ac11>] mark_held_locks+0x5f/0x76 [ 747.233865] [<ffffffffa023d2f3>] ? zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233867] [<ffffffff8107d13b>] lockdep_trace_alloc+0xb2/0xb5 [ 747.233870] [<ffffffff8112bac7>] kmem_cache_alloc_trace+0x32/0x1e2 [ 747.233873] [<ffffffffa023d2f3>] zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233876] [<ffffffffa023d428>] zcomp_strm_multi_find+0xe7/0x173 [zram] [ 747.233879] [<ffffffffa023d58b>] zcomp_strm_find+0xc/0xe [zram] [ 747.233881] [<ffffffffa023f292>] zram_bvec_rw+0x2ca/0x7e0 [zram] [ 747.233885] [<ffffffffa023fa8c>] zram_make_request+0x1fa/0x301 [zram] [ 747.233889] [<ffffffff812142f8>] generic_make_request+0x9c/0xdb [ 747.233891] [<ffffffff8121442e>] submit_bio+0xf7/0x120 [ 747.233895] [<ffffffff810f1c0c>] ? __test_set_page_writeback+0x1a0/0x1b8 [ 747.233897] [<ffffffff811a9d00>] ext4_io_submit+0x2e/0x43 [ 747.233899] [<ffffffff811a9efa>] ext4_bio_write_page+0x1b7/0x300 [ 747.233902] [<ffffffff811a2106>] mpage_submit_page+0x60/0x77 [ 747.233905] [<ffffffff811a25b0>] mpage_map_and_submit_buffers+0x10f/0x21d [ 747.233907] [<ffffffff811a6814>] ext4_writepages+0xc8c/0xe1b [ 747.233910] [<ffffffff810f3f77>] do_writepages+0x23/0x2c [ 747.233913] [<ffffffff810ea5d1>] __filemap_fdatawrite_range+0x84/0x8b [ 747.233915] [<ffffffff810ea657>] filemap_flush+0x1c/0x1e [ 747.233917] [<ffffffff811a3851>] ext4_alloc_da_blocks+0xb8/0x117 [ 747.233919] [<ffffffff811af52a>] ext4_rename+0x132/0x6dc [ 747.233921] [<ffffffff8107ac11>] ? mark_held_locks+0x5f/0x76 [ 747.233924] [<ffffffff811afafd>] ext4_rename2+0x29/0x2b [ 747.233926] [<ffffffff811427ea>] vfs_rename+0x540/0x636 [ 747.233928] [<ffffffff81146a01>] SyS_renameat2+0x359/0x44d [ 747.233931] [<ffffffff81146b26>] SyS_rename+0x1e/0x20 [ 747.233933] [<ffffffff814faa17>] entry_SYSCALL_64_fastpath+0x12/0x6f [[email protected]: add stable mark] Signed-off-by: Sergey Senozhatsky <[email protected]> Acked-by: Minchan Kim <[email protected]> Cc: Kyeongdon Kim <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Francisco Franco <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I can't get my Fiio E17 to play anything with r34, with r33 it worked (if you plugged the DAC while something was playing OR plugged the DAC, started playing something and plugged something to the 3.5mm jack and removed it).
I can't test USB OTG because I glued the OTG cable to another USB cable.
dmesg http://forum.xda-developers.com/attachment.php?attachmentid=3049935&d=1417816414
The text was updated successfully, but these errors were encountered: