Skip to content

Odroidx next#1

Merged
hardkernel merged 9 commits intohardkernel:odroidx-nextfrom
tobetter:odroidx-next
Aug 13, 2012
Merged

Odroidx next#1
hardkernel merged 9 commits intohardkernel:odroidx-nextfrom
tobetter:odroidx-next

Conversation

@tobetter
Copy link
Collaborator

eMMC 및 CPUFREQ 드라이버 관련 패치입니다.

Dongjin Kim and others added 9 commits August 13, 2012 07:11
Some platforms allow for clock gating and control of bus interface unit clock
and card interface unit clock. Add support for clock lookup of optional biu
and ciu clocks for clock gating and clock speed determination.

Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
Signed-off-by: Thomas Abraham <thomas.abraham@linaro.org>
'biu' and 'ciu' clock source is chagned to "dwmmc" and "sclk_dwmci", to use
Exynos speicific clock gating.

Signed-off-by: Dongjin Kim <dongjin.kim@agreeyamobility.net>
Defined the clocks for DWCI and gpio configuration.

Signed-off-by: Dongjin Kim <dongjin.kim@agreeyamobility.net>
Signed-off-by: Dongjin Kim <dongjin.kim@agreeyamobility.net>
"Synopsys DesignWare Memory Card Interface" is enabled.

Signed-off-by: Dongjin Kim <dongjin.kim@agreeyamobility.net>
Exynos4 cpufreq driver is enabled, and 'performance' is selected for its power
governor as default.

Signed-off-by: Dongjin Kim <dongjin.kim@agreeyamobility.net>
hardkernel added a commit that referenced this pull request Aug 13, 2012
@hardkernel hardkernel merged commit 5d0fd34 into hardkernel:odroidx-next Aug 13, 2012
hardkernel pushed a commit that referenced this pull request Aug 17, 2012
With the new i.MX clock infrastructure we need to request the dma clocks
seperately: ahb and ipg clocks.

This fixes the following kernel crash and make audio to be functional again:

root@freescale /home$ aplay audio48k16S.wav
Playing WAVE 'audio48k16S.wav' : Signed 16 bit Little Endian, Rate 48000 Hz, Stereo
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c7b74000
[00000000] *pgd=a7bb5831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in:
CPU: 0    Not tainted  (3.5.0-rc5-next-20120702-00007-g3028b64 #1128)
PC is at snd_dmaengine_pcm_get_chan+0x8/0x10
LR is at snd_imx_pcm_hw_params+0x18/0xdc
pc : [<c02d3cf8>]    lr : [<c02e95ec>]    psr: a0000013
sp : c7b45e30  ip : ffffffff  fp : c7ae58e0
r10: 00000000  r9 : c7ae981c  r8 : c7b88800
r7 : c7ae5a60  r6 : c7ae5b20  r5 : c7ae9810  r4 : c7afa060
r3 : 00000000  r2 : 00000001  r1 : c7b88800  r0 : c7afa060
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 0005317f  Table: a7b74000  DAC: 00000015
Process aplay (pid: 701, stack limit = 0xc7b44270)
Stack: (0xc7b45e30 to 0xc7b46000)
5e20:                                     00100000 00000029 c7b88800 c02db870
5e40: c7ae5a60 c02d4594 00000010 01ae5a60 c7ae5a60 c7ae9810 c7ae9810 c7afa060
5e60: c7ae5b20 c7ae5a60 c7b88800 c02e3ef0 c02e3e08 c7b1e400 c7afa060 c7b88800
5e80: 00000000 c0014da8 c7b44000 00000000 bec566ac c02cd400 c7afa060 c7afa060
5ea0: bec56800 c7b88800 c0014da8 c02cdd7c c04ee710 c04ee7b8 00000003 c005fc74
5ec0: 00000000 7fffffff c7b45f00 c7afa060 c7b67420 c7ba3070 00000004 c0014da8
5ee0: c7b44000 00000000 bec566ac c02ced88 c04e95f8 b6f5ab04 c7b45fb0 0145a468
5f00: 0145a600 bec566bc bec56800 c7b67420 c7ba3070 c00d499c c7b45f18 c7b45f18
5f20: 0000001a 00000004 00000001 c7b44000 c0527f40 00000009 00000008 00000000
5f40: c7b44000 c002c9ec 00000001 c04f0ab0 c04ebec0 00000101 00000000 0000000a
5f60: 60000093 c7b67420 bec56800 c25c4111 00000004 c0014da8 c7b44000 00000000
5f80: bec566ac c00d4f38 b6ffb658 00000000 c0522d80 0145a468 b6fd5000 0145a418
5fa0: 00000036 c0014c00 0145a468 b6fd5000 00000004 c25c4111 bec56800 00020001
5fc0: 0145a468 b6fd5000 0145a418 00000036 0145a468 0145a600 bec566bc bec566ac
5fe0: 0145a468 bec56388 b6f65ce4 b6dcebec 20000010 00000004 00000000 00000000
[<c02d3cf8>] (snd_dmaengine_pcm_get_chan+0x8/0x10) from [<c02e95ec>] (snd_imx_pcm_hw_params+0x18/0xdc)
[<c02e95ec>] (snd_imx_pcm_hw_params+0x18/0xdc) from [<c02e3ef0>] (soc_pcm_hw_params+0xe8/0x1f0)
[<c02e3ef0>] (soc_pcm_hw_params+0xe8/0x1f0) from [<c02cd400>] (snd_pcm_hw_params+0x124/0x474)
[<c02cd400>] (snd_pcm_hw_params+0x124/0x474) from [<c02cdd7c>] (snd_pcm_common_ioctl1+0x4b4/0xf74)
[<c02cdd7c>] (snd_pcm_common_ioctl1+0x4b4/0xf74) from [<c02ced88>] (snd_pcm_playback_ioctl1+0x30/0x510)
[<c02ced88>] (snd_pcm_playback_ioctl1+0x30/0x510) from [<c00d499c>] (do_vfs_ioctl+0x80/0x5e4)
[<c00d499c>] (do_vfs_ioctl+0x80/0x5e4) from [<c00d4f38>] (sys_ioctl+0x38/0x60)
[<c00d4f38>] (sys_ioctl+0x38/0x60) from [<c0014c00>] (ret_fast_syscall+0x0/0x2c)
Code: e593000c e12fff1e e59030a0 e59330bc (e5930000)
---[ end trace fa518c8ba3a74e97 ]--

Reported-by: Javier Martin <javier.martin@vista-silicon.com>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com>
Cc: stable@vger.kernel.org
hardkernel pushed a commit that referenced this pull request Aug 17, 2012
commit 5d299f3 (net: ipv6: fix TCP early demux) added a
regression for ipv6_mapped case.

[   67.422369] SELinux: initialized (dev autofs, type autofs), uses
genfs_contexts
[   67.449678] SELinux: initialized (dev autofs, type autofs), uses
genfs_contexts
[   92.631060] BUG: unable to handle kernel NULL pointer dereference at
(null)
[   92.631435] IP: [<          (null)>]           (null)
[   92.631645] PGD 0
[   92.631846] Oops: 0010 [#1] SMP
[   92.632095] Modules linked in: autofs4 sunrpc ipv6 dm_mirror
dm_region_hash dm_log dm_multipath dm_mod video sbs sbshc battery ac lp
parport sg snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device pcspkr snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer serio_raw button floppy snd i2c_i801 i2c_core soundcore
snd_page_alloc shpchp ide_cd_mod cdrom microcode ehci_hcd ohci_hcd
uhci_hcd
[   92.634294] CPU 0
[   92.634294] Pid: 4469, comm: sendmail Not tainted 3.6.0-rc1 #3
[   92.634294] RIP: 0010:[<0000000000000000>]  [<          (null)>]
(null)
[   92.634294] RSP: 0018:ffff880245fc7cb0  EFLAGS: 00010282
[   92.634294] RAX: ffffffffa01985f0 RBX: ffff88024827ad00 RCX:
0000000000000000
[   92.634294] RDX: 0000000000000218 RSI: ffff880254735380 RDI:
ffff88024827ad00
[   92.634294] RBP: ffff880245fc7cc8 R08: 0000000000000001 R09:
0000000000000000
[   92.634294] R10: 0000000000000000 R11: ffff880245fc7bf8 R12:
ffff880254735380
[   92.634294] R13: ffff880254735380 R14: 0000000000000000 R15:
7fffffffffff0218
[   92.634294] FS:  00007f4516ccd6f0(0000) GS:ffff880256600000(0000)
knlGS:0000000000000000
[   92.634294] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   92.634294] CR2: 0000000000000000 CR3: 0000000245ed1000 CR4:
00000000000007f0
[   92.634294] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[   92.634294] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[   92.634294] Process sendmail (pid: 4469, threadinfo ffff880245fc6000,
task ffff880254b8cac0)
[   92.634294] Stack:
[   92.634294]  ffffffff813837a7 ffff88024827ad00 ffff880254b6b0e8
ffff880245fc7d68
[   92.634294]  ffffffff81385083 00000000001d2680 ffff8802547353a8
ffff880245fc7d18
[   92.634294]  ffffffff8105903a ffff88024827ad60 0000000000000002
00000000000000ff
[   92.634294] Call Trace:
[   92.634294]  [<ffffffff813837a7>] ? tcp_finish_connect+0x2c/0xfa
[   92.634294]  [<ffffffff81385083>] tcp_rcv_state_process+0x2b6/0x9c6
[   92.634294]  [<ffffffff8105903a>] ? sched_clock_cpu+0xc3/0xd1
[   92.634294]  [<ffffffff81059073>] ? local_clock+0x2b/0x3c
[   92.634294]  [<ffffffff8138caf3>] tcp_v4_do_rcv+0x63a/0x670
[   92.634294]  [<ffffffff8133278e>] release_sock+0x128/0x1bd
[   92.634294]  [<ffffffff8139f060>] __inet_stream_connect+0x1b1/0x352
[   92.634294]  [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f
[   92.634294]  [<ffffffff8104b333>] ? wake_up_bit+0x25/0x25
[   92.634294]  [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f
[   92.634294]  [<ffffffff8139f223>] ? inet_stream_connect+0x22/0x4b
[   92.634294]  [<ffffffff8139f234>] inet_stream_connect+0x33/0x4b
[   92.634294]  [<ffffffff8132e8cf>] sys_connect+0x78/0x9e
[   92.634294]  [<ffffffff813fd407>] ? sysret_check+0x1b/0x56
[   92.634294]  [<ffffffff81088503>] ? __audit_syscall_entry+0x195/0x1c8
[   92.634294]  [<ffffffff811cc26e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   92.634294]  [<ffffffff813fd3e2>] system_call_fastpath+0x16/0x1b
[   92.634294] Code:  Bad RIP value.
[   92.634294] RIP  [<          (null)>]           (null)
[   92.634294]  RSP <ffff880245fc7cb0>
[   92.634294] CR2: 0000000000000000
[   92.648982] ---[ end trace 24e2bed94314c8d9 ]---
[   92.649146] Kernel panic - not syncing: Fatal exception in interrupt

Fix this using inet_sk_rx_dst_set(), and export this function in case
IPv6 is modular.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hardkernel pushed a commit that referenced this pull request Aug 23, 2012
sco_chan_del() only has conn != NULL when called from sco_conn_del() so
just move the code from it that deal with conn to sco_conn_del().

[  120.765529]
[  120.765529] ======================================================
[  120.766529] [ INFO: possible circular locking dependency detected ]
[  120.766529] 3.5.0-rc1-10292-g3701f94-dirty #70 Tainted: G        W
[  120.766529] -------------------------------------------------------
[  120.766529] kworker/u:3/1497 is trying to acquire lock:
[  120.766529]  (&(&conn->lock)->rlock#2){+.+...}, at:
[<ffffffffa00b7ecc>] sco_chan_del+0x4c/0x170 [bluetooth]
[  120.766529]
[  120.766529] but task is already holding lock:
[  120.766529]  (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at:
[<ffffffffa00b8401>] sco_conn_del+0x61/0xe0 [bluetooth]
[  120.766529]
[  120.766529] which lock already depends on the new lock.
[  120.766529]
[  120.766529]
[  120.766529] the existing dependency chain (in reverse order) is:
[  120.766529]
[  120.766529] -> #1 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}:
[  120.766529]        [<ffffffff8107980e>] lock_acquire+0x8e/0xb0
[  120.766529]        [<ffffffff813c19e0>] _raw_spin_lock+0x40/0x80
[  120.766529]        [<ffffffffa00b85e9>] sco_connect_cfm+0x79/0x300
[bluetooth]
[  120.766529]        [<ffffffffa0094b13>]
hci_sync_conn_complete_evt.isra.90+0x343/0x400 [bluetooth]
[  120.766529]        [<ffffffffa009d447>] hci_event_packet+0x317/0xfb0
[bluetooth]
[  120.766529]        [<ffffffffa008aa68>] hci_rx_work+0x2c8/0x890
[bluetooth]
[  120.766529]        [<ffffffff81047db7>] process_one_work+0x197/0x460
[  120.766529]        [<ffffffff810489d6>] worker_thread+0x126/0x2d0
[  120.766529]        [<ffffffff8104ee4d>] kthread+0x9d/0xb0
[  120.766529]        [<ffffffff813c4294>] kernel_thread_helper+0x4/0x10
[  120.766529]
[  120.766529] -> #0 (&(&conn->lock)->rlock#2){+.+...}:
[  120.766529]        [<ffffffff81078a8a>] __lock_acquire+0x154a/0x1d30
[  120.766529]        [<ffffffff8107980e>] lock_acquire+0x8e/0xb0
[  120.766529]        [<ffffffff813c19e0>] _raw_spin_lock+0x40/0x80
[  120.766529]        [<ffffffffa00b7ecc>] sco_chan_del+0x4c/0x170
[bluetooth]
[  120.766529]        [<ffffffffa00b8414>] sco_conn_del+0x74/0xe0
[bluetooth]
[  120.766529]        [<ffffffffa00b88a2>] sco_disconn_cfm+0x32/0x60
[bluetooth]
[  120.766529]        [<ffffffffa0093a82>]
hci_disconn_complete_evt.isra.53+0x242/0x390 [bluetooth]
[  120.766529]        [<ffffffffa009d747>] hci_event_packet+0x617/0xfb0
[bluetooth]
[  120.766529]        [<ffffffffa008aa68>] hci_rx_work+0x2c8/0x890
[bluetooth]
[  120.766529]        [<ffffffff81047db7>] process_one_work+0x197/0x460
[  120.766529]        [<ffffffff810489d6>] worker_thread+0x126/0x2d0
[  120.766529]        [<ffffffff8104ee4d>] kthread+0x9d/0xb0
[  120.766529]        [<ffffffff813c4294>] kernel_thread_helper+0x4/0x10
[  120.766529]
[  120.766529] other info that might help us debug this:
[  120.766529]
[  120.766529]  Possible unsafe locking scenario:
[  120.766529]
[  120.766529]        CPU0                    CPU1
[  120.766529]        ----                    ----
[  120.766529]   lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
[  120.766529]
lock(&(&conn->lock)->rlock#2);
[  120.766529]
lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
[  120.766529]   lock(&(&conn->lock)->rlock#2);
[  120.766529]
[  120.766529]  *** DEADLOCK ***

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
hardkernel pushed a commit that referenced this pull request Aug 23, 2012
Doing port specific cleanup in the .port_remove hook is a
lot simpler and safer than doing it in the USB driver
.release or .disconnect methods. The removal of the port
from the usb-serial bus will happen before the USB driver
cleanup, so we must be careful about accessing port specific
driver data from any USB driver functions.

This problem surfaced after the commit

 0998d06 device-core: Ensure drvdata = NULL when no driver is bound

which turned the previous unsafe access into a reliable NULL
pointer dereference.

Fixes the following Oops:

[  243.148471] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  243.148508] IP: [<ffffffffa0468527>] stop_read_write_urbs+0x37/0x80 [usb_wwan]
[  243.148556] PGD 79d60067 PUD 79d61067 PMD 0
[  243.148590] Oops: 0000 [#1] SMP
[  243.148617] Modules linked in: sr_mod cdrom qmi_wwan usbnet option cdc_wdm usb_wwan usbserial usb_storage uas fuse af_packet ip6table_filter ip6_tables iptable_filter ip_tables x_tables tun edd
cpufreq_conservative cpufreq_userspace cpufreq_powersave snd_pcm_oss snd_mixer_oss acpi_cpufreq snd_seq mperf snd_seq_device coretemp arc4 sg hp_wmi sparse_keymap uvcvideo videobuf2_core
videodev videobuf2_vmalloc videobuf2_memops rtl8192ce rtl8192c_common rtlwifi joydev pcspkr microcode mac80211 i2c_i801 lpc_ich r8169 snd_hda_codec_idt cfg80211 snd_hda_intel snd_hda_codec rfkill
snd_hwdep snd_pcm wmi snd_timer ac snd soundcore snd_page_alloc battery uhci_hcd i915 drm_kms_helper drm i2c_algo_bit ehci_hcd thermal usbcore video usb_common button processor thermal_sys
[  243.149007] CPU 1
[  243.149027] Pid: 135, comm: khubd Not tainted 3.5.0-rc7-next-20120720-1-vanilla #1 Hewlett-Packard HP Mini 110-3700                /1584
[  243.149072] RIP: 0010:[<ffffffffa0468527>]  [<ffffffffa0468527>] stop_read_write_urbs+0x37/0x80 [usb_wwan]
[  243.149118] RSP: 0018:ffff880037e75b30  EFLAGS: 00010286
[  243.149133] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88005912aa28
[  243.149150] RDX: ffff88005e95f028 RSI: 0000000000000000 RDI: ffff88005f7c1a10
[  243.149166] RBP: ffff880037e75b60 R08: 0000000000000000 R09: ffffffff812cea90
[  243.149182] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88006539b440
[  243.149198] R13: ffff88006539b440 R14: 0000000000000000 R15: 0000000000000000
[  243.149216] FS:  0000000000000000(0000) GS:ffff88007ee80000(0000) knlGS:0000000000000000
[  243.149233] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  243.149248] CR2: 0000000000000000 CR3: 0000000079fe0000 CR4: 00000000000007e0
[  243.149264] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  243.149280] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  243.149298] Process khubd (pid: 135, threadinfo ffff880037e74000, task ffff880037d40600)
[  243.149313] Stack:
[  243.149323]  ffff880037e75b40 ffff88006539b440 ffff8800799bc830 ffff88005f7c1800
[  243.149348]  0000000000000001 ffff88006539b448 ffff880037e75b70 ffffffffa04685e9
[  243.149371]  ffff880037e75bc0 ffffffffa0473765 ffff880037354988 ffff88007b594800
[  243.149395] Call Trace:
[  243.149419]  [<ffffffffa04685e9>] usb_wwan_disconnect+0x9/0x10 [usb_wwan]
[  243.149447]  [<ffffffffa0473765>] usb_serial_disconnect+0xd5/0x120 [usbserial]
[  243.149511]  [<ffffffffa0046b48>] usb_unbind_interface+0x58/0x1a0 [usbcore]
[  243.149545]  [<ffffffff8139ebd7>] __device_release_driver+0x77/0xe0
[  243.149567]  [<ffffffff8139ec67>] device_release_driver+0x27/0x40
[  243.149587]  [<ffffffff8139e5cf>] bus_remove_device+0xdf/0x150
[  243.149608]  [<ffffffff8139bc78>] device_del+0x118/0x1a0
[  243.149661]  [<ffffffffa0044590>] usb_disable_device+0xb0/0x280 [usbcore]
[  243.149718]  [<ffffffffa003c6fd>] usb_disconnect+0x9d/0x140 [usbcore]
[  243.149770]  [<ffffffffa003da7d>] hub_port_connect_change+0xad/0x8a0 [usbcore]
[  243.149825]  [<ffffffffa0043bf5>] ? usb_control_msg+0xe5/0x110 [usbcore]
[  243.149878]  [<ffffffffa003e6e3>] hub_events+0x473/0x760 [usbcore]
[  243.149931]  [<ffffffffa003ea05>] hub_thread+0x35/0x1d0 [usbcore]
[  243.149955]  [<ffffffff81061960>] ? add_wait_queue+0x60/0x60
[  243.150004]  [<ffffffffa003e9d0>] ? hub_events+0x760/0x760 [usbcore]
[  243.150026]  [<ffffffff8106133e>] kthread+0x8e/0xa0
[  243.150047]  [<ffffffff8157ec04>] kernel_thread_helper+0x4/0x10
[  243.150068]  [<ffffffff810612b0>] ? flush_kthread_work+0x120/0x120
[  243.150088]  [<ffffffff8157ec00>] ? gs_change+0xb/0xb
[  243.150101] Code: fd 41 54 53 48 83 ec 08 80 7f 1a 00 74 57 49 89 fc 31 db 90 49 8b 7c 24 20 45 31 f6 48 81 c7 10 02 00 00 e8 bc 64 f3 e0 49 89 c7 <4b> 8b 3c 37 49 83 c6 08 e8 4c a5 bd ff 49 83 fe 20
75 ed 45 30
[  243.150257] RIP  [<ffffffffa0468527>] stop_read_write_urbs+0x37/0x80 [usb_wwan]
[  243.150282]  RSP <ffff880037e75b30>
[  243.150294] CR2: 0000000000000000
[  243.177170] ---[ end trace fba433d9015ffb8c ]---

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hardkernel pushed a commit that referenced this pull request Aug 23, 2012
On architectures where cputime_t is 64 bit type, is possible to trigger
divide by zero on do_div(temp, (__force u32) total) line, if total is a
non zero number but has lower 32 bit's zeroed. Removing casting is not
a good solution since some do_div() implementations do cast to u32
internally.

This problem can be triggered in practice on very long lived processes:

  PID: 2331   TASK: ffff880472814b00  CPU: 2   COMMAND: "oraagent.bin"
   #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b
   #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2
   #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00
   #3 [ffff880472a51cd0] die at ffffffff8100f26b
   #4 [ffff880472a51d00] do_trap at ffffffff814f03f4
   #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff
   #6 [ffff880472a51e00] divide_error at ffffffff8100be7b
      [exception RIP: thread_group_times+0x56]
      RIP: ffffffff81056a16  RSP: ffff880472a51eb8  RFLAGS: 00010046
      RAX: bc3572c9fe12d194  RBX: ffff880874150800  RCX: 0000000110266fad
      RDX: 0000000000000000  RSI: ffff880472a51eb8  RDI: 001038ae7d9633dc
      RBP: ffff880472a51ef8   R8: 00000000b10a3a64   R9: ffff880874150800
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: ffff880472a51f08
      R13: ffff880472a51f10  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffff880472a51f00] do_sys_times at ffffffff8108845d
   #8 [ffff880472a51f40] sys_times at ffffffff81088524
   #9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2
      RIP: 0000003808caac3a  RSP: 00007fcba27ab6d8  RFLAGS: 00000202
      RAX: 0000000000000064  RBX: ffffffff8100b0f2  RCX: 0000000000000000
      RDX: 00007fcba27ab6e0  RSI: 000000000076d58e  RDI: 00007fcba27ab6e0
      RBP: 00007fcba27ab700   R8: 0000000000000020   R9: 000000000000091b
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: 00007fff9ca41940
      R13: 0000000000000000  R14: 00007fcba27ac9c0  R15: 00007fff9ca41940
      ORIG_RAX: 0000000000000064  CS: 0033  SS: 002b

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120808092714.GA3580@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
hardkernel pushed a commit that referenced this pull request Aug 23, 2012
…lock

Cong Wang reports that lockdep detected suspicious RCU usage while
enabling IPV6 forwarding:

 [ 1123.310275] ===============================
 [ 1123.442202] [ INFO: suspicious RCU usage. ]
 [ 1123.558207] 3.6.0-rc1+ #109 Not tainted
 [ 1123.665204] -------------------------------
 [ 1123.768254] include/linux/rcupdate.h:430 Illegal context switch in RCU read-side critical section!
 [ 1123.992320]
 [ 1123.992320] other info that might help us debug this:
 [ 1123.992320]
 [ 1124.307382]
 [ 1124.307382] rcu_scheduler_active = 1, debug_locks = 0
 [ 1124.522220] 2 locks held by sysctl/5710:
 [ 1124.648364]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81768498>] rtnl_trylock+0x15/0x17
 [ 1124.882211]  #1:  (rcu_read_lock){.+.+.+}, at: [<ffffffff81871df8>] rcu_lock_acquire+0x0/0x29
 [ 1125.085209]
 [ 1125.085209] stack backtrace:
 [ 1125.332213] Pid: 5710, comm: sysctl Not tainted 3.6.0-rc1+ #109
 [ 1125.441291] Call Trace:
 [ 1125.545281]  [<ffffffff8109d915>] lockdep_rcu_suspicious+0x109/0x112
 [ 1125.667212]  [<ffffffff8107c240>] rcu_preempt_sleep_check+0x45/0x47
 [ 1125.781838]  [<ffffffff8107c260>] __might_sleep+0x1e/0x19b
[...]
 [ 1127.445223]  [<ffffffff81757ac5>] call_netdevice_notifiers+0x4a/0x4f
[...]
 [ 1127.772188]  [<ffffffff8175e125>] dev_disable_lro+0x32/0x6b
 [ 1127.885174]  [<ffffffff81872d26>] dev_forward_change+0x30/0xcb
 [ 1128.013214]  [<ffffffff818738c4>] addrconf_forward_change+0x85/0xc5
[...]

addrconf_forward_change() uses RCU iteration over the netdev list,
which is unnecessary since it already holds the RTNL lock.  We also
cannot reasonably require netdevice notifier functions not to sleep.

Reported-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hardkernel pushed a commit that referenced this pull request Aug 23, 2012
Commit 6f458df (tcp: improve latencies of timer triggered events)
added bug leading to following trace :

[ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.131726]
[ 2866.132188] =========================
[ 2866.132281] [ BUG: held lock freed! ]
[ 2866.132281] 3.6.0-rc1+ #622 Not tainted
[ 2866.132281] -------------------------
[ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there!
[ 2866.132281]  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281] 4 locks held by kworker/0:1/652:
[ 2866.132281]  #0:  (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #1:  ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #2:  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281]  #3:  (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f
[ 2866.132281]
[ 2866.132281] stack backtrace:
[ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622
[ 2866.132281] Call Trace:
[ 2866.132281]  <IRQ>  [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159
[ 2866.132281]  [<ffffffff818a0839>] ? __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a
[ 2866.132281]  [<ffffffff818a0839>] __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff818a08c0>] sk_free+0x1c/0x1e
[ 2866.132281]  [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56
[ 2866.132281]  [<ffffffff81078082>] run_timer_softirq+0x218/0x35f
[ 2866.132281]  [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f
[ 2866.132281]  [<ffffffff810f5831>] ? rb_commit+0x58/0x85
[ 2866.132281]  [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148
[ 2866.132281]  [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9
[ 2866.132281]  [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e
[ 2866.132281]  [<ffffffff81a1227c>] call_softirq+0x1c/0x30
[ 2866.132281]  [<ffffffff81039f38>] do_softirq+0x4a/0xa6
[ 2866.132281]  [<ffffffff81070f2b>] irq_exit+0x51/0xad
[ 2866.132281]  [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4
[ 2866.132281]  [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f
[ 2866.132281]  <EOI>  [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1
[ 2866.132281]  [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56
[ 2866.132281]  [<ffffffff81078692>] mod_timer+0x178/0x1a9
[ 2866.132281]  [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26
[ 2866.132281]  [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4
[ 2866.132281]  [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70
[ 2866.132281]  [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4
[ 2866.132281]  [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1
[ 2866.132281]  [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a
[ 2866.132281]  [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6
[ 2866.132281]  [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5
[ 2866.132281]  [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4
[ 2866.132281]  [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5
[ 2866.132281]  [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f
[ 2866.132281]  [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43
[ 2866.132281]  [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80
[ 2866.132281]  [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0
[ 2866.132281]  [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61
[ 2866.132281]  [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1
[ 2866.132281]  [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff81999d92>] call_transmit+0x1c5/0x20e
[ 2866.132281]  [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34
[ 2866.132281]  [<ffffffff810835d6>] process_one_work+0x24d/0x47f
[ 2866.132281]  [<ffffffff81083567>] ? process_one_work+0x1de/0x47f
[ 2866.132281]  [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225
[ 2866.132281]  [<ffffffff81083a6d>] worker_thread+0x236/0x317
[ 2866.132281]  [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f
[ 2866.132281]  [<ffffffff8108b7b8>] kthread+0x9a/0xa2
[ 2866.132281]  [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10
[ 2866.132281]  [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13
[ 2866.132281]  [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a
[ 2866.132281]  [<ffffffff81a12180>] ? gs_change+0x13/0x13
[ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.309689] =============================================================================
[ 2866.310254] BUG TCP (Not tainted): Object already free
[ 2866.310254] -----------------------------------------------------------------------------
[ 2866.310254]

The bug comes from the fact that timer set in sk_reset_timer() can run
before we actually do the sock_hold(). socket refcount reaches zero and
we free the socket too soon.

timer handler is not allowed to reduce socket refcnt if socket is owned
by the user, or we need to change sk_reset_timer() implementation.

We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED
or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags

Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED
was used instead of TCP_DELACK_TIMER_DEFERRED.

For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED,
even if not fired from a timer.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hardkernel pushed a commit that referenced this pull request Aug 23, 2012
Each page mapped in a process's address space must be correctly
accounted for in _mapcount.  Normally the rules for this are
straightforward but hugetlbfs page table sharing is different.  The page
table pages at the PMD level are reference counted while the mapcount
remains the same.

If this accounting is wrong, it causes bugs like this one reported by
Larry Woodman:

  kernel BUG at mm/filemap.c:135!
  invalid opcode: 0000 [#1] SMP
  CPU 22
  Modules linked in: bridge stp llc sunrpc binfmt_misc dcdbas microcode pcspkr acpi_pad acpi]
  Pid: 18001, comm: mpitest Tainted: G        W    3.3.0+ #4 Dell Inc. PowerEdge R620/07NDJ2
  RIP: 0010:[<ffffffff8112cfed>]  [<ffffffff8112cfed>] __delete_from_page_cache+0x15d/0x170
  Process mpitest (pid: 18001, threadinfo ffff880428972000, task ffff880428b5cc20)
  Call Trace:
    delete_from_page_cache+0x40/0x80
    truncate_hugepages+0x115/0x1f0
    hugetlbfs_evict_inode+0x18/0x30
    evict+0x9f/0x1b0
    iput_final+0xe3/0x1e0
    iput+0x3e/0x50
    d_kill+0xf8/0x110
    dput+0xe2/0x1b0
    __fput+0x162/0x240

During fork(), copy_hugetlb_page_range() detects if huge_pte_alloc()
shared page tables with the check dst_pte == src_pte.  The logic is if
the PMD page is the same, they must be shared.  This assumes that the
sharing is between the parent and child.  However, if the sharing is
with a different process entirely then this check fails as in this
diagram:

  parent
    |
    ------------>pmd
                 src_pte----------> data page
                                        ^
  other--------->pmd--------------------|
                  ^
  child-----------|
                 dst_pte

For this situation to occur, it must be possible for Parent and Other to
have faulted and failed to share page tables with each other.  This is
possible due to the following style of race.

  PROC A                                          PROC B
  copy_hugetlb_page_range                         copy_hugetlb_page_range
    src_pte == huge_pte_offset                      src_pte == huge_pte_offset
    !src_pte so no sharing                          !src_pte so no sharing

  (time passes)

  hugetlb_fault                                   hugetlb_fault
    huge_pte_alloc                                  huge_pte_alloc
      huge_pmd_share                                 huge_pmd_share
        LOCK(i_mmap_mutex)
        find nothing, no sharing
        UNLOCK(i_mmap_mutex)
                                                      LOCK(i_mmap_mutex)
                                                      find nothing, no sharing
                                                      UNLOCK(i_mmap_mutex)
      pmd_alloc                                       pmd_alloc
      LOCK(instantiation_mutex)
      fault
      UNLOCK(instantiation_mutex)
                                                  LOCK(instantiation_mutex)
                                                  fault
                                                  UNLOCK(instantiation_mutex)

These two processes are not poing to the same data page but are not
sharing page tables because the opportunity was missed.  When either
process later forks, the src_pte == dst pte is potentially insufficient.
As the check falls through, the wrong PTE information is copied in
(harmless but wrong) and the mapcount is bumped for a page mapped by a
shared page table leading to the BUG_ON.

This patch addresses the issue by moving pmd_alloc into huge_pmd_share
which guarantees that the shared pud is populated in the same critical
section as pmd.  This also means that huge_pte_offset test in
huge_pmd_share is serialized correctly now which in turn means that the
success of the sharing will be higher as the racing tasks see the pud
and pmd populated together.

Race identified and changelog written mostly by Mel Gorman.

{akpm@linux-foundation.org: attempt to make the huge_pmd_share() comment comprehensible, clean up coding style]
Reported-by: Larry Woodman <lwoodman@redhat.com>
Tested-by: Larry Woodman <lwoodman@redhat.com>
Reviewed-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Ken Chen <kenchen@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hardkernel pushed a commit that referenced this pull request Sep 5, 2012
This fixes the following crash when a LogFS file system, created on a
encrypted LVM volume, was mounted

[  526.548034] BUG: unable to handle kernel NULL pointer dereference at
[  526.550106] IP: [<ffffffff8131ecab>] memcpy+0xb/0x120
[  526.551008] PGD bd60067 PUD 1778d067 PMD 0
[  526.551783] Oops: 0000 [#1] SMP

<d>Pid: 2043, comm: mount
<d>RIP: 0010:[<ffffffff8131ecab>]  [<ffffffff8131ecab>] memcpy+0xb/0x120
Call Trace:
	kcryptd_io_read+0xdb/0x100
	crypt_map+0xfd/0x190
	__map_bio+0x48/0x150
	__split_and_process_bio+0x51b/0x630
	dm_request+0x138/0x230
	generic_make_request+0xca/0x100
	submit_bio+0x87/0x110
	sync_request+0xdd/0x120 [logfs]
	bdev_readpage+0x2e/0x70 [logfs]
	do_read_cache_page+0x82/0x180
	logfs_mount+0x2ad/0x770 [logfs]
	mount_fs+0x47/0x1c0
	vfs_kern_mount+0x72/0x110
	do_kern_mount+0x54/0x110
	do_mount+0x520/0x7f0
	sys_mount+0x90/0xe0

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42292
Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
hardkernel pushed a commit that referenced this pull request Sep 5, 2012
…tation

We're hitting bug while trying to reinsert an already existing
expectation:

kernel BUG at kernel/timer.c:895!
invalid opcode: 0000 [#1] SMP
[...]
Call Trace:
 <IRQ>
 [<ffffffffa0069563>] nf_ct_expect_related_report+0x4a0/0x57a [nf_conntrack]
 [<ffffffff812d423a>] ? in4_pton+0x72/0x131
 [<ffffffffa00ca69e>] ip_nat_sdp_media+0xeb/0x185 [nf_nat_sip]
 [<ffffffffa00b5b9b>] set_expected_rtp_rtcp+0x32d/0x39b [nf_conntrack_sip]
 [<ffffffffa00b5f15>] process_sdp+0x30c/0x3ec [nf_conntrack_sip]
 [<ffffffff8103f1eb>] ? irq_exit+0x9a/0x9c
 [<ffffffffa00ca738>] ? ip_nat_sdp_media+0x185/0x185 [nf_nat_sip]

We have to remove the RTP expectation if the RTCP expectation hits EBUSY
since we keep trying with other ports until we succeed.

Reported-by: Rafal Fitt <rafalf@aplusc.com.pl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
hardkernel pushed a commit that referenced this pull request Sep 5, 2012
Following lockdep splat was reported by Pavel Roskin :

[ 1570.586223] ===============================
[ 1570.586225] [ INFO: suspicious RCU usage. ]
[ 1570.586228] 3.6.0-rc3-wl-main #98 Not tainted
[ 1570.586229] -------------------------------
[ 1570.586231] /home/proski/src/linux/net/ipv4/route.c:645 suspicious rcu_dereference_check() usage!
[ 1570.586233]
[ 1570.586233] other info that might help us debug this:
[ 1570.586233]
[ 1570.586236]
[ 1570.586236] rcu_scheduler_active = 1, debug_locks = 0
[ 1570.586238] 2 locks held by Chrome_IOThread/4467:
[ 1570.586240]  #0:  (slock-AF_INET){+.-...}, at: [<ffffffff814f2c0c>] release_sock+0x2c/0xa0
[ 1570.586253]  #1:  (fnhe_lock){+.-...}, at: [<ffffffff815302fc>] update_or_create_fnhe+0x2c/0x270
[ 1570.586260]
[ 1570.586260] stack backtrace:
[ 1570.586263] Pid: 4467, comm: Chrome_IOThread Not tainted 3.6.0-rc3-wl-main #98
[ 1570.586265] Call Trace:
[ 1570.586271]  [<ffffffff810976ed>] lockdep_rcu_suspicious+0xfd/0x130
[ 1570.586275]  [<ffffffff8153042c>] update_or_create_fnhe+0x15c/0x270
[ 1570.586278]  [<ffffffff815305b3>] __ip_rt_update_pmtu+0x73/0xb0
[ 1570.586282]  [<ffffffff81530619>] ip_rt_update_pmtu+0x29/0x90
[ 1570.586285]  [<ffffffff815411dc>] inet_csk_update_pmtu+0x2c/0x80
[ 1570.586290]  [<ffffffff81558d1e>] tcp_v4_mtu_reduced+0x2e/0xc0
[ 1570.586293]  [<ffffffff81553bc4>] tcp_release_cb+0xa4/0xb0
[ 1570.586296]  [<ffffffff814f2c35>] release_sock+0x55/0xa0
[ 1570.586300]  [<ffffffff815442ef>] tcp_sendmsg+0x4af/0xf50
[ 1570.586305]  [<ffffffff8156fc60>] inet_sendmsg+0x120/0x230
[ 1570.586308]  [<ffffffff8156fb40>] ? inet_sk_rebuild_header+0x40/0x40
[ 1570.586312]  [<ffffffff814f4bdd>] ? sock_update_classid+0xbd/0x3b0
[ 1570.586315]  [<ffffffff814f4c50>] ? sock_update_classid+0x130/0x3b0
[ 1570.586320]  [<ffffffff814ec435>] do_sock_write+0xc5/0xe0
[ 1570.586323]  [<ffffffff814ec4a3>] sock_aio_write+0x53/0x80
[ 1570.586328]  [<ffffffff8114bc83>] do_sync_write+0xa3/0xe0
[ 1570.586332]  [<ffffffff8114c5a5>] vfs_write+0x165/0x180
[ 1570.586335]  [<ffffffff8114c805>] sys_write+0x45/0x90
[ 1570.586340]  [<ffffffff815d2722>] system_call_fastpath+0x16/0x1b

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
hardkernel pushed a commit that referenced this pull request Sep 6, 2012
After commit 26b8852 ("mmc:
omap_hsmmc: remove private DMA API implementation"), the Nokia N800
here stopped booting:

[    2.086181] Waiting for root device /dev/mmcblk0p1...
[    2.324066] Unhandled fault: imprecise external abort (0x406) at 0x00000000
[    2.331451] Internal error: : 406 [#1] ARM
[    2.335784] Modules linked in:
[    2.339050] CPU: 0    Not tainted  (3.6.0-rc3 #60)
[    2.344146] PC is at default_idle+0x28/0x30
[    2.348602] LR is at trace_hardirqs_on_caller+0x15c/0x1b0

...

This turned out to be due to memory corruption caused by long-broken
PIO code in drivers/mmc/host/omap.c.  (Previously, this driver had
been using DMA; but the above commit caused the MMC driver to fall
back to PIO mode with an unmodified Kconfig.)

The PIO code, added with the rest of the driver in commit
730c9b7 ("[MMC] Add OMAP MMC host
driver"), confused bytes with 16-bit words.  This bug caused memory
located after the PIO transfer buffer to be corrupted with transfers
larger than 32 bytes.  The driver also did not increment the buffer
pointer after the transfer occurred.  This bug resulted in data
corruption during any transfer larger than 64 bytes.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
hardkernel pushed a commit that referenced this pull request Sep 17, 2012
Murali Nalajala reports a regression that ioremapping address zero
results in an oops dump:

Unable to handle kernel paging request at virtual address fa200000
pgd = d4f80000
[fa200000] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0    Tainted: G        W (3.4.0-g3b5f728-00009-g638207a #13)
PC is at msm_pm_config_rst_vector_before_pc+0x8/0x30
LR is at msm_pm_boot_config_before_pc+0x18/0x20
pc : [<c0078f84>]    lr : [<c007903c>]    psr: a0000093
sp : c0837ef0  ip : cfe00000  fp : 0000000d
r10: da7efc17  r9 : 225c4278  r8 : 00000006
r7 : 0003c000  r6 : c085c824  r5 : 00000001  r4 : fa101000
r3 : fa200000  r2 : c095080c  r1 : 002250fc  r0 : 00000000
Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM Segment kernel
Control: 10c5387d  Table: 25180059  DAC: 00000015
[<c0078f84>] (msm_pm_config_rst_vector_before_pc+0x8/0x30) from [<c007903c>] (msm_pm_boot_config_before_pc+0x18/0x20)
[<c007903c>] (msm_pm_boot_config_before_pc+0x18/0x20) from [<c007a55c>] (msm_pm_power_collapse+0x410/0xb04)
[<c007a55c>] (msm_pm_power_collapse+0x410/0xb04) from [<c007b17c>] (arch_idle+0x294/0x3e0)
[<c007b17c>] (arch_idle+0x294/0x3e0) from [<c000eed8>] (default_idle+0x18/0x2c)
[<c000eed8>] (default_idle+0x18/0x2c) from [<c000f254>] (cpu_idle+0x90/0xe4)
[<c000f254>] (cpu_idle+0x90/0xe4) from [<c057231c>] (rest_init+0x88/0xa0)
[<c057231c>] (rest_init+0x88/0xa0) from [<c07ff890>] (start_kernel+0x3a8/0x40c)
Code: c0704256 e12fff1e e59f2020 e5923000 (e5930000)

This is caused by the 'reserved' entries which we insert (see
19b52ab - ARM: 7438/1: fill possible PMD empty section gaps)
which get matched for physical address zero.

Resolve this by marking these reserved entries with a different flag.

Cc: <stable@vger.kernel.org>
Tested-by: Murali Nalajala <mnalajal@codeaurora.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
hardkernel pushed a commit that referenced this pull request Sep 17, 2012
Fixes following lockdep splat :

[ 1614.734896] =============================================
[ 1614.734898] [ INFO: possible recursive locking detected ]
[ 1614.734901] 3.6.0-rc3+ #782 Not tainted
[ 1614.734903] ---------------------------------------------
[ 1614.734905] swapper/11/0 is trying to acquire lock:
[ 1614.734907]  (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.734920]
[ 1614.734920] but task is already holding lock:
[ 1614.734922]  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734932]
[ 1614.734932] other info that might help us debug this:
[ 1614.734935]  Possible unsafe locking scenario:
[ 1614.734935]
[ 1614.734937]        CPU0
[ 1614.734938]        ----
[ 1614.734940]   lock(slock-AF_INET);
[ 1614.734943]   lock(slock-AF_INET);
[ 1614.734946]
[ 1614.734946]  *** DEADLOCK ***
[ 1614.734946]
[ 1614.734949]  May be due to missing lock nesting notation
[ 1614.734949]
[ 1614.734952] 7 locks held by swapper/11/0:
[ 1614.734954]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00
[ 1614.734964]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0
[ 1614.734972]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230
[ 1614.734982]  #3:  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734989]  #4:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680
[ 1614.734997]  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890
[ 1614.735004]  #6:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00
[ 1614.735012]
[ 1614.735012] stack backtrace:
[ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ #782
[ 1614.735018] Call Trace:
[ 1614.735020]  <IRQ>  [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10
[ 1614.735033]  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
[ 1614.735037]  [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130
[ 1614.735042]  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
[ 1614.735047]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735051]  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
[ 1614.735060]  [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50
[ 1614.735065]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735069]  [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735075]  [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
[ 1614.735079]  [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70
[ 1614.735083]  [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70
[ 1614.735087]  [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00
[ 1614.735093]  [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290
[ 1614.735098]  [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00
[ 1614.735102]  [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70
[ 1614.735106]  [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0
[ 1614.735111]  [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280
[ 1614.735117]  [<ffffffff8160a7e2>] arp_xmit+0x22/0x60
[ 1614.735121]  [<ffffffff8160a863>] arp_send+0x43/0x50
[ 1614.735125]  [<ffffffff8160b82f>] arp_solicit+0x18f/0x450
[ 1614.735132]  [<ffffffff8159d9da>] neigh_probe+0x4a/0x70
[ 1614.735137]  [<ffffffff815a191a>] __neigh_event_send+0xea/0x300
[ 1614.735141]  [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260
[ 1614.735146]  [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890
[ 1614.735150]  [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890
[ 1614.735154]  [<ffffffff815dae79>] ip_output+0x59/0xf0
[ 1614.735158]  [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0
[ 1614.735162]  [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680
[ 1614.735165]  [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0
[ 1614.735172]  [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60
[ 1614.735177]  [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620
[ 1614.735181]  [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960
[ 1614.735185]  [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0
[ 1614.735189]  [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0
[ 1614.735194]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735199]  [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230
[ 1614.735203]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735208]  [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0
[ 1614.735213]  [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0
[ 1614.735217]  [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0
[ 1614.735221]  [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0
[ 1614.735225]  [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90
[ 1614.735229]  [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730
[ 1614.735233]  [<ffffffff815d425d>] ip_rcv+0x21d/0x300
[ 1614.735237]  [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00
[ 1614.735241]  [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00
[ 1614.735245]  [<ffffffff81593368>] process_backlog+0xb8/0x180
[ 1614.735249]  [<ffffffff81593cf9>] net_rx_action+0x159/0x330
[ 1614.735257]  [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0
[ 1614.735264]  [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30
[ 1614.735270]  [<ffffffff8175419c>] call_softirq+0x1c/0x30
[ 1614.735278]  [<ffffffff8100425d>] do_softirq+0x8d/0xc0
[ 1614.735282]  [<ffffffff8104983e>] irq_exit+0xae/0xe0
[ 1614.735287]  [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99
[ 1614.735291]  [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80
[ 1614.735293]  <EOI>  [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10
[ 1614.735306]  [<ffffffff81336f85>] ? intel_idle+0xf5/0x150
[ 1614.735310]  [<ffffffff81336f7e>] ? intel_idle+0xee/0x150
[ 1614.735317]  [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20
[ 1614.735321]  [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630
[ 1614.735327]  [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0
[ 1614.735333]  [<ffffffff8173762e>] start_secondary+0x220/0x222

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hardkernel pushed a commit that referenced this pull request Sep 17, 2012
In znet_probe(), strncmp() may access beyond 0x100000 and
trigger the below oops in kvm.  Fix it by limiting the loop
under 0x100000-8. I suspect the limit could be further decreased
to 0x100000-sizeof(struct netidblk), however no datasheet at hand..

[    3.744312] BUG: unable to handle kernel paging request at 80100000
[    3.746145] IP: [<8119d12a>] strncmp+0xc/0x20
[    3.747446] *pde = 01d10067 *pte = 00100160
[    3.747493] Oops: 0000 [#1] DEBUG_PAGEALLOC
[    3.747493] Pid: 1, comm: swapper Not tainted 3.6.0-rc1-00018-g57bfc0a #73 Bochs Bochs
[    3.747493] EIP: 0060:[<8119d12a>] EFLAGS: 00010206 CPU: 0
[    3.747493] EIP is at strncmp+0xc/0x20
[    3.747493] EAX: 800fff4e EBX: 00000006 ECX: 00000006 EDX: 814d2bb9
[    3.747493] ESI: 80100000 EDI: 814d2bba EBP: 8e03dfa0 ESP: 8e03df98
[    3.747493]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[    3.747493] CR0: 8005003b CR2: 80100000 CR3: 016f7000 CR4: 00000690
[    3.747493] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    3.747493] DR6: ffff0ff0 DR7: 00000400
[    3.747493] Process swapper (pid: 1, ti=8e03c000 task=8e040000 task.ti=8e03c000)
[    3.747493] Stack:
[    3.747493]  800fffff 00000000 8e03dfb4 816a1376 00000006 816a134a 00000000 8e03dfd0
[    3.747493]  816819b 816ed1c0 8e03dfe4 00000006 00000123 816ed604 8e03dfe4 81681b29
[    3.747493]  00000000 81681a5b 00000000 00000000 8134e542 00000000 00000000 00000000
[    3.747493] Call Trace:
[    3.747493]  [<816a1376>] znet_probe+0x2c/0x26b
[    3.747493]  [<816a134a>] ? dnet_driver_init+0xf/0xf
[    3.747493]  [<816819b5>] do_one_initcall+0x6a/0x110
[    3.747493]  [<81681b29>] kernel_init+0xce/0x14b

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hardkernel pushed a commit that referenced this pull request Oct 6, 2012
48d480b [[MIPS] Malta: Fix off by one bug in interrupt
handler.] did not take in account that irq_ffs() will also return 0 if for some reason
the set of pending interrupts happens to be empty.

This is trivial to trigger with a RM5261 CPU module running a 64-bit kernel and results
in something like the following:

CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == ffffffff801772d0, ra == ffffffff8017ad24
Oops[#1]:
Cpu 0
$ 0   : 0000000000000000 ffffffff9000a4e0 ffffffff9000a4e0 ffffffff9000a4e0
$ 4   : ffffffff80592be0 0000000000000000 00000000000000d6 ffffffff80322ed0
$ 8   : ffffffff805fe538 0000000000000000 ffffffff9000a4e0 ffffffff80590000
$12   : 00000000000000d6 0000000000000000 ffffffff80600000 ffffffff805fe538
$16   : 0000000000000000 0000000000000010 ffffffff80592be0 0000000000000010
$20   : 0000000000000000 0000000000500001 0000000000000000 ffffffff8051e078
$24   : 0000000000000028 ffffffff803226e8
$28   : 9800000003828000 980000000382b900 ffffffff8051e060 ffffffff8017ad24
Hi    : 0000000000000000
Lo    : 0000006388974000
epc   : ffffffff801772d0 handle_irq_event_percpu+0x70/0x2f0
    Not tainted
ra    : ffffffff8017ad24 handle_percpu_irq+0x54/0x88
Status: 9000a4e2    KX SX UX KERNEL EXL
Cause : 00808008
BadVA : 0000000000000000
PrId  : 000028a0 (Nevada)
Modules linked in:
Process init (pid: 1, threadinfo=9800000003828000, task=9800000003827968, tls=0000000077087490)
Stack : ffffffff80592be0 ffffffff8058d248 0000000000000040 0000000000000000
        ffffffff80613340 0000000000500001 ffffffff805a0000 0000000000000882
        9800000003b89000 ffffffff8017ad24 00000000000000d5 0000000000000010
        ffffffff9000a4e1 ffffffff801769f4 ffffffff9000a4e0 ffffffff801037f8
        0000000000000000 ffffffff80101c44 0000000000000000 ffffffff9000a4e0
        0000000000000000 9000000018000000 90000000180003f9 0000000000000001
        0000000000000000 00000000000000ff 0000000000000018 0000000000000001
        0000000000000001 00000000003fffff 0000000000000020 ffffffff802cf7ac
        ffffffff80208918 000000007fdadf08 ffffffff80612d88 ffffffff9000a4e1
        0000000000000040 0000000000000000 ffffffff80613340 0000000000500001
        ...
Call Trace:
[<ffffffff801772d0>] handle_irq_event_percpu+0x70/0x2f0
[<ffffffff8017ad24>] handle_percpu_irq+0x54/0x88
[<ffffffff801769f4>] generic_handle_irq+0x44/0x60
[<ffffffff801037f8>] do_IRQ+0x48/0x70
[<ffffffff80101c44>] ret_from_irq+0x0/0x4
[<ffffffff80326170>] serial8250_startup+0x310/0x870
[<ffffffff8032175c>] uart_startup.part.7+0x9c/0x330
[<ffffffff80321b4c>] uart_open+0x15c/0x1b0
[<ffffffff80302034>] tty_open+0x1fc/0x720
[<ffffffff801bffac>] chrdev_open+0x7c/0x180
[<ffffffff801b9ab8>] do_dentry_open.isra.14+0x288/0x390
[<ffffffff801bac5c>] nameidata_to_filp+0x5c/0xc0
[<ffffffff801ca700>] do_last.isra.33+0x330/0x8f0
[<ffffffff801caf3c>] path_openat+0xbc/0x440
[<ffffffff801cb3c8>] do_filp_open+0x38/0xa8
[<ffffffff801bade4>] do_sys_open+0x124/0x218
[<ffffffff80110538>] handle_sys+0x118/0x13c

Code: 02d5a825  12800012  02a0b02d <de820000> de850008  0040f809  0220202d  0040a82d  40026000
---[ end trace 5d8e7b9a86badd2d ]---
Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
hardkernel pushed a commit that referenced this pull request Oct 6, 2012
busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed
(CPU is down again).  This used to be okay because the flag wasn't
used for anything else.

However, after 25511a4 "workqueue: reimplement CPU online rebinding
to handle idle workers", WORKER_REBIND is also used to command idle
workers to rebind.  If not cleared, the worker may confuse the next
CPU_UP cycle by having REBIND spuriously set or oops / get stuck by
prematurely calling idle_worker_rebind().

  WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5
 00()
  Hardware name: Bochs
  Modules linked in: test_wq(O-)
  Pid: 33, comm: kworker/1:1 Tainted: G           O 3.6.0-rc1-work+ #3
  Call Trace:
   [<ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0
   [<ffffffff810903fa>] warn_slowpath_null+0x1a/0x20
   [<ffffffff810b3f1d>] worker_thread+0x4cd/0x500
   [<ffffffff810bc16e>] kthread+0xbe/0xd0
   [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
  ---[ end trace e977cf20f4661968 ]---
  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [<ffffffff810b3db0>] worker_thread+0x360/0x500
  PGD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: test_wq(O-)
  CPU 0
  Pid: 33, comm: kworker/1:1 Tainted: G        W  O 3.6.0-rc1-work+ #3 Bochs Bochs
  RIP: 0010:[<ffffffff810b3db0>]  [<ffffffff810b3db0>] worker_thread+0x360/0x500
  RSP: 0018:ffff88001e1c9de0  EFLAGS: 00010086
  RAX: 0000000000000000 RBX: ffff88001e633e00 RCX: 0000000000004140
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
  RBP: ffff88001e1c9ea0 R08: 0000000000000000 R09: 0000000000000001
  R10: 0000000000000002 R11: 0000000000000000 R12: ffff88001fc8d580
  R13: ffff88001fc8d590 R14: ffff88001e633e20 R15: ffff88001e1c6900
  FS:  0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000000 CR3: 00000000130e8000 CR4: 00000000000006f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/1:1 (pid: 33, threadinfo ffff88001e1c8000, task ffff88001e1c6900)
  Stack:
   ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010
   ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900
   ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340
  Call Trace:
   [<ffffffff810bc16e>] kthread+0xbe/0xd0
   [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
  Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b
  RIP  [<ffffffff810b3db0>] worker_thread+0x360/0x500
   RSP <ffff88001e1c9de0>
  CR2: 0000000000000000

There was no reason to keep WORKER_REBIND on failure in the first
place - WORKER_UNBOUND is guaranteed to be set in such cases
preventing incorrectly activating concurrency management.  Always
clear WORKER_REBIND.

tj: Updated comment and description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
hardkernel pushed a commit that referenced this pull request Oct 6, 2012
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fb] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
hardkernel pushed a commit that referenced this pull request Oct 6, 2012
When call_crda() is called we kick off a witch hunt search
for the same regulatory domain on our internal regulatory
database and that work gets kicked off on a workqueue, this
is done while the cfg80211_mutex is held. If that workqueue
kicks off it will first lock reg_regdb_search_mutex and
later cfg80211_mutex but to ensure two CPUs will not contend
against cfg80211_mutex the right thing to do is to have the
reg_regdb_search() wait until the cfg80211_mutex is let go.

The lockdep report is pasted below.

cfg80211: Calling CRDA to update world regulatory domain

======================================================
[ INFO: possible circular locking dependency detected ]
3.3.8 #3 Tainted: G           O
-------------------------------------------------------
kworker/0:1/235 is trying to acquire lock:
 (cfg80211_mutex){+.+...}, at: [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

but task is already holding lock:
 (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (reg_regdb_search_mutex){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<81645778>] is_world_regdom+0x9f8/0xc74 [cfg80211]

-> #1 (reg_mutex#2){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<8164539c>] is_world_regdom+0x61c/0xc74 [cfg80211]

-> #0 (cfg80211_mutex){+.+...}:
       [<800a77b8>] __lock_acquire+0x10d4/0x17bc
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex#2 --> reg_regdb_search_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(reg_regdb_search_mutex);
                               lock(reg_mutex#2);
                               lock(reg_regdb_search_mutex);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

3 locks held by kworker/0:1/235:
 #0:  (events){.+.+..}, at: [<80089a00>] process_one_work+0x230/0x460
 #1:  (reg_regdb_work){+.+...}, at: [<80089a00>] process_one_work+0x230/0x460
 #2:  (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

stack backtrace:
Call Trace:
[<80290fd4>] dump_stack+0x8/0x34
[<80291bc4>] print_circular_bug+0x2ac/0x2d8
[<800a77b8>] __lock_acquire+0x10d4/0x17bc
[<800a8384>] lock_acquire+0x60/0x88
[<802950a8>] mutex_lock_nested+0x54/0x31c
[<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

Reported-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
hardkernel pushed a commit that referenced this pull request Oct 26, 2012
commit c77d716 upstream.

starting an old X server causes a kernel BUG since commit 1b50247:

------------[ cut here ]------------
kernel BUG at drivers/gpu/drm/i915/i915_gem.c:3661!
invalid opcode: 0000 [#1] SMP
Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss uvcvideo
+videobuf2_core videodev videobuf2_vmalloc videobuf2_memops uhci_hcd ath9k mac80211 snd_hda_codec_realtek ath9k_common microcode
+ath9k_hw psmouse serio_raw sg ath cfg80211 atl1c lpc_ich mfd_core ehci_hcd snd_hda_intel snd_hda_codec snd_hwdep snd_pcm rtc_cmos
+snd_timer snd evdev eeepc_laptop snd_page_alloc sparse_keymap

Pid: 2866, comm: X Not tainted 3.5.6-rc1-eeepc #1 ASUSTeK Computer INC. 1005HA/1005HA
EIP: 0060:[<c12dc291>] EFLAGS: 00013297 CPU: 0
EIP is at i915_gem_entervt_ioctl+0xf1/0x110
EAX: f5941df4 EBX: f5940000 ECX: 00000000 EDX: 00020000
ESI: f5835400 EDI: 00000000 EBP: f51d7e38 ESP: f51d7e20
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: b760e0a0 CR3: 351b6000 CR4: 000007d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process X (pid: 2866, ti=f51d6000 task=f61af8d0 task.ti=f51d6000)
Stack:
 00000001 00000000 f5835414 f51d7e84 f5835400 f54f85c0 f51d7f10 c12b530b
 00000001 c151b139 c14751b6 c152e030 00000b32 00006459 00000059 0000e200
 00000001 00000000 00006459 c159ddd0 c12dc1a0 ffffffea 00000000 00000000
Call Trace:
 [<c12b530b>] drm_ioctl+0x2eb/0x440
 [<c12dc1a0>] ? i915_gem_init+0xe0/0xe0
 [<c1052b2b>] ? enqueue_hrtimer+0x1b/0x50
 [<c1053321>] ? __hrtimer_start_range_ns+0x161/0x330
 [<c10530b3>] ? lock_hrtimer_base+0x23/0x50
 [<c1053163>] ? hrtimer_try_to_cancel+0x33/0x70
 [<c12b5020>] ? drm_version+0x90/0x90
 [<c10ca171>] vfs_ioctl+0x31/0x50
 [<c10ca2e4>] do_vfs_ioctl+0x64/0x510
 [<c10535de>] ? hrtimer_nanosleep+0x8e/0x100
 [<c1052c20>] ? update_rmtp+0x80/0x80
 [<c10ca7c9>] sys_ioctl+0x39/0x60
 [<c1433949>] syscall_call+0x7/0xb
Code: 83 c4 0c 5b 5e 5f 5d c3 c7 44 24 04 2c 05 53 c1 c7 04 24 6f ef 47 c1 e8 6e e0 fd ff c7 83 38 1e 00 00 00 00 00 00 e9 3f ff ff
+ff <0f> 0b eb fe 0f 0b eb fe 8d b4 26 00 00 00 00 0f 0b eb fe 8d b6
EIP: [<c12dc291>] i915_gem_entervt_ioctl+0xf1/0x110 SS:ESP 0068:f51d7e20
---[ end trace dd332ec083cbd513 ]---

The crash happens here in i915_gem_entervt_ioctl() :

    3659          BUG_ON(!list_empty(&dev_priv->mm.active_list));
    3660          BUG_ON(!list_empty(&dev_priv->mm.flushing_list));
 -> 3661          BUG_ON(!list_empty(&dev_priv->mm.inactive_list));
    3662          mutex_unlock(&dev->struct_mutex);

Quoting Chris :
  "That BUG_ON there is silly and can simply be removed. The check is to
   verify that no batches were submitted to the kernel whilst the UMS/GEM
   client was suspended - to which the BUG_ONs are a crude approximation.
   Furthermore, the checks are too late, since it means we attempted to
   program the hardware whilst it was in an invalid state, the BUG_ONs are
   the least of your concerns at that point."

Note that this regression has been introduced in

commit 1b50247
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Apr 24 15:47:30 2012 +0100

    drm/i915: Remove the list of pinned inactive objects

Signed-off-by: Willy Tarreau <w@1wt.eu>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added note about the regressing commit and cc: stable.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hardkernel pushed a commit that referenced this pull request Oct 26, 2012
commit abce9ac upstream.

tpm_write calls tpm_transmit without checking the return value and
assigns the return value unconditionally to chip->pending_data, even if
it's an error value.
This causes three bugs.

So if we write to /dev/tpm0 with a tpm_param_size bigger than
TPM_BUFSIZE=0x1000 (e.g. 0x100a)
and a bufsize also bigger than TPM_BUFSIZE (e.g. 0x100a)
tpm_transmit returns -E2BIG which is assigned to chip->pending_data as
-7, but tpm_write returns that TPM_BUFSIZE bytes have been successfully
been written to the TPM, altough this is not true (bug #1).

As we did write more than than TPM_BUFSIZE bytes but tpm_write reports
that only TPM_BUFSIZE bytes have been written the vfs tries to write
the remaining bytes (in this case 10 bytes) to the tpm device driver via
tpm_write which then blocks at

 /* cannot perform a write until the read has cleared
 either via tpm_read or a user_read_timer timeout */
 while (atomic_read(&chip->data_pending) != 0)
	 msleep(TPM_TIMEOUT);

for 60 seconds, since data_pending is -7 and nobody is able to
read it (since tpm_read luckily checks if data_pending is greater than
0) (#bug 2).

After that the remaining bytes are written to the TPM which are
interpreted by the tpm as a normal command. (bug #3)
So if the last bytes of the command stream happen to be a e.g.
tpm_force_clear this gets accidentally sent to the TPM.

This patch fixes all three bugs, by propagating the error code of
tpm_write and returning -E2BIG if the input buffer is too big,
since the response from the tpm for a truncated value is bogus anyway.
Moreover it returns -EBUSY to userspace if there is a response ready to be
read.

Signed-off-by: Peter Huewe <peter.huewe@infineon.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hardkernel pushed a commit that referenced this pull request Oct 30, 2012
commit 6ff1f3d upstream.

On AM3517, tx and rx interrupt are detected together with
the disconnect event. This generates a kernel panic in musb_interrupt,
because rx / tx are handled after disconnect.
This issue was seen on a Technexion's TAM3517 SOM. Unplugging a device,
tx / rx interrupts together with disconnect are detected. This brings
to kernel panic like this:

[   68.526153] Unable to handle kernel NULL pointer dereference at virtual address 00000011
[   68.534698] pgd = c0004000
[   68.537536] [00000011] *pgd=00000000
[   68.541351] Internal error: Oops: 17 [#1] ARM
[   68.545928] Modules linked in:
[   68.549163] CPU: 0    Not tainted  (3.6.0-rc5-00020-g9e05905 #178)
[   68.555694] PC is at rxstate+0x8/0xdc
[   68.559539] LR is at musb_interrupt+0x98/0x858
[   68.564239] pc : [<c035cd88>]    lr : [<c035af1c>]    psr: 40000193
[   68.564239] sp : ce83fb40  ip : d0906410  fp : 00000000
[   68.576293] r10: 00000000  r9 : cf3b0e40  r8 : 00000002
[   68.581817] r7 : 00000019  r6 : 00000001  r5 : 00000001  r4 : 000000d4
[   68.588684] r3 : 00000000  r2 : 00000000  r1 : ffffffcc  r0 : cf23c108
[   68.595550] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment ke

Note: this behavior is not seen with a USB hub, while it is
easy to reproduce connecting a USB-pen directly to the USB-A of
the board.

Drop tx / rx interrupts if disconnect is detected.

Signed-off-by: Stefano Babic <sbabic@denx.de>
CC: Felipe Balbi <balbi@ti.com>
Tested-by: Dmitry Lifshitz <lifshitz@compulab.co.il>
Tested-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hardkernel pushed a commit that referenced this pull request Nov 1, 2012
On AM3517, tx and rx interrupt are detected together with
the disconnect event. This generates a kernel panic in musb_interrupt,
because rx / tx are handled after disconnect.
This issue was seen on a Technexion's TAM3517 SOM. Unplugging a device,
tx / rx interrupts together with disconnect are detected. This brings
to kernel panic like this:

[   68.526153] Unable to handle kernel NULL pointer dereference at virtual address 00000011
[   68.534698] pgd = c0004000
[   68.537536] [00000011] *pgd=00000000
[   68.541351] Internal error: Oops: 17 [#1] ARM
[   68.545928] Modules linked in:
[   68.549163] CPU: 0    Not tainted  (3.6.0-rc5-00020-g9e05905 #178)
[   68.555694] PC is at rxstate+0x8/0xdc
[   68.559539] LR is at musb_interrupt+0x98/0x858
[   68.564239] pc : [<c035cd88>]    lr : [<c035af1c>]    psr: 40000193
[   68.564239] sp : ce83fb40  ip : d0906410  fp : 00000000
[   68.576293] r10: 00000000  r9 : cf3b0e40  r8 : 00000002
[   68.581817] r7 : 00000019  r6 : 00000001  r5 : 00000001  r4 : 000000d4
[   68.588684] r3 : 00000000  r2 : 00000000  r1 : ffffffcc  r0 : cf23c108
[   68.595550] Flags: nZcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment ke

Note: this behavior is not seen with a USB hub, while it is
easy to reproduce connecting a USB-pen directly to the USB-A of
the board.

Drop tx / rx interrupts if disconnect is detected.

Signed-off-by: Stefano Babic <sbabic@denx.de>
CC: Felipe Balbi <balbi@ti.com>
Cc: stable@vger.kernel.org # 3.5 3.6
Tested-by: Dmitry Lifshitz <lifshitz@compulab.co.il>
Tested-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Felipe Balbi <balbi@ti.com>
hardkernel pushed a commit that referenced this pull request Nov 1, 2012
An interesting effect of using the generic version of linkage.h
is that the padding is defined in terms of x86 NOPs, which can have
even more interesting effects when the assembly code looks like this:

ENTRY(func1)
	mov	x0, xzr
ENDPROC(func1)
	// fall through
ENTRY(func2)
	mov	x0, #1
	ret
ENDPROC(func2)

Admittedly, the code is not very nice. But having code from another
architecture doesn't look completely sane either.

The fix is to add arm64's version of linkage.h, which causes the insertion
of proper AArch64 NOPs.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
hardkernel pushed a commit that referenced this pull request Nov 1, 2012
KVM_PV_REASON_PAGE_NOT_PRESENT kicks cpu out of idleness, but we haven't
marked that spot as an exit from idleness.

Not doing so can cause RCU warnings such as:

[  732.788386] ===============================
[  732.789803] [ INFO: suspicious RCU usage. ]
[  732.790032] 3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63 Tainted: G        W
[  732.790032] -------------------------------
[  732.790032] include/linux/rcupdate.h:738 rcu_read_lock() used illegally while idle!
[  732.790032]
[  732.790032] other info that might help us debug this:
[  732.790032]
[  732.790032]
[  732.790032] RCU used illegally from idle CPU!
[  732.790032] rcu_scheduler_active = 1, debug_locks = 1
[  732.790032] RCU used illegally from extended quiescent state!
[  732.790032] 2 locks held by trinity-child31/8252:
[  732.790032]  #0:  (&rq->lock){-.-.-.}, at: [<ffffffff83a67528>] __schedule+0x178/0x8f0
[  732.790032]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81152bde>] cpuacct_charge+0xe/0x200
[  732.790032]
[  732.790032] stack backtrace:
[  732.790032] Pid: 8252, comm: trinity-child31 Tainted: G        W    3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63
[  732.790032] Call Trace:
[  732.790032]  [<ffffffff8118266b>] lockdep_rcu_suspicious+0x10b/0x120
[  732.790032]  [<ffffffff81152c60>] cpuacct_charge+0x90/0x200
[  732.790032]  [<ffffffff81152bde>] ? cpuacct_charge+0xe/0x200
[  732.790032]  [<ffffffff81158093>] update_curr+0x1a3/0x270
[  732.790032]  [<ffffffff81158a6a>] dequeue_entity+0x2a/0x210
[  732.790032]  [<ffffffff81158ea5>] dequeue_task_fair+0x45/0x130
[  732.790032]  [<ffffffff8114ae29>] dequeue_task+0x89/0xa0
[  732.790032]  [<ffffffff8114bb9e>] deactivate_task+0x1e/0x20
[  732.790032]  [<ffffffff83a67c29>] __schedule+0x879/0x8f0
[  732.790032]  [<ffffffff8117e20d>] ? trace_hardirqs_off+0xd/0x10
[  732.790032]  [<ffffffff810a37a5>] ? kvm_async_pf_task_wait+0x1d5/0x2b0
[  732.790032]  [<ffffffff83a67cf5>] schedule+0x55/0x60
[  732.790032]  [<ffffffff810a37c4>] kvm_async_pf_task_wait+0x1f4/0x2b0
[  732.790032]  [<ffffffff81139e50>] ? abort_exclusive_wait+0xb0/0xb0
[  732.790032]  [<ffffffff81139c25>] ? prepare_to_wait+0x25/0x90
[  732.790032]  [<ffffffff810a3a66>] do_async_page_fault+0x56/0xa0
[  732.790032]  [<ffffffff83a6a6e8>] async_page_fault+0x28/0x30

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
hardkernel pushed a commit that referenced this pull request Nov 1, 2012
blk_put_rl() does not call blkg_put() for q->root_rl because we
don't take request list reference on q->root_blkg.
However, if root_blkg is once attached then detached (freed),
blk_put_rl() is confused by the bogus pointer in q->root_blkg.

For example, with !CONFIG_BLK_DEV_THROTTLING &&
CONFIG_CFQ_GROUP_IOSCHED,
switching IO scheduler from cfq to deadline will cause system stall
after the following warning with 3.6:

> WARNING: at /work/build/linux/block/blk-cgroup.h:250
> blk_put_rl+0x4d/0x95()
> Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf
> ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
> Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
> Call Trace:
>  <IRQ>  [<ffffffff810453bd>] warn_slowpath_common+0x85/0x9d
>  [<ffffffff810453ef>] warn_slowpath_null+0x1a/0x1c
>  [<ffffffff811d5f8d>] blk_put_rl+0x4d/0x95
>  [<ffffffff811d614a>] __blk_put_request+0xc3/0xcb
>  [<ffffffff811d71a3>] blk_finish_request+0x232/0x23f
>  [<ffffffff811d76c3>] ? blk_end_bidi_request+0x34/0x5d
>  [<ffffffff811d76d1>] blk_end_bidi_request+0x42/0x5d
>  [<ffffffff811d7728>] blk_end_request+0x10/0x12
>  [<ffffffff812cdf16>] scsi_io_completion+0x207/0x4d5
>  [<ffffffff812c6fcf>] scsi_finish_command+0xfa/0x103
>  [<ffffffff812ce2f8>] scsi_softirq_done+0xff/0x108
>  [<ffffffff811dcea5>] blk_done_softirq+0x8d/0xa1
>  [<ffffffff810915d5>] ?
>  generic_smp_call_function_single_interrupt+0x9f/0xd7
>  [<ffffffff8104cf5b>] __do_softirq+0x102/0x213
>  [<ffffffff8108a5ec>] ? lock_release_holdtime+0xb6/0xbb
>  [<ffffffff8104d2b4>] ? raise_softirq_irqoff+0x9/0x3d
>  [<ffffffff81424dfc>] call_softirq+0x1c/0x30
>  [<ffffffff81011beb>] do_softirq+0x4b/0xa3
>  [<ffffffff8104cdb0>] irq_exit+0x53/0xd5
>  [<ffffffff8102d865>] smp_call_function_single_interrupt+0x34/0x36
>  [<ffffffff8142486f>] call_function_single_interrupt+0x6f/0x80
>  <EOI>  [<ffffffff8101800b>] ? mwait_idle+0x94/0xcd
>  [<ffffffff81018002>] ? mwait_idle+0x8b/0xcd
>  [<ffffffff81017811>] cpu_idle+0xbb/0x114
>  [<ffffffff81401fbd>] rest_init+0xc1/0xc8
>  [<ffffffff81401efc>] ? csum_partial_copy_generic+0x16c/0x16c
>  [<ffffffff81cdbd3d>] start_kernel+0x3d4/0x3e1
>  [<ffffffff81cdb79e>] ? kernel_init+0x1f7/0x1f7
>  [<ffffffff81cdb2dd>] x86_64_start_reservations+0xb8/0xbd
>  [<ffffffff81cdb3e3>] x86_64_start_kernel+0x101/0x110

This patch clears q->root_blkg and q->root_rl.blkg when root blkg
is destroyed.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
hardkernel pushed a commit that referenced this pull request Nov 1, 2012
Calling __pa() with an ioremap'd address is invalid. If we
encounter an efi_memory_desc_t without EFI_MEMORY_WB set in
->attribute we currently call set_memory_uc(), which in turn
calls __pa() on a potentially ioremap'd address.

On CONFIG_X86_32 this results in the following oops:

  BUG: unable to handle kernel paging request at f7f22280
  IP: [<c10257b9>] reserve_ram_pages_type+0x89/0x210
  *pdpt = 0000000001978001 *pde = 0000000001ffb067 *pte = 0000000000000000
  Oops: 0000 [#1] PREEMPT SMP
  Modules linked in:

  Pid: 0, comm: swapper Not tainted 3.0.0-acpi-efi-0805 #3
   EIP: 0060:[<c10257b9>] EFLAGS: 00010202 CPU: 0
   EIP is at reserve_ram_pages_type+0x89/0x210
   EAX: 0070e280 EBX: 38714000 ECX: f7814000 EDX: 00000000
   ESI: 00000000 EDI: 38715000 EBP: c189fef0 ESP: c189fea8
   DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
  Process swapper (pid: 0, ti=c189e000 task=c18bbe60 task.ti=c189e000)
  Stack:
   80000200 ff108000 00000000 c189ff00 00038714 00000000 00000000 c189fed0
   c104f8ca 00038714 00000000 00038715 00000000 00000000 00038715 00000000
   00000010 38715000 c189ff4 c1025aff 38715000 00000000 00000010 00000000
  Call Trace:
   [<c104f8ca>] ? page_is_ram+0x1a/0x40
   [<c1025aff>] reserve_memtype+0xdf/0x2f0
   [<c1024dc9>] set_memory_uc+0x49/0xa0
   [<c19334d0>] efi_enter_virtual_mode+0x1c2/0x3aa
   [<c19216d4>] start_kernel+0x291/0x2f2
   [<c19211c7>] ? loglevel+0x1b/0x1b
   [<c19210bf>] i386_start_kernel+0xbf/0xc8

The only time we can call set_memory_uc() for a memory region is
when it is part of the direct kernel mapping. For the case where
we ioremap a memory region we must leave it alone.

This patch reimplements the fix from e8c7106 ("x86, efi:
Calling __pa() with an ioremap()ed address is invalid") which
was reverted in e1ad783 because it caused a regression on
some MacBooks (they hung at boot). The regression was caused
because the commit only marked EFI_RUNTIME_SERVICES_DATA as
E820_RESERVED_EFI, when it should have marked all regions that
have the EFI_MEMORY_RUNTIME attribute.

Despite first impressions, it's not possible to use
ioremap_cache() to map all cached memory regions on
CONFIG_X86_64 because of the way that the memory map might be
configured as detailed in the following bug report,

	https://bugzilla.redhat.com/show_bug.cgi?id=748516

e.g. some of the EFI memory regions *need* to be mapped as part
of the direct kernel mapping.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Huang Ying <huang.ying.caritas@gmail.com>
Cc: Keith Packard <keithp@keithp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1350649546-23541-1-git-send-email-matt@console-pimps.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
hardkernel pushed a commit that referenced this pull request Nov 1, 2012
Commit ad67607 ("device_cgroup: convert device_cgroup internally to
policy + exceptions") removed rcu locks which are needed in
task_devcgroup called in this chain:

  devcgroup_inode_mknod OR __devcgroup_inode_permission ->
    __devcgroup_inode_permission ->
      task_devcgroup ->
        task_subsys_state ->
          task_subsys_state_check.

Change the code so that task_devcgroup is safely called with rcu read
lock held.

  ===============================
  [ INFO: suspicious RCU usage. ]
  3.6.0-rc5-next-20120913+ #42 Not tainted
  -------------------------------
  include/linux/cgroup.h:553 suspicious rcu_dereference_check() usage!

  other info that might help us debug this:

  rcu_scheduler_active = 1, debug_locks = 0
  2 locks held by kdevtmpfs/23:
   #0:  (sb_writers){.+.+.+}, at: [<ffffffff8116873f>]
  mnt_want_write+0x1f/0x50
   #1:  (&sb->s_type->i_mutex_key#3/1){+.+.+.}, at: [<ffffffff811558af>]
  kern_path_create+0x7f/0x170

  stack backtrace:
  Pid: 23, comm: kdevtmpfs Not tainted 3.6.0-rc5-next-20120913+ #42
  Call Trace:
    lockdep_rcu_suspicious+0xfd/0x130
    devcgroup_inode_mknod+0x19d/0x240
    vfs_mknod+0x71/0xf0
    handle_create.isra.2+0x72/0x200
    devtmpfsd+0x114/0x140
    ? handle_create.isra.2+0x200/0x200
    kthread+0xd6/0xe0
    kernel_thread_helper+0x4/0x10

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: James Morris <jmorris@namei.org>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hardkernel pushed a commit that referenced this pull request Nov 6, 2012
commit 65635cb upstream.

blk_put_rl() does not call blkg_put() for q->root_rl because we
don't take request list reference on q->root_blkg.
However, if root_blkg is once attached then detached (freed),
blk_put_rl() is confused by the bogus pointer in q->root_blkg.

For example, with !CONFIG_BLK_DEV_THROTTLING &&
CONFIG_CFQ_GROUP_IOSCHED,
switching IO scheduler from cfq to deadline will cause system stall
after the following warning with 3.6:

> WARNING: at /work/build/linux/block/blk-cgroup.h:250
> blk_put_rl+0x4d/0x95()
> Modules linked in: bridge stp llc sunrpc acpi_cpufreq freq_table mperf
> ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
> Pid: 0, comm: swapper/0 Not tainted 3.6.0 #1
> Call Trace:
>  <IRQ>  [<ffffffff810453bd>] warn_slowpath_common+0x85/0x9d
>  [<ffffffff810453ef>] warn_slowpath_null+0x1a/0x1c
>  [<ffffffff811d5f8d>] blk_put_rl+0x4d/0x95
>  [<ffffffff811d614a>] __blk_put_request+0xc3/0xcb
>  [<ffffffff811d71a3>] blk_finish_request+0x232/0x23f
>  [<ffffffff811d76c3>] ? blk_end_bidi_request+0x34/0x5d
>  [<ffffffff811d76d1>] blk_end_bidi_request+0x42/0x5d
>  [<ffffffff811d7728>] blk_end_request+0x10/0x12
>  [<ffffffff812cdf16>] scsi_io_completion+0x207/0x4d5
>  [<ffffffff812c6fcf>] scsi_finish_command+0xfa/0x103
>  [<ffffffff812ce2f8>] scsi_softirq_done+0xff/0x108
>  [<ffffffff811dcea5>] blk_done_softirq+0x8d/0xa1
>  [<ffffffff810915d5>] ?
>  generic_smp_call_function_single_interrupt+0x9f/0xd7
>  [<ffffffff8104cf5b>] __do_softirq+0x102/0x213
>  [<ffffffff8108a5ec>] ? lock_release_holdtime+0xb6/0xbb
>  [<ffffffff8104d2b4>] ? raise_softirq_irqoff+0x9/0x3d
>  [<ffffffff81424dfc>] call_softirq+0x1c/0x30
>  [<ffffffff81011beb>] do_softirq+0x4b/0xa3
>  [<ffffffff8104cdb0>] irq_exit+0x53/0xd5
>  [<ffffffff8102d865>] smp_call_function_single_interrupt+0x34/0x36
>  [<ffffffff8142486f>] call_function_single_interrupt+0x6f/0x80
>  <EOI>  [<ffffffff8101800b>] ? mwait_idle+0x94/0xcd
>  [<ffffffff81018002>] ? mwait_idle+0x8b/0xcd
>  [<ffffffff81017811>] cpu_idle+0xbb/0x114
>  [<ffffffff81401fbd>] rest_init+0xc1/0xc8
>  [<ffffffff81401efc>] ? csum_partial_copy_generic+0x16c/0x16c
>  [<ffffffff81cdbd3d>] start_kernel+0x3d4/0x3e1
>  [<ffffffff81cdb79e>] ? kernel_init+0x1f7/0x1f7
>  [<ffffffff81cdb2dd>] x86_64_start_reservations+0xb8/0xbd
>  [<ffffffff81cdb3e3>] x86_64_start_kernel+0x101/0x110

This patch clears q->root_blkg and q->root_rl.blkg when root blkg
is destroyed.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hardkernel pushed a commit that referenced this pull request Nov 14, 2012
…CONFIG_VFPv3 set

After commit 846a136 ("ARM: vfp: fix
saving d16-d31 vfp registers on v6+ kernels"), the OMAP 2430SDP board
started crashing during boot with omap2plus_defconfig:

[    3.875122] mmcblk0: mmc0:e624 SD04G 3.69 GiB
[    3.915954]  mmcblk0: p1
[    4.086639] Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
[    4.093719] Modules linked in:
[    4.096954] CPU: 0    Not tainted  (3.6.0-02232-g759e00b #570)
[    4.103149] PC is at vfp_reload_hw+0x1c/0x44
[    4.107666] LR is at __und_usr_fault_32+0x0/0x8

It turns out that the context save/restore fix unmasked a latent bug
in commit 5aaf254 ("ARM: 6203/1: Make
VFPv3 usable on ARMv6").  When CONFIG_VFPv3 is set, but the kernel is
booted on a pre-VFPv3 core, the code attempts to save and restore the
d16-d31 VFP registers.  These are only present on non-D16 VFPv3+, so
this results in an undefined instruction exception.  The code didn't
crash before commit 846a136 because the save and restore code was
only touching d0-d15, present on all VFP.

Fix by implementing a request from Russell King to add a new HWCAP
flag that affirmatively indicates the presence of the d16-d31
registers:

   http://marc.info/?l=linux-arm-kernel&m=135013547905283&w=2

and some feedback from Måns to clarify the name of the HWCAP flag.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Martin <dave.martin@linaro.org>
Cc: Måns Rullgård <mans.rullgard@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
hardkernel pushed a commit that referenced this pull request Nov 14, 2012
After commit b3356bf (KVM: emulator: optimize "rep ins" handling),
the pieces of io data can be collected and write them to the guest memory
or MMIO together

Unfortunately, kvm splits the mmio access into 8 bytes and store them to
vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
will cause vcpu->mmio_fragments overflow

The bug can be exposed by isapc (-M isapc):

[23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ ......]
[23154.858083] Call Trace:
[23154.859874]  [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm]
[23154.861677]  [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
[23154.863604]  [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]

Actually, we can use one mmio_fragment to store a large mmio access then
split it when we pass the mmio-exit-info to userspace. After that, we only
need two entries to store mmio info for the cross-mmio pages access

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
hardkernel pushed a commit that referenced this pull request Nov 14, 2012
exit_idle() should be called after irq_enter(), otherwise it throws:

[ INFO: suspicious RCU usage. ]
3.6.5 #1 Not tainted
-------------------------------
include/linux/rcupdate.h:725 rcu_read_lock() used illegally while idle!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
 #0:  (rcu_read_lock){......}, at: [<ffffffff810e9fe0>] __atomic_notifier_call_chain+0x0/0x140

stack backtrace:
Pid: 0, comm: swapper/0 Not tainted 3.6.5 #1
Call Trace:
 <IRQ>  [<ffffffff811259a2>] lockdep_rcu_suspicious+0xe2/0x130
 [<ffffffff810ea10c>] __atomic_notifier_call_chain+0x12c/0x140
 [<ffffffff810e9fe0>] ? atomic_notifier_chain_unregister+0x90/0x90
 [<ffffffff811216cd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810ea136>] atomic_notifier_call_chain+0x16/0x20
 [<ffffffff810777c3>] exit_idle+0x43/0x50
 [<ffffffff81568865>] xen_evtchn_do_upcall+0x25/0x50
 [<ffffffff81aa690e>] xen_do_hypervisor_callback+0x1e/0x30
 <EOI>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
 [<ffffffff81061540>] ? xen_safe_halt+0x10/0x20
 [<ffffffff81075cfa>] ? default_idle+0xba/0x570
 [<ffffffff810778af>] ? cpu_idle+0xdf/0x140
 [<ffffffff81a4d881>] ? rest_init+0x135/0x144
 [<ffffffff81a4d74c>] ? csum_partial_copy_generic+0x16c/0x16c
 [<ffffffff82520c45>] ? start_kernel+0x3db/0x3e8
 [<ffffffff8252066a>] ? repair_env_string+0x5a/0x5a
 [<ffffffff82520356>] ? x86_64_start_reservations+0x131/0x135
 [<ffffffff82524aca>] ? xen_start_kernel+0x465/0x46

Git commit 98ad1cc
Author: Frederic Weisbecker <fweisbec@gmail.com>
Date:   Fri Oct 7 18:22:09 2011 +0200

    x86: Call idle notifier after irq_enter()

did this, but it missed the Xen code.

Signed-off-by: Mojiong Qiu <mjqiu@tencent.com>
Cc: stable@vger.kernel.org # from 3.3 and newer.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
hardkernel pushed a commit that referenced this pull request Nov 17, 2012
A number of Huawei 3G and LTE modems implement a CDC NCM function,
including the necessary functional descriptors, but using a non
standard interface layout and class/subclass/protocol codes.

These devices can be handled by this driver with only a minor
change to the probing logic, allowing a single combined control
and data interface.  This works because the devices
- include a CDC Union descriptor labelling the combined
  interface as both master and slave, and
- have an alternate setting #1 for the bulk endpoints on the
  combined interface.

The 3G/LTE network connection is managed by vendor specific AT
commands on a serial function in the same composite device.
Handling the managment function is out of the scope of this
driver.  It will be handled by an appropriate USB serial
driver.

Reported-and-Tested-by: Olof Ermis <olof.ermis@gmail.com>
Reported-and-Tested-by: Tommy Cheng <tommy7765@yahoo.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
hardkernel pushed a commit that referenced this pull request Nov 17, 2012
The 3215 console always has the RAW3215_FIXED flag set, which causes
raw3215_shutdown() not to wait for outstanding I/O requests if an attached
tty gets closed.
The flag however can be simply removed, so we can guarantee that all requests
belonging to the tty have been processed when the tty is closed.

However the tasklet that belongs to the 3215 device may be scheduled even if
there is no tty attached anymore, since we have a race between console and tty
processing.
Thefore unconditional tty_wakekup() in raw3215_wakeup() can cause the following
NULL pointer dereference:

3.465368 Unable to handle kernel pointer dereference at virtual kernel address (null)
3.465448 Oops: 0004 #1 SMP
3.465454 Modules linked in:
3.465459 CPU: 1 Not tainted 3.6.0 #1
3.465462 Process swapper/1 (pid: 0, task: 000000003ffa4428, ksp: 000000003ffb7ce0)
3.465466 Krnl PSW : 0404100180000000 0000000000162f86 (__wake_up+0x46/0xb8)
3.465480            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3
         Krnl GPRS: fffffffffffffffe 0000000000000000 0000000000000160 0000000000000001
3.465492            0000000000000001 0000000000000004 0000000000000004 000000000096b490
3.465499            0000000000000001 0000000000000100 0000000000000001 0000000000000001
3.465506            070000003fc87d60 0000000000000160 000000003fc87d68 000000003fc87d00
3.465526 Krnl Code: 0000000000162f76: e3c0f0a80004      lg      %r12,168(%r15)
                    0000000000162f7c: 58000370          l       %r0,880
                   #0000000000162f80: c007ffffffff00    xilf    %r0,4294967295
                   >0000000000162f86: ba102000          cs      %r1,%r0,0(%r2)
                    0000000000162f8a: 1211              ltr     %r1,%r1
                    0000000000162f8c: a774002f          brc     7,162fea
                    0000000000162f90: b904002d          lgr     %r2,%r13
                    0000000000162f94: b904003a          lgr     %r3,%r10
3.465597 Call Trace:
3.465599 (<0400000000000000> 0x400000000000000)
3.465602  <000000000048c77e> raw3215_wakeup+0x2e/0x40
3.465607  <0000000000134d66> tasklet_action+0x96/0x168
3.465612  <000000000013423c> __do_softirq+0xd8/0x21c
3.465615  <0000000000134678> irq_exit+0xa8/0xac
3.465617  <000000000046c232> do_IRQ+0x182/0x248
3.465621  <00000000005c8296> io_return+0x0/0x8
3.465625  <00000000005c7cac> vtime_stop_cpu+0x4c/0xb8
3.465629 (<0000000000194e06> tick_nohz_idle_enter+0x4e/0x74)
3.465633  <0000000000104760> cpu_idle+0x170/0x184
3.465636  <00000000005b5182> smp_start_secondary+0xd6/0xe0
3.465641  <00000000005c86be> restart_int_handler+0x56/0x6c
3.465643  <0000000000000000> 0x0
3.465645 Last Breaking-Event-Address:
3.465647  <0000000000403136> tty_wakeup+0x46/0x98
3.465652
3.465654 Kernel panic - not syncing: Fatal exception in interrupt
01: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 0010F63C

The easiest solution is simply to check if tty is NULL in the tasklet.
If it is NULL nothing is to do (no tty attached), otherwise tty_wakeup()
can be called, since we hold a reference to the tty.
This is not nice... but it is a small patch and it works.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
hardkernel pushed a commit that referenced this pull request Nov 17, 2012
Iterating over the vma->anon_vma_chain without anon_vma_lock may cause
NULL ptr deref in anon_vma_interval_tree_verify(), because the node in the
chain might have been removed.

  BUG: unable to handle kernel paging request at fffffffffffffff0
  IP: [<ffffffff8122c29c>] anon_vma_interval_tree_verify+0xc/0xa0
  PGD 4e28067 PUD 4e29067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  CPU 0
  Pid: 9050, comm: trinity-child64 Tainted: G        W    3.7.0-rc2-next-20121025-sasha-00001-g673f98e-dirty #77
  RIP: 0010: anon_vma_interval_tree_verify+0xc/0xa0
  Process trinity-child64 (pid: 9050, threadinfo ffff880045f80000, task ffff880048eb0000)
  Call Trace:
    validate_mm+0x58/0x1e0
    vma_adjust+0x635/0x6b0
    __split_vma.isra.22+0x161/0x220
    split_vma+0x24/0x30
    sys_madvise+0x5da/0x7b0
    tracesys+0xe1/0xe6
  RIP  anon_vma_interval_tree_verify+0xc/0xa0
  CR2: fffffffffffffff0

Figured out by Bob Liu.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Bob Liu <lliubbo@gmail.com>
Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hardkernel pushed a commit that referenced this pull request Nov 21, 2012
…el/git/jgarzik/libata-dev

Pull libata fixes from Jeff Garzik:
 "If you were going to shoot me for not sending these earlier, you would
  be right.  -rc6 beat me by ~2 hours it seems, and they really should
  have gone out long before that.

  These have been in libata-dev.git for a day or so (unfortunately
  linux-next is on vacation).  The main one is #1, with the others being
  minor bits.  #1 has multiple tested-by, and can be considered a
  regression fix IMO.

   1) Fix ACPI oops:

        https://bugzilla.kernel.org/show_bug.cgi?id=48211

   2) Temporary WARN_ONCE() debugging patch for further ACPI debugging.

      The code already oopses here, and so this merely gives slightly
      better info.  Related to

        https://bugzilla.kernel.org/show_bug.cgi?id=49151

      which has been bisected down to a patch that _exposes_ a latest
      bug, but said bisection target does not actually appear to be the
      root cause itself.

   3) sata_svw: fix longstanding error recovery bug, which was
      preventing kdump, by adding missing DMA-start bit check.  Core
      code was already checking DMA-start, but ancillary, less-used
      routines were not.  Fixed.

   4) sata_highbank: fix minor __init/__devinit warning

   5) Fix minor warning, if CONFIG_PM is set, but CONFIG_PM_SLEEP is not
      set

   6) pata_arasan: proper functioning requires clock setting"

* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] PM callbacks should be conditionally compiled on CONFIG_PM_SLEEP
  sata_svw: check DMA start bit before reset
  libata debugging: Warn when unable to find timing descriptor based on xfer_mode
  sata_highbank: mark ahci_highbank_probe as __devinit
  pata_arasan: Initialize cf clock to 166MHz
  libata-acpi: Fix NULL ptr derference in ata_acpi_dev_handle
hardkernel pushed a commit that referenced this pull request Nov 26, 2012
In 32 bit the stack address provided by kernel_stack_pointer() may
point to an invalid range causing NULL pointer access or page faults
while in NMI (see trace below). This happens if called in softirq
context and if the stack is empty. The address at &regs->sp is then
out of range.

Fixing this by checking if regs and &regs->sp are in the same stack
context. Otherwise return the previous stack pointer stored in struct
thread_info. If that address is invalid too, return address of regs.

 BUG: unable to handle kernel NULL pointer dereference at 0000000a
 IP: [<c1004237>] print_context_stack+0x6e/0x8d
 *pde = 00000000
 Oops: 0000 [#1] SMP
 Modules linked in:
 Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch
 EIP: 0060:[<c1004237>] EFLAGS: 00010093 CPU: 0
 EIP is at print_context_stack+0x6e/0x8d
 EAX: ffffe000 EBX: 0000000a ECX: f4435f94 EDX: 0000000a
 ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0
  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
 CR0: 8005003b CR2: 0000000a CR3: 34ac9000 CR4: 000007d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
 Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000)
 Stack:
  000003e8 ffffe000 00001ffc f4e39b00 00000000 0000000a f4435f94 c155198c
  f5409ef0 c1003723 c155198c f5409f04 00000000 f5409edc 00000000 00000000
  f5409ee8 f4435f94 f5409fc4 00000001 f5409f1c c12dce1c 00000000 c155198c
 Call Trace:
  [<c1003723>] dump_trace+0x7b/0xa1
  [<c12dce1c>] x86_backtrace+0x40/0x88
  [<c12db712>] ? oprofile_add_sample+0x56/0x84
  [<c12db731>] oprofile_add_sample+0x75/0x84
  [<c12ddb5b>] op_amd_check_ctrs+0x46/0x260
  [<c12dd40d>] profile_exceptions_notify+0x23/0x4c
  [<c1395034>] nmi_handle+0x31/0x4a
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c13950ed>] do_nmi+0xa0/0x2ff
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c13949e5>] nmi_stack_correct+0x28/0x2d
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c1003603>] ? do_softirq+0x4b/0x7f
  <IRQ>
  [<c102a06f>] irq_exit+0x35/0x5b
  [<c1018f56>] smp_apic_timer_interrupt+0x6c/0x7a
  [<c1394746>] apic_timer_interrupt+0x2a/0x30
 Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 <8b> 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc
 EIP: [<c1004237>] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0
 CR2: 000000000000000a
 ---[ end trace 62afee3481b00012 ]---
 Kernel panic - not syncing: Fatal exception in interrupt

V2:
* add comments to kernel_stack_pointer()
* always return a valid stack address by falling back to the address
  of regs

Reported-by: Yang Wei <wei.yang@windriver.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jun Zhang <jun.zhang@intel.com>
hardkernel pushed a commit that referenced this pull request Nov 26, 2012
We need to first destroy the floppy_wq workqueue before cleaning up
the queue. Otherwise we might race with still pending work with the
workqueue, but all the block queue already gone. This might lead to
various oopses, such as

 CPU 0
 Pid: 6, comm: kworker/u:0 Not tainted 3.7.0-rc4 #1 Bochs Bochs
 RIP: 0010:[<ffffffff8134eef5>]  [<ffffffff8134eef5>] blk_peek_request+0xd5/0x1c0
 RSP: 0000:ffff88000dc7dd88  EFLAGS: 00010092
 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: ffff88000f602688 RSI: ffffffff81fd95d8 RDI: 6b6b6b6b6b6b6b6b
 RBP: ffff88000dc7dd98 R08: ffffffff81fd95c8 R09: 0000000000000000
 R10: ffffffff81fd9480 R11: 0000000000000001 R12: 6b6b6b6b6b6b6b6b
 R13: ffff88000dc7dfd8 R14: ffff88000dc7dfd8 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 0000000001e11000 CR4: 00000000000006f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process kworker/u:0 (pid: 6, threadinfo ffff88000dc7c000, task ffff88000dc5ecc0)
 Stack:
  0000000000000000 0000000000000000 ffff88000dc7ddb8 ffffffff8134efee
  ffff88000dc7ddb8 0000000000000000 ffff88000dc7dde8 ffffffff814aef3c
  ffffffff81e75d80 ffff88000dc0c640 ffff88000fbfb000 ffffffff814aed90
 Call Trace:
  [<ffffffff8134efee>] blk_fetch_request+0xe/0x30
  [<ffffffff814aef3c>] redo_fd_request+0x1ac/0x400
  [<ffffffff814aed90>] ? start_motor+0x130/0x130
  [<ffffffff8106b526>] process_one_work+0x136/0x450
  [<ffffffff8106af65>] ? manage_workers+0x205/0x2e0
  [<ffffffff8106bb6d>] worker_thread+0x14d/0x420
  [<ffffffff8106ba20>] ? rescuer_thread+0x1a0/0x1a0
  [<ffffffff8107075a>] kthread+0xba/0xc0
  [<ffffffff810706a0>] ? __kthread_parkme+0x80/0x80
  [<ffffffff818b553a>] ret_from_fork+0x7a/0xb0
  [<ffffffff810706a0>] ? __kthread_parkme+0x80/0x80
 Code: 0f 84 c0 00 00 00 83 f8 01 0f 85 e2 00 00 00 81 4b 40 00 00 80 00 48 89 df e8 58 f8 ff ff be fb ff ff ff
 fe ff ff <49> 8b 1c 24 49 39 dc 0f 85 2e ff ff ff 41 0f b6 84 24 28 04 00
 RIP  [<ffffffff8134eef5>] blk_peek_request+0xd5/0x1c0
  RSP <ffff88000dc7dd88>

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
hardkernel pushed a commit that referenced this pull request Nov 27, 2012
commit 772aebc upstream.

exit_idle() should be called after irq_enter(), otherwise it throws:

[ INFO: suspicious RCU usage. ]
3.6.5 #1 Not tainted
-------------------------------
include/linux/rcupdate.h:725 rcu_read_lock() used illegally while idle!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
 #0:  (rcu_read_lock){......}, at: [<ffffffff810e9fe0>] __atomic_notifier_call_chain+0x0/0x140

stack backtrace:
Pid: 0, comm: swapper/0 Not tainted 3.6.5 #1
Call Trace:
 <IRQ>  [<ffffffff811259a2>] lockdep_rcu_suspicious+0xe2/0x130
 [<ffffffff810ea10c>] __atomic_notifier_call_chain+0x12c/0x140
 [<ffffffff810e9fe0>] ? atomic_notifier_chain_unregister+0x90/0x90
 [<ffffffff811216cd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff810ea136>] atomic_notifier_call_chain+0x16/0x20
 [<ffffffff810777c3>] exit_idle+0x43/0x50
 [<ffffffff81568865>] xen_evtchn_do_upcall+0x25/0x50
 [<ffffffff81aa690e>] xen_do_hypervisor_callback+0x1e/0x30
 <EOI>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
 [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
 [<ffffffff81061540>] ? xen_safe_halt+0x10/0x20
 [<ffffffff81075cfa>] ? default_idle+0xba/0x570
 [<ffffffff810778af>] ? cpu_idle+0xdf/0x140
 [<ffffffff81a4d881>] ? rest_init+0x135/0x144
 [<ffffffff81a4d74c>] ? csum_partial_copy_generic+0x16c/0x16c
 [<ffffffff82520c45>] ? start_kernel+0x3db/0x3e8
 [<ffffffff8252066a>] ? repair_env_string+0x5a/0x5a
 [<ffffffff82520356>] ? x86_64_start_reservations+0x131/0x135
 [<ffffffff82524aca>] ? xen_start_kernel+0x465/0x46

Git commit 98ad1cc
Author: Frederic Weisbecker <fweisbec@gmail.com>
Date:   Fri Oct 7 18:22:09 2011 +0200

    x86: Call idle notifier after irq_enter()

did this, but it missed the Xen code.

Signed-off-by: Mojiong Qiu <mjqiu@tencent.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hardkernel pushed a commit that referenced this pull request Dec 4, 2012
commit 1022623 upstream.

In 32 bit the stack address provided by kernel_stack_pointer() may
point to an invalid range causing NULL pointer access or page faults
while in NMI (see trace below). This happens if called in softirq
context and if the stack is empty. The address at &regs->sp is then
out of range.

Fixing this by checking if regs and &regs->sp are in the same stack
context. Otherwise return the previous stack pointer stored in struct
thread_info. If that address is invalid too, return address of regs.

 BUG: unable to handle kernel NULL pointer dereference at 0000000a
 IP: [<c1004237>] print_context_stack+0x6e/0x8d
 *pde = 00000000
 Oops: 0000 [#1] SMP
 Modules linked in:
 Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch
 EIP: 0060:[<c1004237>] EFLAGS: 00010093 CPU: 0
 EIP is at print_context_stack+0x6e/0x8d
 EAX: ffffe000 EBX: 0000000a ECX: f4435f94 EDX: 0000000a
 ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0
  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
 CR0: 8005003b CR2: 0000000a CR3: 34ac9000 CR4: 000007d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
 Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000)
 Stack:
  000003e8 ffffe000 00001ffc f4e39b00 00000000 0000000a f4435f94 c155198c
  f5409ef0 c1003723 c155198c f5409f04 00000000 f5409edc 00000000 00000000
  f5409ee8 f4435f94 f5409fc4 00000001 f5409f1c c12dce1c 00000000 c155198c
 Call Trace:
  [<c1003723>] dump_trace+0x7b/0xa1
  [<c12dce1c>] x86_backtrace+0x40/0x88
  [<c12db712>] ? oprofile_add_sample+0x56/0x84
  [<c12db731>] oprofile_add_sample+0x75/0x84
  [<c12ddb5b>] op_amd_check_ctrs+0x46/0x260
  [<c12dd40d>] profile_exceptions_notify+0x23/0x4c
  [<c1395034>] nmi_handle+0x31/0x4a
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c13950ed>] do_nmi+0xa0/0x2ff
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c13949e5>] nmi_stack_correct+0x28/0x2d
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c1003603>] ? do_softirq+0x4b/0x7f
  <IRQ>
  [<c102a06f>] irq_exit+0x35/0x5b
  [<c1018f56>] smp_apic_timer_interrupt+0x6c/0x7a
  [<c1394746>] apic_timer_interrupt+0x2a/0x30
 Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 <8b> 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc
 EIP: [<c1004237>] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0
 CR2: 000000000000000a
 ---[ end trace 62afee3481b00012 ]---
 Kernel panic - not syncing: Fatal exception in interrupt

V2:
* add comments to kernel_stack_pointer()
* always return a valid stack address by falling back to the address
  of regs

Reported-by: Yang Wei <wei.yang@windriver.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jun Zhang <jun.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hardkernel pushed a commit that referenced this pull request Dec 4, 2012
The driver has only 4 hardcoded labels, but allows much more memory.
Fix it by removing the hardcoded logic, using snprintf() instead.

[   19.833972] general protection fault: 0000 [#1] SMP
[   19.837733] Modules linked in: i82975x_edac(+) edac_core firewire_ohci firewire_core crc_itu_t nouveau mxm_wmi wmi video i2c_algo_bit drm_kms_helper ttm drm i2c_core
[   19.837733] CPU 0
[   19.837733] Pid: 390, comm: udevd Not tainted 3.6.1-1.fc17.x86_64.debug #1 Dell Inc.                 Precision WorkStation 390    /0MY510
[   19.837733] RIP: 0010:[<ffffffff813463a8>]  [<ffffffff813463a8>] strncpy+0x18/0x30
[   19.837733] RSP: 0018:ffff880078535b68  EFLAGS: 00010202
[   19.837733] RAX: ffff880069fa9708 RBX: ffff880078588000 RCX: ffff880069fa9708
[   19.837733] RDX: 000000000000001f RSI: 5f706f5f63616465 RDI: ffff880069fa9708
[   19.837733] RBP: ffff880078535b68 R08: ffff880069fa9727 R09: 000000000000fffe
[   19.837733] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
[   19.837733] R13: 0000000000000000 R14: ffff880069fa9290 R15: ffff880079624a80
[   19.837733] FS:  00007f3de01ee840(0000) GS:ffff88007c400000(0000) knlGS:0000000000000000
[   19.837733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   19.837733] CR2: 00007f3de00b9000 CR3: 0000000078dbc000 CR4: 00000000000007f0
[   19.837733] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   19.837733] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   19.837733] Process udevd (pid: 390, threadinfo ffff880078534000, task ffff880079642450)
[   19.837733] Stack:
[   19.837733]  ffff880078535c18 ffffffffa017c6b8 00040000816d627f ffff880079624a88
[   19.837733]  ffffc90004cd6000 ffff880079624520 ffff88007ac21148 0000000000000000
[   19.837733]  0000000000000000 0004000000000000 feda000078535bc8 ffffffff810d696d
[   19.837733] Call Trace:
[   19.837733]  [<ffffffffa017c6b8>] i82975x_init_one+0x2e6/0x3e6 [i82975x_edac]
...

Fix bug reported at:
	https://bugzilla.redhat.com/show_bug.cgi?id=848149
And, very likely:
	https://bbs.archlinux.org/viewtopic.php?id=148033
	https://bugzilla.kernel.org/show_bug.cgi?id=47171

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
hardkernel pushed a commit that referenced this pull request Dec 4, 2012
I enable CONFIG_DEBUG_VIRTUAL and CONFIG_SPARSEMEM_VMEMMAP, when doing
memory hotremove, there is a kernel BUG at arch/x86/mm/physaddr.c:20.

It is caused by free_section_usemap()->virt_to_page(), virt_to_page() is
only used for kernel direct mapping address, but sparse-vmemmap uses
vmemmap address, so it is going wrong here.

  ------------[ cut here ]------------
  kernel BUG at arch/x86/mm/physaddr.c:20!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: acpihp_drv acpihp_slot edd cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse vfat fat loop dm_mod coretemp kvm crc32c_intel ipv6 ixgbe igb iTCO_wdt i7core_edac edac_core pcspkr iTCO_vendor_support ioatdma microcode joydev sr_mod i2c_i801 dca lpc_ich mfd_core mdio tpm_tis i2c_core hid_generic tpm cdrom sg tpm_bios rtc_cmos button ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_hp_sw scsi_dh_rdac scsi_dh_emc scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
  CPU 39
  Pid: 6454, comm: sh Not tainted 3.7.0-rc1-acpihp-final+ #45 QCI QSSC-S4R/QSSC-S4R
  RIP: 0010:[<ffffffff8103c908>]  [<ffffffff8103c908>] __phys_addr+0x88/0x90
  RSP: 0018:ffff8804440d7c08  EFLAGS: 00010006
  RAX: 0000000000000006 RBX: ffffea0012000000 RCX: 000000000000002c
  ...

Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reviewd-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
@mdrjr mdrjr mentioned this pull request Feb 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants