如何阅读、理解、分析和调试 Linux 内核崩溃？

Question

提问by 0x90

Consider the following linux kernel dump stack trace, you can trigger a panic from the kernel source code by calling panic("debugging a linux kernel panic");:

考虑以下 linux 内核转储堆栈跟踪，您可以通过调用从内核源代码触发恐慌panic("debugging a linux kernel panic");：

[<001360ac>] (unwind_backtrace+0x0/0xf8) from [<00147b7c>] (warn_slowpath_common+0x50/0x60)
[<00147b7c>] (warn_slowpath_common+0x50/0x60) from [<00147c40>] (warn_slowpath_null+0x1c/0x24)
[<00147c40>] (warn_slowpath_null+0x1c/0x24) from [<0014de44>] (local_bh_enable_ip+0xa0/0xac)
[<0014de44>] (local_bh_enable_ip+0xa0/0xac) from [<0019594c>] (bdi_register+0xec/0x150)

In unwind_backtrace+0x0/0xf8what the +0x0/0xf8stands for?
How can I see the C code of unwind_backtrace+0x0/0xf8?
How to interpret the panic's content?

在unwind_backtrace+0x0/0xf8什么+0x0/0xf8主张？
我怎样才能看到的 C 代码unwind_backtrace+0x0/0xf8？
如何解读恐慌的内容？

Answer 1

采纳答案by iabdalkader

It's just an ordinary backtrace, those functions are called in reverse order (first one called was called by the previous one and so on):

这只是一个普通的回溯，这些函数以相反的顺序调用（第一个调用被前一个调用，依此类推）：

unwind_backtrace+0x0/0xf8
warn_slowpath_common+0x50/0x60
warn_slowpath_null+0x1c/0x24
ocal_bh_enable_ip+0xa0/0xac
bdi_register+0xec/0x150

The bdi_register+0xec/0x150is the symbol + the offset/length there's more information about that in Understanding a Kernel Oopsand how you can debug a kernel oops. Also there's this excellent tutorial on Debugging the Kernel

该bdi_register+0xec/0x150是符号+偏移/长度有关于更多的信息，了解内核哎呀，以及如何可以调试内核哎呀。还有这个关于调试内核的优秀教程

Note: as suggested below by Eugene, you may want to try addr2linefirst, it still needs an image with debugging symbols though, for example

注意：正如 Eugene 在下面建议的那样，您可能想先尝试addr2line，但它仍然需要一个带有调试符号的图像，例如

addr2line -e vmlinux_with_debug_info 0019594c(+offset)

Answer 2

回答by 0x90

Here are 2 alternatives for addr2line. Assuming you have the proper target's toolchain you can do one of the following:

这里有 2 个替代方案addr2line。假设您拥有正确的目标工具链，您可以执行以下操作之一：

Use objdump:

使用objdump：

locate your vmlinuxor the .kofile under the kernel root directory, then disassemble the object file :
```
objdump -dS vmlinux > /tmp/kernel.s
```
Open the generated assembly file, /tmp/kernel.s. with a text editor such as vim. Go to unwind_backtrace+0x0/0xf8, i.e. search for the address of unwind_backtrace+ the offset. Finally, you have located the problematic part in your source code.

在内核根目录下找到您vmlinux或该.ko文件，然后反汇编目标文件：
```
objdump -dS vmlinux > /tmp/kernel.s
```
打开生成的程序集文件/tmp/kernel.s. 使用文本编辑器，例如vim. 转到 unwind_backtrace+0x0/0xf8，即搜索unwind_backtrace+ 的地址offset。最后，您在源代码中找到了有问题的部分。

Use gdb:

使用gdb：

IMO, an even more elegant option is to use the one and only gdb. Assuming you have the suitable toolchain on your host machine:

IMO，一个更优雅的选择是使用唯一的gdb. 假设您的主机上有合适的工具链：

Run gdb <path-to-vmlinux>.
Execute in gdb's prompt: list *(unwind_backtrace+0x10).

运行gdb <path-to-vmlinux>。
在 gdb 的提示符下执行：list *(unwind_backtrace+0x10).

For additional information you may checkout the following:

有关其他信息，您可以查看以下内容：

Answer 3

回答by mgalgs

In unwind_backtrace+0x0/0xf8what the +0x0/0xf8stands for?

在unwind_backtrace+0x0/0xf8什么+0x0/0xf8主张？

The first number (+0x0) is the offset from the beginning of the function(unwind_backtracein this case). The second number (0xf8) is the total length of the function. Given these two pieces of information, if you already have a hunch about where the fault occurred this might be enough to confirm your suspicion (you can tell (roughly) how far along in the function you were).

第一个数字 ( +0x0) 是距函数开头的偏移量（unwind_backtrace在本例中）。第二个数字 ( 0xf8) 是函数的总长度。鉴于这两条信息，如果您已经对故障发生的位置有预感，这可能足以证实您的怀疑（您可以（大致）知道您在功能上的进展情况）。

To get the exact source line of the corresponding instruction (generally better than hunches), use addr2lineor the other methods in other answers.

要获得相应指令的确切源代码行（通常比预感更好），请使用addr2line其他答案中的或其他方法。

如何阅读、理解、分析和调试 Linux 内核崩溃？

提问by 0x90

采纳答案by iabdalkader

回答by 0x90

回答by mgalgs

相关推荐

最近更新

标签

如何阅读、理解、分析和调试 Linux 内核崩溃？

提问by 0x90

采纳答案by iabdalkader

回答by 0x90

回答by mgalgs

相关推荐

Linux awk one liner 根据列的值仅选择行

从 C# 中的 SQL 数据库读取 DateTime 值时没有毫秒值

Linux 搜索单词并显示整行

Linux 递归地chmod

相关推荐

最近更新

标签