Linux 二进制文件中 ascii 字符串的“grep”偏移量

Question

提问by mgilson

I'm generating binary data files that are simply a series of records concatenated together. Each record consists of a (binary) header followed by binary data. Within the binary header is an ascii string 80 characters long. Somewhere along the way, my process of writing the files got a little messed up and I'm trying to debug this problem by inspecting how long each record actually is.

我生成的二进制数据文件只是一系列连接在一起的记录。每条记录由一个（二进制）标头和后跟二进制数据组成。在二进制头中是一个 80 个字符长的 ascii 字符串。在此过程中，我编写文件的过程有点混乱，我试图通过检查每条记录的实际长度来调试这个问题。

Thisseems extremely related, but I don't understand perl, so I haven't been able to get the accepted answer there to work. The other answer points to bgrepwhich I've compiled, but it wants me to feed it a hex string and I'd rather just have a tool where I can give it the ascii string and it will find it in the binary data, print the string and the byte offset where it was found.

这似乎非常相关，但我不明白 perl，所以我无法在那里得到公认的答案。另一个答案指向bgrep我已经编译的，但它希望我提供一个十六进制字符串，我宁愿有一个工具，我可以给它一个 ascii 字符串，它会在二进制数据中找到它，打印字符串和找到它的字节偏移量。

In other words, I'm looking for some tool which acts like this:

换句话说，我正在寻找一些像这样的工具：

tool foobar filename

or

或者

tool foobar < filename

and its output is something like this:

它的输出是这样的：

foobar:10
foobar:410
foobar:810
foobar:1210
...

e.g. the string which matched and a byte offset in the file where the match started. In this example case, I can infer that each record is 400 bytes long.

例如匹配的字符串和匹配开始的文件中的字节偏移量。在本例中，我可以推断出每条记录的长度为 400 字节。

Other constraints:

其他约束：

ability to search by regex is cool, but I don't need it for this problem
My binary files are big (3.5Gb), so I'd like to avoid reading the whole file into memory if possible.

通过正则表达式搜索的能力很酷，但我不需要它来解决这个问题
我的二进制文件很大（3.5Gb），所以如果可能的话，我想避免将整个文件读入内存。

Answer 1

采纳答案by Thor

You could use stringsfor this:

您可以strings为此使用：

strings -a -t x filename | grep foobar

Tested with GNU binutils.

用 GNU binutils 测试。

For example, where in /bin/lsdoes --helpoccur:

例如， where in/bin/ls确实--help发生：

strings -a -t x /bin/ls | grep -- --help

Output:

输出：

14938 Try `%s --help' for more information.
162f0       --help     display this help and exit

Answer 2

回答by Hari Menon

grep --byte-offset --only-matching --text foobar filename

The --byte-offsetoption prints the offset of each matching line.

该--byte-offset选项打印每个匹配行的偏移量。

The --only-matchingoption makes it print offset for each matching instance instead of each matching line.

该--only-matching选项使其打印每个匹配实例的偏移量，而不是每个匹配行。

The --textoption makes grep treat the binary file as a text file.

该--text选项使 grep 将二进制文件视为文本文件。

You can shorten it to:

您可以将其缩短为：

grep -oba foobar filename

It works in the GNU version of grep, which comes with linux by default.?It won't work in BSD grep (which comes with Mac by default).

它适用于 GNU 版本grep，它默认随 linux 一起提供。它不适用于 BSD grep（默认情况下随 Mac 一起提供）。

Answer 3

回答by caesun

I wanted to do the same task. Though strings | grep worked, I found gsar was the very tool I needed.

我想做同样的任务。虽然字符串 | grep 起作用了，我发现 gsar 正是我需要的工具。

http://tjaberg.com/

The output looks like:

输出看起来像：

>gsar.exe -bic -sfoobar filename.bin
filename.bin: 0x34b5: AAA foobar BBB
filename.bin: 0x56a0: foobar DDD
filename.bin: 2 matches found

Linux 二进制文件中 ascii 字符串的“grep”偏移量

提问by mgilson

采纳答案by Thor

回答by Hari Menon

回答by caesun

相关推荐

最近更新

标签

Linux 二进制文件中 ascii 字符串的“grep”偏移量

提问by mgilson

采纳答案by Thor

回答by Hari Menon

回答by caesun

相关推荐

如何在 C 和 Linux 中检查套接字可用的数据量

Linux 如何在java中获取当前目录？

C# 如何计算/找到给定日期的周数？

Linux 如何在shell中剪切字符串的第一列（可变长度）

相关推荐

最近更新

标签