Linux 如何仅使用 grep/sed 提取子字符串和数字

Question

提问by Hooloovoo

I have a text file containing both text and numbers, I want to use grep to extract only the numbers I need for example, given a file as follow:

我有一个包含文本和数字的文本文件，我想使用 grep 仅提取我需要的数字，例如，给定文件如下：

miss rate 0.21  
ipc 222  
stalls n shdmem 112

So say I only want to extract the data for miss ratewhich is 0.21. How do I do it with grep or sed? Plus, I need more than one number, not only the one after miss rate. That is, I may want to get both 0.21and 112. A sample output might look like this:

所以说我只想提取miss rateis的数据0.21。我如何用 grep 或 sed 做到这一点？另外，我需要多个数字，而不仅仅是miss rate. 也就是说，我可能想要同时获得0.21和112。示例输出可能如下所示：

0.21 222 112

Cause I need the data for later plot.

因为我需要稍后绘图的数据。

Answer 1

采纳答案by that other guy

Use awkinstead:

使用awk来代替：

awk '/^miss rate/ { print  }' yourfile

To do it with just grep, you need non-standard extensions like here with GNU grep using PCRE (-P) with positive lookbehind (?<=..) and match only (-o):

要仅使用 grep 来完成此操作，您需要使用 PCRE (-P) 和正向后视 (?<=..) 并仅匹配 (-o) 的非标准扩展，例如此处与 GNU grep 一起使用：

grep -Po '(?<=miss rate ).*' yourfile

Answer 2

回答by kamituel

You can use:

您可以使用：

grep -P "miss rate \d+(\.\d+)?" file.txt

or:

或者：

grep -E "miss rate [0-9]+(\.[0-9]+)?"

Both of those commands will print out miss rate 0.21. If you want to extract the number only, why not use Perl, Sed or Awk?

这两个命令都会打印出来miss rate 0.21。如果您只想提取数字，为什么不使用 Perl、Sed 或 Awk？

If you really want to avoid those, maybe this will work?

如果你真的想避免这些，也许这会奏效？

grep -E "miss rate [0-9]+(\.[0-9]+)?" g | xargs basename | tail -n 1

Answer 3

回答by DanneJ

If you reallywant to use only grep for this, then you can try:

如果您真的只想为此使用 grep，那么您可以尝试：

grep "miss rate" file | grep -oe '\([0-9.]*\)'

It will first find the line that matches, and then only output the digits.

它将首先找到匹配的行，然后只输出数字。

Sed might be a bit more readable, though:

不过，Sed 可能更具可读性：

sed -n 's#miss rate ##p' file

Answer 4

回答by Gilles Quenot

Using the special look aroundregex trick \Kwith pcreengine with grep:

使用特殊的周围看看正则表达式招\ķ与PCRE发动机的grep：

grep -oP 'miss rate \K.*' file.txt

or with perl:

或使用perl：

perl -lne 'print $& if /miss rate \K.*/' file.txt

Answer 5

回答by mariux

The grep-and-cutsolution would look like:

的grep-和-cut的解决办法是这样的：

to get the 3rd field for every successful grep use:

为每次成功使用 grep 获取第三个字段：

grep "^miss rate " yourfile | cut -d ' ' -f 3

or to get the 3rd field and the rest use:

或获取第三个字段，其余使用：

grep "^miss rate " yourfile | cut -d ' ' -f 3-

Or if you use bash and "miss rate" only occurs once in your file you can also just do:

或者，如果您使用 bash 并且“未命中率”仅在您的文件中出现一次，您也可以这样做：

a=( $(grep -m 1 "miss rate" yourfile) )
echo ${a[2]}

where ${a[2]}is your result.

${a[2]}你的结果在哪里。

If "miss rate" occurs more then once you can loop over the grep output reading only what you need. (in bash)

如果“未命中率”发生的次数更多，则您可以循环遍历 grep 输出，仅读取您需要的内容。（在 bash 中）

Answer 6

回答by Daniel Williams

I believe

我相信

sed 's|[^0-9]*$[0-9\.]*$|\1 |g' fiilename

will do the trick. However every entry will be on it's own line if that is ok. I am sure there is a way for sed to produce a comma or space delimited list but I am not a super master of all things sed.

会做的伎俩。但是，如果可以的话，每个条目都将在它自己的行上。我确信 sed 有一种方法可以生成逗号或空格分隔的列表，但我不是 sed 的所有方面的超级大师。

Linux 如何仅使用 grep/sed 提取子字符串和数字

提问by Hooloovoo

采纳答案by that other guy

回答by kamituel

回答by DanneJ

回答by Gilles Quenot

回答by mariux

回答by Daniel Williams

相关推荐

最近更新

标签

Linux 如何仅使用 grep/sed 提取子字符串和数字

提问by Hooloovoo

采纳答案by that other guy

回答by kamituel

回答by DanneJ

回答by Gilles Quenot

回答by mariux

回答by Daniel Williams

相关推荐

在 C#/Java 中获取实时市场/股票报价

C# 使用 Split() 时执行 Trim()

如何让关于框出现在 C# 中？

C# 如何使用 HTML 输入文件类型限制文件类型？

相关推荐

最近更新

标签