Linux Grep 以特定字符开头的所有字符串实例

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15016970/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 19:06:07  来源:igfitidea点击:

Grep all instances of strings that start with certain characters

linuxbashgrepcat

提问by Stephopolis

I would like to grep out all instances of strings that start with the characters 'rs' (from just one file) and pipe the full string into a new file. I managed to get the count of the instances but I don't know how to get them into the new file:

我想 grep 出所有以字符 'rs' 开头的字符串实例(仅来自一个文件),并将完整的字符串通过管道传输到一个新文件中。我设法获得了实例的数量,但我不知道如何将它们放入新文件中:

grep -c rs < /home/Stephanie/this.txt
698572

An example of a line in the file is:

文件中一行的示例是:

1203823    forward   efjdhgv   rs124054t8 dhdfhfhs
12045345    back   efjdkkjf   rs12445368 dhdfhfhs

I just want to grab the rs string and move it to a ne file. Can someone help me out with the piping? I read around a bit but what I found wasn't particularly helpful to me. thanks

我只想获取 rs 字符串并将其移动到 ne 文件中。有人可以帮我解决管道问题吗?我阅读了一些,但我发现的内容对我并不是特别有帮助。谢谢

采纳答案by biophonc

I'd suggest something like this:

我会建议这样的事情:

egrep -o "(\s(rs\S+))" data.txt | cut -d " " -f 2 > newfile.txt

\slooks for something that starts with any whitespace character

\s查找以任何空格字符开头的内容

(rs\S+)and then searches for a string that starts with "rs" and is followed by any non-whitespace character

(rs\S+)然后搜索以“rs”开头并后跟任何非空白字符的字符串

The results still have the white spaces in it, which we don't want, so we "cut" them out, before the content gets written to new file.

结果中仍然有我们不想要的空格,因此我们在将内容写入新文件之前将它们“剪掉”。

回答by perreal

Using Perl:

使用 Perl:

 perl -lane 'print  while (/\b(rs\w+)/g)' input

Or using trand grep:

或使用trgrep

tr '[ \t]' '[\n\n]' < input | grep '^rs'

here ^matches start of a line.

这里^匹配一行的开头。

回答by Vijay

perl -F -lane '$a=$_;for(@F){if(/^rs/){print $a;last}}' your_file

or

或者

perl -lne 'print if(/[\s]rs/ || /^rs/)' your_file

回答by vara

Using Grep Command:

使用 Grep 命令:

grep -w -o "rs[0-9a-z]*"

回答by Chris Gleason

Super old, but wanted to add to this. @kev grep -c '^rs' would dump out a count of all the lines that start with rs which none do.

超级旧,但想补充一下。@kev grep -c '^rs' 会转储以 rs 开头的所有行的计数,而没有。

To do this relatively easily with most standard binaries, you could use:

要使用大多数标准二进制文件相对容易地做到这一点,您可以使用:

cat text.file | awk {'print '} | grep '^rs'

This would cat the file, pull out the fourth field of each line and only pull lines that start with rs

这将 cat 文件,拉出每行的第四个字段,只拉出以 rs 开头的行