如何在Linux中使用Shell脚本计算文件中单词的出现次数-IGI

时间：2020-01-09 10:37:50 　来源:igfitidea点击:

脚本计算文件中单词的出现次数

我们可以使用awk的关联数组以不同方式解决此问题。
单词是字母字符，由空格或者句点分隔。
首先，我们应该解析给定文件中的所有单词，然后需要找到每个单词的计数。
可以使用sed，awk或者grep等工具使用正则表达式来解析单词。

样例Shell脚本

下面是一个示例shell脚本，它将对文件中单词的出现进行计数并打印出文件中所有单词的总数。

# cat /tmp/count_words.sh
#!/bin/bash
#Desc: Find out frequency of words in a file
if [ $# -ne 1 ];
then
  echo "Usage: # cat /tmp/dummy_file.txt
count occurrences of word in file linux
shell script to count number of words in a file
count occurrences of all words in file linux
shell script to count number of lines in a file without using wc command
shell script to counts number of lines and words in a file
find count of string in file linux
shell script to counting number of lines words and characters in a file
count number of lines in a file linux
 filename";
  exit -1
fi
filename=
egrep -o "\b[[:alpha:]]+\b" $filename | \
awk '{ count[# /tmp/count_words.sh /tmp/dummy_file.txt
Word          Count
script        4
linux         4
words         4
counts        1
counting      1
without       1
count         6
lines         4
of            8
and           2
using         1
a             5
to            4
characters    1
number        5
in            8
command       1
shell         4
file          8
find          1
wc            1
string        1
all           1
word          1
occurrences   2
]++ }
END {printf("%-14s%s\n","Word","Count") ;
for(ind in count)
{ printf("%-14s%d\n",ind,count[ind]); }
}'

接下来，我将创建一个带有某些内容的" dummy_file.txt"，我们将使用该内容来计算文件中的所有单词

# egrep -c '\<count\>' /tmp/dummy_file.txt
6

现在，我们将沿着该文件运行脚本。
如我们所见，脚本将打印出文件中单词的全部出现。
该脚本还能够区分匹配的单词，例如计数，计数，计数

# egrep -c 'count' /tmp/dummy_file.txt
8

一行代码命令

我们还可以使用grep，sed，tr，python等各种内衬命令来计算文件中单词的出现次数。
我将在此处显示更多示例：

使用grep命令

使用egrep，我们可以使用不同的指令来计算文件中单词出现的次数，例如在/tmp/dummy_file.txt中打印单词出现的总数。

# tr ' ' '\n' < /tmp/dummy_file.txt | grep '\<count\>' | wc -l
6

这里的'\ <count >'确保我们只匹配确切的字符串，否则，如果我们只使用单词count，则检查输出。
\ <断言单词的开头，\>断言单词的结尾

##代码##

这是因为它试图捕获计数并也从文件中计数

使用tr命令

与grep相似，我们可以使用翻译命令来计算文件中单词的出现次数

##代码##

我们也可以使用sed和其他工具列出Linux或者Unix中文件的字数统计。

如何在Linux中使用Shell脚本计算文件中单词的出现次数

脚本计算文件中单词的出现次数

样例Shell脚本

一行代码命令

使用grep命令

使用tr命令

相关推荐

最近更新

标签

如何在Linux中使用Shell脚本计算文件中单词的出现次数

脚本计算文件中单词的出现次数

样例Shell脚本

一行代码命令

使用grep命令

使用tr命令

相关推荐

如何配置安全的Kerberized NFS服务器(RHEL/CentOS 7)

将sendmail配置为智能主机

如何在RHEL 8上使用FirewallD设置防火墙

在Linux中配置软件Linear RAID 0

相关推荐

最近更新

标签