在 Linux 上删除 Windows 换行符(sed 与 awk)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11680815/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Removing Windows newlines on Linux (sed vs. awk)
提问by kermatt
Have some delimited files with improperly placed newline characters in the middle of fields (not line ends), appearing as ^M in Vim. They originate from freebcp (on Centos 6) exports of a MSSQL database. Dumping the data in hex shows \r\n patterns:
有一些分隔文件,在字段中间(不是行尾)放置了不正确的换行符,在 Vim 中显示为 ^M。它们源自 MSSQL 数据库的 freebcp(在 Centos 6 上)导出。以十六进制转储数据显示 \r\n 模式:
$ xxd test.txt | grep 0d0a
0000190: 3932 3139 322d 3239 3836 0d0a 0d0a 7c43
I can remove them with awk, but am unable to do the same with sed.
我可以用 awk 删除它们,但不能用 sed 做同样的事情。
This works in awk, removing the line breaks completely:
这适用于 awk,完全删除换行符:
awk 'gsub(/\r/,""){printf sed -i 's/\r//g'
;next}{print}'
But this in sed does not, leaving line feeds in place:
但这在 sed 中没有,留下换行符:
sed -i 's/\r\n//g'
where this appears to have no effect:
这似乎没有效果:
echo $string | sed $'s/\r//'
Using ^M in the sed expression (ctrl+v, ctrl+m) also does not seem to work.
在 sed 表达式 (ctrl+v, ctrl+m) 中使用 ^M 似乎也不起作用。
For this sort of task, sed is easier to grok, but I am working on learning more about both. Am I using sed improperly, or is there a limitation?
对于这类任务,sed 更容易理解,但我正在努力学习更多关于两者的知识。sed 使用不当,还是有限制?
采纳答案by chepner
I believe some versions of sed
will not recognize \r
as a character. However, you can use a bash
feature to work around that limitation:
我相信某些版本sed
不会识别\r
为字符。但是,您可以使用一项bash
功能来解决该限制:
dos2unix input
Here, you let bash
replace '\r' with the actual carriage return character inside the $'...'
construct before passing that to sed
as its command. (Assuming you use bash
; other shells should have a similar construct.)
在这里,您可以bash
将 '\r' 替换为$'...'
构造中的实际回车符,然后再将其sed
作为命令传递给它。(假设您使用bash
; 其他 shell 应该具有类似的构造。)
回答by kev
You can use the command line tool dos2unix
您可以使用命令行工具 dos2unix
tr -d '\r' <input >output
Or use the tr
command:
或者使用以下tr
命令:
:e ++ff=dos
:w ++ff=unix
:e!
Actually, you can do the file-format switching in vim
:
实际上,您可以在以下位置进行文件格式切换vim
:
:e ++ff=dos
:set ff=unix
:w
方法B:
:e ++ff=unix " <-- make sure open with UNIX format
:%s/\r\n//g " <-- remove all \r\n
:w " <-- save file
EDIT
编辑
If you want to delete the \r\n
sequences in the file, try these commands in vim
:
如果要删除\r\n
文件中的序列,请尝试以下命令vim
:
sed '1h;1!H;$!d;${g;s/\r\n//g}' input
sed ':A;/\r$/{N;bA};s/\r\n//g' input
Your awk
solution works fine. Another two sed
solutions:
您的awk
解决方案工作正常。另外两个sed
解决方案:
awk 1 RS='\r\n' ORS=
回答by Steven Penny
Another method
另一种方法
##代码##- set Record Separator to
\r\n
- set Output Record Separator to empty string
1
is always true, and in the absence of an action block{print}
is used
- 将记录分隔符设置为
\r\n
- 将输出记录分隔符设置为空字符串
1
始终为真,并且在没有动作块的{print}
情况下使用
回答by Sergiy Dolnyy
sed -e 's/\r//g' input_file
sed -e 's/\r//g' input_file
This works for me. The difference of -einstead of -icommand.
这对我有用。-e与-i命令的区别。
Also I mentioned that see on different platforms behave differently.
Mine is:sed --version
This is not GNU sed version 4.0
我还提到在不同平台上看到的行为不同。我的是:sed --version
This is not GNU sed version 4.0