Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the "script" command generate ^[ and ^M characters and how to remove them with vim search and replace?

Tags:

regex

linux

vim

On linux, using the bash shell, when I use the script command, the generated file is called typescript. When I open that file with vim, each line contains the ^M character, and several lines (due to my colored command prompt) contain a character ^[. I would like to replace these characters with nothing, effectively removing them from the generated script.

First, I tried :%s/^[//gc, :%s/\^[//gc, :%s/\^\[//gc, and a few other variants. None of them matched the ^[ character, so the search/replace didn't work.

I also tried all these variants for the ^M character with the same results. After some googling I discovered that the ^M character is really the carriage return "\r". So then I tried :%s/\\r//gc and this worked for the ^M character!

I googled some more to try and figure out what the ^[ character is but have found nothing helpful.

2 questions:

1) What is the ^[ character, and what is the appropriate regex to use in vim to search and replace it?

2) Why, when using the script command on linux, does the generated script produce ^M at the end of the line? This makes me think the linux script command is generating CRLF eol characters rather than just LF eol characters.

like image 680
axiopisty Avatar asked Nov 06 '13 18:11

axiopisty


4 Answers

^M and ^[ are control characters. As you already pointed out correctly, they are one character, not two, you can type them in vim by pressing Ctrl+V and then Ctrl+[ to get ^[.

So the replace command you're looking for would look like s/^[//gc, with the only difference from what you've tried that you cannot type ^[ literally.

^M is a CR (carriage return character). There are commands like dos2unix to get rid of such characters. Also vim has some built in functions to get rid of them.

^[ on the other hand, is a color control character. In bash you probably get a colored output, in vim you only see the control character.

Indeed I see the same control characters when using script. Others have pointed out that this behavior is expected, I couldn't find a straight forward way to circumvent it, so I wrote a wrapper script:

#!/usr/bin/env bash

### Set the variable typescript to the last positional parameter passed to script
typescript="${!#}"
### If the last positional parameter is an option (and starts with "-"),
### set typescript to "typescript" (standard argument of script)
if [[ "${!#:0:1}" == "-" ]]; then
    typescript="typescript"
fi
### Invoke /usr/bin/script with all options passed to the wrapper script
/usr/bin/script $@
### Once script has finished, call dos2unix to get rid of control characters
dos2unix "$typescript"

Write these lines into a file called script and put it in a directory which is in the $PATH variable before /usr/bin (in my case that's ~/bin). If you now type type script, it should point you to your wrapper script, not to /usr/bin/script. When you now type script, it will invoke the wrapper script which in turn calls /usr/bin/script and dos2unix.

like image 145
pfnuesel Avatar answered Oct 06 '22 23:10

pfnuesel


Why, when using the script command on linux, does the generated script produce ^M at the end of the line. This makes me think the linux script command is generating CRLF eol characters rather than just LF eol characters.

Because that's what the terminal driver inserts:

It is the terminal driver in canonical mode, "inside" the pseudo-terminal, that is expanding NLs … into CRNL pairs.

like image 31
Roger Pate Avatar answered Oct 07 '22 01:10

Roger Pate


I have found that some files are written for different line endings. Unix, Dos, and Mac. you can change the way VIM see's these by re-editing the file in the following file format. I found that ^M gets changed to newline characters when editing in the mac format, so for that run this in VIM. This is not really a search and replace however sometimes systems need the file to be in a particular line ending so changing that may not be smart.

:e ++ff=mac

You will then be able to see how this file is supposed to look.

For other file formats its similar

:e ++ff=dos
:e ++ff=unix
like image 23
Timbinous Avatar answered Oct 07 '22 01:10

Timbinous


The command

sed '/[[:cntrl:]].../s///g ; /[[:cntrl:]]/s///g' typescript > typescript2

works fine for me

like image 23
Top Maths Avatar answered Oct 06 '22 23:10

Top Maths