Questions regarding this article should be directed to the author at walter@ccnet.com.
This installment of our Vi/Ex tutorial series is a diversion
from the subjects I promised at the end of the previous part --
the change is my fault, and yet it is necessary. When I blithely
suggested last time that the R
command is just like the familiar r
command, except for a few differences I mentioned, I was leading
you astray.
There are several differences that can cause problems in
certain uses unless you understand those differences. And you
won't really comprehend the greatest of those differences until
you know about metacharacters in insert mode.
But as an encouragement to follow all this, consider that almost
all of what I say here about the R
command also is valid with all the other commands that put you
into text insertion mode: a A i I o O c s
:a :i
etcetera.
R
than
to r
The r
command replaces whatever
character is presently under the cursor, so there must be some
character under the cursor for it to replace -- otherwise it just
gives you an error beep. Not so with R
.
You can give the R
command on an empty line; whatever you type after that, up to the
next escape character, will take the place of that empty line
just as though you had typed past the
end of an existing line after giving an R
command. (I was going to say ``just as though you had given an
a
command'', but I'm now very leary of
making comparisons that are incomplete without paragraphs of
explanations.) You can even start entering text into a brand-new
file via the R
command.
The factor above can be useful in various situations; I only have
space to mention one. At times I want to type new characters to
replace blank spaces in a place where some of the lines are empty.
These do not have any blanks; no characters at all. But I do not have
to look at each line before I start typing on it, to see whether I
should use an R
or an
a
command, because
R
will work in either case.
The R
command is more forgiving of
your typing errors, too. Whatever character you type after an
r
is final. If you accidentally
typed the wrong character, you can only put back what was there
by typing a u
command, if the mistake
was the last editing command you typed, or put in the replacement
you had in mind by returning the cursor to the spot and running
another, more careful, r
command.
But if you mistype during an R
command, you can backspace over the error with the backspace key.
Then you can type in the character (or characters; you can back
up multiple spaces by repeating the backspace key) you should have
typed. And if you simply typed too far, you'll be glad to know
that backspacing doesn't just remove the incorrect characters, it
restores the characters that were there, either right away or as
soon as you hit the escape key. You can even backspace over
everything you've typed during this R
command before you type escape, because the editor does not object
to a replacement string length of zero.
One caveat here, though, lest my clarification turn out to need
a clarification of its own. With either of these commands it is
possible to break a line, just by typing the return key as a
replacement character, and with the R
command this linebreaking can be done either while actually replacing
characters or when typing on beyond the end of the existing line.
With almost all versions of the editor, it is not possible to
backspace over an inserted linebreak, even while you are still in
R
insertion mode.
The most important difference, though, is the handling of
metacharacters. Yes, text insertion utilizes metacharacters too,
quite apart from the ones that the replacement patterns in
:substitute
commands use. The
r
command recognizes hardly any of
these metacharacters, and quoting those in as literal characters is
very simple. The R
command, though,
recognizes almost all of them, and quoting characters in with
R
is rather complicated.
The phrase ``quoting in'' is standard terminology, but it is
rather misleading in the editor. Unlike Unix shells, the editor
does not use any of the ASCII quotation marks: ` '
"
(backquote, single and double quote) to quote
characters into a file. Instead, it uses the backslash
(``\
'') and control-V
(``^V
''); the latter is what you
send when you press the V key while holding the CONTROL or CTRL
key down. In either case, you quote a character in by typing the
quoting character just prior to the character you want to quote
in. So if @ is your line kill character, and you want to put
that character in the text you are typing in, you would have to
type either \@ or ^V@ to get it there. And if you want several
consecutive characters quoted in, you must quote each of them
individually. That is, if you want to put @@@ into a line, you
must type either ^V@^V@^V@ or \@\@\@ to put that string there.
But \ and ^V are not always interchangeable. In many cases
either will work; but sometimes you must choose the right one.
Which one to use depends both on what character you want to quote in
and whether you're using the r
or
R
command.
One obvious use for quoting is to insert a character that normally
erases part or all of what you've just typed in. The ASCII backspace
character, control-H, must be quoted in, and so must your own line-kill
character (@ in the example above) and your own erase character if it
is not control-H. With the r
command
you quote in any of these with a backslash; when using
R
you may quote any of these in using
either backslash or control-V.
A pause here, to answer a question that might be in the minds of people who know a little about Unix internals. Ordinarily it is the asynchronous serial terminal line (or TTY) driver that recognizes the erase and line-kill characters and edits the input line accordingly without including these characters in the final result. Then, how can one enter these same input-line characters into the edit buffer if they don't get past the TTY driver? Because Vi/Ex places the TTY driver into a special ``raw'' mode that ignores the line-editing characters passing them on to the editor. Otherwise you would not be able to quote these characters in. Also, the editor is set up to discover your erase and line-kill characters by querying your personal environment, and then interpret these characters as the line driver would have. A nifty feature -- but unfortunately, the editor has no way to let the user turn this feature off.
A little-known technique that can require quoting operates when you
are doing a type-text-in insertion using a screen-mode command,
including the R
command. If the last
such insertion you did inserted no more than 128 characters, and if
the first character in your new insertion is @, the editor will
immediately use the text from your last insertion as the whole text
for your new insertion, discarding the @ character, and end the
insertion right there by putting you back in command mode.
That is, if you've just finished inserting the word ``loop'' on one
line, and then go to another place and type
R@
, it will have the same effect as
if you had typed Rloop
followed by
an escape.
This should be used more often than it is, because it is much
more flexible than typing a dot (.
)
to repeat the last command. With the @ technique you can perform
other commands between text insertions, and if you are doing
changes you can change various phrases to one new phrase -- not
possible with .
repeats. (But
since this technique is so little known, it may have been removed
from your version of the editor without anyone hearing about it.
So test it before you try to use it.)
So when you do need to start inserted or replaced text with an @ character, type \@ or ^V@ to put it there. But, you may well ask, what happens when @ is also your line-kill character? Does a nonquoted @ then repeat the last inserted text or just kill the line? The answer is that the line-kill function prevails. In this case a nonquoted @ simply erases what you've typed so far on the first line of your new insert or replace -- namely, itself. So typing an @ there is a no-op. In short, a quoted @ in this case is itself, a nonquoted @ kills the line, and there is no way to use the repeat-inserted-text function when @ is your line-kill character.
At times you may want to use the beautify option to
the set
command. This tells the
editor to throw away most, but not all, control characters you may
try to type in -- the exceptions usually are the tab (^I),
newline (^J), and form feed (^L) -- in order to keep you from
inadvertently putting in invisible control characters that will
be hard to detect later. This option is normally off, but you
can type :se bf
to turn it on.
But even when you want most control characters thrown out, there
will be occasions when one must go in. This is not possible using a
r
command. The usual
r
technique of backslashing will
usually bite
back in this case -- the editor will interpret the control character
by acting on its control meaning rather than inserting it in the text.
Using R
, though, you can
insert most control characters by preceding each with ^V.
Even this may not be enough. Some systems are set up so that
when certain control characters are typed in, even though preceded by
^V, the system acts on them as control characters before the editor
ever sees them. To get around this problem, many implementations of
the editor, especially older ones, interpret an ordinary character
typed right after a ^V as a control character. That is, on these
systems, typing ^VF or ^Vf while running an
R
command inserts a ^F in the file,
just as typing ^V^F would on systems that don't have this challenge.
Here are the latest questions, and my solutions, from inquiring readers with problems you might face someday.
Hi Walter,
In moving files from Windows machines to UNIX, some of our users do binary transfers which result in ^M characters in the ASCII files. Usually they occur at the ends of individual lines and I do:
:1,$ s/^M//gwhere ^M is generated by ^V^M and everything works fine to delete these characters. I now have a new problem: I found a file with ^M characters embedded in it, but the file is one long line. I need to replace them with Vi's line-end character to split this long line into multiple lines. But I can't because it's the same as pressing the ENTER or RETURN key in the middle of the substitution command. How can I replace the superfluous carriage return? We have several files like this and it's causing problems viewing them with Web browsers.
I tried substituting a newline with the character code and the octal code unsuccessfully, and tried the ^M as a last unsuccessful resort.
Things aren't as complicated as you make them seem, Tommy. First of all, Web browsers generally ignore carriage-return and/or linefeed characters while formatting text for display. If your browser is choking on these all-one-line files, it is probably because the lines are too long for your browser, or for some other cause not related to embedded ^M characters.
Now, as you have deduced, the difference between Microsoft and Unix text file formats is that Microsoft operating systems seem to favor carriage-return followed by linefeed (^J) as the line separator, while Unix systems use linefeed alone.
As you've discovered, you cannot directly quote a ^J into any
editor command. And yet, you put a ^J into your file every time
you hit return during text entry, although the return key on most
terminals sends a ^M character. That's the trick; the
substitute
command regards a ^M in
the input pattern as a signal to insert a ^J and discard
the ^M. So you only need to get that ^M into the replacement
pattern by typing in your command line like this:
:1,$ s/^V^M/^V^M/g
You just have to overlook the appearance of futility in this command line, as though it were going to replace each ^M with itself. That first ^M is in the outgoing pattern, so it matches a real ^M. The second, in the replacement pattern, calls for a ^J as I explained above.
However, these all-one-line files may be too long for the Vi
editor, which cannot handle lines much more than a thousand
characters long in most common implementations, with shorter
limits in older versions. The editor will truncate lines that
exceed the limit, with only a minimal and rather cryptic warning.
In such cases, use the tr
utility
to replace the ^M characters (which is a very straightforward job
with that tool), before you bring the file into the Vi
editor.
You may wonder then, how you would use the
substitute
command to put ^M
characters into your file. The answer is to backslash the
quoted-in ^M. To add a ^M at the end of every line in your file,
so as to conform it to Microsoft practice, type this command:
:%s/$/\^V^M
(Note that it is important to type the \ first, then the ^V, followed by the ^M.) The ^V puts the immediately-following ^M into the command line, and the backslash tells the command that this ^M is to be considered a real one, not a metacharacter for ^J. In fact, these are the general principles for quoting characters almost everywhere except in typing-in-text mode:
Finally, you can replace linefeed characters with something
else via line mode commands, but you must use two commands and
only one of them is the substitute
command. Suppose you need to change a short file's format from a
number of lines to the format Tommy encountered: a single line with
^M separators. That is, replace each ^J (except the last) with a
^M. (This had better be a fairly short file, because even newer
versions of the editor can't handle any lines longer than 1024
characters.)
Start by using a command similar to the one above to put ^M at the end of every line except the last. (Since these ^M characters are to separate lines, there's no use for one at the end of the last line.) Then use this command:
:%j!
to join all the lines into one. The ``j'' in this command line is
the shortest abbreviation for the line mode
join
command, and the ``!'' switch at
the end of it tells the command not to insert blank space between
the lines it joins.
Hi,
I have a question (rather simple, really) but no one seem able to know the answer. Not even the help desk (with all the Vi gurus :) ). I'm hoping you can help me with it.
I have a text file of unknown length. Each line of the file can be very short or very long (from 3 characters up to 1000 characters).
Within this file, I'm trying to locate (search) the nth occurrence of a word.
Here are a few things I've tried:
- The simple solution would be (from visual command mode): a
/foobar
command followed by then
command typed n-1 times. But what if n is large, say 200 or greater?):1,$ global /^/ /foobar/
(and its variations) Nothing useful...Can you suggest a better way?
Yes, although it involves a slightly tricky procedure. Consider the following command string:
:$|/\<foobar\>/s//QQQ
The first command in this string takes us to the last line of our file and -- incidentally -- displays it on our screen, which is not important here. The second command searches forward for a line containing ``foobar'' as a word, and starting from the last line the search must wrap around and find the first instance in the file. Then that second command replaces the word ``foobar'' with ``QQQ'', leaving the cursor at the point where the substitution was made.
Now let us make an addition to the start of this command string:
:1,199g/^/$|/\<foobar\>/s//QQQ
This revised string repeats the procedure 199 times; each time the
first instance of ``foobar'' remaining in the file is the one replaced.
So we end up sitting on the ``QQQ'' string that replaced the 199th instance
of ``foobar''; simply typing n
will bring
us to the 200th instance. And if we move off that 200th instance for
any reason, going to the top of the file and searching for ``foobar''
will bring us right back to it, because the first 199 are now gone.
When we are finished with that 200th ``foobar'', this command:
:%s/QQQ/foobar/g
will change those 199 ``QQQ'' strings back to ``foobar''. Of course, if there is any chance that ``QQQ'' might occur in the document as itself, we can choose another dummy string.
And while I'm at it, I've got another question.
How do I delete all lines beginning with a certain string, say, !@#$ (or foobar for that matter). And a related question: how to delete lines containing the word foobar (anywhere within the line)?
The first command line following will solve your first problem, and the second will solve your second:
:g/^foobar/d :g/\<foobar\>/d
To make room to answer two readers' questions, I had to skip
presenting three great Vi tools -- autoindent,
abbreviate
, and
map!
-- and the effect their
metacharacters have in text-insertion mode. They'll be first up
in the next part of this tutorial.
More answers to reader questions are coming, too. I have queries to answer about the semicolon address separator and about yanking within macros -- and if a few more significant problems arrive here, I'll try to fit them in, too.
And this time you won't have to wait and wait for the next tutorial part. As I write this paragraph, I'm already in the middle of creating the next part, so you should see it within two weeks after this part appears online.