(How) can I remove all newlines (\n) using sed?
I tried to remove all newlines in a pipe like this:
Which results on debian squeeze in:
Not removing the trailing newline.
tr -d ‘\n’ as in How do I remove newlines from a text file? works just fine but isn’t sed.
6 Answers 6
Looks like sed will add back the \n if it is present as the last character.
If you want to remove the last \n you need an external utility, or use e.g. awk .
Compare: (echo foo; echo bar) | sed -e :a -e N -e ‘$!ba’ -e ‘s/\n/ /g’ | hexdump -C with (echo foo; echo bar) |tr -d ‘\n’ | hexdump -C and then with echo foo; echo bar) | hexdump -C | sed -e :a -e N -e ‘$!ba’ -e ‘s/\n/ /g’
If foo and bar are not expected to contain new lines, then you must beware that
will add a new line after each echo
(echo -n foo; echo -n bar)
will not add a new line at the end of the output. So it may be that you don’t need sed to remove new lines at all even if it did remove the trailing lines.
sed fails to remove newline character
I’ve been using sed for quite some time but here is a quirk I came around with, which I am not able to resolve.
Let me explain my problem with the actual case.
Scene#1
In the first command, I pipe printf output to xclip so that it gets copied to the clipboard. Now, printf , unlike echo does not insert a new line at the end by default. So, if I paste this content into terminal, the ls command that is copied does not automatically run.
In the second, there is a new line at the end, so pasting the clipboard content also results in the running of the command in the clipboard.
This is undesirable for me. So, I wanted to remove the newline using sed , but it failed, as explained in the scene below.
Scene#2
The content in the clipboard still contains new-line. When I paste it into terminal, the command automatically runs.
I also tried removing carriage return character \r . But nada. It seems I am missing something very crucial/basic here.
How do I remove newlines from a text file?
I have the following data, and I need to put it all into one line.
None of these commands is working perfectly.
Most of them let the data look like this:
19 Answers 19
Edit:
If none of the commands posted here are working, then you have something other than a newline separating your fields. Possibly you have DOS/Windows line endings in the file (although I would expect the Perl solutions to work even in that case)?
If that doesn’t work then you’re going to have to inspect your file more closely (e.g. in a hex editor) to find out what characters are actually in there that you want to remove.
This page here has a bunch of other methods to remove newlines.
edited to remove feline abuse 🙂
to figure WHAT is the offending character. then use
You can edit the file in vim:
Nerd fact: use ASCII instead.
(Edited cause i didn’t see the friggin’ answer that had same solution, only difference was that mine had ASCII)
Use sed with POSIX classes
This will remove all lines containing only whitespace (spaces & tabs)
sed ‘/^[[:space:]]*$/d’
Just take whatever you are working with and pipe it to that
Example
cat filename | sed ‘/^[[:space:]]*$/d’
If the data is in file.txt, then:
The ‘ $( ‘ reads the file and gives the contents as a series of words which ‘echo’ then echoes with a space between them. The ‘tr’ command then deletes any spaces:
xargs consumes newlines as well (but adds a final trailing newline):
Assuming you only want to keep the digits and the semicolons, the following should do the trick assuming there are no major encoding issues, though it will also remove the very last «newline»:
You can easily modify the above to include other characters, e.g. if you want to retain decimal points, commas, etc.
Using the gedit text editor (3.18.3)
- Click Search
- Click Find and Replace.
- Enter \n\s into Find field
- Leave Replace with blank (nothing)
- Check Regular expression box
- Click the Find button
Note: this doesn’t exactly address the OP’s original, 7 year old problem but should help some noob linux users (like me) who find their way here from the SE’s with similar «how do I get my text all on one line» questions.
Remove newline from unix variable
I have a variable whose value is found using sql query.
I want to remove the new line charcater from that variable since I want to concatenate this variable with the other. Below is the code:
4 Answers 4
If you are using bash , you can use Parameter Expansion:
The following should work in /bin/sh as well:
Sounds like you need «tr», something like:
man tr for detail, as usual
This work on Linux (bash):
On Linux, or other systems with GNU’s date utility, this also works to get that value for dt: (without involving Oracle. )
SET PAGES[IZE] Sets the number of lines on each page of output. You can set PAGESIZE to zero to suppress all headings, page breaks, titles, the initial blank line, and other formatting information.
so add a set pagesize 0 to your script to avoid a heading blank line.
for most of my scripts I use the settings in the following code piece of code:
Not the answer you’re looking for? Browse other questions tagged sed or ask your own question.
Related
Hot Network Questions
Subscribe to RSS
To subscribe to this RSS feed, copy and paste this URL into your RSS reader.
site design / logo © 2020 Stack Exchange Inc; user contributions licensed under cc by-sa. rev 2020.9.18.37632
How to remove a newline from a string in Bash
I have the following variable.
How can I remove that first newline?
8 Answers 8
Under bash, there are some bashisms:
The tr command could be replaced by // bashism:
See Parameter Expansion and QUOTING in bash’s man page:
Further.
As asked by @AlexJordan, this will suppress all specified characters. So what if $COMMAND do contain spaces.
Note: bash have extglob option to be enabled ( shopt -s extglob ) in order to use *(. ) syntax.
will replace the newline (in POSIX/Unix it’s not a carriage return) with a space.
To be honest I would think about switching away from bash to something more sane though. Or avoiding generating this malformed data in the first place.
Hmmm, this seems like it could be a horrible security hole as well, depending on where the data is coming from.
Clean your variable by removing all the carriage returns:
(Note that the control character in this question is a ‘newline’ ( \n ), not a carriage return ( \r ); the latter would have output REBOOT| on a single line.)
Explanation
The pattern is expanded to produce a pattern just as in filename expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. [. ] If string is null, matches of pattern are deleted and the / following pattern may be omitted.
Also uses the $» ANSI-C quoting construct to specify a newline as $’\n’ . Using a newline directly would work as well, though less pretty:
Can sed replace new line characters?
Is there an issue with sed and new line character?
I have a file test.txt with the following contents
The following does not work:
sed -r -i ‘s/\n/,/g’ test.txt
I know that I can use tr for this but my question is why it seems not possible with sed.
If this is a side effect of processing the file line by line I would be interested in why this happens. I think grep removes new lines. Does sed do the same?
9 Answers 9
With GNU sed and provided POSIXLY_CORRECT is not in the environment (for single-line input):
- create a label via :a
- append the current and next line to the pattern space via N
- if we are before the last line, branch to the created label $!ba ( $! means not to do it on the last line (as there should be one final newline)).
- finally the substitution replaces every newline with a comma on the pattern space (which is the whole file).
This works with GNU sed :
-z is included since 4.2.2
NB. -z changes the delimiter to null characters ( \0 ). If your input does not contain any null characters, the whole input is treated as a single line. This can come with its limitations.
To avoid having the newline of the last line replaced, you can change it back:
(Which is GNU sed syntax again, but it doesn’t matter as the whole thing is GNU only)
sed always removes the trailing \n ewline just before populating pattern space, and then appends one before writing out the results of its script. A \n ewline can be had in pattern-space by various means — but never if it is not the result of an edit. This is important — \n ewlines in sed ‘s pattern space always reflect a change, and never occur in the input stream. \n ewlines are the only delimiter a sed der can count on with unknown input.
If you want to replace all \n ewlines with commas and your file is not very large, then you can do:
That appends every input line to h old space — except the first, which instead overwrites h old space — following a \n ewline character. It then d eletes every line not the $! last from output. On the last line H old and pattern spaces are e x changed and all \n ewline characters are y/// translated to commas.
For large files this sort of thing is bound to cause problems — sed ‘s buffer on line-boundaries, that can be easily overflowed with actions of this sort.
From Oracle’s web site:
The sed utility works by sequentially reading a file, line by line, into memory. It then performs all actions specified for the line and places the line back in memory to dump to the terminal with the requested changes made. After all actions have taken place to this one line, it reads the next line of the file and repeats the process until it is finished with the file.
Basically this means that because sed is reading line by line the newline character is not matched.
or, in a portable version (without ; concatening after jump mark labels)
An explanation into how that works is provided on that page.
There are actually two questions on your post:
Can sed replace new line characters?
Yes. Absolutely yes. Any sed could do:
That will transform any newline (that got into the pattern space) into commas.
Is there an issue with sed and new line character?
Yes, there are several issues with the newline character in sed:
- By default, sed will place in the pattern space a valid line. Some seds have limits on the length of a line and on accepting NUL bytes. A line ends on a newline. So, as soon as a newline is found on the input, the input gets split, then sed removes the newline and places what is left in the pattern space. So, most of the time, no newline gets into the pattern space.
- Only by an edit of the pattern space is a newline added/inserted/edited in.
- Almost always, a newline is appended to each consecutive output of sed.
- The GNU sed is able to avoid printing a trailing newline if the last line of the input is missing the newline.
- Only GNU sed is able to use another delimiter instead of newline (namely NUL bytes with the -z option).
All the above points make it difficult to «convert newlines» to anything.
And, if newlines are replaced with another text character, then sed must contain the whole text file in memory (whatever process was used to get there).
A couple of solutions that capture the whole file in memory in sed are:
A couple of fast solutions that doesn’t use much memory are:
1 From sed solutions: For every line, H adds the line to the hold space (except that the first line completely replace the hold space (avoid a leading newline)), then the pattern space is erased with $!d (except on the last line). On that last line, which was not erased, the rest of commands gets executed. First, get all the lines captured in the hold space with x and then, replace all newlines with a comma with y/\n/,/ .