Меню Рубрики

Linux count lines in file

count lines in a file

I’m sure there are many ways to do this: how can I count the number of lines in a text file?

6 Answers 6

The standard way is with wc , which takes arguments to specify what it should count (bytes, chars, words, etc.); -l is for lines:

As Michael said, wc -l is the way to go. But, just in case you inexplicably have bash , perl , or awk but not wc , here are a few more solutions:

Bash-only

Perl Solutions

and the far less readable:

Awk Solution

Steven D forgot GNU sed :

Also, if you want the count without outputting the filename and you’re using wc :

Just for the heck of it:

Word of warning when using

because wc -l functions by counting \n, if the last line in your file doesn’t end in a newline effectively the line count will be off by 1. (hence the old convention leaving newline at the end of your file)

Since I can never be sure if any given file follows the convention of ending the last line with a newline or not, I recommend using any of these alternate commands which will include the last line in the count regardless of newline or not.

In case you only have bash and absolutely no external tools available, you could also do the following:

Explanation: the loop reads standard input line by line ( read ; since we do nothing with the read input anyway, no variable is provided to store it in), and increases the variable count each time. Due to redirection ( after done ), standard input for the loop is from file.txt .

You can always use the command grep as follows:

It will count all the actual rows of file.txt , whether or not its last row contains a LF character at the end.

Not the answer you’re looking for? Browse other questions tagged command-line or ask your own question.

Linked

Related

Hot Network Questions

Subscribe to RSS

To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

site design / logo © 2020 Stack Exchange Inc; user contributions licensed under cc by-sa. rev 2020.9.18.37632

Источник

How to count lines in a document?

I have lines like these, and I want to know how many lines I actually have.

Is there a way to count them all using linux commands?

24 Answers 24

This will output the number of lines in :

Or, to omit the from the result use wc -l :

You can also pipe data to wc as well:

To count all lines use:

To filter and count only lines with pattern use:

Or use -v to invert match:

See the grep man page to take a look at the -e,-i and -x args.

there are many ways. using wc is one.

sed -n ‘$=’ file (GNU sed)

The tool wc is the «word counter» in UNIX and UNIX-like operating systems, but you can also use it to count lines in a file by adding the -l option.

wc -l foo will count the number of lines in foo . You can also pipe output from a program like this: ls -l | wc -l , which will tell you how many files are in the current directory (plus one).

If you want to check the total line of all the files in a directory ,you can use find and wc:

If all you want is the number of lines (and not the number of lines and the stupid file name coming back):

As previously mentioned these also work (but are inferior for other reasons):

Write each FILE to standard output, with line numbers added. With no FILE, or when FILE is -, read standard input.

I’ve been using this:

I prefer it over the accepted answer because it does not print the filename, and you don’t have to use awk to fix that. Accepted answer:

But I think the best one is GGB667’s answer:

I will probably be using that from now on. It’s slightly shorter than my way. I am putting up my old way of doing it in case anyone prefers it. The output is the same with those two methods.

wc -l does not count lines.

Yes, this answer may be a bit late to the party, but I haven’t found anyone document a more robust solution in the answers yet.

Contrary to popular belief, POSIX does not require files to end with a newline character at all. Yes, the definition of a POSIX 3.206 Line is as follows:

A sequence of zero or more non- characters plus a terminating character.

However, what many people are not aware of is that POSIX also defines POSIX 3.195 Incomplete Line as:

A sequence of one or more non- characters at the end of the file.

Hence, files without a trailing LF are perfectly POSIX-compliant.

If you choose not to support both EOF types, your program is not POSIX-compliant.

As an example, let’s have look at the following file.

No matter the EOF, I’m sure you would agree that there are two lines. You figured that out by looking at how many lines have been started, not by looking at how many lines have been terminated. In other words, as per POSIX, these two files both have the same amount of lines:

The man page is relatively clear about wc counting newlines, with a newline just being a 0x0a character:

Hence, wc doesn’t even attempt to count what you might call a «line». Using wc to count lines can very well lead to miscounts, depending on the EOF of your input file.

POSIX-compliant solution

You can use grep to count lines just as in the example above. This solution is both more robust and precise, and it supports all the different flavors of what a line in your file could be:

Источник

Count lines in large files

I commonly work with text files of

20 Gb size and I find myself counting the number of lines in a given file very often.

The way I do it now it’s just cat fname | wc -l , and it takes very long. Is there any solution that’d be much faster?

I work in a high performance cluster with Hadoop installed. I was wondering if a map reduce approach could help.

I’d like the solution to be as simple as one line run, like the wc -l solution, but not sure how feasible it is.

13 Answers 13

Try: sed -n ‘$=’ filename

Also cat is unnecessary: wc -l filename is enough in your present way.

Your limiting speed factor is the I/O speed of your storage device, so changing between simple newlines/pattern counting programs won’t help, because the execution speed difference between those programs are likely to be suppressed by the way slower disk/storage/whatever you have.

But if you have the same file copied across disks/devices, or the file is distributed among those disks, you can certainly perform the operation in parallel. I don’t know specifically about this Hadoop, but assuming you can read a 10gb the file from 4 different locations, you can run 4 different line counting processes, each one in one part of the file, and sum their results up:

Notice the & at each command line, so all will run in parallel; dd works like cat here, but allow us to specify how many bytes to read ( count * bs bytes) and how many to skip at the beginning of the input ( skip * bs bytes). It works in blocks, hence, the need to specify bs as the block size. In this example, I’ve partitioned the 10Gb file in 4 equal chunks of 4Kb * 655360 = 2684354560 bytes = 2.5GB, one given to each job, you may want to setup a script to do it for you based on the size of the file and the number of parallel jobs you will run. You need also to sum the result of the executions, what I haven’t done for my lack of shell script ability.

If your filesystem is smart enough to split big file among many devices, like a RAID or a distributed filesystem or something, and automatically parallelize I/O requests that can be paralellized, you can do such a split, running many parallel jobs, but using the same file path, and you still may have some speed gain.

EDIT: Another idea that occurred to me is, if the lines inside the file have the same size, you can get the exact number of lines by dividing the size of the file by the size of the line, both in bytes. You can do it almost instantaneously in a single job. If you have the mean size and don’t care exactly for the the line count, but want an estimation, you can do this same operation and get a satisfactory result much faster than the exact operation.

Источник

How to count number of lines in a file on Linux

Sometimes you may required to count number of lines in a file on Linux command line or shell scripting. This tutorial will help you to count number of lines in a file on Linux.

Comamnd Syntax:

Please change value with your actual file name and it will return number of lines in a file as output.

Example

Following command will count number of lines in /etc/passwd files and print on terminal. We can also use –lines in place of -l as command line switch.

You can also count number of line on piped output.

Count Total Character’s in a File

Use -m or –chars switch with wc command to count number of characters in a file and print on screen.

Count Total Words in a File

Use -w or –words switch with wc command to count number of words in a file and print on screen.

Источник

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *

  • Mac os как сделать снимок экрана
  • Mac os как сделать скрин экрана
  • Mac os как сделать сброс
  • Mac os как сделать образ системы
  • Mac os как сделать бэкап