Count files by extension
TL;DR:
The most popular command you can google appears to be this:
Here for example.
The problem is that it incorrectly counts files without extensions:
So it needs an improvement. The first step is to cut the path and keep only the base name:
Now we no longer have paths:
The next step would be adding a dot at the end of extension-less files so empty extension gets counted separately:
Now we have a useful output:
Next step is to add a fix for hidden files without extension (like .gitignore
):
Now we have a perfect list sorted by extension. To sort by count, you can add sort -n
for ascending order or sort -nr
for descending:
The final explanation:
find . -type f
find every file in the current directory-
sed
:'s/.*\///'
cut the path, keeping only basename's/^\.//'
remove leading dot because it does not indicate an extension'/\./! s/$/./'
add a dot if there is no dot at all's/.*\.//'
remove the file name, keeping only an extension
sort
because uniq works only on sorted outputuniq -c
find duplicates and count themsort -nr
optional part, sort extensions by a number of entries
It does not handle common double extensions like .tar.gz
but it was always enough for me.
Comments
Comments powered by Disqus