Count files by extension ######################## :date: 2025-07-11 07:23:00 +0300 :category: Linux :tags: find, sed, sort, uniq TL;DR: .. code-block:: bash find . -type f | sed 's/.*\///; s/^\.//; /\./! s/$/./; s/.*\.//' | sort | uniq -c | sort -nr .. TEASER_END The most popular command you can google appears to be this: .. code-block:: bash find . -type f | sed 's/.*\.//' | sort | uniq -c `Here `__ for example. The problem is that it incorrectly counts files without extensions: .. code-block:: text 1 css 1 csv 1 git/COMMIT_EDITMSG 1 git/config 1 git/description 1 git/HEAD 4 gitignore ... So it needs an improvement. The first step is to cut the path and keep only the base name: .. code-block:: bash find . -type f | sed 's/.*\///; s/.*\.//' | sort | uniq -c Now we no longer have paths: .. code-block:: text 1 COMMIT_EDITMSG 1 config 1 css 1 csv ... The next step would be adding a dot at the end of extension-less files so empty extension gets counted separately: .. code-block:: bash find . -type f | sed 's/.*\///; /\./! s/$/./; s/.*\.//' | sort | uniq -c Now we have a useful output: .. code-block:: text 263 1 bash 7 coffee 1 css 1 csv 1 exe 1 fish 4 gitignore 1 iml ... Next step is to add a fix for hidden files without extension (like ``.gitignore``): .. code-block:: bash find . -type f | sed 's/.*\///; s/^\.//; /\./! s/$/./; s/.*\.//' | sort | uniq -c Now we have a perfect list sorted by extension. To sort by count, you can add ``sort -n`` for ascending order or ``sort -nr`` for descending: .. code-block:: bash find . -type f | sed 's/.*\///; s/^\.//; /\./! s/$/./; s/.*\.//' | sort | uniq -c | sort -nr .. code-block:: text 1058 php 268 47 md 36 json ... The final explanation: * ``find . -type f`` find every file in the current directory * ``sed``: * ``'s/.*\///'`` cut the path, keeping only basename * ``'s/^\.//'`` remove leading dot because it does not indicate an extension * ``'/\./! s/$/./'`` add a dot if there is no dot at all * ``'s/.*\.//'`` remove the file name, keeping only an extension * ``sort`` because uniq works only on sorted output * ``uniq -c`` find duplicates and count them * ``sort -nr`` optional part, sort extensions by a number of entries It does not handle common double extensions like ``.tar.gz`` but it was always enough for me.