2

I have a directory with too much files in it.

I want to compress first 5 thousand files in that directory to become file.tar.gz and then 5001 - 10000...and so on

how to do it?

3
  • Are the first five thousand named differently than the next?
    – wfoster
    Commented Dec 20, 2010 at 5:38
  • yes. very different name.
    – Captain
    Commented Dec 20, 2010 at 5:49
  • Have you tried using a regular expression to match the first 5k? Then maybe a simple perl or python script to do the leg work
    – wfoster
    Commented Dec 20, 2010 at 17:40

3 Answers 3

0

Use ls to generate the list of names and head and tail to filter them. Here's a one-liner that does it in a loop. You'll need to know the number of files in the directory (ls | wc -l will tell you).

for ii in $(seq -w 5000 5000 NUMBER_OF_FILES) ; do echo $ii ; ls | head -n $ii | tail -n 5000 | tar -f ../ARCHIVE_FILE_$ii.tar.gz -czv -T - ; done

Replace the bits in capitals with what you want.

1
  • Useless Use Of ls Award goes to...
    – Hello71
    Commented May 28, 2011 at 2:10
0

This script gradually adds all files to the archive, and numbering the archive. Rename ARCHIVE_NAME and '5000'.

$ COUNT_MOD=0; for i in *; do tar -r -f ARCHIVE_NAME`expr $COUNT_MOD / 5000`.tar $i; ((COUNT_MOD++)) ; done

This script is not optimized, so there are a few rules:

  1. ARCHIVE_NAME# must not exist when starting this script, so if anything fails, do an 'rm ARCHIVE_NAME*'.
  2. A directory entry is treated as 1 entry by the script, but not 'tar'. Tar will go into the directory and will add all files recursively, and you might end up more than 5000 files in an archive.
  3. Compressed archives cannot be updated, I left out '-z', sorry :-)
1
  • you could of use for i in * instead.
    – Wuffers
    Commented May 28, 2011 at 2:48
0

You could build a set of files that list each 5000 filenames and use them with the -T arguments for tar. Something like this might work:

ls -1 | split -l 5000 - tarlist
count=0
for f in tarlist*
do
    tar -czf save.$count.tar.gz -T $f
    count=`expr $count + 1`
done

You must log in to answer this question.