fill a (cd|dvd) up efficently
by TOC » Sun, 02 Apr 2006 10:57:59 GMT
back in the old (dos) days, I used to have a utility which would, given a
group of files to backup, would efficiently fill up a floppy disk.
I would like to put together something similar in bash for burning blank
single sided data DVDs.
so basically say I have a couple hundred files ranging from 10MB to 1000MB.
I would like to do something like the following:
./_find_files_that_will_fit_efficiantly_on_a_dvd.sh .
#the following files occupy 4.61 GB of disk space
#+and should fit on a single sided data dvd.
/home/myname/data/file234234.dat
/home/myname/data/file2342345344.dat
/home/myname/data/file234345234.dat
/home/myname/data/file2233454534234.dat
/home/myname/data/file2342534234.dat
/home/myname/data/file2342543234.dat
hope that makes sense... I will monitor this thread closely... so if I am
unclear, feel free to ask me questions.
thanks in advance.
Re: fill a (cd|dvd) up efficently
by Chris F.A. Johnson » Sun, 02 Apr 2006 13:35:46 GMT
There is a script in Chapter 13 of my book (see my .sig) that does
that; the scripts are available through the Apress web site.
I posted a precursor to it, wihch dealth with directories rather
than files) in this group a few years ago (the book version runs
in any POSIX shell):
#! /bin/bash
max=700000 ## adjust to taste
do-it()
{
## do whatever you need to with the list here
printf "\n = = = = = = = = = = =\nTOTAL: %d\n" $total
printf "\t%s\n" "$@"
}
## Set variables NL and TAB
eval "$(printf "NL='\n' TAB='\t'")"
## If there's a directory on the command line, cd into it
if [ -d "$1" ]; then
cd "$1" || exit 5
fi
## store directories and sizes in a sorted array
IFS=$NL
set -f
dirlist=( `du -ks */ | sort -rn` )
IFS=$' $TAB$NL'
unset toobig ## array for directories that are too big
while [ ${#dirlist[@]} -gt 0 ]
do
total=0
unset list
ndx=0
dirs=${#dirlist[@]}
while [ $ndx -le $dirs ]
do
size=$(( ${dirlist[$ndx]%$TAB*} ))
if [ $size -gt $max ]; then
## If directory is too large, remove from list and store in toobig[]
toobig[${#toobig[@]}]=${dirlist[$ndx]}
unset dirlist[$ndx]
elif [ $(( $size + $total )) -gt $max ]; then
: ## If this directory would exceed the max, go on to the next one
else
## Add directory to list
list[${#list[@]}]=${dirlist[$ndx]#*$TAB}
total=$(( $total + $size ))
unset dirlist[$ndx]
fi
ndx=$(( $ndx + 1 ))
done
do-it "${list[@]}"
dirlist=( "${dirlist[@]}" )
done
## Report any directories that are too big
big=${#toobig[@]}
[ $big -gt 0 ] && {
[ $big -eq 1 ] && x="y is" || x="ies are"
printf "\n * * * * * * *\n$big director$x too big:\n"
printf "\t%s\n" "${toobig[@]}"
}
--
Chris F.A. Johnson, author | < http://www.**--****.com/ >
Shell Scripting Recipes: | My code in this post, if any,
A Problem-Solution Approach | is released under the
2005, Apress | GNU General Public Licence
Re: fill a (cd|dvd) up efficently
by Icarus Sparry » Sun, 02 Apr 2006 13:45:02 GMT
Getting an optimal packing is a hard problem. You can get a reasonable
solution by sorting the files into reverse size order and picking the
largest file that will still fit. So something like
ls -s | sort -rn | awk -v space=4000000 '$1<space { print $2 ; space -= $1}'
You can add an END rule to the awk program to print out the remaining
space. You will need to see what units your ls program uses to report the
size of blocks, and set space to the free space on your cd/dvd in the same
units. This program does not account for the filesystem overhead either.
It will have problems if there are funny characters, such as spaces or
newlines in the filenames.
However it works pretty well for a single line.
Re: fill a (cd|dvd) up efficently
by Kurt Swanson » Sun, 02 Apr 2006 14:20:16 GMT
TOC < XXXX@XXXXX.COM > writes:
http://www.**--****.com/
--
?2006 Kurt Swanson AB
Re: fill a (cd|dvd) up efficently
by Janis Papanagnou » Mon, 03 Apr 2006 13:15:57 GMT
For the background look for "Greedy Algorithm"
"Martello and Toth (1990) proposed a greedy version of the algorithm
to solve the knapsack problem. Their version sorts the essentials in
decreasing order and then proceeds to insert them into the sack,
starting from the first element (the greatest) until there is no
longer space in the sack for more. If k is the maximum possible number
of essentials that can fit into the sack, the greedy algorithm is
guaranteed to insert at least k/2 of them." [quoted from Wikipedia]
Which is the base of the solutions suggested upthread.
See also: http://www.**--****.com/
Janis
Re: fill a (cd|dvd) up efficently
by harryooopotter » Tue, 04 Apr 2006 08:06:17 GMT
TOC wrote...
Python:
http://www.**--****.com/
Perl:
http://www.**--****.com/