Comments: 
You've just described the knapsack problem  it's NP complete.
A greedy algorithm (copy as many as will fit, in order, then move on to the next) works fairly well, but it's not optimal. Computing an optimal solution takes an amount of time that increases very rapidly with the number of files, and is impractical in most cases.
You're correct of course. I think it's a bad choice of words on my part. I guess what I meant is 'best as possible'?
Anything with "best" in it is going to imply "optimal". You probably mean something like "doing a reasonably good job of packing".
I'd guess that an optimal solution would require something like "tar c multivolume" (which splits a monster tar archive over multiple physical volumes), though it would require extra effort to extract the files.
Yeah you're right again! I'm being awfully sloppy  sorry!
You're right, the best (and probably easiest) solution would be to use an archival program that can span volumes. That way you use the full capacity of the media.
From: (Anonymous) 20060908 03:22 pm (UTC)
 (Link)

Actually, this particular problem is called 'minimum bin packing'. In the knapsack problem there's only one DVD, and you want to pack as much stuff there as possible, by possibly some other measure than the sizes.
While the exact solution of this problem is intractable, the simple "First fit" method that first sorts the files by decreasing size, and assigns a file on the first disk on which it fits, does provide an approximate solution, that uses at most 1.5 m^* + 1 times as many disks as the optimal solution that would use m^* disks. If the files don't have widely different sizes, with most episodes in a series having the same size and so on, it is slightly possible to accomodate for the preferred ordering by varying the sorting.
There's also an asymptotically polynomial time approximation scheme for the problem, consisting of the following:
1. Remove all "small" files from the list, small depending on a parameter. 2. Quantise the sizes of the remaining items into a constant (parameter) number of of sizes. 3. Solve the resulting restricted bin packing problem, whose complexity does not depend on the input size, only on the parameters. 4. Insert the remaining small items using "first fit".
This algorithm is a bit complex, but it can achieve approximation with performance r m^* + 1 for any rational 1 < r < 2; see e.g. [Ausiello et. al: Complexity and Approximation].

Now, with regard to actually implementing even the first algorithm, it is too much work for my lazy ass, especially in sh (and anything else would feel like cheating), which would likely involve some very tedious sed'ing of lists as strings, so I'll leave it to someone else.
 tuomov
Thanks for the correction. Yes, first fit seems pretty straightforward. Like you, I'll let somebody else implement it. (Deleted comment)
Yeah fair enough. I think the problem was perhaps a bit ambitious!
I hadn't realised the problem I suggested was so hairy (I suspected it might be but I hadn't investigated). It makes me feel better about the totally naive solution I implemented in Ruby (happened to be the scripting language I was sortof learning at the time): #!/usr/bin/env ruby
def Toggle(files, total, desc)
files = files.clone();
if files.size == 0
printf("%05i", total / 1024);
print ":" + desc + "\n";
else
file = files.pop;
#size = File.stat(file).size;
size = `du k "#{file}"`;
size = Integer(size[/\d+/]);
Toggle(files, total, desc);
Toggle(files, total + size, " \"" + file + "\"" + desc);
end
end
Toggle(ARGV, 0, ""); I run this as dvdfitter.rb *  sort  less . I then skim down the list to find the highest value that is under the size of a DVD. This last part isn't very elegant, but I couldn't be bothered adding it to the Ruby.  