[NTLUG:Discuss] Is there a disk defragmenter for Linux

Sun Dec 29 11:30:31 CST 2002

Steve Baker wrote:
> Greg Edwards wrote:
> 
>> I would speculate that 99% of the programs you normally use write data 
>> to disk as opposed to update the data already on disk.
> 
> 
> ...and those that do extend files will quite frequently have read
> the file beforehand - in which case it's probably all in RAM cache
> anyway.  Remember Linux uses all of free memory as disk cache.
> 
> In that case, I suppose shifting files to extend them wouldn't be all
> that costly...although I'm a little suprised to hear that's what it
> does.
> 

Let me back up just a little bit.  I may have misstated or mislead on 
the 1M + 1 block statement or at least clouded the waters.

Remember that disk technology has physical issues that cannot be 
ignored.  Due to the physical issue of the location of the read/write 
heads and the shape of a disk platter only 1 track can be read or 
written per rotation of the disk.  For performance reasons any read or 
write is done as an entire track.  I makes no since to try and start a 
read or write part way around a track and you have to keep the head over 
the track for the full rotation so the cost is the same whether you read 
1 or all bytes.  The hardware requirements are easier when reading the 
entire track.

I said that a disk track was 8K which is wrong.  Its been a long time 
since I've looked at the detailed specs so I checked and a track on a 
modern disk is generally 32K not 8K.  You can get the track size by 
taking the sectors per track (usually 63 on modern disks) * the bytes 
per sector (usually 512).

So the largest storage size that a *nix OS would use would be 32K and 
the smallest 512 (or your stated size when your did the format).  So (if 
my quick math is right) a 1M file would take 32 tracks to store on disk 
and adding 100 bytes would cause another 512 block on another track to 
be allocated causing the file to exist on 33 tracks.

The amount of data actually written to disk depends on the mode the file 
was opened in.  A file that is opened in append only would only write 
the last allocation.  While this write does not cause an overflow of the 
current allocation that same track would continue to be used.  As soon 
as the size overflows the current allocation then the move I talked 
about before would occur.

In append and read/write modes a file generally would only move the data 
during writes that cause an overflow of the last allocation on disk.  In 
write only modes the OS will buffer data until a full track is filled 
(or a sync operation) to maximize performance.

When I say that most of your daily programs do complete file writes so 
that your 1M file is written as a whole to another location is mostly 
due to the design of the programs.  Most programs like word processors, 
spreadsheets, etc. keep a backup of the original until a complete write 
of the new file is successful so that you don't lose your data on a 
write error.

I hope this clears the water up a little and I hope I'm not getting too 
long winded ;)

-- 
Greg Edwards