[NTLUG:Discuss] disk usage by file age

Hank Ivy hankivy at hot.rr.com
Fri Jul 2 00:06:51 CDT 2010


On Thursday 01 July 2010 11:51 pm Hank Ivy wrote:
> On Wednesday 30 June 2010 10:16 am Michael Barnes wrote:
> > Any simple ideas for this?
> 
> Sometimes, getting exactly what you want is not simple.  It can be better
>  to have a powerful framework that is easy to change to get what you want.
>  Attached is a PERL script.  It is not short on lines of code.  But it will
>  scan many directories, not get lost in symbolic links, never fork, run
>  fast, and is easily adaptable to many other purposes.

Sorry, the attachment got lost in the listserv.  Look below.

> > I'm trying to figure out how to gather some disk data.  What I need is
> > to find the disk usage by subfolder, but only by files older than a
> > certain time.  I can get a list of files with
> > find ./ftp/news +mtime 180
> > and get a list of everything older than six months.
> > I can do
> > du -sh ./ftp/news/*
> > and find the usage by subfolder.
> > What I need is to combine the two, so I get something like
> >
> > 15M	        ftp/news/4-Dallas
> > 40M	        ftp/news/4-Washington
> > 560M	ftp/news/House
> > 1.1G	        ftp/news/Senate
> > 717M	ftp/news/White House
> > 69M	        ftp/news/YIR 2009
> > 65M	        ftp/news/stuff
> >
> > knowing that the size is for files over six months old.

#!/usr/bin/perl                                                                                              
                                                                                                             
# file : find_large_old_folders.pl                                                                           
# Author : Hank Ivy (Henry Berry Ivy, Jr.)                                                                   
# Date : July 1, 2010                                                                                        
                                                                                                             
use strict ;                                                                                                 
                                                                                                             
my $usage = "XXX \[-A\|--Age\] nn \[-S\|--Size\] nn Directories ...\n" .                                     
        "Age is minimum age of files to count.\n" .                                                          
        "   Default is 60 days.\n" .                                                                         
        "Size is minimum size of files in a folder to report.\n" .                                           
        "   Default is 50 Megabytes.\n" ;                                                                    

sub TotalDirSize() {
        my ($direct, $Age_minimum) = @_ ;
        my $TotalSoFar = 0 ;             
        my @FilesInDir = () ;            
        my $DirSep = "/" ;               
        my $LocalFile = "" ;             
        unless (-r $direct) {            
                # The directory is not readable.  It is a subdirectory.
                return 0 ;                                             
        }                                                              
        opendir SUBDIR, $direct ;                                      
        @FilesInDir = readdir SUBDIR ;                                 
        closedir SUBDIR ;                                              
        foreach $LocalFile (@FilesInDir) {                             
                next if ($LocalFile eq "\.") ; # Do not process the "." dir.  
It is just itself.
                next if ($LocalFile eq "\.\.") ; # Do not process the ".." 
dir.  It is the parent directory.
                my $FullPath = $direct . $DirSep . $LocalFile ;                                             
                next if (-l $FullPath) ; # do not follow a symbolic link.                                   
                        # Symbolic links take little space but they can take 
you to other                   
                        # file systems, other partitions, or back to yourself.                              
                if (-d _) {                                                                                 
                        $TotalSoFar += &TotalDirSize($FullPath, $Age_minimum) 
;                             
                } else {                                                                                    
                        if ((int(-M _)) >= $Age_minimum) {                                                  
                                $TotalSoFar += -s _ ;                                                       
                        }                                                                                   
                }                                                                                           
        }                                                                                                   
        return $TotalSoFar ;                                                                                
}                                                                                                           

my %FolderSize = () ;

my @FoldersSearch = () ;

# Parse Parameters - Collect Folders to Search, and conditions.

# Initialize option flag value, and parameters.
my ($OptionFlag) = "" ;                        
my ($Age_minimum) = 60 ; # days old.           
my ($Size_minimum) = 50 ; # Megabytes.         
my $parm ;                                     

foreach $parm (@ARGV) {
        if ($OptionFlag) {
                if ($OptionFlag eq "-A" or $OptionFlag eq "--Age") {
                        $Age_minimum = $parm ;                      
                        $OptionFlag = "" ;                          
                }                                                   
                if ($OptionFlag eq "-S" or $OptionFlag eq "--Size") {
                        $Size_minimum = $parm ;                      
                        $OptionFlag = "" ;                           
                }                                                    
                # Add processing of other parameters here.           
        } else {                                                     
                # Test for parameter options.                        
                if ($parm =~ m/^-A|^--Age|^-S|^--Size/) {            
                        # Test for value here or next value.         
                        if ($parm =~ m/^-A(\S+)$|^--Age(\S+)$/) {    
                                $Age_minimum = $1 ;                  
                        } else {                                     
                                $OptionFlag = $parm ;                
                        }                                            
                        if ($parm =~ m/^-S(\S+)$|^--Size(\S+)$/) {   
                                $Size_minimum = $1 ;                 
                        } else {                                     
                                $OptionFlag = $parm ;                
                        }                                            
                } else {                                             
                        # Save folder values.                        
                        push @FoldersSearch , $parm ;                
                }                                                    
        }                                                            
}                                                                    

if ($OptionFlag) {
        warn "Option flag $OptionFlag given without a parameter.\n" ;
        die $usage ;                                                 
}                                                                    

unless ($Age_minimum =~ m/^\d+$/) {
        # $Age_minimum is not all numeric. It is invalid.
        warn "Age parameter value $Age_minimum is invalid.\n" ;
        die $usage ;
}
# $Age_minimum is all numeric, and valid.

unless ($Size_minimum =~ m/^\d+$/) {
        # $Size_minimum is not all numeric. It is invalid.
        warn "Size parameter value $Size_minimum is invalid.\n" ;
        die $usage ;
}
# $Size_minimum is all numeric, and valid.

my $direct ;
foreach $direct (@FoldersSearch) {
        unless (-d $direct or -r _) {
                warn "Parameter $direct is either not a directory, or 
readable.\n" ;
                die $usage ;
        }
}

foreach $direct (@FoldersSearch) {
        $FolderSize{$direct} = &TotalDirSize($direct, $Age_minimum) ;
}

foreach $direct (@FoldersSearch) {
        my $DSize = $FolderSize{$direct} ;
        my $MegSize = int ( $DSize / 1000000000) ;
        if ($MegSize >= $Size_minimum ) {
                print "$MegSize $direct\n" ;
        }
}



-- 
Hank Ivy

GPG Fingerprint:
1A0F E1CB 0160 0069 7C19 4B00 911C 92E8 F8B0 4C7C



More information about the Discuss mailing list