2007年12月16日 星期日
Parse FTP logfile by Perl
G6 FTP Server用了好幾年, 雖然G6本身提供了一些統計及圖表, 但是至目前最新版也無法自訂想要的統計數據, 有些外掛可以用, 但似乎並不怎麼好用.
所以決定自己寫個parser去parse G6的FTP logfile都還比弄懂難搞的外掛來的快-.-"
用Java嗎? 是可以啦, 不過"最近聽說" Perl處理文字特快, 所以想來搬弄看看.
原始log檔會是這樣的字串:
07/12/02 18:47:39, 300, 192.168.1.2, clotho, RETR \root\1.jpg from 0 to 734414848 in 00:02:46 at 4320.493 KBytes/s : ok
重點只要username : clotho, 及transfered file size: from 0 to 734414848 (從第0 byte到734414848 byte)
所以Perl這樣寫:
open(FILE, $source_file) or die "ERROR";
while(defined($line=))
{
chomp($line);
if ($line =~ /(\d+)\/(\d+)\/(\d+) (\d+)\:(\d+)\:(\d+), (\d+), (\d+)\.(\d+)\.(\d+)\.(\d+), (.*?), (.*?) from (\d+) to (\d+) .*/){
my $user_name = $12;
my $strat_point = $14;
my $end_point = $15;
print $user_name."\t".$strat_point."\t".$end_point."\n";
}
}
這樣就可以得需要的欄位.
之後需要加個hashtable, 以username為key, transfered size為value, 整個檔案掃過一遍之後, hashtable內就有所有username跟size.
但是如果只用for loop把hashtable的東西印出來這樣看不到直觀的效果,因此又加了size排序, 以hashtable的value為大小, 對key作排序, 然後印出來, 最後得到這樣的結果:
plums 0.05MB
f1 0.25GB
kami 0.51GB
likanki 2.81GB
latte 5.66GB
cityplayer 6.00GB
ason 6.06GB
dmr 11.78GB
newtype 23.44GB
lab509 26.64GB
wjl 41.01GB
所以source code最終長這樣:
my %users;
my $source_file = $ARGV[0];
open(FILE, $source_file) or die "ERROR";
while(defined($line=))
{
chomp($line);
if ($line =~ /(\d+)\/(\d+)\/(\d+) (\d+)\:(\d+)\:(\d+), (\d+), (\d+)\.(\d+)\.(\d+)\.(\d+), (.*?), (.*?) from (\d+) to (\d+) .*/)
{
my $user_name = $12;
my $strat_point = $14;
my $end_point = $15;
# print $user_name."\t".$strat_point."\t".$end_point."\n";
if (exists $users{$user_name})
{
$users{$user_name} = $users{$user_name} + ($end_point-$strat_point);
}
else { $users{$user_name} = ($end_point-$strat_point);}
}
}
close(FILE);
foreach $key (keys %users)
{
my $volume = $users{$key}/(1024**2);
my $formatted = sprintf "%-12s", $key;
my $formattedv = sprintf "%10.2f", $volume;
#print $key."\t\t".$volume."\n";
#print $formatted.$formattedv."MB\n";
}
print "\n\n";
my @ordered = sort { $users{$a} <=> $users{$b} } keys %users;
for (@ordered)
{
my $volume = $users{$_}/(1024**3);
my $unit = "GB";
if($volume <= 0.001){
$volume = $users{$_}/(1024**2);
$unit = "MB";
}
my $formattedf = sprintf "%10.2f", $volume;
my $formattedName = sprintf "%-12s", $_;
print $formattedName.$formattedf.$unit."\n";
}
訂閱:
張貼留言 (Atom)
沒有留言:
張貼留言