2007年12月16日 星期日

Parse FTP logfile by Perl

G6 FTP Server用了好幾年, 雖然G6本身提供了一些統計及圖表, 但是至目前最新版也無法自訂想要的統計數據, 有些外掛可以用, 但似乎並不怎麼好用. 所以決定自己寫個parser去parse G6的FTP logfile都還比弄懂難搞的外掛來的快-.-" 用Java嗎? 是可以啦, 不過"最近聽說" Perl處理文字特快, 所以想來搬弄看看. 原始log檔會是這樣的字串: 07/12/02 18:47:39, 300, 192.168.1.2, clotho, RETR \root\1.jpg from 0 to 734414848 in 00:02:46 at 4320.493 KBytes/s : ok 重點只要username : clotho, 及transfered file size: from 0 to 734414848 (從第0 byte到734414848 byte) 所以Perl這樣寫: open(FILE, $source_file) or die "ERROR"; while(defined($line=)) { chomp($line); if ($line =~ /(\d+)\/(\d+)\/(\d+) (\d+)\:(\d+)\:(\d+), (\d+), (\d+)\.(\d+)\.(\d+)\.(\d+), (.*?), (.*?) from (\d+) to (\d+) .*/){ my $user_name = $12; my $strat_point = $14; my $end_point = $15; print $user_name."\t".$strat_point."\t".$end_point."\n"; } } 這樣就可以得需要的欄位. 之後需要加個hashtable, 以username為key, transfered size為value, 整個檔案掃過一遍之後, hashtable內就有所有username跟size. 但是如果只用for loop把hashtable的東西印出來這樣看不到直觀的效果,因此又加了size排序, 以hashtable的value為大小, 對key作排序, 然後印出來, 最後得到這樣的結果: plums 0.05MB f1 0.25GB kami 0.51GB likanki 2.81GB latte 5.66GB cityplayer 6.00GB ason 6.06GB dmr 11.78GB newtype 23.44GB lab509 26.64GB wjl 41.01GB 所以source code最終長這樣: my %users; my $source_file = $ARGV[0]; open(FILE, $source_file) or die "ERROR"; while(defined($line=)) { chomp($line); if ($line =~ /(\d+)\/(\d+)\/(\d+) (\d+)\:(\d+)\:(\d+), (\d+), (\d+)\.(\d+)\.(\d+)\.(\d+), (.*?), (.*?) from (\d+) to (\d+) .*/) { my $user_name = $12; my $strat_point = $14; my $end_point = $15; # print $user_name."\t".$strat_point."\t".$end_point."\n"; if (exists $users{$user_name}) { $users{$user_name} = $users{$user_name} + ($end_point-$strat_point); } else { $users{$user_name} = ($end_point-$strat_point);} } } close(FILE); foreach $key (keys %users) { my $volume = $users{$key}/(1024**2); my $formatted = sprintf "%-12s", $key; my $formattedv = sprintf "%10.2f", $volume; #print $key."\t\t".$volume."\n"; #print $formatted.$formattedv."MB\n"; } print "\n\n"; my @ordered = sort { $users{$a} <=> $users{$b} } keys %users; for (@ordered) { my $volume = $users{$_}/(1024**3); my $unit = "GB"; if($volume <= 0.001){ $volume = $users{$_}/(1024**2); $unit = "MB"; } my $formattedf = sprintf "%10.2f", $volume; my $formattedName = sprintf "%-12s", $_; print $formattedName.$formattedf.$unit."\n"; }

沒有留言: