shell - Generate Summary report from logs : Peform additions on output of a command ( using AWK / SED or any other way) and formatting output -
i processing several files @ time.each of has summary stats . @ end of process want create summary file add stats . know how dig out stats log files. want able add numbers , echo file here use dig out times .
find . -iname "$srch1*" -exec grep "it took" {} \; -print
output
took 0 hours, 11 minutes , 4 seconds process file. ./filepart000010-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 56 seconds process file. ./filepart000007-20140204-154923.dat.gz.log took 0 hours, 29 minutes , 54 seconds process file. ./filepart000001-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 33 seconds process file. ./filepart000004-20140204-154923.dat.gz.log took 0 hours, 59 minutes , 38 seconds process file. ./filepart000000-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 50 seconds process file. ./filepart000005-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 10 seconds process file. ./filepart000002-20140204-154923.dat.gz.log took 0 hours, 10 minutes , 39 seconds process file. ./filepart000008-20140204-154923.dat.gz.log took 0 hours, 12 minutes , 27 seconds process file. ./filepart000009-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 36 seconds process file. ./filepart000003-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 40 seconds process file. ./filepart000006-20140204-154923.dat.gz.log
what want
summary filepart000006-20140204-154923.dat.gz.log 0 hours, 11 minutes , 40 seconds
then find out longest times among them , output message .
total time taken =____________
i running in parallel time taken longest one.
then calculations this.
find . -iname "$srch*" -exec grep "processed files" {} \; -print processed files: 7936635 ./filename-20131102-part000000-20140204-153310.dat.gz.log processed files: 3264805 ./filename-20131102-part000001-20140204-153310.dat.gz.log processed files: 1607547 ./filename-20131102-part000008-20140204-153310.dat.gz.log processed files: 3180478 ./filename-20131102-part000003-20140204-153310.dat.gz.log processed files: 1595497 ./filename-20131102-part000007-20140204-153310.dat.gz.log processed files: 1568532 ./filename-20131102-part000009-20140204-153310.dat.gz.log processed files: 3259884 ./filename-20131102-part000002-20140204-153310.dat.gz.log processed files: 3141542 ./filename-20131102-part000004-20140204-153310.dat.gz.log processed files: 3124221 ./filename-20131102-part000005-20140204-153310.dat.gz.log processed files: 3136845 ./filename-20131102-part000006-20140204-153310.dat.gz.log
and if want metrics
( find . -iname "dl-aster-full-20131102*" -exec grep "processed files" {} \;) | cut -d":" -f2 7936635 3264805 1607547 3180478 1595497 1568532 3259884 3141542 3124221 3136845
based on above 2 create summary file .
filename processed files filename-20131102-part000000-20140204-153310.dat.gz.log 7936635
.... summary above added.
( 7936635 + 3264805 + 1607547 + 3180478.....etc 1595497 1568532 3259884 3141542 3124221 3136845 ) total files = ____________
so overall 1 .
filename processed files filename-20131102-part000000-20140204-153310.dat.gz.log 7936635 total files = ____________ ( sum of above )
all that needs done -- output in format
filename processed files filename-20131102-part000000-20140204-153310.dat.gz.log 7936635
in above command on different line , perform summation numbers outputted.
my question . -- how can perform addition above - using anything. i'd avoid perl , since not sure , it'd installed everywhere shell run -- how can format output above . know how extract output
with below sed command, can output (filename , grep result 1 line), next easy you. (the grep result should 1 line each file)
find . -iname "$srch1*" -exec grep "it took" {} \; -print |sed -r 'n;s/(.*)\n(.*)/\2 \1/' ./filepart000010-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 4 seconds process file. ./filepart000007-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 56 seconds process file. ./filepart000001-20140204-154923.dat.gz.log took 0 hours, 29 minutes , 54 seconds process file. ./filepart000004-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 33 seconds process file. ./filepart000000-20140204-154923.dat.gz.log took 0 hours, 59 minutes , 38 seconds process file. ./filepart000005-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 50 seconds process file. ./filepart000002-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 10 seconds process file. ./filepart000008-20140204-154923.dat.gz.log took 0 hours, 10 minutes , 39 seconds process file. ./filepart000009-20140204-154923.dat.gz.log took 0 hours, 12 minutes , 27 seconds process file. ./filepart000003-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 36 seconds process file. ./filepart000006-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 40 seconds process file. find . -iname "$srch*" -exec grep "processed files" {} \; -print| sed -r 'n;s/(.*)\n(.*)/\2 \1/' ./filename-20131102-part000000-20140204-153310.dat.gz.log processed files: 7936635 ./filename-20131102-part000001-20140204-153310.dat.gz.log processed files: 3264805 ./filename-20131102-part000008-20140204-153310.dat.gz.log processed files: 1607547 ./filename-20131102-part000003-20140204-153310.dat.gz.log processed files: 3180478 ./filename-20131102-part000007-20140204-153310.dat.gz.log processed files: 1595497 ./filename-20131102-part000009-20140204-153310.dat.gz.log processed files: 1568532 ./filename-20131102-part000002-20140204-153310.dat.gz.log processed files: 3259884 ./filename-20131102-part000004-20140204-153310.dat.gz.log processed files: 3141542 ./filename-20131102-part000005-20140204-153310.dat.gz.log processed files: 3124221 ./filename-20131102-part000006-20140204-153310.dat.gz.log processed files: 3136845
if need calculate longest time , total time, use below script (you should fine format output.)
find . -iname "$srch1*" -exec grep "it took" {} \; -print |sed -r 'n;s/(.*)\n(.*)/\2 \1/' > temp1 awk 'function s2t(x) { h=int(x/3600);m=int((x-h*3600)/60);s=x-h*3600-m*60} {a=$4*3600+$6*60+$9;max=a>max?a:max;t+=a} end{ s2t(max);print "max is",h,m,s; s2t(t);print "sum " ,h,m,s}' temp1 max 0 59 38 sum 3 46 27
for second one:
find . -iname "$srch*" -exec grep "processed files" {} \; -print| sed -r 'n;s/(.*)\n(.*)/\2 \1/' > temp2 awk '{sum+=$nf}end{print "total files = ", sum}' temp2 total files = 31815986
Comments
Post a Comment