shell - Generate Summary report from logs : Peform additions on output of a command ( using AWK / SED or any other way) and formatting output -


i processing several files @ time.each of has summary stats . @ end of process want create summary file add stats . know how dig out stats log files. want able add numbers , echo file here use dig out times .

find . -iname "$srch1*" -exec grep "it took" {} \; -print 

output

    took 0 hours, 11 minutes , 4 seconds process file. ./filepart000010-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 56 seconds process file. ./filepart000007-20140204-154923.dat.gz.log took 0 hours, 29 minutes , 54 seconds process file. ./filepart000001-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 33 seconds process file. ./filepart000004-20140204-154923.dat.gz.log took 0 hours, 59 minutes , 38 seconds process file. ./filepart000000-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 50 seconds process file. ./filepart000005-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 10 seconds process file. ./filepart000002-20140204-154923.dat.gz.log took 0 hours, 10 minutes , 39 seconds process file. ./filepart000008-20140204-154923.dat.gz.log took 0 hours, 12 minutes , 27 seconds process file. ./filepart000009-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 36 seconds process file. ./filepart000003-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 40 seconds process file. ./filepart000006-20140204-154923.dat.gz.log 

what want

summary  filepart000006-20140204-154923.dat.gz.log  0 hours, 11 minutes , 40 seconds 

then find out longest times among them , output message .

 total time taken =____________ 

i running in parallel time taken longest one.

then calculations this.

find . -iname "$srch*" -exec grep "processed files" {} \; -print          processed files:   7936635 ./filename-20131102-part000000-20140204-153310.dat.gz.log         processed files:   3264805 ./filename-20131102-part000001-20140204-153310.dat.gz.log         processed files:   1607547 ./filename-20131102-part000008-20140204-153310.dat.gz.log         processed files:   3180478 ./filename-20131102-part000003-20140204-153310.dat.gz.log         processed files:   1595497 ./filename-20131102-part000007-20140204-153310.dat.gz.log         processed files:   1568532 ./filename-20131102-part000009-20140204-153310.dat.gz.log         processed files:   3259884 ./filename-20131102-part000002-20140204-153310.dat.gz.log         processed files:   3141542 ./filename-20131102-part000004-20140204-153310.dat.gz.log         processed files:   3124221 ./filename-20131102-part000005-20140204-153310.dat.gz.log         processed files:   3136845 ./filename-20131102-part000006-20140204-153310.dat.gz.log 

and if want metrics

( find . -iname "dl-aster-full-20131102*" -exec grep "processed files" {} \;) | cut -d":" -f2    7936635    3264805    1607547    3180478    1595497    1568532    3259884    3141542    3124221    3136845 

based on above 2 create summary file .

filename                                                  processed files  filename-20131102-part000000-20140204-153310.dat.gz.log   7936635 

.... summary above added.

   ( 7936635 +    3264805 +    1607547 +    3180478.....etc    1595497    1568532    3259884    3141542    3124221    3136845 )     total files = ____________ 

so overall 1 .

filename                                                  processed files      filename-20131102-part000000-20140204-153310.dat.gz.log   7936635      total files = ____________ ( sum of above )  

all that needs done -- output in format

 filename                                                  processed files      filename-20131102-part000000-20140204-153310.dat.gz.log   7936635 

in above command on different line , perform summation numbers outputted.

my question . -- how can perform addition above - using anything. i'd avoid perl , since not sure , it'd installed everywhere shell run -- how can format output above . know how extract output

with below sed command, can output (filename , grep result 1 line), next easy you. (the grep result should 1 line each file)

find . -iname "$srch1*" -exec grep "it took" {} \; -print |sed -r 'n;s/(.*)\n(.*)/\2 \1/'  ./filepart000010-20140204-154923.dat.gz.log    took 0 hours, 11 minutes , 4 seconds process file. ./filepart000007-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 56 seconds process file. ./filepart000001-20140204-154923.dat.gz.log took 0 hours, 29 minutes , 54 seconds process file. ./filepart000004-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 33 seconds process file. ./filepart000000-20140204-154923.dat.gz.log took 0 hours, 59 minutes , 38 seconds process file. ./filepart000005-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 50 seconds process file. ./filepart000002-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 10 seconds process file. ./filepart000008-20140204-154923.dat.gz.log took 0 hours, 10 minutes , 39 seconds process file. ./filepart000009-20140204-154923.dat.gz.log took 0 hours, 12 minutes , 27 seconds process file. ./filepart000003-20140204-154923.dat.gz.log took 0 hours, 22 minutes , 36 seconds process file. ./filepart000006-20140204-154923.dat.gz.log took 0 hours, 11 minutes , 40 seconds process file.   find . -iname "$srch*" -exec grep "processed files" {} \; -print| sed -r 'n;s/(.*)\n(.*)/\2 \1/'  ./filename-20131102-part000000-20140204-153310.dat.gz.log         processed files:   7936635 ./filename-20131102-part000001-20140204-153310.dat.gz.log         processed files:   3264805 ./filename-20131102-part000008-20140204-153310.dat.gz.log         processed files:   1607547 ./filename-20131102-part000003-20140204-153310.dat.gz.log         processed files:   3180478 ./filename-20131102-part000007-20140204-153310.dat.gz.log         processed files:   1595497 ./filename-20131102-part000009-20140204-153310.dat.gz.log         processed files:   1568532 ./filename-20131102-part000002-20140204-153310.dat.gz.log         processed files:   3259884 ./filename-20131102-part000004-20140204-153310.dat.gz.log         processed files:   3141542 ./filename-20131102-part000005-20140204-153310.dat.gz.log         processed files:   3124221 ./filename-20131102-part000006-20140204-153310.dat.gz.log         processed files:   3136845 

if need calculate longest time , total time, use below script (you should fine format output.)

find . -iname "$srch1*" -exec grep "it took" {} \; -print |sed -r 'n;s/(.*)\n(.*)/\2 \1/' > temp1 awk 'function s2t(x) { h=int(x/3600);m=int((x-h*3600)/60);s=x-h*3600-m*60} {a=$4*3600+$6*60+$9;max=a>max?a:max;t+=a} end{ s2t(max);print "max is",h,m,s; s2t(t);print "sum " ,h,m,s}' temp1  max 0 59 38 sum  3 46 27 

for second one:

find . -iname "$srch*" -exec grep "processed files" {} \; -print| sed -r 'n;s/(.*)\n(.*)/\2 \1/'  > temp2 awk '{sum+=$nf}end{print "total files = ", sum}' temp2  total files =  31815986 

Comments

Popular posts from this blog

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -

SQL: Divide the sum of values in one table with the count of rows in another -