I have a file containing multiple tests with detailed action written one beneath another. All test blocks are separated one from another by new line. I want to extract only first and last line from the all blocks and put it on one line for each block into a new file. Here is an example:
input.txt:
[test1]
duration
summary
code=
Results= PASS
[test2]
duration
summary=x
code=
Results=FAIL
.....
[testX]
duration
summary=x
code=
Results= PASS
output.txt should be sometime like this:
test1 PASS
test2 FAIL
...
testX PASS
eg2:
[Linux_MP3Enc_xffv.2_Con_37_003]
type = testcase
summary = MP3 encoder test
ActionGroup[Linux_Enc] = PASS
ActionGroup[Linux_Playb] = PASS
ActionGroup[Linux_Pause_Resume] = PASS
ActionGroup[Linux_Fast_Seek] = PASS
Duration = 230.607398987 s
Total_Result = PASS
[Composer__vtx_007]
type = testcase
summary = composer
Background[0xff000000] = PASS
Background[0xffFFFFFF] = PASS
Background[0xffFF0000] = PASS
Background[0xff00FF00] = PASS
Background[0xff0000FF] = PASS
Background[0xff00FFFF] = PASS
Background[0xffFFFF00] = PASS
Background[0xffFF00FF] = PASS
Duration = 28.3567230701 s
Total_Result = PASS
[Videox_Rotate_008]
type = testcase
summary = rotation
Rotation[0] = PASS
Rotation[1] = PASS
Rotation[2] = PASS
Rotation[3] = PASS
Duration = 14.0116529465 s
Total_Result = PASS
Thank you!
Short and simple gnu awk
:
awk -F= -v RS='' '{print $1 $NF}' file
[Linux_MP3Enc_xffv.2_Con_37_003] PASS
[Composer__vtx_007] PASS
[Videox_Rotate_008] PASS
If you do not like the brackets:
awk -F'[]=[]' -v RS='' '{print $2 $NF}' file
Linux_MP3Enc_xffv.2_Con_37_003 PASS
Composer__vtx_007 PASS
Videox_Rotate_008 PASS
One way to solve this is using a regular expression such as:
(?<testId>test\d+)(?:.*\n){4}.*(?<outcome>PASS|FAIL)
The regex matches your sample output and stores the test id (e.g. "test1") in the capture group named "testId" and the outcome (e.g. "PASS") in the capture group "outcome".
The regex can be used in any language with regex support. The below code shows how to do it in Python.
import re
# Read from input.txt
with open('input.txt', 'r') as f:
indata = f.read()
# Modify the regex slightly to fit Python regex syntax
pattern = '(?:.*)(?P<testId>test\d+)(?:.*\n){4}.*(?P<outcome>PASS|FAIL)'
# Get a generator which yeilds all matches
matches = re.finditer(pattern, indata)
# Combine the matches to a list of strings
outputs = ['{} {}'.format(m.group('testId'), m.group('outcome')) for m in matches]
# Join all rows to one string
output = '\n'.join(outputs)
# Write to output.txt
with open('output.txt', 'w') as f:
f.write(output)
Running the above script on input.txt
containing:
[test1]
duration
summary
code=
Results= PASS
[test2]
duration
summary=x
code=
Results=FAIL
[test444]
duration
summary=x
code=
Results= PASS
yields a file output.txt
containing:
test1 PASS
test2 FAIL
test444 PASS
Using sed
as tagged (although other tools would probably be more natural to use) :
sed -nE '/^\[.*\]$/h;s/^Results= ?//;t r;b;:r;H;x;s/\n/ /;p'
Explanation :
/^\[.*\]$/h # matches the [...] lines, put them in the hold buffer
s/^Results= ?// # matches the Results= lines, discards the useless part
t r;b # on lines which matched, jump to label r;
# otherwise jump to the end (and start processing the next line)
:r;H;x;s/\n/ /;p # label r; append the pattern space (which contains the end of the Results= line)
# to the hold buffer. Switch Hold buffer and pattern space,
# replace the linefeed in the pattern space by a space and print it
You can try it here.
In order to print the first and last line from the block, how about:
awk -v RS="" '{
n = split($0, a, /\n/)
print a[1]
print a[n]
}' input.txt
Result for the 1st example:
[Linux_MP3Enc_xffv.2_Con_37_003]
Total_Result = PASS
[Composer__vtx_007]
Total_Result = PASS
[Videox_Rotate_008]
Total_Result = PASS
The man page of awk
tells:
If RS is set to the null string, then records are separated by blank lines.
You can easily split the block with blank lines with this feature.
Hope this helps.
User contributions licensed under CC BY-SA 3.0