I have some catchall log files in a format as follows:
timestamp event summary
foo details
account name: userA
bar more details
timestamp event summary
baz details
account name: userB
qux more details
timestamp etc.
I would like to search the log file for userB
, and if found, echo from the preceding timestamp down to (but not including) the following timestamp. There will likely be several events matching my search. It would be nice to echo some sort of --- start ---
and --- end ---
surrounding each match.
This would be perfect for pcregrep -M
, right? Problem is, GnuWin32's pcregrep
crashes with multiline regexps searching large files, and these catch-all logs can be 100 megs or more.
What I've tried
My hackish workaround thus far involves using grep -B15 -A30
to find matching lines and print surrounding content, then piping the now more manageable chunk into pcregrep
for polishing. Problem is that some events are less than ten lines, while others are 30 or more; and I'm getting some unexpected results where the shorter events are encountered.
:parselog <username> <logfile>
set silent=1
set count=0
set deez=20\d\d-\d\d-\d\d \d\d:\d\d:\d\d
echo Searching %~2 for records containing %~1...
for /f "delims=" %%I in (
'grep -P -i -B15 -A30 ":\s+\b%~1\b(@mydomain\.ext)?$" "%~2" ^| pcregrep -M -i "^%deez%(.|\n)+?\b%~1\b(@mydomain\.ext|\r?\n)(.|\n)+?\n%deez%" 2^>NUL'
) do (
echo(%%I| findstr "^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9]:[0-9][0-9]:[0-9][0-9]" >NUL && (
if defined silent (
set silent=
set found=1
set /a "count+=1"
echo;
echo ---------------start of record !count!-------------
) else (
set silent=1
echo ----------------end of record !count!--------------
echo;
)
)
if not defined silent echo(%%I
)
goto :EOF
Is there a better way to do this? I've come across an awk
command that looked interesting, something like:
awk "/start pattern/,/end pattern/" logfile
... but it would need to match a middle pattern as well. Unfortunately, I'm not that familiar with awk
syntax. Any suggestions?
Ed Morton suggested that I supply some example logging and expected output.
Example catch-all
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730158 Mon Mar 25 08:02:28 2013 529 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 2 Logon Failure:
Reason: Unknown user name or bad password
User Name: user5f
Domain: MYDOMAIN
Logon Type: 3
Logon Process: Advapi
Authentication Package: Negotiate
Workstation Name: dc3
Caller User Name: dc3$
Caller Domain: MYDOMAIN
Caller Logon ID: (0x0,0x3E7)
Caller Process ID: 400
Transited Services: -
Source Network Address: 169.254.7.86
Source Port: 40838
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730159 Mon Mar 25 08:02:29 2013 680 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 9 Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
Logon account: USER6Q
Source Workstation: dc3
Error Code: 0xC0000234
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730160 Mon Mar 25 08:02:29 2013 539 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 2 Logon Failure:
Reason: Account locked out
User Name: USER6Q@MYDOMAIN.TLD
Domain: MYDOMAIN
Logon Type: 3
Logon Process: Advapi
Authentication Package: Negotiate
Workstation Name: dc3
Caller User Name: dc3$
Caller Domain: MYDOMAIN
Caller Logon ID: (0x0,0x3E7)
Caller Process ID: 400
Transited Services: -
Source Network Address: 169.254.7.89
Source Port: 55314
2013-03-25 08:02:32 Auth.Notice 169.254.5.62 Mar 25 08:36:38 DC4.mydomain.tld MSWinEventLog 5 Security 201326798 Mon Mar 25 08:36:37 2013 4624 Microsoft-Windows-Security-Auditing N/A Audit Success DC4.mydomain.tld 12544 An account was successfully logged on.
Subject:
Security ID: S-1-0-0
Account Name: -
Account Domain: -
Logon ID: 0x0
Logon Type: 3
New Logon:
Security ID: S-1-5-21-606747145-1409082233-725345543-160838
Account Name: DEPTACCT16$
Account Domain: MYDOMAIN
Logon ID: 0x1158e6012c
Logon GUID: {BCC72986-82A0-4EE9-3729-847BA6FA3A98}
Process Information:
Process ID: 0x0
Process Name: -
Network Information:
Workstation Name:
Source Network Address: 169.254.114.62
Source Port: 42183
Detailed Authentication Information:
Logon Process: Kerberos
Authentication Package: Kerberos
Transited Services: -
Package Name (NTLM only): -
Key Length: 0
This event is generated when a logon session is created. It is generated on the computer that was accessed.
The subject fields indicate...
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730162 Mon Mar 25 08:02:30 2013 675 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 9 Pre-authentication failed:
User Name: USER8Y
User ID: %{S-1-5-21-606747145-1409082233-725345543-3904}
Service Name: krbtgt/MYDOMAIN
Pre-Authentication Type: 0x0
Failure Code: 0x19
Client Address: 169.254.87.158
2013-03-25 08:02:32 Auth.Critical etc.
Example command
call :parselog user6q \\path\to\catch-all.log
Expected result
---------------start of record 1-------------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730159 Mon Mar 25 08:02:29 2013 680 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 9 Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
Logon account: USER6Q
Source Workstation: dc3
Error Code: 0xC0000234
---------------end of record 1-------------
---------------start of record 2-------------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730160 Mon Mar 25 08:02:29 2013 539 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 2 Logon Failure:
Reason: Account locked out
User Name: USER6Q@MYDOMAIN.TLD
Domain: MYDOMAIN
Logon Type: 3
Logon Process: Advapi
Authentication Package: Negotiate
Workstation Name: dc3
Caller User Name: dc3$
Caller Domain: MYDOMAIN
Caller Logon ID: (0x0,0x3E7)
Caller Process ID: 400
Transited Services: -
Source Network Address: 169.254.7.89
Source Port: 55314
---------------end of record 2-------------
This is all you need with GNU awk (for IGNORECASE):
$ cat tst.awk
function prtRecord() {
if (record ~ regexp) {
printf "-------- start of record %d --------%s", ++numRecords, ORS
printf "%s", record
printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
}
record = ""
}
BEGIN{ IGNORECASE=1 }
/^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
{ record = record $0 ORS }
END { prtRecord() }
or with any awk:
$ cat tst.awk
function prtRecord() {
if (tolower(record) ~ tolower(regexp)) {
printf "-------- start of record %d --------%s", ++numRecords, ORS
printf "%s", record
printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
}
record = ""
}
/^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
{ record = record $0 ORS }
END { prtRecord() }
Either way you'd run it on UNIX as:
$ awk -v regexp=user6q -f tst.awk file
I don't know the Windows syntax but I expect it's very similar if not identical.
Note the use of tolower() in the script to make both sides of the comparison lower case so the match is case-insensitive. If you can instead pass in a search regexp that's the correct case, then you don't need to call tolower() on either side of the comparison. nbd, it might just speed the script up slightly.
$ awk -v regexp=user6q -f tst.awk file
-------- start of record 1 --------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security
11730159 Mon Mar 25 08:02:29 2013 680 Security NT AUTHORITY\SYSTEM N/A Audit Failure
dc3 9 Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
Logon account: USER6Q
Source Workstation: dc3
Error Code: 0xC0000234
--------- end of record 1 ---------
-------- start of record 2 --------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security
11730160 Mon Mar 25 08:02:29 2013 539 Security NT AUTHORITY\SYSTEM N/A Audit Failure
dc3 2 Logon Failure:
Reason: Account locked out
User Name: USER6Q@MYDOMAIN.TLD
Domain: MYDOMAIN
Logon Type: 3
Logon Process: Advapi
Authentication Package: Negotiate
Workstation Name: dc3
Caller User Name: dc3$
Caller Domain: MYDOMAIN
Caller Logon ID: (0x0,0x3E7)
Caller Process ID: 400
Transited Services: -
Source Network Address: 169.254.7.89
Source Port: 55314
--------- end of record 2 ---------
Here's my effort:
@ECHO OFF
SETLOCAL
::
:: Target username
::
SET target=%1
CALL :zaplines
SET count=0
FOR /f "delims=" %%I IN (rojoslog.txt) DO (
ECHO.%%I| findstr /r "^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9]:[0-9][0-9]:[0-9][0-9]" >NUL
IF NOT ERRORLEVEL 1 (
IF DEFINED founduser CALL :report
CALL :zaplines
)
(SET stored=)
FOR /l %%L IN (1000,1,1200) DO IF NOT DEFINED stored IF NOT DEFINED line%%L (
SET line%%L=%%I
SET stored=Y
)
ECHO.%%I|FINDSTR /b /e /i /c:"account name: %target%" >NUL
IF NOT ERRORLEVEL 1 (SET founduser=Y)
)
IF DEFINED founduser CALL :report
GOTO :eof
::
:: remove all envvars starting 'line'
:: Set 'not found user' at same time
::
:zaplines
(SET founduser=)
FOR /f "delims==" %%L IN ('set line 2^>nul') DO (SET %%L=)
GOTO :eof
:report
IF NOT DEFINED line1000 GOTO :EOF
SET /a count+=1
ECHO.
ECHO.---------- START of record %count% ----------
FOR /l %%L IN (1000,1,1200) DO IF DEFINED line%%L CALL ECHO.%%line%%L%%
ECHO.----------- END of record %count% -----------
GOTO :eof
Below there is a pure Batch solution that does not use grep. It locates timestamp lines because the "summary" word that must not exist in other lines, but this word may be changed for another one if needed.
EDIT: I changed the word that identify timestamp lines to "Auth."; I also changed FINDSTR seek to ignore case. This is the new version:
@echo off
setlocal EnableDelayedExpansion
:parselog <username> <logfile>
echo Searching %~2 for records containing %~1...
set n=0
set previousMatch=Auth.
for /F "tokens=1* delims=:" %%a in ('findstr /I /N "Auth\. %~1" %2') do (
set currentMatch=%%b
if "!previousMatch:Auth.=!" neq "!previousMatch!" (
if "!currentMatch:Auth.=!" equ "!currentMatch!" (
set /A n+=1
set /A skip[!n!]=!previousLine!-1
)
) else (
set /A end[!n!]=%%a-1
)
set previousLine=%%a
set previousMatch=%%b
)
if %n% equ 0 (
echo No records found
goto :EOF
)
if not defined end[%n%] set end[%n%]=-1
set i=1
:nextRecord
echo/
echo ---------------start of record %i%-------------
if !skip[%i%]! equ 0 (
set skip=
) else (
set skip=skip=!skip[%i%]!
)
set end=!end[%i%]!
for /F "%skip% tokens=1* delims=:" %%a in ('findstr /N "^" %2') do (
echo(%%b
if %%a equ %end% goto endOfRecord
)
:endOfRecord
echo ---------------end of record %i%-------------
set /A i+=1
if %i% leq %n% goto nextRecord
Example command:
C:>test user6q catch-all.log
Result:
Searching catch-all.log for records containing user6q...
---------------start of record 1-------------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730159 Mon Mar 25 08:02:29 2013 680 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 9 Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
Logon account: USER6Q
Source Workstation: dc3
Error Code: 0xC0000234
---------------end of record 1-------------
---------------start of record 2-------------
2013-03-25 08:02:32 Auth.Critical 169.254.8.110 Mar 25 08:02:32 dc3 MSWinEventLog 2 Security 11730160 Mon Mar 25 08:02:29 2013 539 Security NT AUTHORITY\SYSTEM N/A Audit Failure dc3 2 Logon Failure:
Reason: Account locked out
User Name: USER6Q@MYDOMAIN.TLD
Domain: MYDOMAIN
Logon Type: 3
Logon Process: Advapi
Authentication Package: Negotiate
Workstation Name: dc3
Caller User Name: dc3$
Caller Domain: MYDOMAIN
Caller Logon ID: (0x0,0x3E7)
Caller Process ID: 400
Transited Services: -
Source Network Address: 169.254.7.89
Source Port: 55314
---------------end of record 2-------------
This method use just one execution of findstr
command to locate all matching records, and then one additional findstr
command to show each record. Note that first for /F ...
command works over findstr "Auth. user.."
results, and the second for /F
command have a "skip=N" option and a GOTO that break the loop as soon as the record was displayed. This mean that FOR commands does not slow down the program; the speed of this program depends on the speed of FINDSTR command.
However, it is possible that the second for /F "%skip% ... in ('findstr /N "^" %2')
command take too long because the size of FINDSTR output result before it is processed by the FOR. If this happen, we could modify the second FOR by another faster method (an asynchronous pipe that will be break, for example). Please, report the result.
Antonio
I think awk is all you need:
awk "/---start of record---/,/---end of record---/ {print}" logfile
That's all you need if the first line indicator is:
---start of record---
and the last is:
---end of record---
Notice that there is no middle-pattern matching, that "," is just a separator for both regexps.
User contributions licensed under CC BY-SA 3.0