Parsing winerror.h & ntstatus.h statuses/definitions with regex

2

I'm trying to create a regex in Python to capture all the Last Errors, HRESULT and NTSTATUS defenitions from winerror.h & ntstatus.h:

For example, for this text:

//
// MessageId: NTE_BAD_PROV_TYPE
//
// MessageText:
//
// Invalid provider type specified.
// More data is avaiable on bla bla.
//
#define NTE_BAD_PROV_TYPE                _HRESULT_TYPEDEF_(0x80090014L)

I want to capture:

('Invalid provider type specified. More data is avaiable on bla bla.', 'NTE_BAD_PROV_TYPE', 0x80090014)


Same goes for ntstatus.h:

//
// MessageId: STATUS_NOT_ALL_ASSIGNED
//
// MessageText:
//
// Indicates not all privileges or groups referenced are assigned to the caller.
// This allows, for example, all privileges to be disabled without having to know exactly which privileges are assigned.
//
#define STATUS_NOT_ALL_ASSIGNED          ((NTSTATUS)0x00000106L)
python
c
regex
windows
header
asked on Stack Overflow Nov 8, 2017 by JJ By • edited Nov 12, 2017 by JJ By

2 Answers

0

I think this is close enough:

re.findall(r"(?<=// )(?:Message\w{2,4}:)? ?(.+)?\n", text) + list(re.search(r"(0x\d+)", text).groups())

['NTE_BAD_PROV_TYPE', '', 'Invalid provider type specified.', 'More data is avaiable on bla bla.', '0x80090014']

answered on Stack Overflow Nov 8, 2017 by MadeR
0

This is the regex I ended up figuring out and using:

To capture all Last Errors & HRESULT definitions in winerror.h:

//\sMessageText:[\n\r]+//\s*[\n\r]+
(?P<message_text>//.*?//\n)*.*?
#define\s(?P<status_name>[A-Za-z0-9]+)
\s+(?:_HRESULT_TYPEDEF_\(|NDIS_ERROR_TYPEDEF_\()?
(?P<status_value>(?:0[xX])?[A-Fa-f0-9]+)L

To capture all NTSTATUS definitions in ntstatus.h:

//\sMessageText:[\n\r]+//\s*[\n\r]+
(?P<message_text>//.*?//\n)*.*?
#define\s(?P<status_name>[A-Za-z0-9]+)
\s+\(\(NTSTATUS\)(?P<status_value>0[xX]?[A-Fa-f0-9]+)L\)

I also format the description (message text) afterwards by replacing the // with newlines.

answered on Stack Overflow Nov 12, 2017 by JJ By

User contributions licensed under CC BY-SA 3.0