I have been working on this code on and off for a few days. I started realizing that my text file was printing white space after every character. I have tried line.strip(), but this doesn't get ride of the data.
Moving over to Linux, using file *.txt to list all types of data in my text files. This shown that my original file I am moving the data from was listed as:
~res-x64.txt: data
But the two output files were listed as:
comma.txt: ASCII text, with CRLF line terminators
slash.txt: ASCII text, with CRLF line terminators
Concluding that due to Windows, it might store the text as UTF-16. The additional "data" portion was the hex code I stripped thanks to the help of my last post. I tried several encoding tools in python. It prints nothing to the new file, and I get an error in the power-shell complaining:
Traceback (most recent call last):
File "test.py", line 17, in <module>
f1.write(line.decode('utf-16').encode('utf-8'))
File "C:\Python27\lib\encodings\utf_16.py", line 16, in decode
return codecs.utf_16_decode(input, errors, True)
UnicodeDecodeError: 'utf16' codec can't decode byte 0x00 in position 188: truncated data
My code:
import re
import sys
#This allows you to specify the text file
#in commandline start of program.
with open(sys.argv[1], 'r') as f:
with open("complete.txt", "w") as f1:
for i in xrange(7):
f.next()
for line in f:
#process(line)
if ':' in line:
#Remove unneeded hex data
line = line.split(':')[0]
#Print every line with \ on new file
if '\\' in line:
f1.write(line.decode('utf-16').encode('utf-8'))
f.close
f1.close
Sample of the input text file:
HKU\S-1-5-21-230830461-2995936100-1910591732-1107\Software\Microsoft\Windows\CurrentVersion\ContentDeliveryManager\Subscriptions\280810\UpdateDrivenByExpiration: 0x00000001
HKU\S-1-5-21-230830461-2995936100-1910591732-1107\Software\Microsoft\Windows\CurrentVersion\ContentDeliveryManager\Subscriptions\280810\UpdateDrivenByExpiration: 0x00000000
HKU\S-1-5-21-230830461-2995936100-1910591732-1107\Software\Microsoft\Windows\CurrentVersion\Search\JumplistData\E7CF176E110C211B: D6 1E 9E B8 B3 B7 D3 01
HKU\S-1-5-21-230830461-2995936100-1910591732-1107\Software\Microsoft\Windows\CurrentVersion\Search\JumplistData\E7CF176E110C211B: 08 64 2B 00 B4 B7 D3 01
Sample of output text file:
H K L M \ S O F T W A R E \ N o r t o n \ { 0 C 5 5 C 0 9 6 - 0 F 1 D - 4 F 2 8 - A A A 2 - 8 5 E F 5 9 1 1 2 6 E 7 } \ S h a r e d D e f s \ S D S D e f s \ A P P _ I D _ S C A N N E R 4 H K L M \ S O F T W A R E \ M i c r o s o f t \ W i n d o w s N T \ C u r r e n t V e r s i o n \ N o t i f i c a t i o n s \ D a t a \ 4 1 8 A 0 7 3 A A 3 B C 3 4 7 5 H K L M \ S O F T W A R E \ M i c r o s o f t \ W i n d o w s N T \ C u r r e n t V e r s i o n \ N o t i f i c a t i o n s \ D a t a \ 4 1 8 A 0 7 3 A A 3 B C 3 4 7 5 H K L M \ S Y S T E M \ C o n t r o l S e t 0 0 1 \ S e r v i c e s \ b a m \ U s e r S e t t i n g s \ S - 1 - 5 - 2 1 - 2 3 0 8 3 0 4 6 1 - 2 9 9 5 9 3 6 1 0 0 - 1 9 1 0 5 9 1 7 3 2 - 1 1 0 7 \ \ D e v i c e \ H a r d d i s k V o l u m e 3 \ P r o g r a m F i l e s ( x 8 6 ) \ M o z i l l a F i r e f o x \ f i r e f o x . e x e H K L M \ S Y S T E M \ C o n t r o l S e t 0 0 1 \ S e r v i c e s \ b a m \ U s e r S e t t i n g s \ S - 1 - 5 - 2 1 - 2 3 0 8 3 0 4 6 1 - 2 9 9 5 9 3 6 1 0 0 - 1 9 1 0 5 9 1 7 3 2 - 1 1 0 7 \ \ D e v i c e \ H a r d d i s k V o l u m e 3 \ P r o g r a m F i l e s ( x 8 6 ) \ M o z i l l a F i r e f o x \ f i r e f o x . e x e H K U \ S - 1 - 5 - 2 1 - 2 3 0 8 3 0 4 6 1 - 2 9 9 5 9 3 6 1 0 0 - 1 9 1 0 5 9 1 7 3 2 - 1 1 0 7 \ S o f t w a r e \ M i c r o s o f t \ W i n d o w s \ C u r r e n t V e r s i o n \ C l o u d S t o r e \ S t o r e \ C a c h e \ D e f a u l t A c c o u n t \ $ $ w i n d o w s . d a t a . t a s k f l o w . s h e l l a c t i v i t i e s \ C u r r e n t \ D a t a
H K U \ S - 1 - 5 - 2 1 - 2 3 0 8 3 0 4 6 1 - 2 9 9 5 9 3 6 1 0 0 - 1 9 1 0 5 9 1 7 3 2 - 1 1 0 7 \ S o f t w a r e \ M i c r o s o f t \ W i n d o w s \ C u r r e n t V e r s i o n \ C l o u d S t o r e \ S t o r e \ C a c h e \ D e f a u l t A c c o u n t \ $ $ w i n d o w s . d a t a . t a s k f l o w . s h e l l a c t i v i t i e s \ C u r r e n t \ D a t a
How would I remove these white spaces?
User contributions licensed under CC BY-SA 3.0