I'm trying to read .doc file with ruby, I use win32ole library.
IT my code:
require 'win32ole'
class DocParser
def initialize
@content = ''
end
def read_file file_path
begin
word = WIN32OLE.connect( 'Word.Application' )
doc = word.activedocument
rescue
word = WIN32OLE.new( 'Word.Application' )
doc = word.documents.open( file_path )
end
word.visible = false
doc.sentences.each{ |x| @content = @content + x.text }
word.quit
@content
end
end
I kick off doc reading with DocParser.new.read_file('path/file.doc')
When I run this using rails c
- I don't have any problems, it's working fine.
But when I run it using rails (e.g. after button click), once in a while (every 3-4 time) this code crashes with error:
WIN32OLERuntimeError (failed to create WIN32OLE object from `Word.Application'
HRESULT error code:0x800401f0
CoInitialize has not been called.):
lib/file_parsers/doc_parser.rb:14:in `initialize'
lib/file_parsers/doc_parser.rb:14:in `new'
lib/file_parsers/doc_parser.rb:14:in `rescue in read_file'
lib/file_parsers/doc_parser.rb:10:in `read_file'
lib/search_engine.rb:10:in `block in search'
lib/search_engine.rb:43:in `block in each_file_in'
lib/search_engine.rb:42:in `each_file_in'
lib/search_engine.rb:8:in `search'
app/controllers/home_controller.rb:9:in `search'
Rendered c:/Ruby193/lib/ruby/gems/1.9.1/gems/actionpack-4.1.1/lib/action_dispatch/middleware/templates/rescues/_source.erb (0.0ms)
Rendered c:/Ruby193/lib/ruby/gems/1.9.1/gems/actionpack-4.1.1/lib/action_dispatch/middleware/templates/rescues/_trace.text.erb (2.0ms)
Rendered c:/Ruby193/lib/ruby/gems/1.9.1/gems/actionpack-4.1.1/lib/action_dispatch/middleware/templates/rescues/_request_and_response.text.erb (2.0ms)
Rendered c:/Ruby193/lib/ruby/gems/1.9.1/gems/actionpack-4.1.1/lib/action_dispatch/middleware/templates/rescues/diagnostics.erb (56.0ms)
Aditionaly, this code read doc file successfully, but RAILS CRASHES AFTER A FEW SECONDS: look at this gist
What is my problem? How can I fix it? Please, help!
Don't know the difference between rails c and rails, so I'll give some random advise.
First, it is a bad idea to run this in a webserver, each time Word is run on the server, so what happens if multiple users start using this at the same time ?
You'd better convert your .doc files to another format first like .rtf or .docx (a batch conversion ?) and then use other gems that don't require Word itself.
If you keep it like this, consider to not close word (remove the word.quit
) buit only close the document itself, the instance will be picked up the next time by the WIN32OLE.connect
While testing you'de better keep word visible so that you can better see what is happening (errors ?). I notice your path uses forward slashes while in this case backslashes are needed but since your code runs a few times before the error i suppose that is not the problem.
Hope this helps.
I upgrade my ruby from 1.9.3 to 2.0.0.
Now rails doesn't crashes and I have not problems with win23ole and reading old version MS Word documents.
I guess the problem was in memory usage - cause new ruby (>2.0.0) use new Garbage Collector.
User contributions licensed under CC BY-SA 3.0