How to get a block at an offset in the IO.foreach loop in ruby?

2

I'm using the IO.foreach loop to find a string using regular expressions. I want to append the next block (next line) to the file_names list. How can I do that?

file_names = [""]                                                                                                                                                                                                                           
IO.foreach("a.txt") { |block|                                                                                                                         
  if block =~ /^file_names*/                                                                                              
    dir = # get the next block                                                                                                                                                                                           
    file_names.append(dir)                                                                                      
  end                                                                                                                    

}     


Actually my input looks like this:

file_names[174]:                                                                                                         
           name: "vector"                                                                                                
      dir_index: 1                                                                                                       
       mod_time: 0x00000000                                                                                              
         length: 0x00000000                                                                                              
file_names[175]:                                                                                                         
           name: "stl_bvector.h"                                                                                         
      dir_index: 2                                                                                                       
       mod_time: 0x00000000                                                                                              
         length: 0x00000000    

I have a list of file_names, and I want to capture each of the name, dir_index, mod_time and length properties and put them into the files_names array index according to the file_names index in the text.

ruby
asked on Stack Overflow Oct 23, 2019 by zengod • edited Oct 23, 2019 by zengod

2 Answers

1

You can use #each_cons to get the value of the next 4 rows from the text file:

files = IO.foreach("text.txt").each_cons(5).with_object([]) do |block, o|
  if block[0] =~ /file_names.*/
    o << block[1..4].map{|e| e.split(':')[1]}
  end
end

puts files
#=> "vector"                                                                                                
#    1                                                                                                       
#    0x00000000                                                                                              
#    0x00000000                                                                                              
#    "stl_bvector.h"                                                                                         
#    2                                                                                                       
#    0x00000000                                                                                              
#    0x00000000 

Keep in mind that the files array contains subarrays of 4 elements. If the : symbol occurs later in the lines, you could replace the third line of my code with this:

o << block[1..4].map{ |e| e.partition(':').last.strip}

I also added #strip in case you want to remove the whitespaces around the values. With this line changed, the actual array will look something like this:

p files
#=>[["\"vector\"", "1", "0x00000000", "0x00000000"], ["\"stl_bvector.h\"", "2", "0x00000000", "0x00000000"]]

(the values don't contain the \ escape character, that's just the way #p shows it).

Another option, if you know the pattern 1 filename, 4 values will be persistent through the entire text file and the textfile always starts with a filename, you can replace #each_cons with #each_slice and remove the regex completely, this will also speed up the entire process:

IO.foreach("text.txt").each_slice(5).with_object([]) do |block, o|
  o << block[1..4].map{ |e| e.partition(':').last.strip }
end
answered on Stack Overflow Oct 23, 2019 by Viktor • edited Oct 23, 2019 by Viktor
1

It's actually pretty easy to carve up a series of lines based on a pattern using slice_before:

File.readlines("data.txt").slice_before(/\Afile_names/)

Now you have an array of arrays that looks like:

[
  [
    "file_names[174]:\n",
    "           name: \"vector\"\n",
    "      dir_index: 1\n",
    "       mod_time: 0x00000000\n",
    "         length: 0x00000000\n"
  ],
  [
    "file_names[175]:\n",
    "           name: \"stl_bvector.h\"\n",
    "      dir_index: 2\n",
    "       mod_time: 0x00000000\n",
    "         length: 0x00000000"
  ]
]

Each of these groups could be transformed further, like for example into a Ruby Hash using those keys.

answered on Stack Overflow Oct 23, 2019 by tadman

User contributions licensed under CC BY-SA 3.0