Tesseract OCR string manipulation issue

0

I am hoping someone can help me witch this issue I am new to using Tesseract OCR so please bear with me. I created a simple code which allows me to extract text from an image. This works fine

Code Below :

 private void button1_Click(object sender, EventArgs e)
        {
            List<string> list = new List<string>();
            StringBuilder concatenatedString = new StringBuilder();

            var image = new Bitmap("1.jpg");
            var ocr = new TesseractEngine("./tessdata", "eng", EngineMode.TesseractAndCube);
            var page = ocr.Process(image);
           list.Add(page.GetText());
           
            foreach (string how in list)
             {
                concatenatedString.Append(how);
                // Console.WriteLine(how);
            }
            Console.WriteLine(concatenatedString);
        }

I can get an output, however the trouble I am having is that no matter what I do I can’t seem to merge the output into one line. The output I get is below.

output :

16:29:02 SIN: LE1005 Rec: 000375 User2001
07/01/2020 _ I 1463

  


The program '[30388] ocr_image.exe' has exited with code -1 (0xffffffff).

This is what my output looks like multiple lines including blank ones. I have tried things like join and concat however it just doesn’t seem to do it.

c#
string
tesseract
asked on Stack Overflow Jul 2, 2020 by Vijay Yadav

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0