How to Extract Text From Part of an Image Using C# & VB.NET

This technical tip shows how to extract text from part of an image inside .NET Applications. Aspose.OCR for .NET provides OcrEngine class to extract text from a specific part of the image document. The OcrEngine class requires Source image, Language and Resource file for character recognition. The source image is the document on which OCR will be performed. The image can be a BMP, TIFF, JPEG, GIF or PNG file. The OcrEngine.Image property is used to set the source image. One or more languages must be specified before performing OCR. This is because the OcrEngine tries to recognize characters of the specified languages in the image. The OcrEngine recognizes text word by word. Each recognized word has a specific language which might be different from the language of the other words. Aspose.OCR for .NET also maintains the priority of each language. The language added first has the highest priority. Each language added afterwards has lower priority: the last added language has the least priority. The language priority matters when OCR is performed.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//your code here...
const string resourceFileName = @"Aspose.OCR.Resources.zip";
try
{
    //Create an instance of OcrEngine and assign 
    //image, language and image settings
    OcrEngine ocrEngine = new OcrEngine();
    ocrEngine.Image = ImageStream.FromFile("Sample.bmp");
    ocrEngine.Languages.AddLanguage(Language.Load("english"));
    ocrEngine.Config.UseDefaultDictionaries = true;
    //Define the block in which to recognize text
    int startX = 0, startY = 0, width = 120, height = 100;
    //Clear recognition blocks
    ocrEngine.Config.ClearRecognitionBlocks();
    //Add 3 rectangle blocks to user defined recognition blocks
    ocrEngine.Config.AddRecognitionBlock(RecognitionBlock.CreateTextBlock(startX, startY, width, height));
    //Set the resource file name and extract OCR text
    using (ocrEngine.Resource = new FileStream(resourceFileName, FileMode.Open))
    {
        try
        {
            if (ocrEngine.Process())
            {
                //Retrieve user defined blocks that determines the paye layout
                var blocks = ocrEngine.Config.RecognitionBlocks;
                //Loop over the list of blocks
                foreach (var block in blocks)
                {
                    //Display if block is set to be recognized
                    Console.WriteLine(block.ToRecognize);
                    //Check if block has recognition data
                    if (block.RecognitionData == null)
                    {
                        Console.WriteLine("Null{0}", Environment.NewLine);
                        continue;
                    }
                    //Display dimension & size of rectangle that defines the recognition block
                    Console.WriteLine("Block: {0}", block.Rectangle);
                    //Display the recognition results
                    Console.WriteLine("Text: {0}{1}", block.RecognitionData.Text, Environment.NewLine);
                }
            }
X

Url: http://www.aspose.com/docs/display/ocrnet/Extracting+Text+from+Part+of+an+Image

Language: C/C++ | User: Sheraz Khan | Created: Jul 16, 2014 | Tags: Extract Text from Part of Image perform OCR on source image set the source image recognize characters in image Extract Text from Specific Part of the Image do OCR on a Specific Block of Image NET OCR Component