How to Search Text in a PDF using Regular Expression & Add Hyperlink over It

This technical tip shows how .NET developers can search text inside PDF file using a regular expression and adding hyperlinks over the matches inside their .NET applications. To find a phrase and add hyperlink over it, first pass the regular expression as a parameter to the TextFragmentAbsorber constructor and then create a TextSearchOptions object which specifies whether the regular expression is used or not. After that get the matching phrases into TextFragments and loop though the matches to get their rectangular dimensions, change foreground color to blue (optional - to make it appear like hyperlink and create a link using the PdfContentEditor class' CreateWebLink(..) method. Lastly save the updated PDF using Save method of Document object.
//your code here...The following code snippet shows you how to search text inside PDF file using a regular expression and adding hyperlinks over the matches. //C# Code Sample //create absorber object to find all instances of the input search phrase TextFragmentAbsorber absorber = new TextFragmentAbsorber("D[a-z]{7}:"); //Enable regular expression search absorber.TextSearchOptions = new TextSearchOptions(true); //open document PdfContentEditor editor = new PdfContentEditor(); // bind source PDF file editor.BindPdf("c:/pdftest/25741-out.pdf"); //accept the absorber for the page editor.Document.Pages[1].Accept(absorber); int[] dashArray = { }; String[] LEArray = { }; System.Drawing.Color blue = System.Drawing.Color.Blue; //loop through the fragments foreach (TextFragment textFragment in absorber.TextFragments) { textFragment.TextState.ForegroundColor = Aspose.Pdf.Color.Blue; System.Drawing.Rectangle rect = new System.Drawing.Rectangle((int)textFragment.Rectangle.LLX, (int)Math.Round(textFragment.Rectangle.LLY), (int)Math.Round(textFragment.Rectangle.Width + 2), (int)Math.Round(textFragment.Rectangle.Height + 1)); Enum[] actionName = new Enum[2] { Aspose.Pdf.InteractiveFeatures.PredefinedAction.Document_AttachFile, Aspose.Pdf.InteractiveFeatures.PredefinedAction.Document_ExtractPages }; editor.CreateWebLink(rect, "", 1, blue, actionName); editor.CreateLine(rect, "", (float)textFragment.Rectangle.LLX + 1, (float)textFragment.Rectangle.LLY - 1, (float)textFragment.Rectangle.URX, (float)textFragment.Rectangle.LLY - 1, 1, 1, blue, "S", dashArray, LEArray); } //Save & Close the document editor.Save("c:/pdftest/TextReplaced_with_Links.pdf"); editor.Close(); //VB.NET Code Sample 'create absorber object to find all instances of the input search phrase Dim absorber As Aspose.Pdf.Text.TextFragmentAbsorber = New Aspose.Pdf.Text.TextFragmentAbsorber("D[a-z]{7}") 'Enable regular expression search absorber.TextSearchOptions = New TextSearchOptions(True) 'open document Dim editor As Aspose.Pdf.Facades.PdfContentEditor = New Aspose.Pdf.Facades.PdfContentEditor() ' bind source PDF file editor.BindPdf("c:/pdftest/25741-out.pdf") 'accept the absorber for the page editor.Document.Pages(1).Accept(absorber) Dim dashArray() As Integer = {} Dim LEArray() As String = {} Dim blue As System.Drawing.Color = System.Drawing.Color.Blue 'loop through the fragments For Each textFragment As Aspose.Pdf.Text.TextFragment In absorber.TextFragments textFragment.TextState.ForegroundColor = Aspose.Pdf.Color.Blue Dim rect As System.Drawing.Rectangle = New System.Drawing.Rectangle(CType(textFragment.Rectangle.LLX, Integer), CType(Math.Round(textFragment.Rectangle.LLY), Integer), CType(Math.Round(textFragment.Rectangle.Width + 2), Integer), CType(Math.Round(textFragment.Rectangle.Height + 1), Integer)) Dim actionName() As System.Enum = New System.Enum() {Aspose.Pdf.InteractiveFeatures.PredefinedAction.Document_AttachFile, Aspose.Pdf.InteractiveFeatures.PredefinedAction.Document_ExtractPages} editor.CreateWebLink(rect, "", 1, blue, actionName) editor.CreateLine(rect, "", CType(textFragment.Rectangle.LLX + 1, Double), CType(textFragment.Rectangle.LLY - 1, Double), CType(textFragment.Rectangle.URX, Double), CType(textFragment.Rectangle.LLY - 1, Double), 1, 1, blue, "S", dashArray, LEArray) Next 'Save & Close the document editor.Save("c:/pdftest/TextReplaced_with_Links.pdf") editor.Close()


Language: C# | User: Sheraz Khan | Created: Feb 25, 2015 | Tags: Search Text in PDF file Search Text Based on Regex add Hyperlink over PDF Text Search Text Based regular expression PDF add hyperlink over a text open PDF file in .NET .NET PDF component