Skip to Content

How can I find all the links in a PDF document?

Estimated Reading Time: 1 Minutes

Hyperlinks direct a user to a different part of the same PDF document, or connect the user to an external web page. In PDF, a link is a type of annotation. So to compile a list of all of the links in a PDF document, you would search for annots of Type “Annot” and Subtype “Link.” For each link found, you need to identify those links that has a defined Action is to determine whether the link is to a URI/URL, as opposed to the GoTo types that simply jump to a destination elsewhere in the document.

Adobe PDF Library does not offer a method that will list all of the links found in a PDF document in a single call. To obtain the links, you need to inspect each page of the document to see if it has any Annotations on it:

  • Acquire the page
  • Call PDPageGetNumAnnots to find annotations
  • Inspect each annotation with PDAnnotGetSubtype to determine if an annotation is a link

The following sample code summarizes this sequence:

for (i=0; i<PDPageGetNumAnnots(pdPage); i++)
{
    annot = PDPageGetAnnot(pdPage, i);
    if(PDAnnotIsValid(annot) &&
       PDAnnotGetSubtype(annot) ==
       ASAtomFromString("Link"))
    {
       ...
     }
 }

Then, you need to identify the annotations where the action is defined as a reference to an embedded URL, not a GoTo action to take the reader to another page within the same document.

See Section 12.6.4, “Action Types,” in the ISO 32000 Reference, page 417.

 

 

 

How can I find all the links in a PDF document?
  • COMMENT