
The Outlines List
The outline list lists any outlines (a.k.a. bookmarks) defined in the master document. This list is not available for imposed layouts. Clicking an entry in the list scrolls the content view to the entry’s target page. When editing outlines it may be handy to turn this linking behaviour of. To do so turn on the Caps Lock key. While the key is active, the links of the outline list are inactive. Outlines are created by dragging pages from the page list and dropping them at the desired location in the outline.
The Outlines List. |
|
![]() |
Creating Outlines
Outlines allow interactive navigation of a document. An outline consists of a tree-structured hierarchy of outline items (also known as bookmarks), which serves as a visual table of contents that displays a document’s structure while allowing quick access to other parts of the document. By clicking the disclosure triangles left of parent outline items the items can be interactively opened or closed. Open items show their immediate children in the hierarchy, and each child in turn may be a parent of other children and may either be open or closed. Closed items hide all their descendants in the hierarchy. Clicking the text of a visible item activates its associated action. Usually this results in a jump to another part of the document, although it is possible to define external destinations to outline items.
Creating individual outlines
To create an outline item drag a page from the page list to the outline list at the desired location. You can drop the page on top of an existing item to make the new outline item a child of the existing item, or you can drop the page above or below an existing item to create a new sibling at that location. By default the new item’s title reflects the dragged page’s label (as it appears in the page list).
Creating outlines automatically from table of contents style entries
PDFClerk is able to automatically create outlines in the Outline List by analysing a Table of Contents. This feature will work for the majority of TOC entries, and allows for some flexibility in how the TOC is formatted. To create the outlines, you need to tell PDFClerk a few specifics about the formatting of the TOC and the desired formatting of the resulting outline entries. Choose the menu item Tools->Create Outline From TOC.

In the dialog you specify the page on which the TOC starts and the page on which the TOC ends. As you enter numbers in the input fields for the start and end pages, the preview will automatically scroll to the page with the number entered. (Note that these fields expect the number as an index to the desired page from the start of the document. This is not necessarily the same as the page label for the desired page, which isn’t even necessarily a number.)
Draw a selection around the section of the page currently shown that you want PDFClerk to analyse. This allows you to avoid entries that do not belong in a TOC, like page numbers, headers and footers. If the table of contents has a columnar layout, you can draw multiple boxes around the relevant areas (see picture above) after activating the caps-lock key. Make sure to create the boxes in the order you want PDFClerk to analyse them. For most western languages that should be left-most column first, then subsequent columns to the right. Note that you must select each column separately to get good results for columnar TOCs.
Next you specify whether PDFClerk tries to create a hierarchical outline structure by following the visual indentation of each TOC entry, or whether it should follow chapter and paragraph numbering (for instance a paragraph entry might look like this: “2.4.1 The General Structure of a Table of Contents”). Alternatively you can choose to forgo automatic indentation.
TOCs usually have a number of entries that do not follow the general pattern of the regular entries. For instance the first entry may refer to the TOC itself; an Appendix entry will not have a paragraph numbering; some entries may state a chapter number before the actual name of the chapter (like: “Chapter 4: The frugality of the means”), but you do not want the chapter indication to be included in the created outline. For such cases you can alert PDFClerk to special terms so that it can deal with them appropriately. Terms can be a single word or an expression consisting of multiple words. PDFClerk only looks for the first occurrence of a term in a TOC entry; subsequent occurrences of the special term will not be treated as special terms. You can specify the following characteristics and options for the terms:
1. Has Page Label: Use if the term includes a page label that doesn’t appear in the normal pattern or the page label (which is either at the start of the TOC entry or at the end).
2. Hide Term: The term will not be included in the label text for the outline item created for this TOC entry.
3. Has Trailer: The term is followed by a token that qualifies the term (like the “2” in “Chapter 2”).
4. Hide Trailer: The trailer will not be included in the label text for the outline item created for this TOC entry.
5. Has Text Numbering: the term is preceded at the start of the line by a chapter/paragraph number. This is useful only when “Indentation” is set to “Follow Chapter/Paragraph Numbering”.
6. Link Line To Page It Appears On: This usually applies to the special terms “Contents” and “Table of Contents” which should be linked to the page they appear on.
If a document is in a language other than English you will of course need to change/enter the terms to the equivalent for the TOC elements in the other language.
Attempt to locate targets tells PDFClerk to look for the first occurrence of the relevant text in the TOC on the target page and to make the outline link directly to that section of the page. If you switch this option off PDFClerk will always create the links to the page itself, rather than to a specific location on the page.
Sone TOC pages are encoded rather badly in the PDF stream and yield bad results. In such cases holding the option key while clicking OK in the outline generation dialog disables smart TOC line detection, which may improve results.
Limitations: Automatic outline generation is a convenient feature that can save you hours of manual labour where you otherwise would have to manually create each outline entry. However, it is not a full-proof method. Some entries may not be correctly parsed by PDFClerk and other issues may result in outline entries that need further manual adjustment. It is therefore recommended to check the outline list created by PDFClerk for completeness and correctness:
Check that an outline was created for each TOC entry that was not specifically defined to be ignored.
Check for spurious entries. Most notably TOC entries that span more than 1 line will not be handled correctly by PDFClerk.
Check that each entry links to the correct page (and location on the page).
Check the label text of each outline for correctness and spurious characters. (Systemic issues that can occur with some TOCs, like a spurious dot at the end of some outline labels, can easily be corrected using the find and replace feature for outlines.)
Creating outlines automatically from selected links
If a document has no outline, but has a table of contents whose entries are linked, then you can create outlines automatically by selecting the links and control clicking, or right clicking, the content view. At the bottom of the contextual menu that pops up is an option to Create Outline From Selected Links. PDFClerk will create an outline item for each link. The outline list will be flat, but it is easy to further arrange the entries into a hierarchy as explained under the heading Editing and Arranging Outlines below.
Linking Table Of Contents Entries To Their Targets
If the entries in a table of contents are not hyperlinked to their target pages, you can let PDFClerk Pro attempt to quickly link each line through the Link TOC Lines To Target Pages option under the Tools menu. The dialog is virtually identical to the Create Outline From TOC option dialog. The only differences are that indentation has no meaning and is therefore not available, and that, when the lines in the table of contents are closely spaced, it may be necessary to inset the height of the generated links so they do not overlap their neighbouring lines. To this end a Link Height Inset field is available where you can enter an integer value that will reduce the size. The most likely useful values lie between 1 and 5.
Editing and Arranging Outlines
Arranging outlines is a simple matter of drag and drop. As you drag outline items, indicators will show where the drop will take place.
Alternatively you can adjust the outline hierarchy using the Tab key: Select the outline(s) you wish to indent and press the Tab key. The outline will become the last child of the current sibling above it. If you hold down the option key while pressing Tab the outline will become the first child of the current sibling above it. If an outline is the first child in a group of siblings it will not be moved. Shift Tab will move selected outlines up in the outline hierarchy.
The title of the outline item (the text string displayed in the list) can be edited by double clicking the outline item’s title.
To edit the destination of an outline item open the Dynamic Annotations Inspector to make the necessary changes.
Outline items can not only be hyperlinks to other pages in the document at hand, but they can be hyperlinks to destinations in external PDF documents, and they can even be altogether different actions, although it is unusual to create such alternative outline items.
To arrange outline items simply drag items to the desired location in the list. Items can easily be organised into multiple levels this way.
Using Find and Replace For Outlines
The Outline List sports a dedicated Find And Replace function, that is accessed by control-clicking, or right-clicking, the list and selected the Find And Replace menu item. This brings up the Find And Replace dialog for outlines:

This dialog allows you to search for words/expressions in the list of outlines and to make individual and/or global substitutions. Searches and replacements can be either literal (the text entered in the input fields is used as is for searching and replacing) or regular expressions (the input is parsed and interpreted, which allows for very powerful searches and substitutions.)
You can also batch indent/outdent outlines based on the matches. To do so select the desired action from the bottom popup menu, enter an appropriate query in the Find field and click the Apply button.
The regular expression engine is based on ICU regular expressions:
Regular Expression Metacharacters
Character | Description |
---|---|
\a | Match a BELL, \u0007 |
\A | Match at the beginning of the input. Differs from ^ in that \A will not match after a new line within the input. |
\b, outside of a [Set] | Match if the current position is a word boundary. Boundaries occur at the transitions between word (\w) and non-word (\W) characters, with combining marks ignored. |
\b, within a [Set] | Match a BACKSPACE, \u0008. |
\B | Match if the current position is not a word boundary. |
\cX | Match a control-X character. |
\d | Match any character with the Unicode General Category of Nd (Number, Decimal Digit.) |
\D | Match any character that is not a decimal digit. |
\e | Match an ESCAPE, \u001B. |
\E | Terminates a \Q ... \E quoted sequence. |
\f | Match a FORM FEED, \u000C. |
\G | Match if the current position is at the end of the previous match. |
\n | Match a LINE FEED, \u000A. |
\N{UNICODE CHARACTER NAME} | Match the named character. |
\p{UNICODE PROPERTY NAME} | Match any character with the specified Unicode Property. |
\P{UNICODE PROPERTY NAME} | Match any character not having the specified Unicode Property. |
\Q | Quotes all following characters until \E. |
\r | Match a CARRIAGE RETURN, \u000D. |
\s | Match a white space character. White space is defined as [\t\n\f\r\p{Z}]. |
\S | Match a non-white space character. |
\t | Match a HORIZONTAL TABULATION, \u0009. |
\uhhhh | Match the character with the hex value hhhh. |
\Uhhhhhhhh | Match the character with the hex value hhhhhhhh. Exactly eight hex digits must be provided, even though the largest Unicode code point is \U0010ffff. |
\w | Match a word character. Word characters are [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}]. |
\W | Match a non-word character. |
\x{hhhh} | Match the character with hex value hhhh. From one to six hex digits may be supplied. |
\xhh | Match the character with two digit hex value hh |
\X | Match a Grapheme Cluster . |
\Z | Match if the current position is at the end of input, but before the final line terminator, if one exists. |
\z | Match if the current position is at the end of input. |
\n | Back Reference. Match whatever the nth capturing group matched. n must be a number > 1 and < total number of capture groups in the pattern. Note: Octal escapes, such as \012, are not supported in ICU regular expressions |
[pattern] | Match any one character from the set. See UnicodeSet for a full description of what may appear in the pattern |
. | Match any character. |
^ | Match at the beginning of a line. |
$ | Match at the end of a line. |
\ | Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ . / |
Regular Expression Operators
Operator | Description |
---|---|
| | Alternation. A|B matches either A or B. |
* | Match 0 or more times. Match as many times as possible. |
+ | Match 1 or more times. Match as many times as possible. |
? | Match zero or one times. Prefer one. |
{n} | Match exactly n times |
{n,} | Match at least n times. Match as many times as possible. |
{n,m} | Match between n and m times. Match as many times as possible, but not more than m. |
*? | Match 0 or more times. Match as few times as possible. |
+? | Match 1 or more times. Match as few times as possible. |
?? | Match zero or one times. Prefer zero. |
{n}? | Match exactly n times |
{n,}? | Match at least n times, but no more than required for an overall pattern match |
{n,m}? | Match between n and m times. Match as few times as possible, but not less than n. |
*+ | Match 0 or more times. Match as many times as possible when first encountered, do not retry with fewer even if overall match fails (Possessive Match) |
++ | Match 1 or more times. Possessive match. |
?+ | Match zero or one times. Possessive match. |
{n}+ | Match exactly n times |
{n,}+ | Match at least n times. Possessive Match. |
{n,m}+ | Match between n and m times. Possessive Match. |
( ... ) | Capturing parentheses. Range of input that matched the parenthesized subexpression is available after the match. |
(?: ... ) | Non-capturing parentheses. Groups the included pattern, but does not provide capturing of matching text. Somewhat more efficient than capturing parentheses. |
(?> ... ) | Atomic-match parentheses. First match of the parenthesized subexpression is the only one tried; if it does not lead to an overall pattern match, back up the search for a match to a position before the "(?>" |
(?# ... ) | Free-format comment (?# comment ). |
(?= ... ) | Look-ahead assertion. True if the parenthesized pattern matches at the current input position, but does not advance the input position. |
(?! ... ) | Negative look-ahead assertion. True if the parenthesized pattern does not match at the current input position. Does not advance the input position. |
(?<= ... ) | Look-behind assertion. True if the parenthesized pattern matches text preceding the current input position, with the last character of the match being the input character just before the current position. Does not alter the input position. The length of possible strings matched by the look-behind pattern must not be unbounded (no * or + operators.) |
(?<! ... ) | Negative Look-behind assertion. True if the parenthesized pattern does not match text preceding the current input position, with the last character of the match being the input character just before the current position. Does not alter the input position. The length of possible strings matched by the look-behind pattern must not be unbounded (no * or + operators.) |
(?ismwx-ismwx: ... ) | Flag settings. Evaluate the parenthesized expression with the specified flags enabled or -disabled. |
(?ismwx-ismwx) | Flag settings. Change the flag settings. Changes apply to the portion of the pattern following the setting. For example, (?i) changes to a case insensitive match. |
Replacement Text
The replacement text for find-and-replace operations may contain references to capture-group text from the find. References are of the form $n, where n is the number of the capture group.
Character | Descriptions |
---|---|
$n | The text of capture group n will be substituted for $n. n must be >= 0 and not greater than the number of capture groups. A $ not followed by a digit has no special meaning, and will appear in the substitution text as itself, a $. |
\ | Treat the following character as a literal, suppressing any special meaning. Backslash escaping in substitution text is only required for '$' and '\', but may be used on any other character without bad effects. |
Examples:
- To match a trailing dot at the end of an outline label: \.$
The \ is needed to escape the dot so that it is parsed literally (instead of as a dot operator). The $ specifies that that a match is valid only if it occurs at the end of a line.
- To match a trailing exclamation mark at the end of an outline label: !$
The exclamation mark does not need to be escaped since it has no special meaning as an operator. The $ specifies that that a match is valid only if it occurs at the end of a line.
- To remove dates from outline entries:
Find: '[0-9]{2}/[0-9]{2}/[0-9]{4} ' (Do not enter the quote marks, do note the space at the end.)
Replace: [leave blank]
- To match chapter/paragraph numbering like "2.4.1":
Find: ^[0-9]+\.?[0-9]?\.?[0-9]?\.?[0-9]?
This will match up to four levels deep.
Auto-Incrementing Number Replacement
Allows the insertion of numbers that increment/decrement by a fixed value on each subsequent outline item that matches the regular expression in the Find field. Although it looks like regular expression syntax this is not a regular expression, yet it is only available when regular expression is enabled in the dialog and the action is Replace All. Only the first appearance of this syntax will be used for the replacement value. If you enter this syntax multiple times it will insert the value of the first occurrence multiple times.
Use the syntax $[n1,n2] in the Replacement field to insert the numbers. N1 is the starting number, n2 is the increment value. To decrement use a negative value.
Examples:
- $[108, -2] will insert the number 108 at the first match, 106 at the second match, etc.
- To add a ‘line number’ at the start of each outline entry you would use “^” as the entry in the Find field, and “$[1,1] “ (note the trailing space) as the entry in the Replacement field.