Find & Replace
Batch find and replace text across hundreds of PDFs
Content-stream editing with .NET regex, capture groups, per-pair page ranges, and CSV-driven multi-replace. Preserves fonts, layout, and digital signatures on the pages you do not touch.
Download Complimentary TrialUnder the hood
Why content-stream find and replace is different from search-and-replace in a viewer
A PDF reader's Find dialog walks the document, highlights matches, and asks the user to retype each one. That works for one file. It does not work for two hundred. PDF Batch Editor instead opens each PDF's content stream — the sequence of text-showing operators (Tj, TJ, ', ") that put glyphs on the page — locates every match, and rewrites the operators with new strings.
The result inherits the original font, size, and color. The replacement is written into the same content stream as the original, so the file is still a valid, signable, searchable PDF when you are done. Text-fitting modes adjust the replacement string when its rendered width differs from the original — Adaptive combines font scaling and horizontal compression for the best visual fit, Preserve Width keeps the original bounding box and adjusts character spacing, Fit to Page rescales the font, and None writes at the original size and lets the layout shift.
When the file already carries a digital signature, PDF Batch Editor uses an incremental update by default: changes are appended to the file after the existing %%EOF, leaving the original revision intact. Signatures attached to pages you did not modify stay valid. Signatures attached to a page you did modify are invalidated — which is exactly what the PDF specification requires.
Pattern matching
Regex, case sensitivity, and whole-word matching
Each pair has independent regex, case-sensitivity, and whole-word toggles. The regex engine is .NET System.Text.RegularExpressions — full support for character classes, quantifiers, anchors, lookaheads, lookbehinds, and capture groups. Replacement strings reference captured groups with $1, $2, etc.
A few patterns that come up in real document sets:
Reformat US dates from MM/DD/YYYY to ISO YYYY-MM-DD
Find: \b(\d{2})/(\d{2})/(\d{4})\b
Replace: $3-$1-$2
Mask all but the last four digits of an account number
Find: \b\d{12}(\d{4})\b
Replace: ************$1
Update a versioned product name across an entire library
Find: \bAcme Suite v\d+(\.\d+)*\b
Replace: Acme Suite 2026
Strip a draft watermark line from every page header
Find: ^DRAFT — do not distribute$
Replace: (empty)
Whole-word matching uses an IsLetterOrDigit boundary check on the characters before and after the match. Searching Smith with whole-word on matches Smith, but skips Smithfield and Smithy. Case-sensitive matching is on by default; flip it off to make Acme, ACME, and acme equivalent.
Multi-replace
CSV-driven substitutions for hundreds of pairs
Annual updates are rarely one substitution. They are dates, addresses, contact names, internal codes, version strings — sometimes hundreds of pairs across a contract library. Build the change set in Excel or Google Sheets, save as CSV, and import:
find,replace
2025,2026
Q4 2025,Q1 2026
123 Old St.,500 New Ave.
"Acme Holdings","Acme Holdings, Inc."
v3.4,v3.5
Each row becomes one pair, applied in order. Pairs chain through temp files internally, so a later pair sees the result of an earlier one — useful when a substitution depends on an earlier one. Mix imported pairs with hand-edited pairs, expand any row after import to override its options individually, and export the current pair list back to CSV for version control or sharing with colleagues.
Per-pair controls
Page ranges, signatures, formatting, and conditional skipping
Each find/replace pair has its own independent settings, accessible by expanding the pair row.
Page range
Restrict replacement to specific pages with comma-separated values and hyphenated ranges — e.g. 1-3, 5, 8-10. Pages outside the range are written through unchanged. Useful when a contract template re-uses the same boilerplate but the substitution should only land in the cover page or the signature page.
Incremental save
Default on. Appends changes to the file after the existing %%EOF rather than rewriting the document, which is what protects digital signatures attached to unmodified pages. Turn it off only when you need a single-revision, fully-rewritten output (for archival or to defeat partial-revision recovery).
Skip if no match
When set, files where this pair found no matches are not written to the output folder — no empty copy, no suffix collision. Combined with multi-pair processing, this lets you target a small substitution across a large library and only produce output for the files actually affected.
Bold, underline, strikethrough, color, highlight
Visual markup applied to the replacement text. Useful when the find-and-replace pass is itself a review step — replace each instance of "to be confirmed" with the same string but underlined and red, so legal can spot every remaining placeholder at a glance.
Edge cases
Scanned PDFs, signed pages, encoding, and form fields
Scanned PDFs without an OCR layer. The tool reads the content stream, not rendered pixels. A flatbed scan that has not been run through OCR contains no text operators — just an image — and there is nothing to find. The preview will report zero matches and a "may be scanned/image-only" warning. Run OCR first, or replace text in the original Word/Excel source.
Signed PDFs. Incremental save (default on) preserves the prior revision, so signatures over pages you do not modify continue to verify. Signatures attached to a modified page are invalidated by design — that is the entire point of a signature. If you need to keep all signatures intact, restrict the pair's page range so it does not touch a signed page.
Custom encodings and Identity-H fonts. Most PDFs use a ToUnicode CMap that maps glyph IDs back to readable characters. PDF Batch Editor extracts text through this CMap, so well-formed PDFs work transparently. PDFs generated by older or hand-rolled tools sometimes ship without a CMap or with a broken one — in those cases the live preview's match count may under-report even though the underlying content stream still contains your search text. If a file's preview shows zero matches but you can clearly read the text on the page, the PDF likely lacks a usable text-extraction layer.
AcroForm fields. Find/Replace targets the visible page content stream. AcroForm field values live in a separate dictionary; once a form widget's appearance has been generated and embedded into the page (the usual case after a form is filled), the visible text is matched and replaced like any other content. Empty, unfilled forms are best handled by the dedicated batch form-fill module.
Use Cases
Real-world batch text replacement
Legal & Contracts
A paralegal needs to replace a party name across 47 case documents. Load the entire matter folder, define one pair, run. Two minutes instead of four hours, and incremental save keeps the witness affidavits' existing signatures intact.
Annual Policy Updates
HR updates 300 policy documents every January — new dates, addresses, benefit numbers. The whole change set lives in a CSV; one import, one execute, the entire library is current. Skip-if-no-match means policies that don't reference the changed values are not touched.
Template Customization
A sales team maintains 50 proposal templates. When the company renames a product or moves a tagline, a single regex pair updates every template at once — capture groups preserve any version suffix the original copy carried.
Frequently Asked Questions
Does this work on scanned PDFs?
Only if the scan has an OCR text layer. PDF Batch Editor reads the PDF's content stream — the actual text drawn on each page. If a file is image-only (a flatbed scan with no recognized text), there is nothing to find. Run OCR first, or work from the original Word/Excel source.
Can I use capture groups like $1 and $2 in the replacement?
Yes. Enable the Replace regex toggle on the pair, and you can reference $1, $2, etc. in the replacement string. Combined with the Find regex toggle, this lets you reformat structured strings — capture a date in (\d{2})/(\d{2})/(\d{4}) and rewrite it as $3-$1-$2. The find pattern uses .NET Regex syntax.
Will replacing text invalidate digital signatures?
Only on the pages that change. PDF Batch Editor uses incremental updates (enabled by default per pair), which append modifications to the file rather than rewriting the whole document. Signatures attached to pages you do not touch stay valid; signatures attached to a modified page are invalidated — the correct PDF spec behavior.
How do I limit replacement to specific pages?
Each pair has a Page range field that accepts comma-separated pages and hyphenated ranges, e.g. 1-5, 8, 12-15. Pages outside the range are not modified. Leave the field blank to apply across the entire document.
What does Whole word matching actually do?
It checks that each match is bounded by a non-letter, non-digit character (or by the start/end of the document). Searching Smith with whole-word on matches Smith. but skips Smithfield and Smithy. The check uses .NET's IsLetterOrDigit, which works across most Latin and Unicode scripts.
Can I import find/replace pairs from a spreadsheet?
Yes. Save the spreadsheet as a CSV with two columns — find,replace — and import. Each row becomes one pair. Per-pair options (regex, case sensitivity, page range, formatting) inherit from the defaults you set; expand any pair after import to override its options individually.
Stop editing PDFs one at a time
Replace text across your entire PDF library in seconds. Complimentary 14-day trial.
Download Complimentary Trial