Watermarking script #6017

mjpost · 2025-09-18T13:19:55Z

For revised PDFs, we are currently unable to add the footer. This script enables that and has only Python dependencies. I used it to correct the PDF here: https://aclanthology.org/2025.acl-long.426/

This also updates the add_revision.py script to add a revision watermark:

Comments on format etc welcomed.

Edit: There is now a web service: https://aclanthology.org/watermark.html

github-actions · 2025-09-18T14:08:47Z

Build successful. Some useful links:

Complete site preview: https://preview.aclanthology.org/watermark
Potential volumes of interest:

This preview will be removed when the branch is merged.

mjpost · 2025-09-18T16:46:17Z

On a whim, I vibe-coded this in about 30 minutes: https://aclanthology.org/watermark.html

(Thanks to VS Code and GPT 5!)

mbollmann · 2025-09-18T18:43:39Z

Haven't had too close of a look yet, but the first thing I noticed: why are page numbers below the footer? In the official proceedings, they're above the footer.

mbollmann · 2025-09-18T18:46:26Z

On a whim, I vibe-coded this in about 30 minutes: aclanthology.org/watermark.html

The instructions say "Separate footer lines with the Enter key." but if I do that, a black square appears on the PDF. It only works correctly when entering "\n".

mjpost · 2025-09-18T19:39:35Z

Thanks—fixed.

mbollmann

Pretty cool, though I remain skeptical about vibe coding. :)

Most of add_footer.py is quite arcane (mostly due to reportlab.pdfgen's API) and not really commented so it's hard – though possible with some effort – to follow what's going on. I mostly focused on the CGI script now to check for security concerns and the like.

mbollmann · 2025-09-19T08:44:18Z

bin/add_footer.py

A general comment: I have been wondering about throwing all of our scripts into bin/, which has become a mixture of (i) core build scripts, (ii) data ingestion & modification scripts, (iii) one-off scripts that are probably outdated by now, and (iv) other miscellaneous stuff. It’s quite unclear which of these scripts are still useful and what for, unless you look into each of them.

I was wondering if we could start categorizing them into subfolders, or at least name them more explicitly (e.g. here I would prefer add_footer_to_pdf.py).

This is definitely long overdue for a reorg. bin/ itself isn't that great of a name. One suggestion is to use scripts/ instead, and then have some kind of minimal one-level nesting within it, following your taxonomy above: build, data, misc.

mbollmann · 2025-09-19T08:46:14Z

bin/add_footer.py

+                ).pages[0]
+            page.merge_page(pnum_cache[nkey])
+
+        # Footer only on first page; place it ABOVE the fixed page number


Just noting again for the record that this is not where *ACL proceedings currently place the footer.

Yeah....but the current ACL choice is ugly, and also (I suspect) just some random person's quick decision. Witness (from ACL 2025):

It's different even from ten years ago (source):

Maybe I shouldn't in turn just arbitrarily change it, but I think it looks better.

Agree that the second one looks better, but it still has the footer below the page number on the first page, which my (completely subjective) gut reaction finds more appealing :)

Regardless of subjective appeal, there is an argument though for making the footer of revisions consistent with the original.

The footstamp offset varies by conference and year. Our options are (a) come up with a good default, and ideally get ACL to consolidate on that or (b) provide more knobs in this user interface to allow users to fiddle and match the original. I guess we should do both.

Now I obviously haven't checked all conferences, but regarding the two examples you posted, it seems to me that the difference between them is not actually the placement of the footer, but the margins of the page content. In other words, I think the absolute placement of the footer may actually be the same there.

mbollmann · 2025-09-19T08:48:21Z

bin/add_footer.py

+    python add_footer.py -p 199 in.pdf out.pdf "…"
+    python add_footer.py -p 199 --footer-size 9 --pagenum-size 10 --bottom-margin 14 in.pdf out.pdf "…"
+
+Copyright 2025, Matt Post


Not Apache license, like all the other scripts?

Just an oversight. This is mostly just a proof of concept.

mbollmann · 2025-09-19T08:50:14Z