Recently, we had the opportunity to work on a very simple yet challenging task. The problem statement was to create a PDF file programmatically and to calculate some values dynamically while filling up the AcroForm of a PDF file.
The purpose of writing this article is to share the challenges faced and what we did to overcome them. In this article, we’ll consider a simple single-page PDF file with 3 AcroForm fields where the 3rd field gets populated by the multiplication of the other 2 fields.
You can make use of the canvas to create the AcroForms within it. Reportlab provides the canvas from its pdfgen utility.
Below you’ll find sample code to create this simple pdf file.
For the sake of simplicity, we will create a basic AcroForm PDF, which will contain 3 input fields. This function would create the pdf file with the current timestamp and return the file name while saving it.
Before applying the javascript to the pdf file, we save a javascript file that will take care of the calculation(operation) intended for your purpose. Alternatively, you can also pass the javascript string directly into your program.
In this small snippet, it would take the input values from the textfield with name ‘price’ and ‘quantity’ and set out the value of textfield with name ‘total’ in the pdf file.
While many libraries, would give you the ability to include javascript into your pdf file, we got stuck around with the triggering of these actions. All the articles we found, dealt with the integration of javascript which only got triggered while opening the file. This led us to dig deep into the pdf documentation which you can find at (https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf). We started understanding the Javascript Actions and the Interactive forms to be specific.
To begin with, a PDF file consists of many elements, where each element can have different data structures(refer to Table 34). In order to work with the Javascript actions and other interactive actions, ‘dictionary’ object is needed which basically contains key-value pairs.
The Javascript action consists of 2 primary keys where the key ‘S’ serves as the type of the action(‘Javascript’ for a Javascript Action) and the other key being ‘JS’ which contains the javascript text or stream you pass.
To create this in python, we will use the library of pdfrw, which supports the PDF object model.
We will create a function which would take in the javascript stream and return the action dictionary.
Once we have the javascript and the PDF ready, we will integrate the javascript into the file and attach triggers to the fields needed for the purpose. To make use of that, we will apply the trigger events(12.6.3 Trigger Events in the documentation) in the annotations object. Each field in the PDF page contains an annotation object which itself holds multiple action dictionaries. We will use the additional-actions dictionary(AA) of the annotation to include the javascript action produced above. We will use the ‘Bl’ key which dictates to trigger action when the annotation loses the input focus. So in essence, our javascript gets called whenever the input focus comes out of the field we enter the data in and performs its operation.
Additionally, we will also include the javascript while opening the page for the first time.
We are good to go in generating the PDF file now.
This will create the PDF file in 2 separate process.
In the first place, it’ll create an AcroForm PDF, and then in the second place, it connects and applies the Javascript to your file.
When we started working on the project, the documentations or articles related to the project were very hard to find. We couldn’t find a single place that could conclusively provide the core solution i.e the triggering of the JS. We couldn’t take no for a solution so we dug deep into the PDF documentations which eventually led us to believe in our ethos more than ever. We at TheCodeWork get things done. If you have some interesting ideas or a challenging problem, connect with us to bring it to life.