Dynamically changing PDF Acroforms with Python and Javascript

  • Home  »  Scripting   »   Dynamically changing PDF Acroforms with Python and Javascript

Recently, we had the opportunity to work on a very simple yet challenging task. The problem statement was to create a PDF file programmatically and to calculate some values dynamically while filling up the AcroForm of a PDF file.

The purpose of writing this article is to share the challenges faced and what we did to overcome them. In this article, we’ll consider a simple single-page PDF file with 3 AcroForm fields where the 3rd field gets populated by the multiplication of the other 2 fields.

Technologies & Libraries Used:

  • Python 3.x
  • Reportlab
  • pdfrw

Creating AcroForm PDF Files with Reportlab:

You can make use of the canvas to create the AcroForms within it. Reportlab provides the canvas from its pdfgen utility.
Below you’ll find sample code to create this simple pdf file.

from reportlab.pdfgen import canvas
import datetime

def create_pdf():
    file_name = datetime.datetime.strftime(
          datetime.datetime.now(),'%Y-%m-%d_%H-%M-%S')+ ‘sample.pdf’

    your_canvas = canvas.Canvas(file_name)
    your_form = your_canvas.acroForm
    your_canvas.drawString(25, 780, “Calculations”)
    your_form.textfield(x=25, y=700, borderStyle='underlined',
                   width=50,fillColor=white, fontSize=12, height=20, 
                   name='price')
    your_form.textfield(x=200, y=700, borderStyle='underlined',
                   width=50,fillColor=white, fontSize=12, height=20, 
                   name='quantity')
    your_form.textfield(x=375, y=700, borderStyle='underlined',
                   width=50,fillColor=white, fontSize=12, height=20, 
                   name='total')
    your_canvas.save()
    return file_nam

For the sake of simplicity, we will create a basic AcroForm PDF, which will contain 3 input fields. This function would create the pdf file with the current timestamp and return the file name while saving it.

Javascript

Before applying the javascript to the pdf file, we save a javascript file which will take care of the calculation(operation) intended for your purpose. Alternatively, you can also pass the javascript string directly into your program.

var price = this.getField(‘price’).value; 
var quantity = this.getField(‘quantity’).value;
var total=parseFloat((parseFloat(price)*parseFloat(quantity) || 
          0).toFixed(2));
this.getField(‘total’).value=total;

In this small snippet, it would take the input values from the textfield with name ‘price’ and ‘quantity’ and set out the value of textfield with name ‘total’ in the pdf file.

Javascript Actions in PDF

While many libraries, would give you the ability to include javascript into your pdf file, we got stuck around with the triggering of these actions. All the articles we found, dealt with the integration of javascript which only got triggered while opening the file. This led us to dig deep into the pdf documentation which you can find at (https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf). We started understanding the Javascript Actions and the Interactive forms to be specific.

To begin with, a PDF file consists of many elements, where each element can have different data structures(refer to Table 34). In order to work with the Javascript actions and other interactive actions, ‘dictionary’ object is needed which basically contains key-value pairs.

The Javascript action consists of 2 primary keys where the key ‘S’ serves as the type of the action(‘Javascript’ for a Javascript Action) and the other key being ‘JS’ which contains the javascript text or stream you pass.
To create this in python, we will use the library of pdfrw, which supports the PDF object model.

We will create a function which would take in the javascript stream and return the action dictionary.

from pdfrw.objects.pdfdict import PdfDict
from pdfrw.objects.pdfname import PdfName

def make_js_action(js):
    action = PdfDict()
    action.S = PdfName.JavaScript
    action.JS = js
    return action

Adding Javascript and Triggering Events

Once we have the javascript and the PDF ready, we will integrate the javascript into the file and attach triggers to the fields needed for the purpose. To make use of that, we will apply the trigger events(12.6.3 Trigger Events in the documentation) in the annotations object. Each field in the PDF page contains an annotation object which itself holds multiple action dictionaries. We will use the additional-actions dictionary(AA) of the annotation to include the javascript action produced above. We will use the ‘Bl’ key which dictates to trigger action when the annotation loses the input focus. So in essence, our javascript gets called whenever the input focus comes out of the field we enter the data in and performs its operation.
Additionally, we will also include the javascript while opening the page for the first time.

from pdfrw import PdfReader, PdfWriter
from pdfrw.objects.pdfstring import PdfString
from pdfrw.objects.pdfdict import PdfDict
from pdfrw.objects.pdfarray import PdfArray

def append_js_to_pdf(file_name):
    pdf_writer = PdfWriter()
    pdf_reader = PdfReader(file_name)
    try:
        js = open(sys.argv[1]).read()
    except:
        js = "app.alert('HOLA!');"
    for page_index in pdf_reader.pages:
        page = page_index
        page.Type = PdfName.Page
        for field in page.Annots:
            if ‘price’ in field.get('/T') or ‘quantity’ in field.get('/T'):
                field.update(PdfDict(AA=PdfDict(Bl=make_js_action(js))))
        page.AA = PdfDict()
        page.AA.O = make_js_action(js)
        pdf_writer.addpage(page)  
    pdf_writer.write(file_name)

We are good to go in generating the PDF file now.

if __name__ == "__main__":
    file_name = create_pdf()
    javascript_added = append_js_to_pdf(file_name)

This will create the PDF file in 2 separate processes. In the first place, it’ll create an AcroForm PDF and then in the second place, it connects and applies the Javascript to your file.

Development strategy at TheCodeWork

When we started working on the project, the documentations or articles related to the project were very hard to find. We couldn’t find a single place which could conclusively provide the core solution i.e the triggering of the JS. We couldn’t take no for a solution so we dug deep into the PDF documentations which eventually led us to believe in our ethos more than ever. We at TheCodeWork get things done. If you have some interesting ideas or a challenging problem, connect with us to bring it to life.

Post a Comment