Loading Prepopulation Data into nTEGRATE

Optional Approach

Note, this is a informative article to understand the inner-workings of the prepopulation process. Please also see an alternative approach to loading the PREPOP_DATASET_* tables - Guide for "WRK Table" Approach For Prepopulation



Many customers would like to pre-fill a wide variety of information into forms in nFORM with data from their existing systems. For instance, customers with a form that is used to renew a certain kind of permit might like to have all of the details of  their permits filled into the form when their downstream customer begins to fill it out. This makes it easier for their downstream customer to correctly complete the form and provides an opportunity for that downstream customer to confirm or correct that data. It's a big feature that all our customers cherish.

This feature is not as straightforward as loading in lookup table data. For one thing the naming on the tables and columns can be challenging to keep straight. As someone who uses nFORM you are accustomed to thinking about forms, their section and then the various fields (first name, last name, date of birth, etc.) and collection of fields (address, name and title, etc.) that an external customer will fill out. The nTEGRATE data model is halfway between being all about nFORM and abstract enough to work with another system; the data that actually goes into the form (the rows in prepop_dataset_item) are clearly nFORM specific yet the other tables have surprisingly abstract names.

In any case, the process goes like this...

  • Create or update a set of data for a specific nFORM section
  • Create or update a context record that hints to nFORM when to prefill that section
  • Sometimes, for private data, create or update a "login" record to link that context to a specific nFORM account name
  • Create or update a handful of records that represent the actual data to be filled into the form

That's not so bad, is it?

In the diagram above, there are some key columns to take note of. As an example, in the PREPOP_DATASET table, the TAG column is indicative of a form section tag. The REP_SEC_SORT column is used to orders sections with the same tag for the same context type and context key.

Using a form with a repeating section as an example, where the CONTACTS section is the repeating section, there is one row for the FACILITY_INFO section in the PREPOP_DATASET table, and N rows for the amount of repeating sections in the PREPOP_DATESET table (in the example below, there are 2 repeated "sub" sections).

For non-repeating sections, the REP_SEC_SORT column can be left null, or set to 0. This column is used to determine the order in which repeated submission sections are displayed, so for repeating sections, the value in REP_SEC_SORT will be different per section within the repeater: 


TAGNAMECREATED_DATEREP_SEC_SORT
FACILITY_INFOFacility Information2021-08-17 07:04:36.976
CONTACTSContact Information2021-08-17 07:04:36.9761
CONTACTSContact Information2021-08-17 07:04:36.9762


To hook the PREPOP_DATASET records to a context, we would need to establish a relationship from PREPOP_DATASET to PREPOP_DATASET_CONTEXT using the PREPOP_DATASET_ID, which is a foreign key to the PREPOP_DATASET table. These table columns will be described more in-depth below. 

NOTE: the new prepopulation application is web-based and requires IIS to run. This new application runs every minute, so data will flow in a much more rapid process. Additionally, not shown on the diagram to the right, there are two new columns on the PREPOP_DATASET. IS_SEND and IS_DELETE.  IS_SEND = 1 will send the record up to nFORM and then come back to nTegrate and set the IS_SEND to 0.  IS_DELETE set to 1, will find the dataset in nFORM and delete the record and corresponding dataset_item records, and then come back to nTegrate to delete out the record.

Topics Covered on This Page

What is a "Context" in this Context?

We see a lot of people struggle with the idea of a "context" but it really boils down to one thing: a business key. The context is always the business key that you are using to link the form's prefill data with the customer actually filling out the form. For instance, if people have permits for outdoor recreational water equipment (like a hot tub or a pool) then the business key (or "context") might be the number on their permit. When designing the form there's a special field for this (referred to as an "Initial Context Identifier") where you can provide a "tag" for the context and provide a prompt to help the customer fill-in the correct value.

In the screenshot, you can see that the "tag" for this context is "PBS_PIN_NUM". When someone files a form and has filled in this field, nFORM will look in the "prepop_dataset_context" table for records with a matching "context_key" and a "context_type" that matched the "tag" (in this case, "PBS_PIN_NUM"). It will then look to the "prepop_dataset_id" column to find the set of prefill data that goes with this context. With that in hand it will grab all of the prefill data from the "prepop_dataset_item" table for the dataset and merge it into the new form submission. With that done, the customer will see a form that already has a bunch of data in it instead of being entirely empty.

If you're anything like me you're already wondering why each set of prefill data may have more than one context. The idea is that some things might have more than one business key. For instance, we might want to let people look up their outdoor recreational water permit by their permit number or by a unique identifier of the equipment itself (maybe by equipment serial number). Laying out the data in this manner lets us use either key at the expense of making it a little trickier for the external customer to enter the correct context value. This is an important thing to keep in mind: the cost of the customer typing in the wrong data and getting an entirely empty form is very high: it frustrates that customer who will needlessly fill in a lot more data then they need to, in turn causing a duplicate record that our customer will be cranky about merging into their target system.

Creating Your Data Sets

Since the data set is the topmost item in the hierarchy I find that the easiest place to start, after all, the database won't let you insert any data until you have your data set records established. For these examples we'll stick with our outdoor recreational water equipment example to try and keep things straight in our minds. There are three interesting fields in the "prepop_dataset" table.

  • TAG - The tag on an nFORM form section
  • NAME - This is for you to use as you like
  • REP_SEC_SORT - A number used to link a data set to a specific section of a repeating section

The "tag" column should be pretty clear, it's the "tag" on a form section in nFORM. The "name" fields really isn't used at all; some say it's only there by accident. Anyway, we're lucky it's there because it lets us link these new data set records to the context records we need to create in the next step. Lastly we have "rep_sec_sort" which is an index into a section of a repeating section starting with the number 1. The idea here is that if you need to prefill three repeating sections you can use this column to point to the particular repeating section that should hold the set of data. If you're section isn't repeating then you will leave it as "null".

In my opinion, the easiest way to do this is to create a new data set for each section of your form. If there's a section of your form that turns out to be used, verbatim, in another form then you can also re-use your prefill data, how handy is that? It will also make it easier to double-check your data, you can query section by section and context by context to make sure the data makes sense.

The Massachusetts HCD database has all of the permit data for these recreation outdoor water features, we'll query that to get our context data and populate our new data sets.

Create Data Set Records
MERGE nform.prepop_dataset AS dataset
USING (
	SELECT 'PERMIT_FEATURE'   AS tag
		permit.permit_number  AS name
    FROM ma_hcd.dbo.permit
	WHERE permit.expired is null) AS incoming
ON (dataset.tag = incoming.tag 
	AND dataset.name = incoming.name)
WHEN NOT MATCHED BY SOURCE AND tag = 'PERMIT_FEATURE'
	THEN DELETE
WHEN NOT MATCHED BY TARGET THEN
	INSERT (prepop_dataset_id, tag, name, created_date
			rep_sec_sort)
	VALUES (newid(), incoming.tag, incoming.name, 
			current_timestamp, 0);

What we do here is we merge all of the permit numbers that have not yet expired into our data set table. Inside the "using" statement we select all of the permit numbers that we want present, in this case all of the permits that haven't yet expired. We match that data with our existing data set records by checking to see if the "tag" and the "name" fields match (in this case the "tag" is the name of our section, "PERMIT_FEATURE" and the name is the permit's number). If we have a row in our data set (the source of the merge) that doesn't match up with a row in our incoming data (perhaps indicating the permit has expired or was invalid) then we delete that row from our data set. If we have an incoming row that doesn't line up with a row in our data set table then we insert a new row.

We don't have to handle the last situation, where a row in the data set matches a row in the our incoming data but has other fields that need to be updated. We don't have any data like that in our data set.

You may or may not use the SQL "merge" statement often but you might want to get familiar with it now. When we move data from nTEGRATE over to nFORM, that data will retain the unique identifiers from the nTEGRATE database. This lets us update the form data for our prefill data as the data in the source system (MA's HCD database) changes over time. To make this work we want to keep the identifiers on our data sets. The "merge" statements lets us do that; we're updating all of the data here but you might have some business logic in your project that lets you load in only the changed data.

Creating the Contexts for Your Data Sets

With your data sets in place our next task is to link them with their contexts. In our case we are only going to let external customers type in their permit number when they start the form, we will only accept that one business key. That makes this a little simpler since we will only need to create on context for each of our data sets.

Create Contexts
MERGE nform.prepop_dataset_context AS context
USING (
	SELECT prepop_dataset_id AS dataset_id,
		name                 AS context_key
	FROM nform.prepop_dataset
	WHERE tag = 'PERMIT_NUM') AS incoming
ON (context.prepop_dataset_id = incoming.dataset_id
	AND context.context_key = incoming.context_key
	AND context.context_type = 'PERMIT_NUM')
WHEN NOT MATCHED BY SOURCE
	AND prepop_dataset_id in (
		SELECT ds.prepop_dataset_id
		FROM nform.prepop_dataset ds
		WHERE ds.tag = 'PERMIT_FEATURE)
	THEN DELETE
WHEN NOT MATCHED BY TARGET THEN
	INSERT (prepop_dataset_context_id, prepop_dataset_id,
		context_type, context_key)
	VALUES (newid(), incoming.dataset_id, 'PERMIT_NUM', 
		incoming.context_key);

This is very similar to how we populated our data sets, in fact we build on those same records. Because we have the context key in the "name" field of our data set, we can select all of those data sets and use them to populate the context table. Just as before we use a "merge" statement to decide if we need to delete any orphaned records and then insert the rest. Easy peasy! (big grin)

Creating Your Data Items

The last step is to create the prefill data items that will be used to fill in the form. For each form field that you want to pre-fill we will have one item in the table. In the interests of keeping it simple, our form section will only have two controls in it...

  • Text Field - The name of the permitted feature
  • Text Field - The name of the permit owner

We're going to go over how to insert the data into the item table, we're going to save the discussion for how the different controls are for another section of the document. It's not that it's so complicated but that it's a little arbitrary what is named what. First we're going to delete any data that is currently out there for this form section. We could do a merge but we don't need to retain the unique identifiers for the individual field items and it's just as easy to fill them from scratch each time.

Delete Existing Data Items
DELETE FROM nform.prepop_dataset_item
WHERE prepop_dataset_id IN (
	SELECT prepop_dataset_id FROM nform.prepop_dataset
	WHERE tag = 'PERMIT_FEATURE');

Keep in mind that we don't want to delete all of the items, only the ones for our form section. Next we fill in the data from the source system.

Create Data Items
WITH input_values_all AS (
	SELECT permit.permit_num                      AS context_key,
		dataset.prepop_dataset_id                 AS dataset_id,
		cast(permit.feature_name AS varchar(255)) AS feature_name,
		cast(permit.owner_name AS varchar(255))   AS owner_name
	FROM ma_hcd.dbo.permit
		JOIN nform.prepop_dataset dataset
			ON dataset.name = permit.permit_num
				AND dataset.tag = 'PERMIT_FEATURE),
input_values AS (
	SELECT DISTINCT * FROM input_values_all
	UNPIVOT (value FOR tag IN (feature_name, owner_name)) 
	AS unpivoted),
items AS (
	SELECT newid()     AS prepop_dataset_item_id,
		dataset_id     AS prepop_dataset_id,
		incoming.tag   AS tag,
		'input'        AS control_type,
		'controlValue' AS field,
		value          AS value
		null           AS table_sort
	FORM input_values  AS incoming)
INSERT INTO nform.prepop_dataset_item
	(prepop_dataset_item, prepop_dataset_id, tag,
	control_type, field, value, table_sort)
SELECT * FROM item;

How about them apples? First, at the top, we selected all of the permit numbers in the source system, along with the data for our section, and joined that to our existing data set items by matching the permit number in the source system to the "name" in our data sets (as well as matching by the "tag" on the data set). We also named the columns we selected so that they matched the "tag" in the "PERMIT_FEATURE" of our form section.

Next we took that data and "UNPIVOTED" by our form tags, that is, we took each of those columns that were named after the "tag" in our form section ("feature_name" and "owner_name") and create new rows for them. When those new rows are created new columns called "value" and "tag" were also created. In this way we flatten the data into the format that nFORM required (one row for each field value).

Now that we have a row for each item of prefill form data we can re-arrange it to match the layout of the table in nTEGRATE. We generate a new unique identifier for each row, use the matching data set identifier and then fill in the tag for the form control (a control are the "fields" that the customer fills in, some are a collection of fields like "address"). We are only populating "input" controls so we can use a literal; the "field" is where we need to place our value, these controls have the same field called "controlValue" so that is a literal as well. Next we map in the "value" field with our data and set the table sort value to null.

With the data laid out as required we simply insert it all in to the item table.

Flag the Data to Be Sent to nFORM

Just like the lookup table data, nTEGRATE will not send your pre-fill data over to nFORM until the "is_send" flag on any related "prepop_dataset" item is set. When the nTEGRATE collects data to send over, it will pull every context related to a dataset, you only need to set the flag on one child context in order to flow the entire data set (it would make little sense to flow a context without it's parent data set).

We grab all of the distinct context keys and then flag one hundred of their parents to be flowed over to nFORM. Typically nTEGRATE will flow data every five minutes, you won't have to wait long for the data to migrate over. At the time of this writing nTEGRATE sends four items over at a time, you can monitor the progress of the data flowing with the following query.


Write Your Pre-Fill Data into a Stored Procedure

It may be tempting to right a series of queries that select your pre-fill data and then insert that data into nTEGRATE, but we advise against that. In almost all scenarios you will need to run the pre-fill process repeatedly. During development you will probably have to rebuild the data over and over as requirements change or you notice errors in your SQL. Once development has ended, it's likely that the client will want to update their prefill data on a schedule. Writing your pre-fill instructions into a stored procedure from the start will make this a lot easier in the long run. (wink)

Pre-Filling Different Form Controls

There are a variety of controls in nFORM and most of them are simple to pre-fill, you will need to populate the control tag, control type, field, and value.  Additionally, the advanced table control (grid) will allow you to sort the rows that you are prefilling, by using TABLE_SORT. More complex controls, like the "contact" and "address" controls, will have multiple entries in the prepop_dataset_item table, one for each element such as 'street', 'postalCode', etc. 'This section goes over all of the different controls and notes their "control_type" and their various fields.


The following script will provide you with the control types and field names: 

select ft.CONTROL_TYPE,
       ft.FIELD,
       ft.FIELD_DISPLAY_NAME
from nForm.CONTROL_FIELD_TYPE ft
where ft.CONTROL_TYPE not in ('attachment', 'grid', 'paragraph', 'supergrid', 'table')
order by ft.CONTROL_TYPE, ft.FIELD

Short Text

The "short text" control has the "control_type" of "input", is has one field named "controlValue.  This is the same concept for many other "simple" controls, such as calculated, date, email, number, hidden, phone, ssn, time, and url.

Item Column NameDescriptionNotes
control_typeinput

field

controlValueThe value of the field

Single and Multiple Selection

The single and multiple selection controls both have the same type, "select", and you set their value with the "select" field. The value should be the text description that appears in the selection menu (not the hidden code value).

When you need to prefill a multiple selection item, provide the selections separated by a comma (","). For instance, to select "fire man" and "police woman" you would use the value "Fire Man,Police Woman".

Item Column NameDescriptionNotes
control_typeselectBoth single and multiple selection menus use the same control type.
fieldselectUse the descriptive value you want to select, not the code.
fieldotherReasonIf the selected item is "Other", value for the "Please Describe" field

Address

The address control is made up of a collection of fields, one for each part of the address. This makes pre-filling the field a little more work since you will need to add a data item row for each section of the address. Aside from that it works similarly to the other controls.

Item Column NameDescriptionNotes
contol_typeaddress
fieldstreetThe first street address line
fieldstreet2The second street address line
fieldlocalityThe address' town
fieldareaCodeThe address' state
fieldpostalCodeThe ZIP code for the address
fieldcountyThe address' county.
fieldcountryThe address' country.

Contact

Like the address control, the contact control is made up of a collection of fields. Many of these are similar to the address fields but, unfortunately, they have slightly different names. Keep an eye on these as it is very easy to confuse them and get them wrong.

This control also has it's own little repeating section: a phone number can have multiple entries. The first phone number has it's fields suffixed with "_0" at the end, for instance "phoneTypeContact_0" or "extensionContact_0". The next phone number in the list will be suffixed with "_1" (i.e. "extensionContact_1") and so on.

Item Column NameDescriptionNotes
control_typecontact
fieldcompanyNameContactThe "company" name for the contact
fieldfirstNameContactThe first name of the contact
fieldlastNameContactThe last name of the contact
fieldtitleContactThe title of the contact
fieldemailContactEmail address for the contact
fieldstreetContactThe first street address line
fieldstreet2ContactThe second street address line
fieldlocalityContactThe city for the address
fieldareaCodeContactThe state for the address
fieldpostalCodeContactThe ZIP code for the address
fieldphoneTypeContact_0The type of phone for the first phone number (i.e. "Business" or "Mobile")
fieldphoneContact_0The first phone number
fieldextensionContact_0The extension for the first phone number
field
prefixNames
The name prefix (e.g., Mr., Ms., Dr., etc.)
fieldcountyThe county for the contact
fieldcountryThe country for the contact

Location

The "location" control has the "control_type" of "location", it has one field named "mapCoord".  For value you will load it as comma-separated, Latitude, Longitude. 

Item Column NameDescriptionNotes
control_typelocation

field

mapCoordThe lattidude, longitude of the control
value35.386389,-84.587778Add lat and long separated by a comma

Grid

A grid control, or Advanced Table Control, is prefilled a bit differently than the other more simple controls.  In this case you put the tag of the column control in the FIELD column of the PREPOP_DATASET_ITEM.  You can use TABLE_SORT to order the rows, starting with 0. Remember, the overall grid (advanced table) has a tag, and the individual columns (controls) have their own tags.

The new DataGrid control is prepopulated using the same format, however, the control_type is supergrid.

Item Column NameDescriptionTable_SortNotes
tagFACILITY_TAG
Tag of the table
control_typegrid
Use 'grid' for the control type.
fieldFACILITY_ID0Tag of the control within the table. First row in the table
fieldFACILITY_NAME0Tag of the control within the table. . First row in the table