Tokenizer Steps

<< Click to Display Table of Contents >>

Navigation:  General Functions > Configuring the MIE (Programmer's Guide) > Alternate Language Support > Tokenizer >

Tokenizer Steps

When the Tokenizer runs, it makes several passes at the text to be Tokenized.  The following passes are made:

Step

Purpose

Notes

1

Check for Alert boxes

Catch point screens often contain active script.  This step looks for Alert message boxes contained in embedded javascript.  The text within the Alert box will be tokenized.

2

Check for Confirm boxes

Same as Step 1, except looking for javascript Confirm message boxes.

3

Check for MsgBox

This step is for legacy support only.  There are some older catch point screens that contain VBScript.  MsgBox is a construct in vbScript that behaves similarly to Javascript's Alert and/or Confirm boxes.  This step looks for msgBox() constructs within vbScript sections and tokenizes the text within.

NOTE:  MsgBox must be immediately followed with a left parenthesis to be matched and processed.  It must always, therefore, be used as a vbScript function and not as a vbScript sub.

NOTE:  vbScript is not longer being used to create catch point screens and is not recommended for use by ESI.  vbScript is only recognized by Microsoft's Internet Explorer, and will not work with Firefox, Chrome, or other browsers.  Javascript is the recommended scripting language as Javascript is supported by all browsers.

4

Check for ESI tags

ESI tags start with {% and end with %}.  This step looks for Input tags as well as Caption tags.  For Caption tags, if the tag currently does not specify an override caption, the default caption is looked up and added as an override.  The override caption is then tokenized.  For Input tags, if the tag contains an override caption (such as may be the case with Input tags that are used to create buttons), the override caption is tokenized.  In any case, the effect is to permit alternate language translation of all captions that appear on the screen as well as any buttons that might be presented for clicking.  

NOTE:  For buttons to be tokenized and therefore eligible for alternate language translation, the ESI tag that specifies the Input must have an override caption specified.  See the help on Merge tags for how to specify override captions.

5

Check for ESI tags (alternate delimiters)

At times ESI tags will be delimited by [% and %] as opposed to {% and %}.  Such tags are used within the Gen 2 Software Request Portal, as the {% and %} delimiters are reserved for Gen 2 SRP-specific constructs.  This step is identical to Step 4, expect that the alternate delimiters are used to identify potential Caption and Input tags.

6

Check for HTML

This step is only performed for those message types that are expected to be in HTML.  This test looks for the < and > delimiters used by HTML.  It isolates text that sits between HTML constructs, and tokenizes that text.  There are some special cases where HTML constructs will be included within the tokenized text.  Those cases include:

<b>, </b>, <u>, </u>, <i>, </i> as well as any of the HTML representations for various characters (such as &nbsp; for non-breaking space).

In the case of bold, italics, and/or underline, the start sequence will be moved, if necessary, to be with the text that is tokenized (in some instances, the <b> might be followed by <font> and <span> and other HTML sequences, all occurring before the text that the <b> is modifying).  The text will contain the start sequence, the text, and then the stop sequence.  A token will be assigned for everything up to and including the stop sequence.  Text that occurs after the stop sequence will be assigned a different token.

NOTE:  When embedding bold and/or italics and/or underline within text, be sure to follow the end sequence with a non-breaking space sequence to preserve the proper spacing when the tokens are filled in at run time.  

Wrong:  this is a test of <i>italics</i> use.

Correct: this is a test of <i>italics</i>&nbsp;use.