XLS2XForm is a tool to simplify the creation of forms. Forms can be designed with Excel and XLS2XForm will convert it to an XForm that can be used with ODK tools.
This tutorial teaches you to build your forms using Excel. Please refer to the sample Excel file as you read through the tutorial.
Basic format
Each workbook must have two worksheets: survey and choices.
- The survey worksheet contains the structure and most of the content of the form. It contains the full list of questions and information about how they should be presented. Most rows represent a question; the rest of the rows specify control structures such as groups.
- The choices worksheet is used to specify the options for multiple choice questions. Each row represents a multiple choice option. Choices with the same list name are considered part of a related set of choices. This allows a set of choices to be reused in multiple select questions (for example, yes/no choices).
Each of these worksheets has a set of necessary columns that must be present for the .xls file to be valid. Additionally, each worksheet has a set of optional columns that allow further control over the behavior of each entry in the form, but are not essential to the specification. Every entry must have values for each necessary column, but any of the optional columns may be left blank from entry to entry. For example, in the survey worksheet, each question must have a name specified, but one question may have a numerical constraint specified while another has an image file and a third may have niether. These columns, necessary or optional, may appear in any order. Optional columns may be left out completely. Any number of rows may be left blank. All .xls file formatting is ignored so that dividing lines, shading, and font formatting can be used to make the specification more readable.
Survey Worksheet
The rows of the survey worksheet specify the structure of the form. Questions will appear in the order they are specified (of course, depending on skip logic). The various question types correspond to ODK Collect question types and begin with the word "add" and end with the word "prompt":
- add text prompt: A string reponse.
- add integer prompt: A whole number response.
- add decimal prompt: Any number response.
- add date prompt: A date response.
- add select multiple prompt : A multiple choice question where multiple answers are allowed. The list name of the set of choices must also be provided. An example of a tag entry for select would be: "add select multiple prompt using colors", where "colors" is the "list name" of a set of choices from the choices worksheet.
- add select one prompt : A multiple choice question where only one answer is allowed. It requires a list name for a set of choices just as select multiple does.
- add barcode prompt: Opens a barcode scanning app and captures a barcode response.
- add location prompt: Enables the phone's gps and captures a gps location response.
- add image prompt: Opens the camera app and captures a photo or allows one previously taken to be selected from the phone's photo gallery. It shows a preview of the photo taken or selected.
- add audio prompt: Records audio or allow selection of a previously recorded audio segment. It also allows the audio file saved to be reviewed.
- add video prompt: Records video or allow selection of a previously recorded video segment. It also allows the video file saved to be reviewed.
- add acknowledge prompt: A simple message the user must acknowledge by checking a box. The form will not advance to the next question unless the box is checked.
- add note prompt: A read only question, meaning that there is no response required from the user. For example, you might describe the following series of questions in a longer form or provide instructions.
There are also question types that load information from the phone without displaying a question to the user, these begin with the word "get":
- get device id: The device id.
- get start time: A time stamp indicating when the survey was started.
- get end time: A time stamp indicating when the survey was finished.
- get today: The date the survey was taken.
- get subscriber id: The subscriber id.
- get sim id: The sim id.
- get phone number: The device's phone number.
You can control the title and id of your form with the following commands starting with the word "set". They both default to the name of the file if left unspecified:
- set form title: Set the form title.
- set form id: Set the form id.
The three necessary columns are type, name, and label.
- The type column specifies the question type. A few examples include "add text prompt", "add integer prompt", and "get today". The type column can also specify the begin and end of groups (see next paragraph).
- The name column specifies the internal name of the question or group (its XPATH name in XForm-speak). When referring to a question somewhere else in the form, this is the name to use. When viewing completed data from Aggregate, this name will be used in the database table. This value must be unique and not contain any spaces. It may contain numbers and some special characters, but must start with a letter.
- The caption column specifies the caption text the user sees in the form. For a question this will be the question text. For a group this will be the group label that appears in the breadcrumb trail.
Questions can also be grouped so that they appear on the same screen in ODK Collect. To begin a group, enter "begin group" in the type column. To close a group, enter "end group" under the type column. Note that the "begin group" row must have values for all three necessary columns (specified below), but the "end group" row does not. Groups may also be repeated (that is, looped). To make a set of questions repeat, enter "begin looped group" and "end looped group" in the type column.
The following columns are optional on the survey worksheet.
- The hint column specifies the question's hint, which will appear smaller and less emphasized than the question's caption. A common use for the hint column is to describe the question's constraints.
- The default column specifies the default answer for the question. For an integer question you might fill in "42". For a select multiple question, include all of the desired choice names (not the captions) seperated by spaces: "red blue green".
- The appearance column specifies the appearance attribute for the question or group. This is only relevant for multiple choice questions and groups. For multiple choice questions you could enter "quick" for a quick advance widget or "minimal" for a spinner widget. A question in which the answers are all images could be set to display those answers as a matrix of images by providing an appearance value of "compact". By default, this matrix will will the width of the screen and then start a new row. If a maximum number of columns is desired, it can be indicated by appending a dash and the number to compact (e.g. "compact-2"). An auto-complete widget can be used to make a choice selection by typing in a few letters that appear in the choice name. The appearance tag for this type of select question is "autocomplete". By default it searches for letters that appear contiguously in the choice name but "autocomplete-chars" will search for matches that contain the user provided letters in any order. An appearance of value "list" is used to have a multiple choice question appear in a table (matrix) format when in a group of questions with identical choices. The appearance "list-nolabel" is used to suppress the choice labels in the table format. The first question in this last type of group has the appearance "label" and is used at the top of the columns of choices. When used with a group, the appearance column can have the value "field-list" to indicate that all members of this group should appear on a single screen. See example.xls for examples of each of these.
- The constraint column specifies constraints on the responses that can be provided for questions. For a numerical question this will be a numerical constraint, such as "0 <= . and . <= 100". The dot represents the value provided by the user; the example constraint specifies that the question must have a response between 0 and 100. Note that logical operators "and", "or", and "not" are accepted. For a string question, this column's value will be a regular expression, such as "regex(., '^\d*$')" to specify that a valid input must be all numbers. To constrain the input to 5 digits, the regular expression would be "regex(., '^[0-9]{5}$')". A more complex regular expression to check for a well-formed email address is "regex(., '[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}')" the latter element constraining the part after the "." to be a character string 2 to 4 characters long.
- The constraint message column specifies message to be displayed to the user if a constraint is violated.
- The relevance column specifies whether a question shoud be asked or not based on the answer to a previous question. For example, if a medical form has a section about gynecological information, it could have a question that asks the user's gender. It would then only display the gynecological section if that response is "female". To do this, for each question in the gynecological section, the relevance column would contain "${gender}='female'" where "gender" is the name of the question (not the caption) and "female" is the desired value. Numerical constraints and logical operators can also be used: "${gender}='female' and ${age_in_years}>=15 and ${age_in_years}<e;49"
- The disabled column specifies whether a question should appear in the form. If the value is yes, this question will be ignored during conversion and will not appear in the the Xform.
- The required column specifies whether a response to the question is required. If "yes" then the user cannot advance to the next question without providing a response. If "no" then the user does not need to provide a response. By default no questions are required.
- The read only specifies that a question does not require the user to provide a response but is merely used to provide the user information or a note. If "yes" is in this column then no response is required and the caption is used as the message to the user. By default all questions require responses so there is no need to enter "no" in the "read only" column for a question where input is solicited.
- The calculation column specifies that this question's response should be calculated rather than asked from the user. The user will never see a question for which this column is defined - it is an internal question to compute a new value - it can be displayed to the user as part of a message (see below). For example, if you had a question that asked the user's age you might calculate whether they were eligible for the draft. The value of this column would be as follows: "if(${age} >= '18', 'eligible', 'not eligible')".
- The media:image column specifies an image to attach to the question. Similarly, media:audio specifies an audio file to attach, and media:video specifies a video file to attache. Note that the media file must be present within the phone's sdcard in the correct location (in the forms directory in a subdirectory with the name "formname-media") or it will not appear in the form.
- The caption:<language> column specifies a translation of the caption column into <language>. A new instance of this column can be created for each translation language. For example, a question with caption value "Good Day" might have "caption:Spanish" value "Buenos Dias" and "caption:German" value "Guten Tag".
- The hint:<language> column specifies a translation of the hint column into <language> and works much the same as caption:<language>.
- The media:image:<language> column specifies a translation of the media:image column value. This method can be used to translate media:audio and media:video as well.
Choices Worksheet
The rows of the choices worksheet specify the multiple choice options for the "add select multiple prompt" and "add select one prompt" questions. Every multiple choice question must reference a list name specified here (after the word "prompt" to get "add select one prompt using yesno", for example). Usefully, each set of choices is reusable; multiple questions can reference the same list name. For example, a "yesno" list name with choices "yes" and "no" is likely to be used many times.
The three necessary columns are list name, name, and caption.
- The list name column specifies the name of the set of multiple choice responses. For example, set of choices "brown", "blue", "orange", "red" might share the list name "colors". Note that each choice in your set of choices will have the same list name.
- The name column specifies the internal name for the choice (much like the name column from the survey worksheet). This value must be unique and not contain any spaces. It may contain numbers and some special characters, but must start with a letter. When a choice is made, this name will be the value used for the answer to the question. For select multiple prompts, multiple values are used. To access these values in calculations, the "selected" directive is used. For example, to add a constraint that choices with names "c" and "d" can't both be selected, one would add the constraint "not(selected(., 'c') and selected (., 'd'))" in the constraint column for that question on the survey worksheet.
- The label column specifies the text the user sees for the choices.
The choices worksheet may also use the media and translation columns just as they are used in the survey worksheet.
