Data Access Filters - rev 206

The documentation on this page corresponds to rev 206 (August 2016) of the tools suite. See Older Versions for documentation on earlier releases of this suite.

Release Designation

The data-access filtering in rev 206 (August 2016) is designated Alpha software.

See Release Designations for the meaning of this designation.

Summary

Traditional access control frameworks provide strong protections for data and the management of which users can modify that data. The data-access filtering introduced in ODK 2.0 is weaker. When syncing devices with the server, all data rows for all data tables are sync'd and shared across all devices. Every device gets a full copy of all data. Data-access filtering enables a supervisor to restrict the visibility of that data and to manage who can modify or delete the data through the programmatic means provided by the ODK 2.0 tools.

This is weaker than traditional access control frameworks in that application designers can:

  • circumvent via software. There are specific ways in which application designers can write their applications to defeat these filters. When those mechanisms are not employed, data-access filtering provides equivalent policy enforcement to that of a traditional access control framework.
  • circumvent via external access. The data and attachments are stored as plaintext on the device. Anyone can copy this data off of the device and access it, or write their own apps and directly modify it.

It is important to understand these limitations when designing your applications.

Overview

By default, all tables can be altered by all users.

The ODK 2.0 data access filtering mechanism relies on 5 interacting features that have been added in rev 206:

  1. Verified user identities
  2. Verified user capabilities.
  3. Table-level security configuration (e.g., whether data in the table can be modified by unprivileged users).
  4. row-level access filters (e.g., to specify whether a row is visible to a given user.
  5. sync status of the individual row

Verified User Identities

Enforcing restrictions on who can see or modify data requires that the identity of the user has been verified.

Beginning with this release, when configuring the Server Settings, any changes to any of the settings (i.e., the server URL, type of credential (or anonymous access), username, password or gmail account) will clear any prior user identity and capability information and flag the user identity as unverified.

When leaving the Server Settings screen, a user-verification screen will then be presented:

Clicking the Verify User Permissions button on this screen will initiate a series of requests to the configured server. These requests verify that the server URL is correct, that the server works with this application name, and then verifies the username and password or gmail account that has been configured on the Server Settings page.

If no server sign-on credential was specified (i.e., anonymous access was selected), the user identity will remain unverified.

Otherwise, upon successful completion of this interaction, the user identity will become verified, and the system will internally record this user identity and use it when evaluating whether data is visible and/or modifiable.

Verified user Capabilities

As part of the user-verification process, once a user identity has been verified, the list of capabilities assigned to that user, on that server, is downloaded and internally recorded.

For the purposes of the data access filtering mechanism, there are 3 user capabilities of interest:

  • ROLE_USER -- an ordinary user on the system.
  • ROLE_SUPER_USER_TABLES -- a privileged user who can edit all rows, change how rows are visible, and change who has special permission to edit a given row.
  • ROLE_ADMINISTER_TABLES -- a privileged user who can Reset App Server and who can edit all rows, change how rows are visible, and change who has special permission to edit a given row.

Inside ODK Survey and ODK Tables web pages, the capabilities of the current verified user are available in Javascript via the API:

odkData.getRoles(function(result) {
     var roles = result.getRoles();
     // roles is an array of capabilities granted to the verified user. 
     // It will be null for anonymous and unverified users.
  }, function(errorMsg) {
     // error handler
  });

Finally, after the identity and capabilities of the user are established, the server is asked for the list of all configured users and their capabilities.

If no server sign-on credential was specified (i.e., anonymous access was selected), the server will return null in response to this request.

If the user identity does not have ROLE_SUPER_USER_TABLES or ROLE_ADMINISTER_TABLES capabilities, the returned list will contain a single entry for this one verified user (ordinary users have no ability to change filters and therefore have no use for a full list; returning the full list to an ordinary user can be a security breach).

Otherwise, the full list of all configured users and their capabilities will be returned. This list will contain entries of the form:

{
  user_id: "verified_identity_token",
  full_name: "content of the Full Name field on the server",
  roles: [...]
}

The Full Name field on the server (on the Site Admin / Permissions sub-tab) is provided here to allow super-users and administrators to select people by name. user_id is used internally when configuring row access filters. The list of roles is provided to allow super-users and administrators to choose users based upon their capabilities.

Inside ODK Survey and ODK Tables web pages, the list of all configured users is available in Javascript via the API:

odkData.getUsers(function(result) {
     var users= result.getUsers();
     // users is an array of the above objects.
     // It will be null for anonymous and unverified users.
     // It will be a singleton list if the user lacks permissions.
  }, function(errorMsg) {
     // error handler
  });

Table-level Security Configuration

As mentioned earlier, by default, all tables can be altered by all users.

Data access filtering introduces the notion of a "locked" table. Only super-users and administrators can create and delete rows in locked tables. Anonymous, unverified, or ordinary users are unable to do so.

A table property is used to specify that a table as "locked."

Two other table properties control the creation of a row. The first property specifies whether an anonymous or unverified user can create a row in the table (this only applies if a table is not "locked;" it has no effect if the table is "locked"). The second property specifies the type of row-level access filter to assign to this newly-created row. Row-level access settings are covered more completely in the following section.

These 3 table properties can be specified in the properties sheet of the XLSX file via, e.g., if they are not specified, the default values for these 3 properties are:

partition aspect key type value
Table security locked boolean false
Table security unverifiedUserCanCreate boolean true
Table security filterTypeOnCreation string DEFAULT

Row-level Access Filters

Control of who can see, modify, or delete an individual row is governed by the row-level access filter of that row and that row's sync status. That filter is stored in the row itself under the _filter_type and _filter_value metadata columns. The sync status of the row is also stored in the row itself under the _sync_state metadata column.

The rules for the row-level access filter are as follows (stop at the first rule that applies):

  1. Super-users and administrators have full read/write/delete (rwd) capabilities on all rows, regardless of their row-level access filters and independent of the table's "locked" status. These privileged users also have the ability to change the row-level access filter column values (ordinary users cannot).
    User Capability unlocked table "locked" table
    ROLE_SUPER_USER_TABLES rwd rwd
    ROLE_ADMINISTER_TABLES rwd rwd
  2. If a row has not yet been sync'd to the server, the current user has full read/write/delete (rwd) capabilities on that row. This includes the anonymous and unverified users and is independent of the table's "locked" status.
    _sync_state unlocked table "locked" table
    new_row rwd rwd
  3. If the _filter_value column contain the user_id of the current user, then this user has full read/write/delete (rwd) capability on this row or, for "locked" tables, can modify the row (but cannot delete it).
    _filter_value unlocked table "locked" table
    user_id of current verified user rwd rw
  4. Otherwise, row-level access is governed by the _filter_type column and whether or not the table is locked, as follows:
    _filter_type unlocked table "locked" table
    DEFAULT rwd r
    MODIFY rw r
    READ_ONLY r r
    HIDDEN not visible not visible

Note that _filter_value can be null or any arbitrary placeholder string. If you use placeholder strings, it is recommended that they not begin with "username:" or "mailto:" or be "anonymous" to prevent any possible collisions with existing usernames. Placeholder strings might be useful in workflows to designate queues of unassigned-work.

Super-users and administrators can update the row-level access filters via the Javascript API:

odkData.changeAccessFilterOfRow(tableId, filterType, filterValue, rowId,
  function(result) {
    // success outcome
    // result holds the result set: SELECT * FROM tableId WHERE _id = "rowId"
  },
  function(error) {
    // error handler
  });

Alternatively, super-users and administrators can also use the updateRow API.

Ordinary users will receive a not-authorized error if they attempt to set any of these metadata fields (even if the values they set are unchanged from the current values of those fields).

Implementation of the HIDDEN filter on queries

When a SQL query is processed inside the ODK Services layer, it is first examined to see if the result set contains the columns _sync_state, _filter_type, and _filter_value. If it contains all 3 columns, then the query is wrapped with a where clause to exclude hidden rows and that, in turn, is wrapped by whatever limit and offset you have specified for the query.

Therefore, if you issue a query that omits one or more of these 3 columns from the result set, then no HIDDEN filtering will be applied. This is one way to circumvent data access filtering in software -- by crafting queries that omit one or more of these fields.

For example, queries that return the maximum value in a field, e.g.,

    SELECT MAX(crop_height) as max_height FROM crop_plantings

Would return the maximum crop height across all crop planting -- even if the current user only had access to the crop height data for their own plantings (and the crop information from other farms was hidden from them).

If you want to restrict such calculations to just the data visible to the current user, you must manually construct the query to do so. E.g., this would be the revised query:

    SELECT MAX(crop_height) as max_height from crop_plantings WHERE _filter_type != ? or _filter_value = ?

    bind parameters = [ "HIDDEN", odkCommon.getActiveUser() ]

Effective Access

As mentioned above, when a SQL query is processed inside the ODK Services layer, it is first examined to see if the result set contains the columns _sync_state, _filter_type, and _filter_value. If it contains all 3 columns, then a synthesized column, _effective_access is added to the result set. That column returns one of "r", "rw" or "rwd" to indicate the level of access the current user has on the rows in the result set.

Additionally, once a result set is returned for a given table, you can determine whether the current user can create new rows on the table by calling getCanCreateRow. I.e.,

odkData.query(tableId, whereClause, sqlBindParams, groupBy, having,
            orderByElementKey, orderByDirection, limit, offset, includeKVS, 
  function(result) {
    // success outcome
    // result holds the result set. Assume this has at least one row.
    // obtain the effective access for the first row in the result set
    // this will be one of "r", "rw", or "rwd"
    var effectiveAccess = result.getData(0, "_effective_access");
    // obtain the boolean indicating whether the current user can
    // create new rows in this tableId.
    var ableToCreate = result.getCanCreateRow();
  },
  function(error) {
    // error handler
  });

Usages within Applications

Consider a workflow application where a first group of field agents create work requests, those requests are then sent to a supervisor who assigns them to a different set of field agents for processing.

In this case, you might configure a work_requests table to create rows with a HIDDEN filter type (via filterTypeOnCreation). Then create a form for opening work requests.

The first group of agents (ordinary users) uses that form to create new work requests. Each agent would only see the work requests they themselves create because all other rows in that table would be hidden due to the _filter_type being HIDDEN and due to their being ordinary users.

After the field worker in the first group syncs to the server, and the supervisors sync to the server, the set of work requests the field worker created will have become available on the supervisors' devices. The supervisor (a super-user or administrator) can then see and change the _filter_value on each work request to one of the field agents in the second group.

When the supervisor syncs to the server, and then the field agent in the second group (another ordinary user) syncs to the server, that field agent will see the work items that have been assigned to them (and they will not see any other work items because they are ordinary users of the system).

When the agent in the first group next syncs, their created work item will disappear from their view because it is HIDDEN and the _filter_value no longer matches this field agent's verified user id (it was assigned to the second agent).

Upon completion of the task and after syncing to the server, after the supervisor next syncs, the supervisor could then change the _filter_value to null or to a special placeholder value to remove it from the second agent's list of work items (and that removal would occur when that second agent next syncs with the server after the supervisor syncs his _filter_value change).

Example Application

The app designer has a row-level access demo using the geoweather and geoweather_conditions tables and forms. This demo only works on the device. To install the demo on the device:

  1. force close all the apps
  2. delete the /sdcard/opendatakit/default/ directory on the device
  3. from the app designer, execute grunt adbpush-tables-rowlevelaccessdemo
  4. start ODK Survey and exit it
  5. start ODK Tables.

You will be presented with a demo launch screen.

At this point, all the rows in all the tables have a _sync_state of new_row and are fully editable and deletable. The demo will not become interesting until you set up and sync with a server.

Set up an ODK Aggregate 1.4.12 server with 2 ordinary users, 1 super-user and 1 tables administrator. Reset App Server to push the configuration and data up to the server.

You are now an administrator (you needed to be in order to reset the server). You can choose "Change Row-Level Access Filters" to view and perhaps modify the filter type and filter value of one or more rows. All rows in all tables are fully editable and deletable.

Now, change your Server Settings to one of the ordinary users (a gmail account or a username other than "msundt"). Notice that the list of conditions from the geoweather_conditions table no longer contains the "Light Rain" option. That was hidden and will only be visible to a username of "msundt" or a super-user or administrator.

Use the table display on the "Change Row-Level Access Filters" page to examine what the _effective_access for each row is in the various tables and verify that those settings are enforced.

Change your Server Settings to different users to see how their effective accesses change.