SSO and Identity Management with metaphactory

(Reading time: 7 - 13 minutes)
Connecting the best of both worlds: ontologies and vocabularies in metaphactory

This blog post is co-authored by Andreas Schwarte, Principal Software Engineer, and Wolfgang Schell, Principal Software Architect at metaphacts.

In a previous blog post we provided a high-level overview on security-related topics for using metaphactory in an enterprise environment. This post will dive deeper into authentication and authorization using single sign-on and also cover how the authentication process can be integrated with databases or external services such as third-party REST endpoints.

Single Sign-On (SSO)

User Identity plays a big role in enterprise environments: employees, contractors, and external partners work with many different applications and services in their daily work, where each application requires authentication and authorization.

Maintaining user identities and accounts individually in each of the applications and systems does not scale. To simplify both user administration and the enforcement of security policies as well as daily work for users, the user identity is managed in a centralized system such as an Active Directory. This centralized system is then used by all applications and services for authentication. When logging into a system, the authentication check is delegated to the central authority. Single sign-on (SSO) allows the user to log in once and access different connected services without re-entering authentication factors (e.g., credentials).

SSO Authentication Schemes and Protocols

Single Sign-On is a high-level authentication scheme, which is implemented using different protocols. The most commonly used protocols are OpenID Connect (OIDC), OAuth, Security Assertion Markup Language (SAML), and JSON Web Token (JWT), all of which are supported by metaphactory. Details on each of them are provided in the documentation on Authentication & Authorization Providers.

Our recommended method for SSO authentication is OIDC because it allows integration with all kind of backends and supports both authentication for interactive users (i.e., with a browser) as well as machine-to-machine communication (i.e., calling Web Services like Query-as-a-Service (QaaS)).

Identity, Authentication, Authorization

The user's identity is managed in a central system, the Identity Provider. The main purpose of such central identity is that it can be reused across applications (i.e., to avoid the use of local accounts). Identity here is a concept of representing a given user and their attributes (e.g., name, email address, etc.). Additionally, the user identity is used for assigning roles or groups in the context of a given application.

Authentication is the process of verifying the identity of a user. For authentication, the user typically provides credentials (or, in a more abstract manner, any kind of identity information) that are potentially accompanied by additional authentication factors. Depending on the authentication scheme, the information is passed to the identity provider for performing the validation.

Authorization goes a step beyond and is the process of determining the access permissions (or rights) for a given resource. Typically, authorization refers to associating the user's permissions granted through roles or groups, to a profile in the context of a given application.

The result of authentication and authorization is a verified user profile (representing the identity) with attached information on roles or groups.

Using groups or application roles for role mapping

Let's assume that a user is a member of a business unit which is responsible for modeling product specifications and bills of materials, specifically to ensure that the same terminology and concepts are used consistently across the organization. In the context of metaphactory, such a user would be assigned the roles ontology-edit and vocabulary-admin (accompanied with the system-base role for granting base access). Therefore, what is essentially desired in this scenario, is that all users in that business unit (or group) obtain the metaphactory permissions through a defined role mapping.

Assigning application groups or roles and corresponding mappings happens in two stages:

  1. Basic groups (or roles) are assigned to the user identity for the context of the application.
  2. Afterwards, these groups or roles are mapped to metaphactory-specific roles through the authentication provider configuration (see the scopeRolesMap setting in the example below).

In metaphactory, a role groups the individual permissions required for a given functionality or service, e.g., in the simplest case reading from the default database. A number of such pre-defined roles are available out-of-the-box and can be mapped to users through the respective role mappings.

Example of using OIDC with metaphactory

In the following we demonstrate how a concrete configuration of OIDC looks in metaphactory.

As a first step, the application needs to be registered in the identity provider; in our example below we use Microsoft Azure AD. Our application is registered as "metaphactory-sandbox" and we have created a client secret and defined app roles. The following screenshots display the overview (where all relevant settings can be accessed) and the app roles overview.

Settings overview for metaphactory-sandbox on Microsoft Azure
App roles overview for metaphactory-sandbox on Microsoft Azure

The next step is to apply the concrete configuration to metaphactory, as depicted in the listing below. Please note that we use a placeholder for TENANTID, CLIENT_ID and CLIENT_SECRET, which need to be populated with the values from the actual Azure account and the app registration. Additionally, the callback URL needs to be adjusted to the external URL of the metaphactory system. Finally, the application specific roles can be mapped to metaphactory roles. Once completed, the configuration is stored in metaphactory in the file shiro-sso-oidc-params.ini. For further details, please have a look at the metaphactory documentation on authentication and authorization with OIDC.

[main]
discoveryURI.value = https://login.microsoftonline.com/<TENANTID>/v2.0/.well-known/openid-configuration
callbackUrl.value = https://metaphactory.example.com/sso/callback?client_name=OidcClient
clientId.value = CLIENT_ID
clientSecret.value = CLIENT_SECRET
scope.value = openid email profile offline_access
principalNameAttribute.value = preferred_username
defaultRole.value = guest
rolesClaimAttribute.value = roles
scopeRolesMap.value = "admin":"admin,root,repository-admin","reader":"guest","writer":"ontology-edit,vocabulary-admin,system-base"

tokenAttributeName.value = user.token

Authentication with Databases and External Services

Most systems in today's enterprise environments are not running in isolation but are interconnected with other applications. This includes storing data in a database as well as integrating external systems using web services, e.g., using REST-based service calls.

Users working with metaphactory are authenticated using SSO as described above. But metaphactory also connects to other systems like an RDF database or REST services.

Service Accounts and User Identity Push-Down

In many cases, authentication for external systems is handled using a Service Account. A service account does not represent an individual user but rather a specific application instance.

When using a service account for database access, all users authenticate with metaphactory using their individual user account, but database access is performed using the same technical account, hence providing the same level of data access for all users.

Similarly, calling web services typically requires authentication using a service account with the right set of permissions for the intended functionality.

For some use cases, it is required to push-down the user identity to the database or external systems, e.g., for auditing purposes or to grant individuals access permissions on database level. metaphactory supports forwarding the user identity to some databases like GraphDB and Stardog which implement authentication using OIDC or JWT.

This approach has two big advantages:

  • From the Data Governance point-of-view, data access can be enforced directly in the database on a per-user level. The data owner does not need to trust metaphactory to properly enforce data access.
  • No service account is required. Creating a service account entails complying with operational and security requirements such as proper secret management and frequent password rotation. Removing the need for service accounts greatly simplifies operations and DevOps processes.

Forwarding User Identity in metaphactory

metaphactory supports forwarding User Identity to external systems when using OIDC or JWT for Single Sign-On for user authentication.

When enabled, the user token (JWT) passed to metaphactory is stored in memory in a per-user session store. This is integrated with our Externalized Secrets mechanism and can then be used to provide the access or ID token when authenticating with the database or external services like REST endpoints.

Token Refresh

One of the key properties of OIDC and JWT is that the authentication tokens have a limited lifetime, i.e., they expire after a certain period. The typical expiration time is between 5 and 15 minutes and protects against intercepting and reusing tokens.

When logging into metaphactory using OIDC or JWT, the token provided during SSO is only checked at login time, so the token only needs to be validated once.

When storing and passing this token to other systems, lifetime becomes relevant: when the token expires after a short period of time, it will not be accepted by the external system anymore.

When using OIDC, there are actually multiple tokens:

  • An ID token in JWT format providing user identity and information such as email, groups, etc.
  • An access token which is used for authentication checks
  • A refresh token which can be used to get an updated ID and access token when it has expired

Before using an ID or access token for authentication in external systems, metaphactory checks if the token is still valid and automatically performs a refresh when it has expired.

Client Identity and Token Validation

When forwarding the user's ID token, the target application validates the signature of the token as well as the issuer (iss field within the JWT) and target audience (aud field within the JWT).

The target audience of a token represents the client application as registered with the ID provider. The token that is forwarded to the external system was originally meant for the metaphactory application. This means that the audience field in the JWT contains the client ID registered for this metaphactory instance. When another application – in our example the database – receives this token, it may need to be configured to use the same client ID in order to accept the forwarded token.

Example of database access using OIDC

Some databases like GraphDB and Stardog support token-based authentication. metaphactory supports forwarding user identity for data access with these databases. The following example shows authentication with GraphDB.

Repository Configuration in metaphactory

The repository configuration contains all details on how to connect to the database from metaphactory. Here, we use token-based authentication when connecting to GraphDB and specifically the per-user authentication token as captured during login (see below).

@prefix http: <http://www.openrdf.org/config/repository/http#> .
@prefix rep: <http://www.openrdf.org/config/repository#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix mph: <http://www.metaphacts.com/ontologies/platform/repository#> .

[] a rep:Repository;
  rep:repositoryID "default";
  rdfs:label "GraphDB repository";
  rep:repositoryImpl [
      rep:repositoryType "metaphactory:GraphDBRepository";
      http:repositoryURL <http://graphdb.example.com:7200/repositories/mydb>;
      mph:authenticationToken "${user.token.idToken}"
    ] .
Single Sign-On Configuration in metaphactory

The file config/shiro-sso-oidc-params.ini contains the OIDC configuration settings for metaphactory. See section "Example of using OIDC with metaphactory" above for an example of the configuration file.

The setting relevant for enabling forwarding user identity to the database is tokenAttributeName.value which defines the name under which to refer to the user token later in the repository config using the Externalized Secrets mechanism (see above).

GraphDB configuration

The following example shows the graphdb.properties configuration file for GraphDB containing all relevant settings to enable OIDC authentication and authorization in the database. As above, this example uses Azure Active Directory as identity provider. See the GraphDB OIDC documentation for further details.

When using our docker-compose setup this file can be injected into the GraphDB container using a volume mount to /opt/graphdb/home/conf/graphdb.properties .

Please note that when setting up GraphDB from scratch, security needs to be explicitly switched on in the GraphDB Workbench!

With this setup, authentication is handled by Azure AD. Authorization, i.e., granting access permissions for data access or administrative purposes, is handled by GraphDB directly based on the provided role information. These roles need to be contained in the ID token issued by the identity provider. For Azure AD, the registered application can be configured with custom user groups which are used as roles. Please refer to the documentation of your identity provider for details on role and group mapping as well as to the GraphDB documentation. The example below provides access permissions via a default role which applies to all authenticated users.

###### OPENID AUTHENTICATION + OAUTH AUTHORIZATION ######
# see https://graphdb.ontotext.com/documentation/10.0/access-control.html#openid-oauth

graphdb.auth.methods = openid
graphdb.auth.openid.issuer = https://login.microsoftonline.com/<TENANTID>/
graphdb.auth.openid.client_id = CLIENT_ID
graphdb.auth.openid.username_claim = preferred_username
graphdb.auth.openid.well_known_config_url = https://login.microsoftonline.com/<TENANTID>/v2.0/.well-known/openid-configuration
graphdb.auth.openid.auth_flow = code
graphdb.auth.openid.token_type = id

###### OAUTH AUTHORIZATION
graphdb.auth.database = oauth
# OAuth roles claim. The field from the JWT token that will provide the GraphDB roles.
graphdb.auth.oauth.roles_claim = groups
# OAuth default roles to assign. It may be convenient to always assign certain roles without listing them in the roles claim.
graphdb.auth.oauth.default_roles = ROLE_USER, READ_REPO_*, WRITE_REPO_*

Please note: As described above, in this example, metaphactory and GraphDB share the client identity, so the token used for logging into metaphactory can simply be passed on to GraphDB. The application registration in the ID provider typically restricts the possible callback URLs after successful authentication, so both the callback URLs for metaphactory as well as for the GraphDB Workbench need to be configured as part of the application registration. Only then will a user be able to log into both metaphactory and GraphDB Workbench using a web browser.

Summary & Outlook

Security in software deployments has many facets - as depicted in the image below. In a previous blog post, we tackled the middle layer and discussed network and environment topics to ensure a secure metaphactory in enterprise environments. This blog post covered topics relevant for the top layer and provided an overview of authentication and authorization using single sign-on, role-based access control, and assignment of roles based on groups.

Security consists of different aspects layered on top of each other

In a next blog post, we plan to address topics relevant to the bottom layer and discuss best practices for secure software development and deployment processes, incl. vulnerability scans, updates and patches. So make sure to subscribe to our newsletter or use the RSS feed to learn more about these topics in the future.

As a Principal Software Architect at metaphacts, Wolfgang works with the software engineering team to translate customer needs into sustainable features and implement these in a holistic architecture. As an enthusiastic software developer, he is also involved with the Mannheim Java User Group (Majug) and the JugendHackt Lab Mannheim.

As a Principal Software Engineer at metaphacts and a specialist in semantic technologies, Linked Data, SPARQL and federated query processing, Andreas leads our software engineering team in developing, documenting, and testing metaphactory to ensure that the platform meets our customers' needs and helps them achieve their business goals.