|
1 | 1 | = Security |
2 | 2 | :description: Secure Apache Hive with Kerberos authentication in Kubernetes. Configure Kerberos server, SecretClass, and access Hive securely with provided guides. |
| 3 | +:opa-rego-docs: https://www.openpolicyagent.org/docs/latest/#rego |
3 | 4 |
|
4 | 5 | == Authentication |
5 | 6 | Currently, the only supported authentication mechanism is Kerberos, which is disabled by default. |
@@ -45,3 +46,118 @@ The `kerberos.secretClass` is used to give Hive the possibility to request keyta |
45 | 46 | === 5. Access Hive |
46 | 47 | In case you want to access Hive it is recommended to start up a client Pod that connects to Hive, rather than shelling into the master. |
47 | 48 | We have an https://github.com/stackabletech/hive-operator/blob/main/tests/templates/kuttl/kerberos/70-install-access-hive.yaml.j2[integration test] for this exact purpose, where you can see how to connect and get a valid keytab. |
| 49 | + |
| 50 | + |
| 51 | +== Authorization |
| 52 | +The Stackable Operator for Apache Hive supports the following authorization methods. |
| 53 | + |
| 54 | +=== Open Policy Agent (OPA) |
| 55 | +The Apache Hive metastore can be configured to delegate authorization decisions to an Open Policy Agent (OPA) instance. |
| 56 | +More information on the setup and configuration of OPA can be found in the xref:opa:index.adoc[OPA Operator documentation]. |
| 57 | +A Hive cluster can be configured using OPA authorization by adding this section to the configuration: |
| 58 | + |
| 59 | +[source,yaml] |
| 60 | +---- |
| 61 | +spec: |
| 62 | + clusterConfig: |
| 63 | + authorization: |
| 64 | + opa: |
| 65 | + configMapName: opa # <1> |
| 66 | + package: hms # <2> |
| 67 | +---- |
| 68 | +<1> The name of your OPA Stacklet (`opa` in this case) |
| 69 | +<2> The rego rule package to use for policy decisions. |
| 70 | +This is optional and defaults to the name of the Hive Stacklet. |
| 71 | + |
| 72 | +==== Defining rego rules |
| 73 | +For a general explanation of how rules are written, please refer to the {opa-rego-docs}[OPA documentation]. |
| 74 | +Authorization with OPA is done using the https://github.com/boschglobal/hive-metastore-opa-authorizer[hive-metastore-opa-authorizer] plugin. |
| 75 | + |
| 76 | +===== OPA Inputs |
| 77 | +The payload sent by Hive with each request to OPA, that is accessible within the rego rules, has the following structure: |
| 78 | + |
| 79 | +[source,json] |
| 80 | +---- |
| 81 | +{ |
| 82 | + "identity": { |
| 83 | + "username": "<user>", |
| 84 | + "groups": ["<group1>", "<group2>"] |
| 85 | + }, |
| 86 | + "resources": { |
| 87 | + "database": null, |
| 88 | + "table": null, |
| 89 | + "partition": null, |
| 90 | + "columns": ["col1", "col2"] |
| 91 | + }, |
| 92 | + "privileges": { |
| 93 | + "readRequiredPriv": [], |
| 94 | + "writeRequiredPriv": [], |
| 95 | + "inputs": null, |
| 96 | + "outputs": null |
| 97 | + } |
| 98 | +} |
| 99 | +---- |
| 100 | +* `identity`: Contains user information. |
| 101 | +** `username`: The name of the user. |
| 102 | +** `groups`: A list of groups the user belongs to. |
| 103 | +* `resources`: Specifies the resources involved in the request. |
| 104 | +** `database`: The database object. |
| 105 | +** `table`: The table object. |
| 106 | +** `partition`: The partition object. |
| 107 | +** `columns`: A list of column names involved in the request. |
| 108 | +* `privileges`: Details the privileges required for the request. |
| 109 | +** `readRequiredPriv`: A list of required read privileges. |
| 110 | +** `writeRequiredPriv`: A list of required write privileges. |
| 111 | +** `inputs`: Input tables for the request. |
| 112 | +** `outputs`: Output tables for the request. |
| 113 | + |
| 114 | +===== Example OPA Rego Rule |
| 115 | +Below is a basic rego rule that demonstrates how to handle input dictionary sent from the hive authorizer to OPA: |
| 116 | + |
| 117 | +[source,rego] |
| 118 | +---- |
| 119 | +package hms |
| 120 | +
|
| 121 | +default database_allow = false |
| 122 | +default table_allow = false |
| 123 | +default column_allow = false |
| 124 | +default partition_allow = false |
| 125 | +default user_allow = false |
| 126 | +
|
| 127 | +database_allow if { |
| 128 | + input.identity.username == "stackable" |
| 129 | + input.resources.database.name == "test_db" |
| 130 | +} |
| 131 | +
|
| 132 | +table_allow if { |
| 133 | + input.identity.username == "stackable" |
| 134 | + input.resources.table.dbName == "test_db" |
| 135 | + input.resources.table.tableName == "test_table" |
| 136 | + input.privileges.readRequiredPriv[0].priv == "SELECT" |
| 137 | +} |
| 138 | +
|
| 139 | +table_allow if { |
| 140 | + input.identity.username == "stackable" |
| 141 | + input.resources.table.dbName == "test_db" |
| 142 | + input.privileges.writeRequiredPriv[0].priv == "CREATE" |
| 143 | +} |
| 144 | +---- |
| 145 | +* `database_allow` grants access if the user is `stackable` and the database is `test_db`. |
| 146 | +* `table_allow` grants access if the user is `stackable`, the database is `test_db` and: |
| 147 | +** the table is `test_table` and the required read privilege is `SELECT`. |
| 148 | +** the required write privilege is `CREATE` without any table restriction. |
| 149 | + |
| 150 | +==== Configuring policy URLs |
| 151 | + |
| 152 | +The `database_allow`, `table_allow`, `column_allow`, `partition_allow`, and `user_allow` policy URLs can be xref:usage-guide/overrides.adoc#_configuration_properties[config overridden] using the properties in `hive-site.xml`: |
| 153 | + |
| 154 | +* `com.bosch.bdps.opa.authorization.policy.url.database` |
| 155 | +* `com.bosch.bdps.opa.authorization.policy.url.table` |
| 156 | +* `com.bosch.bdps.opa.authorization.policy.url.column` |
| 157 | +* `com.bosch.bdps.opa.authorization.policy.url.partition` |
| 158 | +* `com.bosch.bdps.opa.authorization.policy.url.user` |
| 159 | + |
| 160 | +==== TLS secured OPA cluster |
| 161 | + |
| 162 | +Stackable OPA clusters secured via TLS are supported and no further configuration is required. |
| 163 | +The Stackable Hive operator automatically adds the certificate from the SecretClass used to secure the OPA cluster to its trust. |
0 commit comments