Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
6eb2c5f
feat: per-index tier access middleware
ZakirG Nov 30, 2020
7601534
-
ZakirG Dec 1, 2020
052ebbf
Merge branch 'master' of https://github.com/uc-cdis/guppy into feat/p…
ZakirG Dec 2, 2020
e98689c
reorganize code
ZakirG Dec 3, 2020
1d802e0
fix unit tests
ZakirG Dec 3, 2020
d862064
fix unit tests
ZakirG Dec 3, 2020
06f2251
fix unit tests
ZakirG Dec 3, 2020
77373b1
draft of new middleware structure
ZakirG Dec 3, 2020
c2c6a59
adjust manifest validation
ZakirG Dec 3, 2020
3c4c8c9
breaking type-wide middlewares into field-scoped middlewares
ZakirG Dec 7, 2020
a8985db
eslint
ZakirG Dec 7, 2020
12ea6b2
add logs
ZakirG Dec 7, 2020
28a0ebe
switched to index-scope querySchema
ZakirG Jan 19, 2021
944435f
Merge branch 'master' of https://github.com/uc-cdis/guppy into feat/p…
ZakirG Jan 19, 2021
dfe7856
eslint
ZakirG Jan 20, 2021
e0d6b73
eslint
ZakirG Jan 20, 2021
6a342d6
undo schema changes
ZakirG Jan 21, 2021
6a0c5bb
fix places where i confused ES type with GQL type
ZakirG Jan 21, 2021
b6a8b92
enhance auth middleware assertion
ZakirG Jan 21, 2021
14bcb67
adjusting new resolver logics
ZakirG Jan 21, 2021
0609a75
adjusting new resolver logics
ZakirG Jan 21, 2021
461d1e5
eslint
ZakirG Jan 21, 2021
af157a3
fixing bug in manifest var logivs
ZakirG Jan 22, 2021
83b897f
clarify manifest logic
ZakirG Jan 22, 2021
7a9e2ee
Merge branch 'master' of https://github.com/uc-cdis/guppy into feat/p…
ZakirG Jan 22, 2021
137b673
clarify manifest logic
ZakirG Jan 22, 2021
3b85d4e
histogram schema
ZakirG Jan 25, 2021
5154bcc
histogram schema
ZakirG Jan 25, 2021
eed451e
typo fixes
ZakirG Jan 25, 2021
5c8527e
add to schema
ZakirG Jan 25, 2021
8e8091c
add to schema
ZakirG Jan 25, 2021
a4985b5
fix client-side error
ZakirG Jan 26, 2021
677fe99
fix client-side error
ZakirG Jan 26, 2021
30dc094
Merge branch 'master' of https://github.com/uc-cdis/guppy into feat/p…
ZakirG Jan 29, 2021
0768a54
fix logic problem
ZakirG Jan 29, 2021
ff9d749
adjust comment
ZakirG Jan 29, 2021
73dba72
eslint
ZakirG Jan 29, 2021
d8fc09a
feat: add doc
ZakirG Feb 1, 2021
09c4164
feat: add doc
ZakirG Feb 1, 2021
9700a00
feat: add doc
ZakirG Feb 1, 2021
c081c6a
PR feedback: config validation
ZakirG Feb 1, 2021
6ade7bb
PR feedback: add RegularAccessHistograms only on-demand
ZakirG Feb 1, 2021
859155f
PR feedback: remove redundant check from resolver
ZakirG Feb 1, 2021
0a6b870
PR feedback: remove extra console log
ZakirG Feb 1, 2021
d967b56
PR feedback: histogram RegularAccess ternary operator
ZakirG Feb 1, 2021
e167e36
PR feedback: add unit test
ZakirG Feb 1, 2021
54048c8
PR feedback: add unit test
ZakirG Feb 1, 2021
16c9307
fix logic error
ZakirG Feb 1, 2021
397048f
fix logic error
ZakirG Feb 1, 2021
55b19c8
PR feedback: addd more unit tests
ZakirG Feb 1, 2021
05f1d33
fix travis
ZakirG Feb 1, 2021
14189ca
fix travis
ZakirG Feb 1, 2021
fd4ad50
feat: enhance unit test for tierAccessLimit
ZakirG Feb 2, 2021
3ad6a7a
feat: enhance unit test for tierAccessLimit
ZakirG Feb 2, 2021
2b12abc
PR feedback: fix indentation
ZakirG Feb 2, 2021
78be005
PR feedback: remove unnecessary else-if
ZakirG Feb 2, 2021
f6460fe
PR feedbacks
ZakirG Feb 2, 2021
8314686
PR feedback: JSON file newlines
ZakirG Feb 2, 2021
7eff1fc
PR feedback: update README
ZakirG Feb 2, 2021
2a31f42
adjust whitespace
ZakirG Feb 2, 2021
3e2da94
PR feedback: update README
ZakirG Feb 2, 2021
606b31d
fix logic for download endpoint
ZakirG Feb 2, 2021
9eefb46
fix logic for download endpoint
ZakirG Feb 3, 2021
44b5ccf
debugging download enddpoint
ZakirG Feb 3, 2021
b3f916b
debugging download enddpoint
ZakirG Feb 3, 2021
808a7c4
debugging download enddpoint
ZakirG Feb 3, 2021
125998d
remove extra prints
ZakirG Feb 3, 2021
600165b
fix legacy comments
ZakirG Feb 3, 2021
6e229ca
cleaning up
ZakirG Feb 3, 2021
9e25c8b
adjust README
ZakirG Feb 3, 2021
672b33c
PR feedback: change variable name
ZakirG Feb 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 13 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,13 @@ You could put following as your config files:
"indices": [
{
"index": "${ES_INDEX_1}",
"type": "${ES_DOC_TYPE_1}"
"type": "${ES_DOC_TYPE_1}",
"tier_access_level": "${ES_TIER_ACCESS_LEVEL_1}" // optional, set this if there is no global tierAccessLevel
},
{
"index": "${ES_INDEX_2}",
"type": "${ES_DOC_TYPE_2}"
"type": "${ES_DOC_TYPE_2}",
"tier_access_level": "${ES_TIER_ACCESS_LEVEL_2}" // optional, set this if there is no global tierAccessLevel
},
...
],
Expand All @@ -35,6 +37,8 @@ You could put following as your config files:
}
```

Note: Guppy expects that either all indices in the guppy config block will have a tier_access_level set OR that a site-wide TIER_ACCESS_LEVEL is set as an environment variable (or in the global block of a commons' manifest). Guppy will throw an error if the config settings do not meet one of these two expectations. See `doc/index_scoped_tiered_access.md` for more information.

Following script will start server using at port 3000, using config file `example_config.json`:

```
Expand All @@ -58,17 +62,17 @@ behavior for local test without Arborist, just set `INTERNAL_LOCAL_TEST=true`. P
look into `/src/server/auth/utils.js` for more details.

### Tiered Access:
Guppy also support 3 different levels of tier access, by setting `TIER_ACCESS_LEVEL`:
The tiered-access setting is configured through either the `TIER_ACCESS_LEVEL` environment variable or the `tier_access_level` properties on individual indices in the esConfig. Guppy supports 3 different levels of tiered access:
- `private` by default: only allows access to authorized resources
- `regular`: allows all kind of aggregation (with limitation for unauthorized resources), but forbid access to raw data without authorization
- `libre`: access to all data

For `regular` level, there's another configuration environment variable `TIER_ACCESS_LIMIT`, which is the minimum visible count for aggregation results.
For the `regular` level, there's another configuration environment variable `TIER_ACCESS_LIMIT`, which is the minimum visible count for aggregation results.

`regular` level commons could also take in a whitelist of values that won't be encrypted. It is set by `config.encrypt_whitelist`.
`regular` level commons can also take in a whitelist of values that won't be encrypted. It is set by `config.encrypt_whitelist`.
By default the whitelist contains missing values: ['\_\_missing\_\_', 'unknown', 'not reported', 'no data'].
Also the whitelist is disabled by default due to security reasons. If you would like to enable whitelist, simply put `enable_encrypt_whitelist: true` in your config.
For example `regular` leveled commons with config looks like this will skip encrypting value `do-not-encrypt-me` even if its count is less than `TIER_ACCESS_LIMIT`:
For example, a `regular` leveled commons with config that looks like this will skip encrypting the value `do-not-encrypt-me` even if its count is less than `TIER_ACCESS_LIMIT`:

```
{
Expand All @@ -89,14 +93,16 @@ For example `regular` leveled commons with config looks like this will skip encr
}
```

For example following script will start a Guppy server with `regular` tier access level, and minimum visible count set to 100:
The following script will start a Guppy server with a site-wide `regular` tier access level, and minimum visible count set to 100:

```
export TIER_ACCESS_LEVEL=regular
export TIER_ACCESS_LIMIT=100
npm start
```

To learn how to configure Guppy's tiered-access system using a per-index scoping, and which use cases might warrant such a configuration, please see `doc/index_scoped_tiered_access.md`.

> #### Tier Access Sensitive Record Exclusion
> It is possible to configure Guppy to hide some records from being returned in `_aggregation` queries when Tiered Access is enabled (tierAccessLevel: "regular").
> The purpose of this is to "hide" information about certain sensitive resources, essentially making this an escape hatch from Tiered Access.
Expand Down
41 changes: 41 additions & 0 deletions doc/index_scoped_tiered_access.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Index-scoped Tiered-Access

Most commons use a site-wide tiered access configuration that applies across indices. However, some use cases require index-scoped permissioning. One example is the case of an open-access study viewer where studies have a mix of public properties and controlled-access properties. Another example is a Data Explorer that presents data types with different permission requirements meant to serve a variety of audiences. For these use cases, tiered-access settings can be specified at the index-level rather than the site-wide level.

Guppy expects that either all indices in the guppy config block will have a tiered-access level set OR that a site-wide tiered-access level is set in the global block of the manifest. Guppy will throw an error if the config settings do not meet one of these two expectations.

You can set index-scoped tiered-access levels using the `tier_access_level` properties in the guppy block of a common's `manifest.json`. Note that the `tier_access_limit` setting is still site-wide and configurable in the manifest's `global` block.
```
...
"guppy": {
"indices": [
{
"index": "subject_regular",
"type": "subject",
"tier_access_level": "regular"
},
{
"index": "subject_private",
"type": "subject_private",
"tier_access_level": "private"
},
{
"index": "file_private",
"type": "file",
"tier_access_level": "private"
},
{
"index": "studies_open",
"type": "studies_open",
"tier_access_level": "libre"
},
{
"index": "studies_controlled_access",
"type": "studies_controlled_access",
"tier_access_level": "private"
}
],
"auth_filter_field": "auth_resource_path",
...
},
```
20 changes: 20 additions & 0 deletions src/server/__tests__/config.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,26 @@ describe('config', () => {
expect(() => (require('../config'))).toThrow(new Error(`Invalid TIER_ACCESS_LEVEL "${process.env.TIER_ACCESS_LEVEL}"`));
});

test('should show error if invalid tier access level in guppy block', async () => {
process.env.TIER_ACCESS_LEVEL = null;
const fileName = './testConfigFiles/test-invalid-index-scoped-tier-access.json';
process.env.GUPPY_CONFIG_FILEPATH = `${__dirname}/${fileName}`;
const invalidItemType = 'subject_private';
expect(() => (require('../config'))).toThrow(new Error(`tier_access_level invalid for index ${invalidItemType}.`));
});

test('clears out site-wide default tiered-access setting if index-scoped levels set', async () => {
process.env.TIER_ACCESS_LEVEL = null;
process.env.TIER_ACCESS_LIMIT = 50;
const fileName = './testConfigFiles/test-index-scoped-tier-access.json';
process.env.GUPPY_CONFIG_FILEPATH = `${__dirname}/${fileName}`;
const config = require('../config').default;
const { indices } = require(fileName);
expect(config.tierAccessLevel).toBeUndefined();
expect(config.tierAccessLimit).toEqual(50);
expect(JSON.stringify(config.esConfig.indices)).toEqual(JSON.stringify(indices));
});

/* --------------- For whitelist --------------- */
test('could disable whitelist', async () => {
const config = require('../config').default;
Expand Down
33 changes: 33 additions & 0 deletions src/server/__tests__/schema.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import {
getAggregationSchema,
getAggregationSchemaForEachType,
getMappingSchema,
getHistogramSchemas,
} from '../schema';
import esInstance from '../es/index';
import config from '../config';
Expand Down Expand Up @@ -143,4 +144,36 @@ describe('Schema', () => {
expect(removeSpacesNewlinesAndDes(mappingSchema))
.toEqual(removeSpacesAndNewlines(expectedMappingSchema));
});

const expectedHistogramSchemas = `
type HistogramForString {
histogram: [BucketsForNestedStringAgg]
}
type RegularAccessHistogramForString {
histogram: [BucketsForNestedStringAgg]
}
type HistogramForNumber {
histogram(
rangeStart: Int,
rangeEnd: Int,
rangeStep: Int,
binCount: Int,
): [BucketsForNestedNumberAgg],
asTextHistogram: [BucketsForNestedStringAgg]
}
type RegularAccessHistogramForNumber {
histogram(
rangeStart: Int,
rangeEnd: Int,
rangeStep: Int,
binCount: Int,
): [BucketsForNestedNumberAgg],
asTextHistogram: [BucketsForNestedStringAgg]
}`;
test('could create histogram schemas for each type', async () => {
await esInstance.initialize();
const histogramSchemas = getHistogramSchemas();
expect(removeSpacesNewlinesAndDes(histogramSchemas))
.toEqual(removeSpacesAndNewlines(expectedHistogramSchemas));
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"indices": [
{
"index": "subject_regular",
"type": "subject",
"tier_access_level": "regular"
},
{
"index": "subject_private",
"type": "subject_private",
"tier_access_level": "private"
},
{
"index": "file_private",
"type": "file",
"tier_access_level": "private"
},
{
"index": "studies_open",
"type": "studies_open",
"tier_access_level": "libre"
},
{
"index": "studies_controlled_access",
"type": "studies_controlled_access",
"tier_access_level": "private"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"indices": [
{
"index": "subject_regular",
"type": "subject",
"tier_access_level": "regular"
},
{
"index": "subject_private",
"type": "subject_private",
"tier_access_level": "private____typo"
}
]
}
36 changes: 28 additions & 8 deletions src/server/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ const config = {
aggregationIncludeMissingData: typeof inputConfig.aggs_include_missing_data === 'undefined' ? true : inputConfig.aggs_include_missing_data,
missingDataAlias: inputConfig.missing_data_alias || 'no data',
},

port: 80,
path: '/graphql',
arboristEndpoint: 'http://arborist-service',
Expand Down Expand Up @@ -56,6 +55,15 @@ if (process.env.GUPPY_PORT) {
config.port = process.env.GUPPY_PORT;
}

const allowedTierAccessLevels = ['private', 'regular', 'libre'];

if (process.env.TIER_ACCESS_LEVEL) {
if (!allowedTierAccessLevels.includes(process.env.TIER_ACCESS_LEVEL)) {
throw new Error(`Invalid TIER_ACCESS_LEVEL "${process.env.TIER_ACCESS_LEVEL}"`);
}
config.tierAccessLevel = process.env.TIER_ACCESS_LEVEL;
}

if (process.env.TIER_ACCESS_LIMIT) {
config.tierAccessLimit = process.env.TIER_ACCESS_LIMIT;
}
Expand All @@ -72,14 +80,26 @@ if (process.env.ANALYZED_TEXT_FIELD_SUFFIX) {
config.analyzedTextFieldSuffix = process.env.ANALYZED_TEXT_FIELD_SUFFIX;
}

// only three options for tier access level: 'private' (default), 'regular', and 'libre'
if (process.env.TIER_ACCESS_LEVEL) {
if (process.env.TIER_ACCESS_LEVEL !== 'private'
&& process.env.TIER_ACCESS_LEVEL !== 'regular'
&& process.env.TIER_ACCESS_LEVEL !== 'libre') {
throw new Error(`Invalid TIER_ACCESS_LEVEL "${process.env.TIER_ACCESS_LEVEL}"`);
// Either all indices should have explicit index-scoped tiered-access values or
// the manifest should have a site-wide TIER_ACCESS_LEVEL value.
// This approach is backwards-compatible with commons configured for past versions of tiered-access.
let allIndicesHaveTierAccessSettings = true;
config.esConfig.indices.forEach((item) => {
if (!item.tier_access_level && !config.tierAccessLevel) {
throw new Error('Either set all index-scoped tiered-access levels or a site-wide tiered-access level.');
}
config.tierAccessLevel = process.env.TIER_ACCESS_LEVEL;
if (item.tier_access_level && !allowedTierAccessLevels.includes(item.tier_access_level)) {
throw new Error(`tier_access_level invalid for index ${item.type}.`);
}
if (!item.tier_access_level) {
allIndicesHaveTierAccessSettings = false;
}
});

// If the indices all have settings, empty out the default
// site-wide TIER_ACCESS_LEVEL from the config.
if (allIndicesHaveTierAccessSettings) {
delete config.tierAccessLevel;
}

// check whitelist is enabled
Expand Down
21 changes: 12 additions & 9 deletions src/server/download.js
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,23 @@ const downloadRouter = async (req, res, next) => {
} = req.body;

log.debug('[download] ', JSON.stringify(req.body, null, 4));
const esIndex = esInstance.getESIndexByType(type);
const esIndexConfig = esInstance.getESIndexConfigByType(type);
const tierAccessLevel = (config.tierAccessLevel
? config.tierAccessLevel : esIndexConfig.tier_access_level);
const jwt = headerParser.parseJWT(req);
const authHelper = await getAuthHelperInstance(jwt);

try {
let appliedFilter;
/**
* Tier acces strategy for download endpoint:
* 1. if data commons is secure, add auth filter layer onto filter
* 2. if data commons is regular:
* Tier access strategy for download endpoint:
* 1. if the data commons or the index is private, add auth filter layer onto filter
* 2. if the data commons or the index is regular:
* a. if request contains out-of-access resource, return 401
* b. if request contains only accessible resouces, return response
* 3. if data commons is private, always return reponse without any auth check
* 3. if the data commons or the index is libre, always return reponse without any auth check
*/
switch (config.tierAccessLevel) {
switch (tierAccessLevel) {
case 'private': {
appliedFilter = authHelper.applyAccessibleFilter(filter);
break;
Expand All @@ -36,7 +38,7 @@ const downloadRouter = async (req, res, next) => {
appliedFilter = authHelper.applyAccessibleFilter(filter);
} else {
const outOfScopeResourceList = await authHelper.getOutOfScopeResourceList(
esIndex, type, filter,
esIndexConfig.index, type, filter,
);
// if requesting resources > allowed resources, return 401,
if (outOfScopeResourceList.length > 0) {
Expand All @@ -54,13 +56,14 @@ const downloadRouter = async (req, res, next) => {
break;
}
default:
throw new Error(`Invalid TIER_ACCESS_LEVEL "${config.tierAccessLevel}"`);
throw new Error(`Invalid TIER_ACCESS_LEVEL "${tierAccessLevel}"`);
}
const data = await esInstance.downloadData({
esIndex, esType: type, filter: appliedFilter, sort, fields,
esIndex: esIndexConfig.index, esType: type, filter: appliedFilter, sort, fields,
});
res.send(data);
} catch (err) {
log.error(err);
next(err);
}
return 0;
Expand Down
28 changes: 28 additions & 0 deletions src/server/es/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -318,6 +318,34 @@ class ES {
);
}

/**
* Get es indexConfig by es type
* Throw 400 error if there's no existing es type
* @param {string} esType
*/
getESIndexConfigByType(esType) {
const index = this.config.indices.find((i) => i.type === esType);
if (index) return index;
throw new CodedError(
400,
`Invalid es type: "${esType}"`,
);
}

/**
* Get es index config by es index name
* Throw 400 error if there's no existing es index of that name
* @param {string} esIndexName
*/
getESIndexConfigByName(esIndexName) {
const indexConfig = this.config.indices.find((i) => i.index === esIndexName);
if (indexConfig) return indexConfig;
throw new CodedError(
400,
`Invalid es index name: "${esIndexName}"`,
);
}

/**
* Get all es indices and their alias
*/
Expand Down
Loading