-
Notifications
You must be signed in to change notification settings - Fork 66
Description
Hello everyone,
We're evaluating the use of RML to implement a ETL pipeline that operates on a large amount JSON data in the form of files which are regularly updated. Since these are millions of files with the same structure, it is very inefficient for us to compute an explicit RML rule set for every single file that a mapping is applied to. From the examples that are documented here I can only see that you must specify the file name in the logical source like this:
<#PersonMapping>
rml:logicalSource [
rml:source "People.json";
...
This limits the applicability of the rule set to a single file only. Is there a way to add wildcard operators to rml:source so that it can be applied to more than one file based on a pattern? Something like this:
<#PersonMapping>
rml:logicalSource [
rml:source "persons/*.json";
What do you think about this? Are there any reasons why this is a bad idea?