Input all categorical data via YAML files#164
Merged
Conversation
02a836c to
0615d32
Compare
77744ba to
07ca96d
Compare
1ff7bdc to
ddb213b
Compare
as for error logging, unfortunately, due to deferred FKs, it is currently not possible to log the exact file where the FK violation happened
cda6e49 to
89e925d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
All categorical data is now authored in YAML files. Contributors no longer need to touch SQL seed files anymore. YAML is more readable, structured, and independent of the database layer.
To support this change, existing seed data has been converted (once) into YAML. From now on, YAML is the single source of truth for seeded data. The SQL seed files have been deleted, but the SQLite database is kept as before. It is still used for storage, querying, and for running deduction scripts. The schema is still defined in SQL files.
There is a major improvement: each category now has its own YAML file. These files no longer contain only the basic metadata, but also tags, related categories, satisfied and unsatisfied properties of the category, and more (see the example below). This information is no longer distributed across many SQL files, making it much easier to add new categories and maintain existing ones.
The same changes have been made for functors. In addition, each property (of a category or a functor) now has its own YAML file. The implications are still grouped by topic, but the topics have been refined, so each file is now more focused.
Previously, SQL seed files were doing two things at once: defining the data and also implementing the logic to insert and normalize it into multiple tables. This PR separates those concerns. The insertion logic is now handled explicitly in the new seed script, while YAML only defines the data itself. This does introduce an extra layer (YAML parsing + import step), but this layer already existed implicitly before in the SQL seed files.
A challenge was finding a good and consistent way to format long strings used as YAML values. This will likely need further improvement in the future. It was also tested whether Markdown could be used inside fields, but this caused more problems than it solved.
This PR resolves #163 (but with YAML instead of markdown).
Example
Here is the current version
Ab.yaml. All the information about this category is in one file.