Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
2aaa5cb
Merge pull request #22 from KnowledgeCaptureAndDiscovery/dev
juanjemdIos Jan 19, 2026
64839f8
Google compliant optioni with -gc. Reference, Logos. Fixes #775, #594…
juanjemdIos Feb 2, 2026
83d289b
poetry lock
juanjemdIos Feb 2, 2026
fc224f1
messages about invalid token or warnings. Fixes #627
juanjemdIos Feb 3, 2026
855513d
Pase codeowners. New flag additional info -ai. Fixes #723
juanjemdIos Feb 5, 2026
9a65480
parser, docs and test for conda environment
juanjemdIos Feb 6, 2026
8133a77
small corrections in conda environment md. Fixes #489
juanjemdIos Feb 6, 2026
2693179
scikit learn to 1.5. Pyproject.toml and poetry.lock. Fixes #692
juanjemdIos Feb 9, 2026
3d3bd1f
After upgrading scikit-learn to 1.5, the test was updated to avoid fa…
juanjemdIos Feb 9, 2026
d5212ec
removing duplicate information. Fixes #833
juanjemdIos Feb 11, 2026
362ca3a
try test failed in last PR.
juanjemdIos Feb 11, 2026
5be314f
codemeta keywords as array instead of string
juanjemdIos Feb 20, 2026
3597acb
solve problem merging structured and non structured requirements in c…
juanjemdIos Feb 24, 2026
66cddc8
describe with local repo. Solve problem with li folders. Fixes #894
juanjemdIos Feb 25, 2026
c2dfcb9
solve bugs with readmes without no parseable content. Fixes #897
juanjemdIos Feb 26, 2026
0b81ff4
again bug with no headers.
juanjemdIos Feb 26, 2026
5d9b379
Merge dev into KnowledgeCaptureAndDiscovery-dev (resolved poetry.lock)
juanjemdIos Feb 26, 2026
903e0ab
Regenerate poetry.lock after merge
juanjemdIos Feb 26, 2026
dfddea7
Actualizo pyproject.toml
juanjemdIos Feb 26, 2026
05082da
better flag. -ra and --reconcile_authors instead of -ai and --additio…
juanjemdIos Feb 26, 2026
939ee24
models regenerated with scikit learn 1.5. Rolf models still with a ol…
juanjemdIos Feb 27, 2026
05cc817
rolf test. Action test before PR in python 3.11 and 3.12
juanjemdIos Mar 2, 2026
fbc6386
some test skipped in CI in order to reduce request and because they a…
juanjemdIos Mar 2, 2026
b906617
bug when broken link in badge. Fixes #903
juanjemdIos Mar 2, 2026
6814d16
dictionary with scheme org properties used in google_codemeta_out to …
juanjemdIos Mar 2, 2026
5141eb6
Update src/somef/somef_cli.py
dgarijo Mar 2, 2026
7d803c1
Merge pull request #24 from KnowledgeCaptureAndDiscovery/dev
juanjemdIos Mar 4, 2026
aff1edc
new rolfs models. Solve comments by Dani.
juanjemdIos Mar 4, 2026
02cc120
Merge branch 'master' of https://github.com/juanjemdIos/somef
juanjemdIos Mar 4, 2026
6f7b976
test failed because case sensitive in the name of a readme
juanjemdIos Mar 4, 2026
41e76e5
best configuration of pyproject.toml and poetry.lock until morph allo…
juanjemdIos Mar 5, 2026
1233e95
ignore some warnings until new version of textblob.
juanjemdIos Mar 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .github/workflows/action-test-before-PR.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,18 @@ on:
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.11", "3.12"]

steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: "3.11"
with:
python-version: ${{ matrix.python-version }}

- name: Install Poetry
run: curl -sSL https://install.python-poetry.org | python3 -
Expand Down
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ Given a readme file (or a GitHub/Gitlab repository) SOMEF will extract the follo
- **Keywords**: set of terms used to commonly identify a software component
- **License**: License and usage terms of a software component
- **Logo**: Main logo used to represent the target software component
- **Maintainer**: Individuals or teams responsible for maintaining the software component, extracted from the CODEOWNERS file
- **Name**: Name identifying a software component
- **Ontologies**: URL and path to the ontology files present in the repository
- **Owner**: Name and type of the user or organization in charge of the repository
Expand Down Expand Up @@ -290,17 +291,21 @@ Options:
-d, --doc_src PATH Path to the README file source
-i, --in_file PATH A file of newline separated links to GitHub/
Gitlab repositories
-l, --local_repo PATH Path to the local repository source. No APIs will be used

Output: [required_any]
-o, --output PATH Path to the output file. If supplied, the
output will be in JSON

-c, --codemeta_out PATH Path to an output codemeta file
-g, --graph_out PATH Path to the output Knowledge Graph export
file. If supplied, the output will be a
Knowledge Graph, in the format given in the
--format option chosen (turtle, json-ld)

-gc, --google_codemeta_out PATH Path to a Google-compliant Codemeta JSON-LD
file. This output transforms the standard
Codemeta to follow Google’s expected JSON-LD
structure.

-f, --graph_format [turtle|json-ld]
If the --graph_out option is given, this is
the format that the graph will be stored in
Expand All @@ -325,6 +330,12 @@ Options:
-v, --requirements_v Export only requirements from structured
sources (pom.xml, requirements.txt, etc.)


-ra, --reconcile_authors SOMEF will extract additional information
from certain files like CODEOWNERS.
This may require extra API
requests and increase execution time

-h, --help Show this message and exit.
```

Expand Down
5 changes: 2 additions & 3 deletions docs/bower.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,9 @@ These fields are defined in the [Bower specification](https://github.com/bower/s
| requirements - value | requirements[i].result.value | "dependencies": {"paq":"version"} -> paq: version *(1)* |
| requirements - name | requirements[i].result.name | "dependencies": {"paq":"version"} -> paq |
| requirements - version | requirements[i].result.version | "dependencies": {"paq":"version"} -> version |
| requirements - dependency type | requirements[i].result.dependency_type | dependencies -> runtime , devDependencies -> dev |
| version | version[i].result.value | version |


<!-- | requirements - dependency type | requirements[i].result.dependency_type | dependencies -> runtime , devDependencies -> dev | -->
---

*(1)*
Expand All @@ -36,4 +35,4 @@ These fields are defined in the [Bower specification](https://github.com/bower/s
- Result value: "jquery: ^3.1.1"
- Result name": "jquery"
- Result version": "^3.1.1"
- Result dependency_type": "runtime" because it is "dependencies"s
<!-- - Result dependency_type": "runtime" because it is "dependencies"s -->
3 changes: 2 additions & 1 deletion docs/composer.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ These fields are defined in the [Composer.json specification](https://getcompose
| requirements - value | requirements[i].result.value | require.name require.version or require-dev.name reire-dev.version |
| requirements - name | requirements[i].result.name | require.name or require-dev.name |
| requirements - version | requirements[i].result.version | require.version or require-dev.version |
| requirements - dependency type | requirements[i].result.dependency_type | require = runtime or require-dev = dev |
| version - value | version[i].result.value | version |
| version - tag | version[i].result.tag | version |

<!-- | requirements - dependency type | requirements[i].result.dependency_type | require = runtime or require-dev = dev | -->
47 changes: 47 additions & 0 deletions docs/condaenvironment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
The following metadata fields can be extracted from a Conda `environment.yml` or `environment.yaml` file.
This file format is part of the Conda environment specification and is commonly used to declare software dependencies for reproducible environments.

Only dependency information is mapped, since it is the only part of the Conda environment specification that corresponds to CodeMeta could be `softwareRequirements`.

---

## Extracted metadata fields

| Software metadata category | SOMEF metadata JSON path | ENVIRONMENT.YML metadata file field |
|-----------------------------|---------------------------------------|------------------------------|
| has_package_file | has_package_file[i].result.value | URL of the `environment.yml` file |
| requirements - value | requirements[i].result.value | dependencies |
| requirements - name | requirements[i].result.name | dependencies extract name |
| requirements - version | requirements[i].result.version | dependencies extract version |
<!-- | requirements - dependency type | requirements[i].result.dependency_type | conda if dependencies or pip if dependencies/pip *(1)* | -->


---

<!--
*(1)*
- Example of a dependency conda and a dependency pip:
```
name: ldm
dependencies:
- python=3.8.5
- pip:
- albumentations==0.4.3
```
- Result:
```
"result": {
"value": "python=3.8.5",
"name": "python",
"version": "3.8.5",
"type": "Software_application",
"dependency_type": "conda"
},
"result": {
"value": "albumentations==0.4.3",
"name": "albumentations",
"version": "0.4.3",
"type": "Software_application",
"dependency_type": "pip"
},
-->
11 changes: 6 additions & 5 deletions docs/gemspec.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ These fields are defined in the [Ruby Gems specification](https://guides.rubygem
| requirements - value | requirements[i].result.value | requirements/add_dependency/add_development_dependency name:version *(6)* |
| requirements - name | requirements[i].result.name | requirements/add_dependency/add_development_dependency name *(6)* |
| requirements - version | requirements[i].result.version | requirements/add_dependency/add_development_dependency version *(6)* |
| requirements - development type | requirements[i].result.development_type | add_dependency -> runtime *(6)* |
| requirements - development type | requirements[i].result.development_type | add_development_dependency -> dev *(6)* |
<!-- | requirements - dependency type | requirements[i].result.development_type | add_dependency -> runtime *(6)* |
| requirements - dependency type | requirements[i].result.development_type | add_development_dependency -> dev *(6)* | -->

---

Expand Down Expand Up @@ -57,7 +57,7 @@ These fields are defined in the [Ruby Gems specification](https://guides.rubygem
- Example: `gem.name = "bootstrap-datepicker-rails"`
- Resutl: `bootstrap-datepicker-rails`

*(5)*
*(6)*
- Regex1: `r'gem\.requirements\s*=\s*(\[.*?\])'`
- Example:
```
Expand All @@ -75,12 +75,13 @@ spec.requirements = [
gem.add_dependency "railties", ">= 3.0"
gem.add_development_dependency "bundler", ">= 1.0"
```
Result: add_depency --> type runtime; add_development_dependencyd --> type dev
<!--
Result: add_depency -> type runtime; add_development_dependencyd -> type dev
```
[{'result': {'value': 'railties: >= 3.0', 'name': 'railties', 'version': '>= 3.0', 'type': 'Software_application', 'dependency_type': 'runtime'}, 'confidence': 1, 'technique': 'code_parser', 'source': 'https://example.org/bootstrap-datepicker-rails.gemspec'}, {'result': {'value': 'bundler: >= 1.0', 'name': 'bundler', 'version': '>= 1.0', 'type': 'Software_application', 'dependency_type': 'dev'}, 'confidence': 1, 'technique': 'code_parser', 'source': 'https://example.org/bootstrap-datepicker-rails.gemspec'}]
```


-->



1 change: 1 addition & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ SOMEF aims to recognize the following categories (in alphabetical order):
- `keywords`: set of terms used to commonly identify a software component
- `license`: License and usage terms of a software component
- `logo`: Main logo used to represent the target software component.
- `maintainer`': Individuals or teams responsible for maintaining the software component, extracted from the CODEOWNERS file
- `name`: Name identifying a software component
- `ontologies`: URL and path to the ontology files present in the repository.
- `owner`: Name of the user or organization in charge of the repository
Expand Down
12 changes: 11 additions & 1 deletion docs/publiccode.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,13 +121,23 @@ dependsOn:

- Result PostgreSQL:
```
"result": {
"value": "PostgreSQL>=14.0",
"name": "PostgreSQL",
"version": ">=14.0",
"type": "Software_application"

},
```
<!-- ```
"result": {
"value": "PostgreSQL>=14.0",
"name": "PostgreSQL",
"version": ">=14.0",
"type": "Software_application",
"dependency_type": "runtime"
},
```
``` -->



2 changes: 2 additions & 0 deletions docs/supported_metadata_files.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ SOMEF can extract metadata from a wide range of files commonly found in software
| `*.cabal` | Haskell | Manifest file serving as the package descriptor for Haskell projects.| <div align="center">[🔍](./cabal.md)</div> | [📄](https://cabal.readthedocs.io/en/3.10/cabal-package.html)| |[Example](https://github.com/haskell/cabal/blob/master/Cabal/Cabal.cabal) |
| `dockerfile` | Dockerfile | Build specification file for container images that can include software metadata via LABEL instructions (OCI specification).| <div align="center">[🔍](./dockerfiledoc.md)</div> | [📄](https://docs.docker.com/reference/dockerfile/)| |[Example](https://github.com/FairwindsOps/nova/blob/master/Dockerfile) |
| `publiccode.yml` | YAML | YAML metadata file for public sector software projects| <div align="center">[🔍](./publiccode.md)</div> | [📄](https://yml.publiccode.tools//)| |[Example](https://github.com/maykinmedia/objects-api/blob/master/publiccode.yaml) |
| `environment.yml` | YAML | Conda environment specification file declaring software dependencies for reproducible environments| <div align="center">[🔍](./codaenvironment.md)</div> | | |[Example](https://github.com/CompVis/stable-diffusion/blob/main/environment.yaml) |


> **Note:** The general principles behind metadata mapping in SOMEF are based on the [CodeMeta crosswalk](https://github.com/codemeta/codemeta/blob/master/crosswalk.csv) and the [CodeMeta JSON-LD context](https://github.com/codemeta/codemeta/blob/master/codemeta.jsonld).
> However, each supported file type may have specific characteristics and field interpretations.
Expand Down
Loading