Use a short SPDX license header for LLM-centered files#1489
Use a short SPDX license header for LLM-centered files#1489Dev-iL wants to merge 3 commits intoapache:mainfrom
Conversation
| # Copyright(c) Open Law Library. All rights reserved. # | ||
| # See ThirdPartyNotices.txt in the project root for additional notices. # |
There was a problem hiding this comment.
Removing this might be incorrect in this case. Is this here because the code was vendored in from pygls?
There was a problem hiding this comment.
yeah I dont' recall. so maybe revert?
There was a problem hiding this comment.
Can this be moved to a NOTICE file if the code in question is licensed under ALv2 too? CC: @potiuk
There was a problem hiding this comment.
Yes. It should be placed in the NOTICE file https://infra.apache.org/licensing-howto.html
There was a problem hiding this comment.
yeah so I think this one should stay, right?
There was a problem hiding this comment.
You don't move license info to the NOTICE. If you use 3rd party code that has a NOTICE then its NOTICE contents must be included in your NOTICE.
This whole issue is caused by the possibility that there is 3rd party code. @Dev-iL and I have looked at that file and if there ever was 3rd party code, it now appears to have been removed from the file.
If we can make a call on if there is 3rd party code in that file - that is the most important decision here. Everything else depends on that starting block.
There was a problem hiding this comment.
Now I see this is another file. Not the conftest.py file. This does look like it has 3rd party source. The source header needs to be retained and our LICENSE needs to mention this file.
Can we take all of this license mess into its own issues and PRs and not not try to deal with everything in one PR? It is really messy to have all this happening in one PR.
There was a problem hiding this comment.
@pjfanning Indeed, this PR has accumulated more discussion than I'd have liked. That said, I think we're actually at the finish line now. The only "license mess" was the two LSP test files (conftest.py and ls_setup.py), and that's been resolved thanks to @skrawcz's analysis. Splitting this into separate PRs at this point would mean re-doing the pre-commit hook configuration and exclusion logic across multiple branches, which is likely more churn than just landing it as-is.
Let's use this as a learning experience for the future.
8f6cdd6 to
c1f2647
Compare
d86280f to
8fa6989
Compare
- Mark .github as export-ignore in .gitattributes - Add short and long license templates for pre-commit hooks - Add pre-commit hooks to enforce license headers (replaces CI scripts) - Delete scripts/add_license_headers.py and scripts/check_license_headers.py - Remove CI license check step from hamilton-lsp workflow - Fix inconsistent license header indentation in several files - Add missing license headers to PR templates - Add vendored code attributions (Open Law Library, Palantir) to NOTICE file - Exclude contrib/docs/ from markdown license hook (Docusaurus frontmatter)
Following the approach from apache/airflow#62073 and apache/airflow#62145, files intended for LLM/agent consumption (not distributed in releases) now use a minimal SPDX license identifier instead of the full Apache 2.0 header - for LLM token efficiency.
See also:
https://lists.apache.org/thread/j1tn63r2lf13v3d1tnnqff8fkcl4nx53
Changes
.githubfolder asexport-ignore.How I tested this
Notes
Checklist