Implement JEPA-based memory compression framework#785
Open
Pshyam17 wants to merge 22 commits intoFactory-AI:mainfrom
Open
Implement JEPA-based memory compression framework#785Pshyam17 wants to merge 22 commits intoFactory-AI:mainfrom
Pshyam17 wants to merge 22 commits intoFactory-AI:mainfrom
Conversation
Defines the interface every encoder in jeval must satisfy. EPEComputer depends only on this abstraction, never on a concrete model class — makes encoder swapping free for ablations.
Wraps all-mpnet-base-v2 as the JEPA target encoder. All parameters frozen unconditionally — a moving target makes EPE uncalibrated. encode_chunked handles long memory files without OOM.
Implements EPE = MSE(predictor(enc(C)), enc(T)) / 4. /4 normalizes to [0,1] on the unit sphere. Separate training_loss() and compute() paths make the inference-time oracle pattern explicit.
Buckets segment-level EPE by Strata content type. weighted_risk = sum(weight_t * mean_epe_t) is the scalar the budget allocator acts on. align_abstractive() handles non-1:1 mappings via nearest-neighbor cosine matching.
Routes memory segments to 6 semantic classes using DeBERTa-v3-large. No labeled training data required. NLI hypotheses are explicitly documented as a tunable hyperparameter — classification accuracy is measurable and improvable independently of EPE.
Maps weighted_risk → 3-tier compression budget per segment. Thresholds calibrated conservatively as priors — empirical calibration from probe eval results is the intended upgrade path.
Full pre-hoc compression pipeline: segment → classify → EPE estimate → budget → apply. _apply_budget is a word-truncation placeholder with a clean interface so any LLM summarizer can be swapped in as the backend.
Intercepts Droid's PreCompact event, runs jeval adaptive compression, writes back verified memories.md before Droid sees it. Gracefully degrades to uncalibrated EPE if no trained predictor checkpoint is found. Appends structured JSONL audit log per compression event.
Reimplements Factory's Dec 2025 probe-based evaluation methodology (recall, artifact, continuation, decision probes, 6-dimension LLM judge, 0-5 scale). Results are directly comparable to their published baselines: Factory 2.45 / Anthropic 2.33 / OpenAI 2.19 on artifact tracking.
README leads with the artifact tracking gap and shows the full pipeline in ASCII. pyproject scoped to the example folder so it installs independently of the main repo.
…import decomposer imported from strata, strata/budget imported from decomposer — circular. RISK_WEIGHTS now lives in its own module that both can import from safely.
sed replaced only the first line of RISK_WEIGHTS dict, leaving the remaining lines as dangling IndentationError. Rewrote file cleanly with correct imports from epe.weights.
- swap word truncation for Mistral via NVIDIA NIM as compression backend - add _llm_compress() with graceful fallback to word truncation if no API key - replace hardcoded EPE thresholds with z-score normalization — thresholds are now relative to session EPE distribution, self-calibrating across any predictor version - add artifact pattern detection in BudgetAllocator: entries containing file paths, JWT, Redis, API endpoints always get budget=1.0 regardless of EPE — fixes 'low EPE != low importance' for predictable content - add trained predictor checkpoint (30 epochs, 5000 pairs, A100) - add eval/train.py with synthetic Droid memory training data generator - add test_data/real_session.md synthetic session for benchmark testing - verify 10/10 critical artifacts survive 3 iterative compression rounds
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce a comprehensive framework for JEPA-based semantic fidelity evaluation, including an abstract encoder interface, a frozen SentenceTransformer, and an EPE computation engine. The framework supports adaptive compression, classification, and budget allocation, enhancing memory management for Droid integration. Documentation and necessary files have been added to facilitate usage and development. Fixes include resolving circular imports and ensuring all necessary components are included.