Skip to content

Latest commit

 

History

History
338 lines (247 loc) · 16.6 KB

File metadata and controls

338 lines (247 loc) · 16.6 KB

Shadows and Least General Generalization (LGG)

This section begins the Learning layer, following Shadows through Paths, Hidden Context Prototypes, clustering feedback, and related transforms that adapt the system over time.

Shadows are a cornerstone of ProtoScript’s graph-based ontology, serving as the primary mechanism for learning and categorization. Through Least General Generalization (LGG), Shadows create ad-hoc subtypes that generalize common structures across Prototypes, enabling unsupervised learning without the need for gradient descent. Shadows are ProtoScript’s primary structural generalization mechanism: Compare/LGG constructs explicit Prototypes from bounded graph comparisons, and feedback-driven pruning keeps those generalizations focused. This section carefully unpacks Shadows and LGG, explaining their purpose, mechanics, and significance with clear analogies, step-by-step examples, and practical applications. We’ll focus on how Shadows generalize Prototypes, categorize instances, and power ProtoScript’s dynamic reasoning, ensuring clarity for developers familiar with C# or JavaScript.

Why Shadows Are Foundational

Shadows are the heart of ProtoScript’s learning capability, allowing the system to:

  • Generalize Patterns: Identify shared structures across Prototypes (e.g., finding that two variable declarations share a common type).
  • Create Subtypes Dynamically: Form ad-hoc categories (e.g., “initialized integer variables”) without predefined schemas.
  • Enable Unsupervised Learning: Discover patterns in data without labeled training sets, unlike supervised machine learning.
  • Scale Efficiently: Operate on graph structures with bounded comparison sets and pruning policies rather than open-ended numeric optimization.

Learning Role in ProtoScript:

  • Structural Learning Path: Shadows, via LGG, provide the runtime pathway for learning new categories and relationships by materializing shared structure as explicit Prototypes.
  • Graph-Focused Alternative: This learning process is symbolic: it derives graph artifacts (Shadows, Paths) from structural comparisons instead of adjusting numeric weights, making it suited to sparse or evolving ontology data.

Shadows generalize structures dynamically through LGG, enabling unsupervised learning and scalable comparisons without relying on the static class hierarchies and external inference engines used in traditional ontologies.

Analogy to Familiar Concepts

For C# developers:

  • Shadows are like finding the common interface or base class between two objects by comparing their properties, but done dynamically at runtime without predefined types.
  • Think of LGG as a LINQ query that extracts the shared structure of two objects, creating a new “type” on the fly.

For JavaScript developers:

  • Shadows resemble finding the common properties of two JSON objects to form a prototype, but organized in a graph for reasoning.
  • LGG is like merging objects to keep only shared keys and values, automatically generating a reusable template.

For database developers:

  • Shadows are like discovering a common schema for two database records by comparing their fields, creating a new table definition.
  • LGG is akin to a graph query that finds the intersection of two node structures.

What Are Shadows?

A Shadow is a Prototype generated by applying Least General Generalization (LGG) to two or more Prototypes, capturing their most specific common structure. It acts as an ad-hoc subtype, representing the shared properties and relationships of the input Prototypes while omitting their differences. Shadows enable ProtoScript to:

  • Categorize: Group Prototypes under a common subtype (e.g., “initialized variables”).
  • Learn: Discover patterns without supervision by generalizing instances.
  • Reason: Query and transform data based on generalized structures.

LGG Defined: LGG finds the least general (most specific) Prototype that subsumes two or more input Prototypes, retaining only their common properties, types, and relationships. It’s the opposite of finding the most specific common instance; instead, it creates the most specific shared abstraction.

Key Characteristics

  1. Structural Generalization

    • LGG compares Prototype graphs, keeping shared properties and generalizing differences (e.g., different variable names become a generic string).
    • Example: int i = 0 and int j = -1 generalize to int _ = _.
  2. Ad-Hoc Subtypes

    • Shadows form temporary or persistent subtypes, categorizing Prototypes dynamically.
    • Example: A Shadow for “parents” groups Homer and Marge.
  3. Unsupervised Learning

    • Shadows learn by comparing instances, requiring no labeled data.
    • Example: Generalizing two SQL queries to a common query structure.
  4. Graph-Based

    • Operates on the graph’s nodes and edges, ensuring scalability and interpretability.
    • Example: Traversing properties to find commonalities.

How Shadows Work

Shadows are created using the compare operator, which applies LGG to two Prototypes. The process involves:

  1. Comparison: Analyze properties, types, and relationships of the input Prototypes.
  2. Retention: Keep identical properties and values (e.g., same type).
  3. Generalization: Replace differing values with their common type or a wildcard (e.g., _ for names).
  4. Output: Produce a new Prototype (the Shadow) representing the shared structure.

Note: The following snippet is conceptual pseudocode. The actual runtime API is very similar and equivalent in behavior; treat this as an explanatory representation of the operation.

Syntax (Conceptual, executed by runtime):

Compare(prototype1, prototype2) // Returns a Shadow Prototype

C# Analogy: Like a method that compares two objects’ fields and returns a new object with only their common properties, but operating on graph nodes.

LGG Rules

  1. Exact Match: Identical properties or values are retained (e.g., TypeName = "int" in both Prototypes).
  2. Type Generalization: Differing values of the same type generalize to the type (e.g., "i" and "j" become string).
  3. Structural Generalization: Differing substructures generalize to their common parent (e.g., different expressions become Expression).
  4. Omission: Properties present in one but not the other are excluded unless structurally required.
  5. Annotations: Comparison metadata (e.g., Compare.Exact, Compare.StartsWith) may annotate the Shadow to indicate match precision.

Example 1: Generalizing C# Variable Declarations

Scenario: Create a Shadow for int i = 0 and int j = -1.

Input Prototypes:

prototype CSharp_VariableDeclaration {
CSharp_Type Type = new CSharp_Type();
string VariableName = "";
CSharp_Expression Initializer = new CSharp_Expression();
}
prototype CSharp_Type {
string TypeName = "";
bool IsNullable = false;
}
prototype CSharp_Expression {
string Value = "";
}
prototype Int_Declaration_I : CSharp_VariableDeclaration {
Type.TypeName = "int";
VariableName = "i";
Initializer = IntegerLiteral_0;
}
prototype IntegerLiteral_0 : CSharp_Expression {
Value = "0";
}
prototype Int_Declaration_J : CSharp_VariableDeclaration {
Type.TypeName = "int";
VariableName = "j";
Initializer = UnaryExpression_Minus1;
}
prototype UnaryExpression_Minus1 : CSharp_Expression {
string Operator = "-";
string Value = "1";
}

Shadow Creation:

  • Comparison:
    • Type.TypeName: Both "int" (exact match, retained).
    • IsNullable: Both false (exact match, retained).
    • VariableName: "i" vs. "j" (differ, generalize to string).
    • Initializer: IntegerLiteral_0 vs. UnaryExpression_Minus1 (differ, generalize to CSharp_Expression).
  • Resulting Shadow:

prototype InitializedIntVariable : CSharp_VariableDeclaration {
Type.TypeName = "int";
Type.IsNullable = false;
VariableName = "";
Initializer = new CSharp_Expression();
}

C# Visualization: int _ = _;

What’s Happening?

  • The Shadow captures the common structure: both are non-nullable int declarations with an initializer.
  • VariableName generalizes to an empty string (a wildcard), and Initializer to CSharp_Expression.
  • Graph View: The Shadow is a node with edges to "int", false, and a generic CSharp_Expression node.
  • Learning Outcome: The Shadow defines a new subtype, “initialized integer variables,” categorizing both inputs.

Unlike OWL, which requires predefined classes, Shadows dynamically learn this subtype from instance comparisons, enabling unsupervised categorization.

Example 2: Generalizing Simpsons Characters

Scenario: Create a Shadow for Homer and Marge to identify shared traits.

Input Prototypes (from Simpsons example):

prototype Person {
string Name = "";
string Gender = "";
Location Location = new Location();
Collection ParentOf = new Collection();
Person Spouse = new Person();
int Age = 0;
}
prototype Location {
string Name = "";
}
prototype SimpsonsHouse : Location {
Name = "Simpsons House";
}
prototype Homer : Person {
Name = "Homer Simpson";
Gender = "Male";
Location = SimpsonsHouse;
ParentOf = [Bart, Lisa, Maggie];
Spouse = Marge;
Age = 39;
}
prototype Marge : Person {
Name = "Marge Simpson";
Gender = "Female";
Location = SimpsonsHouse;
ParentOf = [Bart, Lisa, Maggie];
Spouse = Homer;
Age = 36;
}

Shadow Creation:

  • Comparison:
    • Name: "Homer Simpson" vs. "Marge Simpson" (differ, generalize to string).
    • Gender: "Male" vs. "Female" (differ, generalize to string).
    • Location: Both SimpsonsHouse (exact match, retained).
    • ParentOf: Both [Bart, Lisa, Maggie] (exact match, retained).
    • Spouse: Marge vs. Homer (differ, generalize to Person).
    • Age: 39 vs. 36 (differ, generalize to int).
  • Resulting Shadow:

prototype SimpsonsHouseParent : Person {
Name = "";
Gender = "";
Location = SimpsonsHouse;
ParentOf = [Bart, Lisa, Maggie];
Spouse = new Person();
Age = 0;
}

What’s Happening?

  • The Shadow defines a subtype for “parents living in the Simpsons’ house with children Bart, Lisa, and Maggie.”
  • Differing properties (Name, Gender, Age) generalize to their types; Spouse to a generic Person.
  • Graph View: The Shadow links to SimpsonsHouse and [Bart, Lisa, Maggie] nodes, with placeholder edges for Name, Gender, Spouse, and Age.
  • Learning Outcome: Categorizes Homer and Marge as instances of this subtype, learned without supervision.

Shadows enable ProtoScript to learn family structures dynamically, unlike OWL’s need for predefined family classes.

Example 3: Categorizing with Shadows

Scenario: Use the Shadow to categorize a new Prototype, Ned.

New Prototype:

prototype Ned : Person {
Name = "Ned Flanders";
Gender = "Male";
Location = FlandersHouse;
ParentOf = [Rod, Todd];
Spouse = Maude;
Age = 40;
}
prototype FlandersHouse : Location {
Name = "Flanders House";
}

Categorization:

function IsSimpsonsHouseParent(Person person) : bool {
return person -> SimpsonsHouseParent {
this.Location.Name == "Simpsons House" &&
this.ParentOf.Count == 3
};
}

What’s Happening?

  • IsSimpsonsHouseParent uses the SimpsonsHouseParent Shadow to check if Ned fits the subtype.
  • Ned fails because Location = FlandersHouse, not SimpsonsHouse, and ParentOf has two children, not three.
  • Graph View: The -> operator traverses Ned’s properties, comparing them to the Shadow’s structure.
  • Learning Outcome: The Shadow categorizes Homer and Marge but excludes Ned, demonstrating learned discrimination.

This unsupervised categorization is more flexible than OWL’s static class membership, adapting to new instances dynamically.

Mechanics of Shadows

Shadows are generated by the ProtoScript runtime:

  1. Comparison Operator: Compare(prototype1, prototype2) triggers LGG, analyzing graph structures.
  2. Graph Traversal: Examines nodes, properties, and edges, applying LGG rules.
  3. Node Creation: Produces a new Prototype node (the Shadow) with generalized properties.
  4. Categorization: The Shadow’s structure defines a subtype, tested via the -> operator.

Scalability:

  • Naïvely, LGG could require many pairwise comparisons; runtime stays bounded by drawing candidates from indexed shortlists and feedback-ranked matches.
  • Pruning (e.g., removing trivial Shadows) and clustering (grouping similar Shadows) manage complexity using configurable feedback thresholds.
  • LGG is deterministic structural comparison that produces graph artifacts, not iterative numeric optimization.

Non-Supervised Learning:

  • Shadows learn by structural similarity, not labeled data, making them ideal for sparse or evolving ontologies.
  • Example: Generalizing new characters without predefined categories.

Scalability levers (implementation-dependent)

Shadows rely on explicit controls to stay tractable: candidate Prototypes come from indexes or shortlists rather than full graph scans. LGG traversals honor hop limits set by transform templates, bounding how far comparisons spread. Feedback scores drive pruning or merging of low-value Shadows and can split broad ones when precision drops. Hidden Context Prototypes store deltas instead of full copies, reducing traversal and storage cost during learning. Clustering groups similar Shadows so follow-up comparisons reuse prior matches. Caching materialized graph fragments is optional and can keep frequently used Shadows or Paths ready for transforms. These levers are implementation-dependent; operators choose thresholds to balance fidelity and cost. The learning runtime uses the same controls when applying Paths and Subtypes so downstream transforms reuse the bounded candidate sets.

Why Shadows Support Learning

Shadows anchor the learning pipeline in ProtoScript’s ontology, offering:

  • Unsupervised Learning: Discover subtypes without training data, unlike supervised ontology tools.
  • Scalability Controls: Candidate shortlisting and pruning keep pairwise LGG comparisons focused on promising structures.
  • Interpretability: Shadows are explicit Prototypes, traceable via graph paths, unlike neural networks’ black boxes.
  • Dynamic Reasoning: Ad-hoc subtypes enable flexible querying and transformation.

Structural Contrast with Gradient Descent:

  • Gradient descent updates numeric weights; Shadows rely on explicit structural comparisons and feedback scores, which suits sparse knowledge graphs.
  • Shadows use structural LGG, learning from few examples with deterministic results, ideal for knowledge representation.

Example 4: Cross-Domain Generalization

Scenario: Generalize a C# variable and a database column.

Input Prototypes:

prototype Database_Column {
string ColumnName = "";
string DataType = "";
}
prototype ID_Column : Database_Column {
ColumnName = "ID";
DataType = "int";
}

Shadow Creation (with Int_Declaration_I from Example 1):

  • Comparison:
    • TypeName ("int") vs. DataType ("int"): Exact match, retained as string.
    • VariableName ("i") vs. ColumnName ("ID"): Differ, generalize to string.
    • Other properties (e.g., Initializer, IsNullable): Absent in one, omitted.
  • Resulting Shadow:

prototype IntDataElement {
string Name = "";
string Type = "int";
}

What’s Happening?

  • The Shadow defines a subtype for “integer data elements” (e.g., variables or columns).
  • Graph View: Links to "int" and a generic string for Name.
  • Learning Outcome: Categorizes Int_Declaration_I and ID_Column, revealing cross-domain type consistency.
  • Unifies code and database domains, unlike OWL’s separate ontologies.

Moving Forward

Shadows and LGG are ProtoScript’s core learning mechanism, enabling unsupervised categorization through indexed pairwise comparisons and pruning-based generalization controls.


Previous: Relationships in ProtoScript | Next: Prototype Paths and Parameterization