Closed
Conversation
- Implements Adagrad (Adaptive Gradient) using pure NumPy - Adapts learning rate individually for each parameter - Includes comprehensive docstrings and type hints - Adds doctests for validation - Provides usage example demonstrating convergence - Follows PEP8 coding standards - Part of issue TheAlgorithms#13662
- Implements Adam (Adaptive Moment Estimation) optimizer - Implements Nesterov Accelerated Gradient (NAG) optimizer - Both use pure NumPy without deep learning frameworks - Includes comprehensive docstrings and type hints - Adds doctests for validation - Provides usage examples demonstrating convergence - Follows PEP8 coding standards - Part of issue TheAlgorithms#13662
- Implements Muon optimizer for hidden layer weight matrices - Uses Newton-Schulz orthogonalization iterations - Provides matrix-aware gradient updates with spectral constraints - Includes comprehensive docstrings and type hints - Adds doctests for validation - Provides usage example demonstrating optimization - Follows PEP8 coding standards - Pure NumPy implementation without frameworks - Part of issue TheAlgorithms#13662
Multiple Pull Request Detected@Adhithya-Laxman, we are extremely excited that you want to submit multiple algorithms in this repository but we have a limit on how many pull request a user can keep open at a time. This is to make sure all maintainers and users focus on a limited number of pull requests at a time to maintain the quality of the code. This pull request is being closed as the user already has an open pull request. Please focus on your previous pull request before opening another one. Thank you for your cooperation. User opened pull requests (including this one): #13721, #13718, #13681, #13680, #13646 |
Author
|
My other PRs are closed now, Kindly help me in opening a PR for this branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR implements the Muon optimizer using pure NumPy, completing the neural network optimizers module for the repository.
Muon is a cutting-edge optimizer specifically designed for hidden layer weight matrices in neural networks, using Newton-Schulz matrix orthogonalization iterations for improved convergence and computational efficiency.
This PR addresses part of issue #13662 - Add neural network optimizers module to enhance training capabilities
What does this PR do?
Implementation Details
Why Muon?
Muon represents state-of-the-art optimizer research and offers several advantages:
Features
✅ Complete docstrings with parameter descriptions
✅ Type hints for all function parameters and return values
✅ Doctests for correctness validation
✅ Usage example demonstrating optimizer on matrix optimization
✅ PEP8 compliant code formatting
✅ Newton-Schulz orthogonalization implementation
✅ Configurable hyperparameters (learning rate, momentum, iteration steps)
✅ Pure NumPy - no external deep learning frameworks
Testing
All doctests pass:
Linting passes:
Example output demonstrates proper optimization behavior on matrix parameters.
References
Relation to Issue #13662
This PR completes the optimizer sequence outlined in #13662:
With this PR, the neural network optimizers module is now complete with 6 fundamental optimizers covering classical to cutting-edge optimization techniques.
Use Cases
Muon is particularly effective for:
Checklist
Summary
This PR marks the completion of the neural network optimizers module, providing educators and learners with a comprehensive collection of optimization algorithms from fundamental SGD to cutting-edge Muon. The module now serves as a complete educational resource for understanding neural network training optimization.
This PR along with the following PRs collectively fixes issue #13662:
Related PRs:
Fixes #13662 (This PR completes the neural network optimizers module)