You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/main/java/org/beehive/gpullama3/inference/InferenceCore.java
+12-21Lines changed: 12 additions & 21 deletions
Original file line number
Diff line number
Diff line change
@@ -28,15 +28,6 @@
28
28
* This class provides core computational operations such as RMS normalization and forward passes through model layers. It supports both CPU and GPU implementations.
29
29
* </p>
30
30
*
31
-
* <p>
32
-
* Specifically, it implements:
33
-
* <ul>
34
-
* <li>{@code rmsnorm} – applies Root Mean Square Layer Normalization to input vectors</li>
35
-
* <li>{@code forwardJava} – executes a Forward pass for LLaMA and Mistral models on CPU</li>
36
-
* <li>{@code forwardJavaQwen3} – executes a Forward pass for Qwen3 models on CPU</li>
37
-
* <li>{@code forwardTornadoVM} – executes a Forward pass using TornadoVM for GPU acceleration</li>
38
-
* </ul>
39
-
* </p>
40
31
*/
41
32
42
33
publicfinalclassInferenceCore {
@@ -643,10 +634,10 @@ public static FloatTensor forwardJavaPhi3(Model model, Phi3State state, int toke
643
634
* Granite uses the same transformer architecture as Llama but with maximal update parameterization (µP)
644
635
* scaling factors applied at specific points:
645
636
* <ul>
646
-
* <li>Embedding scaling: multiply embeddings after lookup</li>
647
-
* <li>Attention scaling: use custom multiplier instead of 1/sqrt(headDim)</li>
0 commit comments