Skip to content

Add i18n and ai rules - batch 1#74

Merged
heliocodacy merged 2 commits intomasterfrom
add_custom_rules
Dec 3, 2025
Merged

Add i18n and ai rules - batch 1#74
heliocodacy merged 2 commits intomasterfrom
add_custom_rules

Conversation

@heliocodacy
Copy link
Contributor

No description provided.

Copilot AI review requested due to automatic review settings December 3, 2025 13:43
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for parameterized Semgrep rules by introducing i18n (internationalization) and AI-related custom rules. The implementation enables rules to use HTML comment placeholders (e.g., <!-- MODEL_REGEX -->) that can be replaced with actual parameter values at runtime.

  • Adds parameter extraction and replacement logic for Semgrep rules with HTML comment placeholders
  • Introduces two new rule sets: i18n rules for Java internationalization checks and AI rules for detecting insecure LLM model usage
  • Includes comprehensive test fixtures demonstrating i18n anti-patterns across multiple languages

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
internal/tool/configuration.go Adds parameter placeholder replacement logic and regex pattern matching for HTML comments during configuration file generation
internal/docgen/pattern-with-explanation.go Extends pattern structure to include parameters field for parameterized rules
internal/docgen/parsing.go Implements parameter extraction from rule YAML files, parsing HTML comment placeholders from regex patterns
docs/multiple-tests/i18n/src/i18n.js Test fixture demonstrating i18n configuration and usage patterns in JavaScript/React
docs/multiple-tests/i18n/src/UILayer.java Test fixture showing hardcoded strings vs. localized messages in Java UI layer
docs/multiple-tests/i18n/src/PaymentService.java Test fixture demonstrating i18n anti-patterns in Java payment processing
docs/multiple-tests/i18n/src/Orderlist.jsx Test fixture showing hardcoded strings and locale-specific formatting issues in React
docs/multiple-tests/i18n/src/OrderService.java Test fixture demonstrating string concatenation anti-patterns in Java
docs/multiple-tests/i18n/src/OrderList.js Test fixture showing hardcoded UI labels in React components
docs/multiple-tests/i18n/src/OrderController.java Test fixture demonstrating hardcoded API response messages in Spring controllers
docs/multiple-tests/i18n/src/OrderApp.java Test fixture showing mixed usage of localized and hardcoded strings
docs/multiple-tests/i18n/src/Order.cpp Test fixture demonstrating i18n issues in C++ including hardcoded messages and locale-unaware formatting
docs/multiple-tests/i18n/src/Messages_fr.properties French translation resource bundle for test fixtures
docs/multiple-tests/i18n/src/Messages_en.properties English translation resource bundle for test fixtures
docs/multiple-tests/i18n/results.xml Expected test results for i18n rule violations
docs/multiple-tests/i18n/patterns.xml Pattern configuration for i18n tests
docs/multiple-tests/ai/src/cs/GeminiExample.cs Test fixture demonstrating insecure LLM model usage in C#
docs/multiple-tests/ai/results.xml Expected test results for AI rule violations
docs/multiple-tests/ai/patterns.xml Pattern configuration with parameterized modelRegex for AI tests
docs/codacy-rules-i18n.yaml New Semgrep rule definition for detecting hardcoded strings in Java output statements
docs/codacy-rules-ai.yaml New Semgrep rule definition for detecting insecure LLM model usage with parameterizable regex

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +151 to +206
// replaceParameterPlaceholders replaces HTML comment placeholders (e.g., <!-- MODEL_REGEX -->)
// with the corresponding parameter values from the pattern
func replaceParameterPlaceholders(line string, pattern *codacy.Pattern) string {
if pattern == nil || len(pattern.Parameters) == 0 {
return line
}

// Check if line contains an HTML comment placeholder
if !htmlCommentRegex.MatchString(line) {
return line
}

// Replace each HTML comment with the corresponding parameter value
result := htmlCommentRegex.ReplaceAllStringFunc(line, func(match string) string {
matches := htmlCommentRegex.FindStringSubmatch(match)
if len(matches) > 1 {
paramName := matches[1]
// Convert UPPER_CASE to camelCase to match parameter name format
formattedParamName := formatParameterName(paramName)

// Find the parameter in the pattern
for _, param := range pattern.Parameters {
if param.Name == formattedParamName {
// Use Value if set, otherwise use Default
value := param.Value
if value == nil {
value = param.Default
}
if value != nil {
return fmt.Sprintf("%v", value)
}
}
}
}
// If no parameter found, keep the original placeholder
return match
})

return result
}

// formatParameterName converts UPPER_CASE to camelCase
func formatParameterName(name string) string {
parts := strings.Split(strings.ToLower(name), "_")
if len(parts) == 0 {
return name
}

result := parts[0]
for i := 1; i < len(parts); i++ {
if len(parts[i]) > 0 {
result += strings.ToUpper(string(parts[i][0])) + parts[i][1:]
}
}
return result
}
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new functions replaceParameterPlaceholders (lines 153-190) and formatParameterName (lines 193-206) lack test coverage. Given that the codebase has comprehensive test coverage for similar utility functions (see configuration_test.go), these new functions should also have unit tests.

Consider adding tests that cover:

  • replaceParameterPlaceholders: Test with nil pattern, empty parameters, valid placeholder replacement, multiple placeholders, and Value vs Default selection
  • formatParameterName: Test with single word, multiple underscores, empty string, and edge cases

Copilot uses AI. Check for mistakes.
Comment on lines +297 to +353
// extractParametersFromRule recursively searches for regex fields with HTML comment placeholders
// and creates PatternParameters for each one found
func extractParametersFromRule(ruleMap map[string]interface{}) []codacy.PatternParameter {
var parameters []codacy.PatternParameter
seenParams := make(map[string]bool)

var searchForRegex func(obj interface{})
searchForRegex = func(obj interface{}) {
switch v := obj.(type) {
case map[string]interface{}:
for key, value := range v {
if key == "regex" {
if regexStr, ok := value.(string); ok {
if matches := htmlCommentRegex.FindStringSubmatch(regexStr); len(matches) > 1 {
paramName := matches[1]
if !seenParams[paramName] {
seenParams[paramName] = true
// Convert to proper case (e.g., MODEL_REGEX -> modelRegex)
formattedName := formatParameterName(paramName)
parameters = append(parameters, codacy.PatternParameter{
Name: formattedName,
Description: fmt.Sprintf("Regular expression pattern for %s", strings.ToLower(strings.ReplaceAll(paramName, "_", " "))),
Default: ".*",
})
}
}
}
} else {
searchForRegex(value)
}
}
case []interface{}:
for _, item := range v {
searchForRegex(item)
}
}
}

searchForRegex(ruleMap)
return parameters
}

// formatParameterName converts UPPER_CASE to camelCase
func formatParameterName(name string) string {
parts := strings.Split(strings.ToLower(name), "_")
if len(parts) == 0 {
return name
}

result := parts[0]
for i := 1; i < len(parts); i++ {
if len(parts[i]) > 0 {
result += strings.ToUpper(string(parts[i][0])) + parts[i][1:]
}
}
return result
}
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new functions extractParametersFromRule (lines 299-337) and formatParameterName (lines 340-353) lack test coverage. Given that the codebase has comprehensive test coverage for similar utility functions (see parsing_test.go), these new functions should also have unit tests.

Consider adding tests that cover:

  • extractParametersFromRule: Test with rules containing HTML comment placeholders, nested structures, multiple parameters, and rules without placeholders
  • formatParameterName: Test with single word, multiple underscores, empty string, and edge cases

Copilot uses AI. Check for mistakes.
Comment on lines +130 to +142
func defaultRuleIsConfiguredWithPattern(line string, patterns *[]codacy.Pattern, idIsPresent bool, currentPattern *codacy.Pattern) (bool, *codacy.Pattern) {
if strings.Contains(line, "- id:") {
id := strings.TrimSpace(strings.Split(line, ":")[1])
pattern, found := lo.Find(*patterns, func(item codacy.Pattern) bool {
return item.ID == id
})
if found {
return true, &pattern
}
return false, nil
}
return idIsPresent, currentPattern
}
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new function defaultRuleIsConfiguredWithPattern (lines 130-142) lacks test coverage. This function is a critical part of the configuration logic but doesn't have dedicated unit tests, while the older defaultRuleIsConfigured function has tests.

Consider adding tests for defaultRuleIsConfiguredWithPattern that verify:

  • Correct pattern extraction when ID is found
  • Proper return values when ID is not found
  • Behavior with different currentPattern states

Copilot uses AI. Check for mistakes.
"codacy-rules-ai.yaml",
}
for _, file := range customRulesFiles {
filePath, _ := filepath.Abs(path.Join(docsDir, file))
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error from filepath.Abs() is being silently ignored on line 126 using the blank identifier _. While filepath.Abs() rarely fails, ignoring the error can lead to subtle bugs if it does occur (e.g., when the current directory cannot be determined).

Consider handling the error explicitly:

filePath, err := filepath.Abs(path.Join(docsDir, file))
if err != nil {
    return nil, err
}
Suggested change
filePath, _ := filepath.Abs(path.Join(docsDir, file))
filePath, err := filepath.Abs(path.Join(docsDir, file))
if err != nil {
return nil, err
}

Copilot uses AI. Check for mistakes.
Comment on lines +192 to +206
// formatParameterName converts UPPER_CASE to camelCase
func formatParameterName(name string) string {
parts := strings.Split(strings.ToLower(name), "_")
if len(parts) == 0 {
return name
}

result := parts[0]
for i := 1; i < len(parts); i++ {
if len(parts[i]) > 0 {
result += strings.ToUpper(string(parts[i][0])) + parts[i][1:]
}
}
return result
}
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatParameterName function is duplicated in both internal/tool/configuration.go (lines 193-206) and internal/docgen/parsing.go (lines 340-353). This duplication can lead to maintenance issues if the logic needs to be updated in the future.

Consider extracting this function to a shared utility package or having one package import it from the other if there's a clear dependency direction.

Copilot uses AI. Check for mistakes.
Comment on lines 13 to +17
)

var htmlCommentRegex = regexp.MustCompile(`<!--\s*([A-Z_]+)\s*-->`)
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The htmlCommentRegex variable is duplicated in both internal/tool/configuration.go (line 17) and internal/docgen/parsing.go (line 18). This duplication can lead to maintenance issues and inconsistencies if the regex pattern needs to be modified.

Consider extracting this regex to a shared utility package or constants file that both packages can import.

Suggested change
)
var htmlCommentRegex = regexp.MustCompile(`<!--\s*([A-Z_]+)\s*-->`)
"internal/shared" // Import the shared regex package
// var htmlCommentRegex = regexp.MustCompile(`<!--\s*([A-Z_]+)\s*-->`)
// Use shared.HtmlCommentRegex instead (defined in internal/shared/regex.go)

Copilot uses AI. Check for mistakes.
metadata:
category: security
subcategory: ai
description: Detects usage of insecure LLM models in C# codebases to prevent potential
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description field on line 19 appears to be incomplete. It reads "Detects usage of insecure LLM models in C# codebases to prevent potential" but the sentence is cut off.

Please complete the description with what the rule prevents (e.g., "to prevent potential security vulnerabilities" or "to prevent potential data leaks").

Suggested change
description: Detects usage of insecure LLM models in C# codebases to prevent potential
description: Detects usage of insecure LLM models in C# codebases to prevent potential security vulnerabilities.

Copilot uses AI. Check for mistakes.

export default function OrderList() {
const { t, i18n } = useTranslation();
const [orders, setOrders] = useState([
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable setOrders.

Suggested change
const [orders, setOrders] = useState([
const [orders] = useState([

Copilot uses AI. Check for mistakes.
@heliocodacy heliocodacy enabled auto-merge (squash) December 3, 2025 14:09
Copilot AI review requested due to automatic review settings December 3, 2025 14:10
var parameters []codacy.PatternParameter
seenParams := make(map[string]bool)

var searchForRegex func(obj interface{})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heliocodacy heliocodacy merged commit 82fd3bb into master Dec 3, 2025
14 checks passed
@heliocodacy heliocodacy deleted the add_custom_rules branch December 3, 2025 14:12
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +8 to +9
</checkstyle>

No newline at end of file
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML file contains trailing whitespace on line 9. This should be removed to maintain clean code standards.

Suggested change
</checkstyle>
</checkstyle>

Copilot uses AI. Check for mistakes.
- pattern-not: System.out.println()
- pattern-not: System.err.println()
message: >-
Use localized messages instead of hardcoded strings.
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message field contains trailing whitespace after "hardcoded strings." on line 31. This should be removed for consistency and cleanliness.

Suggested change
Use localized messages instead of hardcoded strings.
Use localized messages instead of hardcoded strings.

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +43
likelihood: HIGH

No newline at end of file
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The file contains a trailing blank line (line 43). While this is sometimes a style preference, it's inconsistent with the other YAML files in the codebase and should be removed for consistency.

Suggested change
likelihood: HIGH
likelihood: HIGH

Copilot uses AI. Check for mistakes.
using Google.GenAI.Types;

public class GenerateContentSimpleText {
public static async Task main() {
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name main should follow C# naming conventions and be capitalized as Main. In C#, method names should use PascalCase, and the entry point method is specifically named Main.

Suggested change
public static async Task main() {
public static async Task Main() {

Copilot uses AI. Check for mistakes.
"codacy-rules-ai.yaml",
}
for _, file := range customRulesFiles {
filePath, _ := filepath.Abs(path.Join(docsDir, file))
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error from filepath.Abs() is being silently ignored. This could lead to unexpected behavior if the path cannot be resolved to an absolute path. The error should be checked and handled appropriately.

Suggested change
filePath, _ := filepath.Abs(path.Join(docsDir, file))
filePath, err := filepath.Abs(path.Join(docsDir, file))
if err != nil {
return nil, fmt.Errorf("failed to resolve absolute path for %s: %w", file, err)
}

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +33
</checkstyle>

No newline at end of file
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML file contains trailing whitespace on line 33. This should be removed to maintain clean code standards.

Suggested change
</checkstyle>
</checkstyle>

Copilot uses AI. Check for mistakes.

export default function OrderList() {
const { t, i18n } = useTranslation();
const [orders, setOrders] = useState([
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable setOrders.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants