Skip to content

[refactor] Semantic Function Clustering Analysis — 2026-05-23 #6368

@github-actions

Description

@github-actions

Automated semantic function clustering analysis of all 129 non-test Go source files across 23 packages (815 functions/methods catalogued).

Executive Summary

The codebase is well-organized overall. Most packages follow clean single-responsibility principles with good naming conventions. Two concrete improvements were identified: one semantic outlier (GitHub-specific tool logic placed in a generic protocol package) and one over-fragmented utility package (4 single-function files in strutil). Additionally, a scattered context-key management pattern is noted as a low-priority improvement.

Function Inventory

Package Files Functions Primary Purpose
internal/auth 1 8 Auth header parsing and validation
internal/cmd 9 ~30 CLI command registration (Cobra)
internal/config 11 ~65 Config parsing, validation, expansion
internal/difc 8 ~70 DIFC labels, evaluator, agent registry
internal/envutil 4 11 Environment variable utilities
internal/guard 10 ~45 Security guards (WASM, noop, write-sink)
internal/httputil 3 6 Shared HTTP response helpers
internal/launcher 4 ~25 Backend process management
internal/logger 14 ~75 Multi-sink logging framework
internal/mcp 9 ~55 MCP protocol connection + GitHub-specific outlier
internal/middleware 1 ~8 jq schema middleware
internal/oidc 1 ~5 OIDC token provider
internal/proxy 6 ~30 GitHub API filtering proxy
internal/server 16 ~85 HTTP server, routing, session management
internal/strutil 4 7 Over-fragmented string utilities
internal/syncutil 1 ~5 Concurrency cache
internal/sys 2 ~5 System/Docker detection
internal/tracing 4 ~15 OpenTelemetry tracing
internal/tty 1 ~3 Terminal detection
internal/version 1 ~3 Version management

Identified Issues

1. Outlier: GitHub-Specific Tool Logic in Generic MCP Protocol Package

Severity: Medium
File: internal/mcp/collaborator_permission.go
Functions:

  • ParseCollaboratorPermissionArgs(argsMap map[string]interface{}) (owner, repo, username string, err error)
  • LogAndWrapCollaboratorPermission(body []byte, owner, repo, username string, statusCode int, logPrintf func(format string, args ...interface{})) interface{}
  • FetchCollaboratorPermission(ctx context.Context, ...) (interface{}, error)

Issue: The mcp package is described as containing "MCP protocol types with enhanced error logging." However, collaborator_permission.go implements GitHub-specific business logic for a single tool (get_collaborator_permission):

  • It constructs a GitHub REST API path (/repos/{owner}/{repo}/collaborators/{username}/permission)
  • It parses GitHub-specific JSON response fields ("permission")
  • It uses httputil.DoGitHubGET and httputil.ApplyGitHubAPIHeaders

This is application-layer GitHub integration code, not generic MCP protocol code. It was placed in mcp/ as a neutral shared location for server/unified.go and proxy/proxy.go, but that logic belongs in the GitHub HTTP utilities layer.

Current call sites:

  • internal/proxy/proxy.go:266,291mcp.ParseCollaboratorPermissionArgs, mcp.FetchCollaboratorPermission
  • internal/server/unified.go:281,294mcp.ParseCollaboratorPermissionArgs, mcp.FetchCollaboratorPermission

Recommendation: Move collaborator_permission.go to internal/httputil/ alongside github_http.go, which already contains ApplyGitHubAPIHeaders, DoGitHubGET, and ParseRateLimitResetHeader. Rename the file to httputil/collaborator_permission.go and update the package declaration to httputil. Update import paths in proxy/proxy.go and server/unified.go.

Estimated effort: 30–60 minutes
Benefits: Clearer package semantics — mcp stays generic protocol, httputil owns all GitHub HTTP interaction helpers


2. Over-Fragmented strutil Package

Severity: Low
Files: internal/strutil/deduplicate.go, internal/strutil/format_duration.go, internal/strutil/random_hex.go, internal/strutil/truncate.go

Issue: The strutil package contains 7 public functions spread across 4 files. Each file is 29–49 lines. There is no logical boundary that justifies four separate files:

File Lines Functions
deduplicate.go 29 DeduplicateStrings
format_duration.go 37 FormatFutureTime, FormatDuration
random_hex.go 49 randomHexFromReader, RandomHex, RandomHexWithFallback
truncate.go 46 Truncate, TruncateWithSuffix, TruncateRunes

All functions are independent general-purpose string/value utilities with no shared state or imports between them. The package already has a clear name (strutil), so one or two files would suffice.

Recommendation: Consolidate into two files:

  • strutil.goTruncate*, DeduplicateStrings, Format* (pure string/time utilities)
  • random.goRandomHex* functions (crypto-dependent, keeps crypto/rand import isolated)

Or simply consolidate all into strutil.go if preferred.

Note: Each function file has a corresponding *_test.go — test files should stay as-is or be merged into strutil_test.go.

Estimated effort: 1 hour (including test file consolidation)
Benefits: Simpler package navigation, fewer files to open when looking for a string utility


3. Context Key Management Scattered Across Packages (Low Priority)

Severity: Low
Issue: Context key getter/setter pairs follow the same pattern but are split across three packages, making it harder to find all context keys used in a request lifecycle:

Package Function Context Key
guard GetAgentIDFromContext / SetAgentIDInContext "difc-agent-id"
guard GetRequestStateFromContext / SetRequestStateInContext "difc-request-state"
server SessionIDFromContext / injectSessionContext sessionIDContextKey
mcp GetAgentTagsSnapshotFromContext agentTagsSnapshotKey

Note: This scatter is largely unavoidable in Go due to import cycle constraints (e.g., server imports guard, so guard cannot import server). However, the inconsistency in key naming ("difc-*" strings vs typed constants) is worth normalizing.

Recommendation: Ensure all context keys use typed constants (not raw strings) to prevent accidental key collisions. The guard package already uses type ContextKey string; ensure server and mcp follow the same typed-key pattern.

Estimated effort: 1–2 hours
Benefits: Reduced risk of context key collisions, consistent pattern

Clustering Results

Well-Organized Clusters ✅

  • Configuration validation (config/validation*.go): Split by concern (standard, env, schema, tracing) — clean
  • Guard WASM implementation (guard/wasm*.go): Split into wasm, wasm_parse, wasm_payload, wasm_validate — well separated
  • Logger sinks (logger/*.go): Each sink type in its own file with shared utilities in common.go — intentional and documented
  • CLI flags (cmd/flags*.go): Split by feature area (core, difc, launch, logging, tls) — good
  • Auth/HMAC middleware (server/auth.go, server/hmac.go): Both delegate to applyIfConfigured in server/middleware.go — no duplication
  • Response writer inheritance (server/response_writer.go embeds httputil.BaseResponseWriter) — correctly layered

Notable Absences

No significant duplicate function implementations were found. The logger package's "quad-function pattern" (4 log-level functions repeated across 3 logger types) is explicitly documented as intentional in common.go with rationale.

Refactoring Recommendations

Priority 1: Medium Impact

Move mcp/collaborator_permission.gohttputil/collaborator_permission.go

  • Files affected: internal/mcp/collaborator_permission.go (move), internal/proxy/proxy.go (update import), internal/server/unified.go (update import)
  • No logic changes needed, only package/import updates

Priority 2: Low Impact

Consolidate strutil files

  • Merge 4 source files into 1–2 files
  • Update/consolidate 4 test files accordingly
  • Zero functional changes

Normalize context key types

  • Ensure server and mcp packages use typed context key constants (like guard.ContextKey)

Implementation Checklist

  • Review mcp/collaborator_permission.go placement and decide on target package (httputil or a new internal/github/ package)
  • Move collaborator permission helpers and update import paths in proxy and server
  • Consolidate strutil files (optional, cosmetic)
  • Audit server and mcp context key definitions for type safety

Analysis Metadata

  • Total Go Files Analyzed: 129 (non-test, internal/ directory)
  • Total Functions/Methods Catalogued: 815
  • Packages Analyzed: 23
  • Outliers Found: 1 (semantic mismatch)
  • Over-Fragmented Packages: 1 (strutil)
  • Duplicate Implementations Detected: 0 significant
  • Detection Method: Naming pattern analysis + function grouping + cross-package usage analysis
  • Analysis Date: 2026-05-23

Generated by Semantic Function Refactoring · ● 1.9M ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions