Skip to content

cputime optimize for arm64#3247

Open
dwh110 wants to merge 1 commit intoapache:masterfrom
dwh110:cputime_opt
Open

cputime optimize for arm64#3247
dwh110 wants to merge 1 commit intoapache:masterfrom
dwh110:cputime_opt

Conversation

@dwh110
Copy link

@dwh110 dwh110 commented Mar 23, 2026

What problem does this PR solve?

Issue Number: n/a

Problem Summary: cpugettime optimize for ARM64

What is changed and the side effects?

Changed:
detail::invariant_cpu_freq value get by CNTFRQ_EL0

Side effects:

  • Performance effects:

  • Breaking backward compatibility: 无


Check List:

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to optimize cpuwide_time_ns() on ARM64 by using the architectural counter frequency (CNTFRQ_EL0) to populate detail::invariant_cpu_freq, shifting initialization away from per-call logic.

Changes:

  • On __aarch64__, route cpuwide_time_ns() to use the cycle-counter path instead of the clock_gettime path.
  • Move initialization of detail::invariant_cpu_freq into a pre-main initializer, using CNTFRQ_EL0 on ARM64.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/butil/time.h Changes cpuwide_time_ns() preprocessor gating to include ARM64 in the invariant-counter fast path and removes lazy init recursion.
src/butil/time.cpp Adds a pre-main initializer to set detail::invariant_cpu_freq, reading CNTFRQ_EL0 on ARM64.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -301,11 +301,6 @@ inline int64_t cpuwide_time_ns() {
} else if (!cpu_freq) {
// Lack of necessary features, return system-wide monotonic time instead.
return monotonic_time_ns();
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpuwide_time_ns() can fall through without returning a value when detail::invariant_cpu_freq is negative (e.g., still -1 before init_invariant_cpu_freq() runs, or if initialization fails and leaves it < 0). This is undefined behavior. Please ensure the function always returns a value for the < 0 case (e.g., treat it like the 0 case and return monotonic_time_ns(), or perform a safe one-time initialization here).

Suggested change
return monotonic_time_ns();
return monotonic_time_ns();
} else {
// Negative frequency (e.g. uninitialized or failed initialization),
// fall back to system-wide monotonic time to avoid undefined behavior.
return monotonic_time_ns();

Copilot uses AI. Check for mistakes.
Comment on lines 155 to +167
int64_t invariant_cpu_freq = -1;

static void __attribute__((constructor)) init_invariant_cpu_freq() {
int64_t baseFreq = -1;
#if defined(__aarch64__)
__asm__ __volatile__("mrs %0, CNTFRQ_EL0" : "=r"(baseFreq));
#else
baseFreq = detail::read_invariant_cpu_frequency();
#endif

detail::invariant_cpu_freq = baseFreq;
}

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using __attribute__((constructor)) in a C++ TU is compiler-specific and can be problematic for portability/toolchains. This repo already has BAIDU_GLOBAL_INIT() (which avoids the attribute in C++) and many places use pthread_once/function-local statics for one-time init; consider switching to an established mechanism so initialization is reliable across supported compilers.

Suggested change
int64_t invariant_cpu_freq = -1;
static void __attribute__((constructor)) init_invariant_cpu_freq() {
int64_t baseFreq = -1;
#if defined(__aarch64__)
__asm__ __volatile__("mrs %0, CNTFRQ_EL0" : "=r"(baseFreq));
#else
baseFreq = detail::read_invariant_cpu_frequency();
#endif
detail::invariant_cpu_freq = baseFreq;
}
static int64_t init_invariant_cpu_freq() {
int64_t baseFreq = -1;
#if defined(__aarch64__)
__asm__ __volatile__("mrs %0, CNTFRQ_EL0" : "=r"(baseFreq));
#else
baseFreq = detail::read_invariant_cpu_frequency();
#endif
return baseFreq;
}
int64_t invariant_cpu_freq = init_invariant_cpu_freq();

Copilot uses AI. Check for mistakes.
Comment on lines +161 to +166
#else
baseFreq = detail::read_invariant_cpu_frequency();
#endif

detail::invariant_cpu_freq = baseFreq;
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init_invariant_cpu_freq() runs unconditionally at load time and, on non-aarch64, reads /proc/cpuinfo even in builds where cpuwide_time_ns() uses clock_gettime and never consults detail::invariant_cpu_freq (i.e., !BAIDU_INTERNAL && !__aarch64__). This can add avoidable startup I/O/latency. Consider gating initialization behind the same preprocessor conditions as the fast-path usage, or lazily initializing on first use.

Copilot uses AI. Check for mistakes.
Comment on lines +158 to +165
int64_t baseFreq = -1;
#if defined(__aarch64__)
__asm__ __volatile__("mrs %0, CNTFRQ_EL0" : "=r"(baseFreq));
#else
baseFreq = detail::read_invariant_cpu_frequency();
#endif

detail::invariant_cpu_freq = baseFreq;
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local variable name baseFreq doesn’t match the surrounding naming style in this file (which predominantly uses snake_case like invariant_tsc, seen_decpoint, endp). Renaming to something like base_freq (and ideally indicating units, e.g. _hz) would improve consistency and readability.

Suggested change
int64_t baseFreq = -1;
#if defined(__aarch64__)
__asm__ __volatile__("mrs %0, CNTFRQ_EL0" : "=r"(baseFreq));
#else
baseFreq = detail::read_invariant_cpu_frequency();
#endif
detail::invariant_cpu_freq = baseFreq;
int64_t base_freq_hz = -1;
#if defined(__aarch64__)
__asm__ __volatile__("mrs %0, CNTFRQ_EL0" : "=r"(base_freq_hz));
#else
base_freq_hz = detail::read_invariant_cpu_frequency();
#endif
detail::invariant_cpu_freq = base_freq_hz;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants