feat(anr): Profile main thread when ANR and report ANR profiles to Sentry#4899
feat(anr): Profile main thread when ANR and report ANR profiles to Sentry#4899
Conversation
|
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
sentry-android-core/src/main/java/io/sentry/android/core/AnrV2Integration.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfileManager.java
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrCulpritIdentifier.java
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AggregatedStackTrace.java
Outdated
Show resolved
Hide resolved
Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| a416a65 | 333.78 ms | 410.37 ms | 76.59 ms |
| d15471f | 342.08 ms | 415.44 ms | 73.35 ms |
| e59e22a | 329.74 ms | 383.31 ms | 53.57 ms |
| d15471f | 315.20 ms | 370.22 ms | 55.02 ms |
| d364ace | 382.77 ms | 443.21 ms | 60.44 ms |
| b3d8889 | 371.33 ms | 426.24 ms | 54.92 ms |
| a416a65 | 316.52 ms | 359.67 ms | 43.15 ms |
| b03edbb | 352.20 ms | 423.69 ms | 71.49 ms |
| d15471f | 304.55 ms | 408.43 ms | 103.87 ms |
| 319f256 | 315.96 ms | 372.96 ms | 57.00 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| a416a65 | 1.58 MiB | 2.12 MiB | 555.26 KiB |
| d15471f | 1.58 MiB | 2.13 MiB | 559.54 KiB |
| e59e22a | 1.58 MiB | 2.20 MiB | 635.34 KiB |
| d15471f | 1.58 MiB | 2.13 MiB | 559.54 KiB |
| d364ace | 1.58 MiB | 2.11 MiB | 539.75 KiB |
| b3d8889 | 1.58 MiB | 2.10 MiB | 535.07 KiB |
| a416a65 | 1.58 MiB | 2.12 MiB | 555.26 KiB |
| b03edbb | 1.58 MiB | 2.13 MiB | 557.32 KiB |
| d15471f | 1.58 MiB | 2.13 MiB | 559.54 KiB |
| 319f256 | 1.58 MiB | 2.19 MiB | 619.79 KiB |
Previous results on branch: markushi/feat/anr-profiling
Startup times
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 00299fd | 359.87 ms | 424.85 ms | 64.98 ms |
| c10e603 | 367.92 ms | 393.50 ms | 25.58 ms |
| eb7143a | 347.66 ms | 408.54 ms | 60.88 ms |
| 2cee1ab | 318.29 ms | 361.00 ms | 42.71 ms |
| fa76e86 | 274.32 ms | 349.63 ms | 75.31 ms |
| 4c0ffee | 314.94 ms | 377.79 ms | 62.86 ms |
| 83a9ec4 | 333.84 ms | 390.30 ms | 56.47 ms |
| fca8df8 | 326.79 ms | 379.69 ms | 52.90 ms |
| 31581b9 | 350.00 ms | 420.63 ms | 70.63 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 00299fd | 1.58 MiB | 2.29 MiB | 723.50 KiB |
| c10e603 | 1.58 MiB | 2.29 MiB | 723.72 KiB |
| eb7143a | 1.58 MiB | 2.29 MiB | 724.12 KiB |
| 2cee1ab | 1.58 MiB | 2.29 MiB | 723.68 KiB |
| fa76e86 | 1.58 MiB | 2.29 MiB | 724.06 KiB |
| 4c0ffee | 1.58 MiB | 2.29 MiB | 723.67 KiB |
| 83a9ec4 | 1.58 MiB | 2.29 MiB | 723.99 KiB |
| fca8df8 | 1.58 MiB | 2.29 MiB | 723.68 KiB |
| 31581b9 | 1.58 MiB | 2.19 MiB | 624.94 KiB |
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrCulpritIdentifier.java
Show resolved
Hide resolved
…ntry-java into markushi/feat/anr-profiling
sentry-android-core/src/main/java/io/sentry/android/core/AnrV2Integration.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/AnrV2Integration.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/AnrV2Integration.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrStackTrace.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfilingIntegration.java
Show resolved
Hide resolved
|
@sentry review |
sentry-android-core/src/test/java/io/sentry/android/core/anr/AnrProfileRotationHelperTest.kt
Outdated
Show resolved
Hide resolved
sentry-android-core/src/test/java/io/sentry/android/core/anr/AnrProfileManagerTest.kt
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/AndroidOptionsInitializer.java
Show resolved
Hide resolved
| return; | ||
| } | ||
|
|
||
| if (options.isEnableAnrProfiling() && hasOnlySystemFrames(event)) { |
There was a problem hiding this comment.
wondering whether we should guard this behind isEnableAnrProfiling or actually just do it for all ANRs going forward? I think AEI also gives quite a lot of noise with just system frames, right?
There was a problem hiding this comment.
Yes, exactly - we should do this everywhere in the long run. I still would guard this right now - otherwise we'll have a breaking change in default behavior. I can create a follow up ticket for the next major.
There was a problem hiding this comment.
I think we've done this quite a lot in the past (breaking behaviour change that affects grouping) so I'm not opposed to doing this now and not wait for the next major 😅 If it improves things I think I'd rather do it sooner. But your call here (we could also wait to get some adoption and see how it performs before doing this for everyone)
| final StackTraceElement stackTraceElement = stack[0]; | ||
| final String message = | ||
| stackTraceElement.getClassName() + "." + stackTraceElement.getMethodName(); | ||
| final AnrException exception = new AnrException(message); |
There was a problem hiding this comment.
is the new AnrException type intentional here? Because it's a new type it will create new groups (even if the profile-derived stacktrace matches the one from AEI). Not sure if we should keep it as ApplicationNotResponding?
There was a problem hiding this comment.
Yes, I think I looked into this early on, but ApplicationNotResponding requires a non-null Thread object, which seemed to be widely used and hard to refactor. But that's a good point, let me check that again.
sentry-android-core/src/main/java/io/sentry/android/core/ApplicationExitInfoEventProcessor.java
Outdated
Show resolved
Hide resolved
| options); | ||
| chunk.setSentryProfile(profile); | ||
|
|
||
| final SentryId profilerId = Sentry.getCurrentScopes().captureProfileChunk(chunk); |
There was a problem hiding this comment.
just curious, but is there a way to send both the ANR event and the profile chunk in the same envelope?
There was a problem hiding this comment.
claude says yes, this should be valid for the backend😅
Yes, this works. The client can send both in the same envelope.
During split_envelope() at relay-server/src/services/processor.rs:357-365, Relay splits them apart before any event-type routing:
- ProfileChunk items are extracted into their own ProcessingGroup::ProfileChunk envelope
- The remaining error event goes into ProcessingGroup::Error
- Both are processed independently through their respective pipelines
We'd need to attach the profile directly to the event (or via hint) and then do something similar as we do for e.g. attachments (
sentry-java/sentry/src/main/java/io/sentry/SentryClient.java
Lines 422 to 432 in abf451a
There was a problem hiding this comment.
yeah, or replay_recording, too.
I guess this will be better in terms of not sending one without the other, but I don't have a strong preference right now, we can probably create a followup issue if you don't feel like doing it now.
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfilingIntegration.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfilingIntegration.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/ApplicationExitInfoEventProcessor.java
Show resolved
Hide resolved
| threadMetadata.setName(MAIN_THREAD_NAME); | ||
| threadMetadata.setPriority(Thread.NORM_PRIORITY); | ||
|
|
||
| final @NotNull Map<String, SentryThreadMetadata> threadMetadataMap = new HashMap<>(); |
There was a problem hiding this comment.
I guess this could've been Collections.singletonMap()
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfileRotationHelper.java
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfileRotationHelper.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfile.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfileManager.java
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfileManager.java
Outdated
Show resolved
Hide resolved
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfileManager.java
Show resolved
Hide resolved
| // common Java and Android packages who are less relevant for being the actual culprit | ||
| private static final List<String> systemAndFrameworkPackages = new ArrayList<>(9); | ||
|
|
||
| static { |
There was a problem hiding this comment.
not sure if it's worth adding some kotlin frames in here as well?
| "SentryAndroidOptions is required"); | ||
| this.logger = options.getLogger(); | ||
|
|
||
| if (((SentryAndroidOptions) options).isEnableAnrProfiling()) { |
There was a problem hiding this comment.
| if (((SentryAndroidOptions) options).isEnableAnrProfiling()) { | |
| if (this.options.isEnableAnrProfiling()) { |
since we already cast it above
| try { | ||
| currentFile.renameTo(oldFile); | ||
| } catch (Throwable e) { | ||
| // ignored | ||
| } |
There was a problem hiding this comment.
Bug: The return value of File.renameTo() is not checked, causing silent failures in ANR profile rotation and leading to a permanently inconsistent state.
Severity: MEDIUM
Suggested Fix
Check the boolean return value of currentFile.renameTo(oldFile). If it returns false, log a warning and do not set shouldRotate to false. This will allow the rotation to be re-attempted on the next trigger, preventing a permanently broken state.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location:
sentry-android-core/src/main/java/io/sentry/android/core/anr/AnrProfileRotationHelper.java#L44-L48
Potential issue: The `renameTo()` method is used to rotate ANR profile files, but its
boolean return value, which indicates success or failure, is ignored. The code is
wrapped in a `try-catch (Throwable)` block that also silences any potential issues.
After the call, `shouldRotate` is unconditionally set to `false`. If `renameTo()` fails
for any reason (e.g., file permissions, cross-filesystem move), the rotation will
silently fail, but the system will act as if it succeeded. This leads to a permanently
inconsistent state where future rotations are blocked, potentially causing ANR profile
data loss or corruption as both profiling integrations might access the same file.
| try (final @NotNull ISentryLifecycleToken ignored = profileManagerLock.acquire()) { | ||
| final @Nullable AnrProfileManager p = profileManager; | ||
| if (p != null) { | ||
| p.close(); |
There was a problem hiding this comment.
hm, I think this one does I/O under the hood right (raf.close())? We should probably move it out of the lock here (since we capture it in the p variable) and also offload to a bg thread perhaps?
| oldThread.interrupt(); | ||
| } | ||
|
|
||
| final @NotNull Thread profilingThread = new Thread(this, "AnrProfilingIntegration"); |
There was a problem hiding this comment.
can we actually reuse the thread as opposed to creating a new one on every foreground? like checking the inForeground flag inside the run method in addition to isInterrupted?
also not sure if it makes any difference on android as to whether to make it isDaemon or not
| // get main thread Handler so we can post messages | ||
| final Looper mainLooper = Looper.getMainLooper(); | ||
| final Thread mainThread = mainLooper.getThread(); | ||
| final Handler mainHandler = new Handler(mainLooper); |
There was a problem hiding this comment.
I guess we could memoize this and capture it in register if anr profiling is enabled? Would avoid allocating a new handler on every run
| static final int MAX_NUM_STACKS = (int) (10_000 / POLLING_INTERVAL_MS); | ||
|
|
||
| private final AtomicBoolean enabled = new AtomicBoolean(true); | ||
| private final Runnable updater = () -> lastMainThreadExecutionTime = SystemClock.uptimeMillis(); |
There was a problem hiding this comment.
I realized we use AndroidCourrentDateProvider instead of SystemClock.uptimeMillis directly, but I'm fine with using systemclock directly since you can override it in tests easily
| final long duration = SystemClock.uptimeMillis() - start; | ||
| if (logger.isEnabled(SentryLevel.DEBUG)) { |
There was a problem hiding this comment.
| final long duration = SystemClock.uptimeMillis() - start; | |
| if (logger.isEnabled(SentryLevel.DEBUG)) { | |
| if (logger.isEnabled(SentryLevel.DEBUG)) { | |
| final long duration = SystemClock.uptimeMillis() - start; |
There was a problem hiding this comment.
btw, does mainThread.getStackTrace() affects the main thread? or is this log just for debugging purposes?
| if (cacheDirPath == null) { | ||
| throw new IllegalStateException("cacheDirPath is required for ANR profiling"); | ||
| } | ||
| final @NotNull File currentFile = |
There was a problem hiding this comment.
alright, I think I found a way this will cause ANRs (similar to replay) - if close() is called from the main thread, we'll be holding the lock, awaiting on the operations here to finish, which can take a while due to I/O in getFileForRecording and AnrProfileManager.<init>.
I guess to make it non-blocking we'd still have to make QueueFile instantiation lazy and move getFileForRecording inside the lazy block, wdyt?
romtsn
left a comment
There was a problem hiding this comment.
Great work already! I believe there are some things to address but I can check once more after that
|
|
||
| final Mechanism mechanism = new Mechanism(); | ||
| mechanism.setType("ANR"); | ||
| final ExceptionMechanismException error = |
There was a problem hiding this comment.
I think this one turns to be unused, shall we pass it to getSentryExceptions?
📜 Description
Adds ANR (Application Not Responding) profiling integration that profiles the main thread when an ANR is detected and reports the captured profiles to Sentry.
Key Changes:
AnrProfilingIntegrationto capture profiles during ANR eventsAnrV2Integrationnow takes care of matching and capturing the profile on the next start💡 Motivation and Context
This feature enables better ANR diagnostics by capturing profiling data at the time of ANR detection, allowing developers to identify performance bottlenecks and problematic code paths causing application hangs.
Example event: https://sentry-sdks.sentry.io/issues/7229210096/events/4598ff6fcc0f402d8ecca615005e7f64/
💚 How did you test it?
📝 Checklist
sendDefaultPIIis enabled.🔮 Next steps