-
Notifications
You must be signed in to change notification settings - Fork 351
ipc4: add counter to mtrace buffer status notification #10564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -1641,6 +1641,8 @@ __cold void ipc_send_panic_notification(void) | |||||||
|
|
||||||||
| #ifdef CONFIG_LOG_BACKEND_ADSP_MTRACE | ||||||||
|
|
||||||||
| static atomic_t mtrace_notify_counter; | ||||||||
|
|
||||||||
| static bool is_notification_queued(struct ipc_msg *msg) | ||||||||
| { | ||||||||
| struct ipc *ipc = ipc_get(); | ||||||||
|
|
@@ -1663,7 +1665,8 @@ void ipc_send_buffer_status_notify(void) | |||||||
| return; | ||||||||
|
|
||||||||
| msg_notify.header = SOF_IPC4_NOTIF_HEADER(SOF_IPC4_NOTIFY_LOG_BUFFER_STATUS); | ||||||||
| msg_notify.extension = 0; | ||||||||
| atomic_add(&mtrace_notify_counter, 1); | ||||||||
| msg_notify.extension = atomic_read(&mtrace_notify_counter); | ||||||||
|
Comment on lines
+1668
to
+1669
|
||||||||
| atomic_add(&mtrace_notify_counter, 1); | |
| msg_notify.extension = atomic_read(&mtrace_notify_counter); | |
| msg_notify.extension = atomic_inc_return(&mtrace_notify_counter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack atomic_add() will return the current value which we should use in the extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the host need this? Has there been a situation where it received the same notification more than once?
I checked the documentation and there is no such iterator in this message. I think in the case of all notifications the extension field remains empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host does not need this as such, but helps with debugging:
Recently we again started to see IPC timeouts where there is a LOG_BUFFER_STATUS notification in the target IPC register. The same notification present in the first timed out message and in consequent ones as well.
There is no way to say that we have received more than one notification or the fw is really locked up hard without enabling printing for notification in kernel, which is noisy.
I thought that it might be a great sideband info to see this in fail logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same notification present in the first timed out message and in consequent ones as well.
Correct me if I misunderstood: HOST sends IPC to DSP and doesn't get ACK, in the registers you can see that the last message from DSP is LOG_BUFFER_STATUS notification. In the next test (I assume DSP/FW was reset) the situation repeats, again no ACK for the message sent from HOST to DSP and in the registers you can see that the last message from DSP is a notification.
In such a situation, the state of registers is important. Did DSP confirm receipt of IPC through ACK and did HOST correctly receive the message (in this case notification) from DSP.
There is no way to say that we have received more than one notification or the fw is really locked up hard without enabling printing for notification in kernel, which is noisy.
For debugging purposes, one could count received notifications of a given type and print the counter at the end.
In FW there is an option to enable a counter of received and sent IPC messages (CONFIG_DEBUG_IPC_COUNTERS), but this requires recompilation.
I thought that it might be a great sideband info to see this in fail logs.
I understand why this could be helpful, but as @abonislawski wrote, in the case of changes in the structure of existing IPCs we must remember to maintain compatibility/compliance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same notification present in the first timed out message and in consequent ones as well.
Correct me if I misunderstood: HOST sends IPC to DSP and doesn't get ACK, in the registers you can see that the last message from DSP is LOG_BUFFER_STATUS notification. In the next test (I assume DSP/FW was reset) the situation repeats, again no ACK for the message sent from HOST to DSP and in the registers you can see that the last message from DSP is a notification.
We have seen cases when firmware acked the message, but did not replied or not acked it (and did not replied) and we had LOG_BUFFER_STATUS in target IPC register. We don't know if firmware sent more log notification or not as we would need to recompile the firmware and/or reload the audio drivers to print the received log notifications (it is disabled due to extensive noise).
The reproduction of these are rare and can take thousands of iterations of a test run (which is generating random use patterns).
In such a situation, the state of registers is important. Did DSP confirm receipt of IPC through ACK and did HOST correctly receive the message (in this case notification) from DSP.
There is no way to say that we have received more than one notification or the fw is really locked up hard without enabling printing for notification in kernel, which is noisy.
For debugging purposes, one could count received notifications of a given type and print the counter at the end.
In FW there is an option to enable a counter of received and sent IPC messages (CONFIG_DEBUG_IPC_COUNTERS), but this requires recompilation.
I see, it is writing to FW_REGISTERS area of the shared memory, I don't think we have this enabled in any configuration.
I thought that it might be a great sideband info to see this in fail logs.
I understand why this could be helpful, but as @abonislawski wrote, in the case of changes in the structure of existing IPCs we must remember to maintain compatibility/compliance.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The atomic counter should be initialized explicitly to ensure a known starting state. Consider using ATOMIC_INIT(0) or an initialization function to set the initial value to 0.