Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/ipc/ipc4/handler.c
Original file line number Diff line number Diff line change
Expand Up @@ -1641,6 +1641,8 @@ __cold void ipc_send_panic_notification(void)

#ifdef CONFIG_LOG_BACKEND_ADSP_MTRACE

static atomic_t mtrace_notify_counter;
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The atomic counter should be initialized explicitly to ensure a known starting state. Consider using ATOMIC_INIT(0) or an initialization function to set the initial value to 0.

Suggested change
static atomic_t mtrace_notify_counter;
static atomic_t mtrace_notify_counter = ATOMIC_INIT(0);

Copilot uses AI. Check for mistakes.

static bool is_notification_queued(struct ipc_msg *msg)
{
struct ipc *ipc = ipc_get();
Expand All @@ -1663,7 +1665,8 @@ void ipc_send_buffer_status_notify(void)
return;

msg_notify.header = SOF_IPC4_NOTIF_HEADER(SOF_IPC4_NOTIFY_LOG_BUFFER_STATUS);
msg_notify.extension = 0;
atomic_add(&mtrace_notify_counter, 1);
msg_notify.extension = atomic_read(&mtrace_notify_counter);
Comment on lines +1668 to +1669
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using separate atomic_add() and atomic_read() operations creates a potential race condition. Between these two operations, another thread could modify the counter, causing the extension field to contain a different value than what was incremented. Use atomic_inc_return() or atomic_fetch_add() instead to atomically increment and read the value in a single operation.

Suggested change
atomic_add(&mtrace_notify_counter, 1);
msg_notify.extension = atomic_read(&mtrace_notify_counter);
msg_notify.extension = atomic_inc_return(&mtrace_notify_counter);

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack atomic_add() will return the current value which we should use in the extension.

Comment on lines 1667 to +1669
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the host need this? Has there been a situation where it received the same notification more than once?
I checked the documentation and there is no such iterator in this message. I think in the case of all notifications the extension field remains empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The host does not need this as such, but helps with debugging:
Recently we again started to see IPC timeouts where there is a LOG_BUFFER_STATUS notification in the target IPC register. The same notification present in the first timed out message and in consequent ones as well.
There is no way to say that we have received more than one notification or the fw is really locked up hard without enabling printing for notification in kernel, which is noisy.
I thought that it might be a great sideband info to see this in fail logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same notification present in the first timed out message and in consequent ones as well.

Correct me if I misunderstood: HOST sends IPC to DSP and doesn't get ACK, in the registers you can see that the last message from DSP is LOG_BUFFER_STATUS notification. In the next test (I assume DSP/FW was reset) the situation repeats, again no ACK for the message sent from HOST to DSP and in the registers you can see that the last message from DSP is a notification.

In such a situation, the state of registers is important. Did DSP confirm receipt of IPC through ACK and did HOST correctly receive the message (in this case notification) from DSP.

There is no way to say that we have received more than one notification or the fw is really locked up hard without enabling printing for notification in kernel, which is noisy.

For debugging purposes, one could count received notifications of a given type and print the counter at the end.

In FW there is an option to enable a counter of received and sent IPC messages (CONFIG_DEBUG_IPC_COUNTERS), but this requires recompilation.

I thought that it might be a great sideband info to see this in fail logs.

I understand why this could be helpful, but as @abonislawski wrote, in the case of changes in the structure of existing IPCs we must remember to maintain compatibility/compliance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same notification present in the first timed out message and in consequent ones as well.

Correct me if I misunderstood: HOST sends IPC to DSP and doesn't get ACK, in the registers you can see that the last message from DSP is LOG_BUFFER_STATUS notification. In the next test (I assume DSP/FW was reset) the situation repeats, again no ACK for the message sent from HOST to DSP and in the registers you can see that the last message from DSP is a notification.

We have seen cases when firmware acked the message, but did not replied or not acked it (and did not replied) and we had LOG_BUFFER_STATUS in target IPC register. We don't know if firmware sent more log notification or not as we would need to recompile the firmware and/or reload the audio drivers to print the received log notifications (it is disabled due to extensive noise).
The reproduction of these are rare and can take thousands of iterations of a test run (which is generating random use patterns).

In such a situation, the state of registers is important. Did DSP confirm receipt of IPC through ACK and did HOST correctly receive the message (in this case notification) from DSP.

There is no way to say that we have received more than one notification or the fw is really locked up hard without enabling printing for notification in kernel, which is noisy.

For debugging purposes, one could count received notifications of a given type and print the counter at the end.

In FW there is an option to enable a counter of received and sent IPC messages (CONFIG_DEBUG_IPC_COUNTERS), but this requires recompilation.

I see, it is writing to FW_REGISTERS area of the shared memory, I don't think we have this enabled in any configuration.

I thought that it might be a great sideband info to see this in fail logs.

I understand why this could be helpful, but as @abonislawski wrote, in the case of changes in the structure of existing IPCs we must remember to maintain compatibility/compliance.

Sure.

msg_notify.tx_size = 0;

tr_dbg(&ipc_tr, "tx-notify\t: %#x|%#x", msg_notify.header, msg_notify.extension);
Expand Down
Loading