Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@namanjain7
Copy link
Contributor

code cleanup for ais25ba sensor

Copy link

@allen-kim-sec allen-kim-sec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample_cc is very important information about position..
alive it.

@allen-kim-sec
Copy link

@zhongnuo-tang

Hi? According to my test result , I can not chatch "0000 Writing BreakPoint via Trace32".
naman & I can find the same fault case by previous TDM PR : #7026 .

you also try to find who write 0000 data ( it has the same result : write 0000 for 64 Bytes ( 4 x 2 Bytes x 8 samples ) ).

there are two interrupt asserted. Interval is very short I guess the TDM data is received continuously
Prime GDMA Page is first and Extention GDMA is next.

  1. 0000 Corruption Session
    Q#1: Is it correct that Prime GDMA Page will complete at the receiving 6ch data , Extention will complete at the receiving 8ch.
    Interrupt will occured Prime GDMA first and Extention next. Interval between two things is only (2/8)*62.5usec = 15.625 usec.

  2. Stability Session
    Q#2: is it safe ?
    OS will mask another interrupt at reaching Prime GDMA Interrupt. Current code use the same interrupt Handler ,
    If the first interrupt is executed and another interrupt arrives before the masking of further interrupt entries is completed,
    the handler using the same interrupt handler must be safety-critical. However, the current structure is not safe.
    Therefore, I would recommend modifying the approach so that each interrupt uses its own handler.
    it is not suitable for Multi Core Env.

Q#3: Another Ideation
If more time is required for complete BSP support of TDM, I have another suggestion.
First, it is essential to thoroughly understand the TDM protocol, particularly how channel 1 is determined.
Channel 1 begins when the Frame Sync occurs at the rising edge.
Fortunately, AIS25BA uses an 8-channel TDM, but the valid information are only up to channels 1 to 3.
Therefore, if the TDM channel support is set to 4 instead of 8,
a single GDMA page ( not require Ext GDMA ) will be completed when 4 channels are finished,
and the interrupt will be executed stably. This is expected to make the current BSP interrupt handler stable.
Additionally,
I expect that when the Frame Sync rises again, the operation will restart from channel 1.
What do you think about this approach? I would like to hear your opinion.

@zhongnuo-tang
Copy link
Contributor

@zhongnuo-tang

Hi? According to my test result , I can not chatch "0000 Writing BreakPoint via Trace32". naman & I can find the same fault case by previous TDM PR : #7026 .

you also try to find who write 0000 data ( it has the same result : write 0000 for 64 Bytes ( 4 x 2 Bytes x 8 samples ) ).

there are two interrupt asserted. Interval is very short I guess the TDM data is received continuously Prime GDMA Page is first and Extention GDMA is next.

  1. 0000 Corruption Session
    Q#1: Is it correct that Prime GDMA Page will complete at the receiving 6ch data , Extention will complete at the receiving 8ch.
    Interrupt will occured Prime GDMA first and Extention next. Interval between two things is only (2/8)*62.5usec = 15.625 usec.
  2. Stability Session
    Q#2: is it safe ?
    OS will mask another interrupt at reaching Prime GDMA Interrupt. Current code use the same interrupt Handler ,
    If the first interrupt is executed and another interrupt arrives before the masking of further interrupt entries is completed,
    the handler using the same interrupt handler must be safety-critical. However, the current structure is not safe.
    Therefore, I would recommend modifying the approach so that each interrupt uses its own handler.
    it is not suitable for Multi Core Env.

Q#3: Another Ideation If more time is required for complete BSP support of TDM, I have another suggestion. First, it is essential to thoroughly understand the TDM protocol, particularly how channel 1 is determined. Channel 1 begins when the Frame Sync occurs at the rising edge. Fortunately, AIS25BA uses an 8-channel TDM, but the valid information are only up to channels 1 to 3. Therefore, if the TDM channel support is set to 4 instead of 8, a single GDMA page ( not require Ext GDMA ) will be completed when 4 channels are finished, and the interrupt will be executed stably. This is expected to make the current BSP interrupt handler stable. Additionally, I expect that when the Frame Sync rises again, the operation will restart from channel 1. What do you think about this approach? I would like to hear your opinion.

Hi, i think it would be best if we can use 4channel TDM which eliminates the complicity of INT & EXT DMA, however, I would need to check why 8 channel was used in the first place.. Let me do checking before replying you.

@allen-kim-sec
Copy link

@zhongnuo-tang
Hi? According to my test result , I can not chatch "0000 Writing BreakPoint via Trace32". naman & I can find the same fault case by previous TDM PR : #7026 .
you also try to find who write 0000 data ( it has the same result : write 0000 for 64 Bytes ( 4 x 2 Bytes x 8 samples ) ).
there are two interrupt asserted. Interval is very short I guess the TDM data is received continuously Prime GDMA Page is first and Extention GDMA is next.

  1. 0000 Corruption Session
    Q#1: Is it correct that Prime GDMA Page will complete at the receiving 6ch data , Extention will complete at the receiving 8ch.
    Interrupt will occured Prime GDMA first and Extention next. Interval between two things is only (2/8)*62.5usec = 15.625 usec.
  2. Stability Session
    Q#2: is it safe ?
    OS will mask another interrupt at reaching Prime GDMA Interrupt. Current code use the same interrupt Handler ,
    If the first interrupt is executed and another interrupt arrives before the masking of further interrupt entries is completed,
    the handler using the same interrupt handler must be safety-critical. However, the current structure is not safe.
    Therefore, I would recommend modifying the approach so that each interrupt uses its own handler.
    it is not suitable for Multi Core Env.

Q#3: Another Ideation If more time is required for complete BSP support of TDM, I have another suggestion. First, it is essential to thoroughly understand the TDM protocol, particularly how channel 1 is determined. Channel 1 begins when the Frame Sync occurs at the rising edge. Fortunately, AIS25BA uses an 8-channel TDM, but the valid information are only up to channels 1 to 3. Therefore, if the TDM channel support is set to 4 instead of 8, a single GDMA page ( not require Ext GDMA ) will be completed when 4 channels are finished, and the interrupt will be executed stably. This is expected to make the current BSP interrupt handler stable. Additionally, I expect that when the Frame Sync rises again, the operation will restart from channel 1. What do you think about this approach? I would like to hear your opinion.

Hi, i think it would be best if we can use 4channel TDM which eliminates the complicity of INT & EXT DMA, however, I would need to check why 8 channel was used in the first place.. Let me do checking before replying you.

Emm... 4ch is not suitable because it need acquisition time for the next sample.

@zhongnuo-tang
Copy link
Contributor

@zhongnuo-tang
Hi? According to my test result , I can not chatch "0000 Writing BreakPoint via Trace32". naman & I can find the same fault case by previous TDM PR : #7026 .
you also try to find who write 0000 data ( it has the same result : write 0000 for 64 Bytes ( 4 x 2 Bytes x 8 samples ) ).
there are two interrupt asserted. Interval is very short I guess the TDM data is received continuously Prime GDMA Page is first and Extention GDMA is next.

  1. 0000 Corruption Session
    Q#1: Is it correct that Prime GDMA Page will complete at the receiving 6ch data , Extention will complete at the receiving 8ch.
    Interrupt will occured Prime GDMA first and Extention next. Interval between two things is only (2/8)*62.5usec = 15.625 usec.
  2. Stability Session
    Q#2: is it safe ?
    OS will mask another interrupt at reaching Prime GDMA Interrupt. Current code use the same interrupt Handler ,
    If the first interrupt is executed and another interrupt arrives before the masking of further interrupt entries is completed,
    the handler using the same interrupt handler must be safety-critical. However, the current structure is not safe.
    Therefore, I would recommend modifying the approach so that each interrupt uses its own handler.
    it is not suitable for Multi Core Env.

Q#3: Another Ideation If more time is required for complete BSP support of TDM, I have another suggestion. First, it is essential to thoroughly understand the TDM protocol, particularly how channel 1 is determined. Channel 1 begins when the Frame Sync occurs at the rising edge. Fortunately, AIS25BA uses an 8-channel TDM, but the valid information are only up to channels 1 to 3. Therefore, if the TDM channel support is set to 4 instead of 8, a single GDMA page ( not require Ext GDMA ) will be completed when 4 channels are finished, and the interrupt will be executed stably. This is expected to make the current BSP interrupt handler stable. Additionally, I expect that when the Frame Sync rises again, the operation will restart from channel 1. What do you think about this approach? I would like to hear your opinion.

Hi, i think it would be best if we can use 4channel TDM which eliminates the complicity of INT & EXT DMA, however, I would need to check why 8 channel was used in the first place.. Let me do checking before replying you.

Emm... 4ch is not suitable because it need acquisition time for the next sample.
I see. I am thinking to just use EXT dma done isr as the callback trigger point.
Since the last valid data would be inside EXT dma, and in the callback handler, we handle both DMA pages at the same time, we can assure that dma page sequence will be in synced, so in order to prevent callback being triggered two times, we just use EXT dma done is trigger point. What do you think?

I just checked that if 8ch is required, it is mandatory to have both prime and EXT dma due to data interleaving. Using single DMA for 8ch will cause data loss.

@allen-kim-sec
Copy link

allen-kim-sec commented Nov 4, 2025

@namanjain7
please enhance trailed detection logic

int last_fault=-1;

static void print_sensor_data(sensor_data_s *data)
{
        for (int i = 0; i < AIS25BA_DMA_BUFF_SAMPLE_NUMBER; i++) {

                if ( data[i].samples[3] == 0xffff )
                {
                        if ( data[i].samples[4] == 0 && data[i].samples[5] == 0 ) trailing_zeroes++;
                        else trailing_zeroes = 0;

                }else if ( data[i].samples[3] == 0x0 )
                {
                        if ( data[i].samples[0] == 0 && data[i].samples[1] == 0 ) trailing_zeroes++;
                        else trailing_zeroes = 0;
                }else trailing_zeroes++;


                if (trailing_zeroes >= 5) {
                        total_number_of_continuous_zeroes++;
                        printf("Continuous trailing zeroes found!!!!! count: %d\n", total_number_of_continuous_zeroes);
                        last_fault = sample_cc;

                        trailing_zeroes = 0;
                }
                printf("%04d: %04x %04x %04x %04x %04x %04x %04x %04x\n",sample_cc ,
                data[i].samples[0], data[i].samples[1], data[i].samples[2] , data[i].samples[3] ,
                data[i].samples[4], data[i].samples[5], data[i].samples[6] , data[i].samples[7] );

                 sample_cc++;
        }
}

@namanjain7 namanjain7 force-pushed the mems_sem_wait_fix_1 branch 2 times, most recently from d6e9dc6 to d869294 Compare November 4, 2025 07:03
@namanjain7
Copy link
Contributor Author

@zhongnuo-tang Hello.
When I called
sensor recordsamples 100 --> multiple times like 10 times.
Then callback for I2S from bsp to driver is not called. It's not executed later in the logs. Do you know why this issue is coming?

0078: 0005 fec7 1f2e 0000 0000 0000 0000 0000
0079: 0006 fec6 1f2b ffff ffff ffff ffff ffff
0080: 0005 fec5 1f29 ffff ffff ffff ffff ffff
i2s_rx_schedule: Work is available                <<---- The three function should be executed like this.
i2s_rx_worker: Calling ais25ba driver callback <<--- 2
ais25ba_i2s_callback: callback is called 123      <<--- 3
0081: 0004 feca 1f26 0000 0000 0000 0000 0000
0082: 0004 fece 1f2e 0000 0000 0000 0000 0000
0083: 0007 fed2 1f3d ffff ffff ffff ffff ffff
0084: 0009 fed5 1f4b ffff ffff ffff ffff ffff
0085: 0009 fed3 1f53 ffff ffff ffff ffff ffff
0086: 0009 fecb 1f4b ffff ffff ffff ffff ffff
0087: 0008 fec5 1f40 0000 0000 0000 0000 0000
0088: 0005 fec2 1f35 ffff ffff ffff ffff ffff
0089: 0003 febf 1f2f ffff ffff ffff ffff ffff
0090: 0002 fec3 1f2e 0000 0000 0000 0000 0000
0091: 0003 fec6 1f2e 0000 0000 0000 0000 0000
0092: 0004 fec8 1f37 ffff ffff ffff ffff ffff
0093: 0006 fec7 1f46 0000 0000 0000 0000 0000
0094: 0006 fec6 1f4f ffff ffff ffff ffff ffff
i2s_rx_schedule: Work is available                   <<<----- The work is scheduled, but call #2 is not executed in bsp.
0095: 000a fec7 1f53 ffff ffff ffff ffff ffff
0096: 000b feca 1f4e 0000 0000 0000 0000 0000
0097: 000d fed2 1f41 ffff ffff ffff ffff ffff
0098: 000b fed5 1f3c 0000 0000 0000 0000 0000

@zhongnuo-tang
Copy link
Contributor

sensor recordsamples 100

i think it might be due to your printing that took some time..

i have modified your code based on this PR, instead of printing all data, print '-' every 100samples. and i can see the three steps
i2s_rx_schedule: Work is available
i2s_rx_worker: Calling ais25ba driver callback
ais25ba_i2s_callback: callback is called 123
with many 'sensor recordsamples 100' spammed.

@namanjain7
Copy link
Contributor Author

sensor recordsamples 100

i think it might be due to your printing that took some time..

i have modified your code based on this PR, instead of printing all data, print '-' every 100samples. and i can see the three steps i2s_rx_schedule: Work is available i2s_rx_worker: Calling ais25ba driver callback ais25ba_i2s_callback: callback is called 123 with many 'sensor recordsamples 100' spammed.

I got it. But if there is no print, then application will process something else in that time. application also requires time to process the data. So, if callback of i2s from bsp to driver is not coming, then what is the solution of this issue?

@zhongnuo-tang
Copy link
Contributor

sensor recordsamples 100

i think it might be due to your printing that took some time..
i have modified your code based on this PR, instead of printing all data, print '-' every 100samples. and i can see the three steps i2s_rx_schedule: Work is available i2s_rx_worker: Calling ais25ba driver callback ais25ba_i2s_callback: callback is called 123 with many 'sensor recordsamples 100' spammed.

I got it. But if there is no print, then application will process something else in that time. application also requires time to process the data. So, if callback of i2s from bsp to driver is not coming, then what is the solution of this issue?

i think it is actually called back to app, but because there is too many logs you kind of missed log. How about you replace the printing with slight delay to simulate the data processing?

@allen-kim-sec allen-kim-sec changed the title [DO NOT MERGE] [TDM] os/: Code cleanup, remove memory leaks [TDM] os/: Code cleanup, remove memory leaks Nov 5, 2025
@allen-kim-sec
Copy link

@namanjain7 #7031 was merged , please change your base from TDM branch.

@allen-kim-sec
Copy link

@namanjain7 ,
I can not find any mismatch or missing frame data.

  • test for "sensor recordsamples 10240" and compare between LA value and console log.
    How to reproduce this issue ?

10234: 0266 0004 1ea3 ffff ffff ffff ffff ffff
10235: 026a 000a 1ea8 0000 0000 0000 0000 0000
10236: 0275 0007 1eb0 0000 0000 0000 0000 0000
10237: 027c fffb 1eb5 ffff ffff ffff ffff ffff
10238: 0280 ffee 1eb6 0000 0000 0000 0000 0000
10239: 027c ffe9 1eb3 ffff ffff ffff ffff ffff
Total Count of trailing zeroes found 0

send_count++;
} while (status == -1 && get_errno() == EINTR);

lldbg("**** send count in mq_send: %d\n", send_count);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the log to check the send_count.
The value is sometimes 2.
But I am assuming if status != OK, then buffer is not pushed to mq, buffer push failed.

// Clean up
ret = ioctl(mems_fd, SENSOR_STOP, NULL);
/* Clean up */
mq_unlink(MEMS_SENSOR_PATH);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the mq_unlink to avoid pushing buffer to mq from driver.
we will only remove buffer from mq from application after executing this.

}
usleep(10000);
is_sensor_prepared = false;
while (mems_mq_receive(g_mems_mq, (FAR char *)&msg, sizeof(msg), &prio) == OK);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are removing the buffers from mq to make it empty.
then we will close it.

@allen-kim-sec
Copy link

allen-kim-sec commented Nov 17, 2025

`[
  {
    "file_path": "apps/examples/sensor_test/sensor_main.c",
    "code_link": "https://github.com/Samsung/TizenRT/blob/224221da4932356fbb4a27ca93d38bb2f923a9be/apps/examples/sensor_test/sensor_main.c#L312-L319",
    "comment": "The `collected_samples` memory may not be freed when errors occur in `mems_sensor_record_data()` function. When `SENSOR_START` fails, the `free(collected_samples)` call is missing, which can cause memory leaks.",
    "recommendation": "Add `free(collected_samples)` calls in all error paths to prevent memory leaks. For example:\n```c\nret = ioctl(mems_fd, SENSOR_START, NULL);\nif (ret != OK) {\n    printf(\"ERROR: MEMS sensor start failed, errno: %d\\n\", errno);\n    mems_teardown();\n    free(collected_samples);  // Add this\n    return ERROR;\n}\n```"
  },
  {
    "file_path": "os/drivers/sensors/ais25ba.c",
    "code_link": "https://github.com/Samsung/TizenRT/blob/224221da4932356fbb4a27ca93d38bb2f923a9be/os/drivers/sensors/ais25ba.c#L347-L350",
    "comment": "The `ais25ba_read()` function lacks timeout for semaphore wait, creating potential for infinite wait. In embedded systems, infinite waits can compromise system responsiveness.",
    "recommendation": "Implement timeout handling using `sem_tickwait()` as noted in TODO comments:\n```c\nret = sem_tickwait(&ctrl->read_sem, clock_systimer(), MSEC2TICK(405));\nif (ret == ETIMEDOUT) {\n    lldbg(\"Timeout for read_sem semaphore, previous read not finished\\n\");\n    goto error;\n}\n```"
  },
  {
    "file_path": "apps/examples/sensor_test/sensor_main.c",
    "code_link": "https://github.com/Samsung/TizenRT/blob/224221da4932356fbb4a27ca93d38bb2f923a9be/apps/examples/sensor_test/sensor_main.c#L448-L465",
    "comment": "The `sensor_read()` function opens file descriptor after `fcntl()` check but misses cleanup on failure. When `SENSOR_SET_SAMPRATE` fails, the `data` memory is not freed.",
    "recommendation": "Apply consistent error handling using `goto cleanup` pattern:\n```c\nret = ioctl(mems_fd, SENSOR_SET_SAMPRATE, AIS25BA_SAMPLE_RATE);\nif (ret != OK) {\n    printf(\"ERROR: Sensor Set sample rate failed\\n\");\n    goto cleanup;  // Use goto\n}\n// ... at end of function ...\ncleanup:\nfree(data);\nclose(mems_fd);\nreturn OK;\n```"
  }
]

Add trailing zeroes check, clean code and remove memory leaks
@namanjain7
Copy link
Contributor Author

namanjain7 commented Nov 17, 2025

`[
  {
    "file_path": "apps/examples/sensor_test/sensor_main.c",
    "code_link": "https://github.com/Samsung/TizenRT/blob/224221da4932356fbb4a27ca93d38bb2f923a9be/apps/examples/sensor_test/sensor_main.c#L312-L319",
    "comment": "The `collected_samples` memory may not be freed when errors occur in `mems_sensor_record_data()` function. When `SENSOR_START` fails, the `free(collected_samples)` call is missing, which can cause memory leaks.",
    "recommendation": "Add `free(collected_samples)` calls in all error paths to prevent memory leaks. For example:\n```c\nret = ioctl(mems_fd, SENSOR_START, NULL);\nif (ret != OK) {\n    printf(\"ERROR: MEMS sensor start failed, errno: %d\\n\", errno);\n    mems_teardown();\n    free(collected_samples);  // Add this\n    return ERROR;\n}\n```"
  },
  {
    "file_path": "os/drivers/sensors/ais25ba.c",
    "code_link": "https://github.com/Samsung/TizenRT/blob/224221da4932356fbb4a27ca93d38bb2f923a9be/os/drivers/sensors/ais25ba.c#L347-L350",
    "comment": "The `ais25ba_read()` function lacks timeout for semaphore wait, creating potential for infinite wait. In embedded systems, infinite waits can compromise system responsiveness.",
    "recommendation": "Implement timeout handling using `sem_tickwait()` as noted in TODO comments:\n```c\nret = sem_tickwait(&ctrl->read_sem, clock_systimer(), MSEC2TICK(405));\nif (ret == ETIMEDOUT) {\n    lldbg(\"Timeout for read_sem semaphore, previous read not finished\\n\");\n    goto error;\n}\n```"
  },
  {
    "file_path": "apps/examples/sensor_test/sensor_main.c",
    "code_link": "https://github.com/Samsung/TizenRT/blob/224221da4932356fbb4a27ca93d38bb2f923a9be/apps/examples/sensor_test/sensor_main.c#L448-L465",
    "comment": "The `sensor_read()` function opens file descriptor after `fcntl()` check but misses cleanup on failure. When `SENSOR_SET_SAMPRATE` fails, the `data` memory is not freed.",
    "recommendation": "Apply consistent error handling using `goto cleanup` pattern:\n```c\nret = ioctl(mems_fd, SENSOR_SET_SAMPRATE, AIS25BA_SAMPLE_RATE);\nif (ret != OK) {\n    printf(\"ERROR: Sensor Set sample rate failed\\n\");\n    goto cleanup;  // Use goto\n}\n// ... at end of function ...\ncleanup:\nfree(data);\nclose(mems_fd);\nreturn OK;\n```"
  }
]

Fixed the issues.
Second point is not issue. It's intentionally added to reproduce the callback issue from realtek.
if I add the sem_tickwait(), then we won't be able to detect issue easily and find possible solution.
first and third point issue is fixed, second is kept intentionally. Will remove it after realtek fixes the issue.

@allen-kim-sec
Copy link

consider sem_tickwait() to treat timeout case

  • prevent deadlock case and can insert more checking routine
  • 1000msec is enough

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants