Possible race during OP-TEE kernel module device probing

List overview All Threads
Download

newer

older

[PATCH v3 00/15] firmware: qcom:...

[PATCH v2 00/15] firmware: qcom:...

Shf Chen (陳少甫)

27 Mar 2026 27 Mar '26

9:01 a.m.

Hi,

We found a possible race condition issue during OP-TEE kernel driver probing the device. A NULL pointer dereference exception can happen when another kernel driver open OP-TEE context with tee_client_open_context() then do a SMC call to OP-TEE. Below is the exception:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=00000001026bb000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Workqueue: events_unbound deferred_probe_work_func pstate: 03400005 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : optee_cq_wait_init+0x78/0x124 [optee] lr : optee_cq_wait_init+0x60/0x124 [optee] sp : ffffffc081fcb7f0 x29: ffffffc081fcb7f0 x28: 0000000000000000 x27: 0000000000001000 x26: ffffff8080e42c60 x25: ffffff8084d46040 x24: 0000000000000000 x23: 0000000000000000 x22: ffffffc081fcb8c0 x21: ffffffc081fcb8a8 x20: 0000000000000000 x19: ffffff8082741570 x18: ffffffe572f8ca00 x17: 00000000fa28650f x16: 00000000fa28650f x15: ffffff8084d47000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000084d47000 x11: 0000000000000000 x10: 0000000032000012 x9 : 04d4600000000001 x8 : ffffffc081fcb8c8 x7 : 0000000000000000 x6 : 000000000000003f x5 : ffffff83c86649e0 x4 : 0000000000000008 x3 : 0000000000000000 x2 : ffffff80827415a0 x1 : 0000000000000000 x0 : ffffffc081fcb8c0 Call trace: optee_cq_wait_init+0x78/0x124 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_smc_do_call_with_arg+0x12c/0x95c [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_shm_register+0x284/0x360 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] register_shm_helper+0x1a4/0x2f4 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] tee_shm_register_kernel_buf+0x60/0x90 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] cmdq_sec_allocate_wsm+0x58/0xc4 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a] cmdq_sec_probe+0x80/0x4a0 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a]

We found the optee->call_queue hasn't been initialized when our driver called OP-TEE, and it might have some issues about the device data structure initialize order in optee_probe()[drivers/tee/optee/smc_abi.c]:

--- rc = tee_device_register(optee->teedev); // <----- TEE device register here if (rc) goto err_unreg_supp_teedev;

rc = tee_device_register(optee->supp_teedev); if (rc) goto err_unreg_supp_teedev;

optee_cq_init(&optee->call_queue, thread_count); // <----- Some data structures are initialized afterwards optee_supp_init(&optee->supp); optee->smc.memremaped_shm = memremaped_shm; optee->pool = pool; optee_shm_arg_cache_init(optee, arg_cache_flags); mutex_init(&optee->rpmb_dev_mutex); ---

We want to ask if the data structure initialization should be done before the tee device registration?

Best regards, Shao-Fu Chen

Show replies by date

Sumit Garg

27 Mar 27 Mar

10:01 a.m.

On Fri, Mar 27, 2026 at 09:01:15AM +0000, Shf Chen (陳少甫) via OP-TEE wrote:

...

Hi,

We found a possible race condition issue during OP-TEE kernel driver probing the device. A NULL pointer dereference exception can happen when another kernel driver open OP-TEE context with tee_client_open_context() then do a SMC call to OP-TEE. Below is the exception:

How is your kernel driver being probed? Have you registered it as a proper TEE bus driver? Any custom ways to invoke TEE kernel client APIs are surely susceptible to races like the one you mentioned below.

-Sumit

...

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=00000001026bb000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Workqueue: events_unbound deferred_probe_work_func pstate: 03400005 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : optee_cq_wait_init+0x78/0x124 [optee] lr : optee_cq_wait_init+0x60/0x124 [optee] sp : ffffffc081fcb7f0 x29: ffffffc081fcb7f0 x28: 0000000000000000 x27: 0000000000001000 x26: ffffff8080e42c60 x25: ffffff8084d46040 x24: 0000000000000000 x23: 0000000000000000 x22: ffffffc081fcb8c0 x21: ffffffc081fcb8a8 x20: 0000000000000000 x19: ffffff8082741570 x18: ffffffe572f8ca00 x17: 00000000fa28650f x16: 00000000fa28650f x15: ffffff8084d47000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000084d47000 x11: 0000000000000000 x10: 0000000032000012 x9 : 04d4600000000001 x8 : ffffffc081fcb8c8 x7 : 0000000000000000 x6 : 000000000000003f x5 : ffffff83c86649e0 x4 : 0000000000000008 x3 : 0000000000000000 x2 : ffffff80827415a0 x1 : 0000000000000000 x0 : ffffffc081fcb8c0 Call trace: optee_cq_wait_init+0x78/0x124 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_smc_do_call_with_arg+0x12c/0x95c [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_shm_register+0x284/0x360 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] register_shm_helper+0x1a4/0x2f4 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] tee_shm_register_kernel_buf+0x60/0x90 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] cmdq_sec_allocate_wsm+0x58/0xc4 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a] cmdq_sec_probe+0x80/0x4a0 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a]

We found the optee->call_queue hasn't been initialized when our driver called OP-TEE, and it might have some issues about the device data structure initialize order in optee_probe()[drivers/tee/optee/smc_abi.c]:

rc = tee_device_register(optee->teedev); // <----- TEE device register here if (rc) goto err_unreg_supp_teedev;

rc = tee_device_register(optee->supp_teedev); if (rc) goto err_unreg_supp_teedev;

optee_cq_init(&optee->call_queue, thread_count); // <----- Some data structures are initialized afterwards optee_supp_init(&optee->supp); optee->smc.memremaped_shm = memremaped_shm; optee->pool = pool; optee_shm_arg_cache_init(optee, arg_cache_flags); mutex_init(&optee->rpmb_dev_mutex);

We want to ask if the data structure initialization should be done before the tee device registration?

Best regards, Shao-Fu Chen

Shf Chen (陳少甫)

30 Mar 30 Mar

5:06 a.m.

On Fri, 2026-03-27 at 15:31 +0530, Sumit Garg wrote:

...

On Fri, Mar 27, 2026 at 09:01:15AM +0000, Shf Chen (陳少甫) via OP-TEE wrote:

...
Hi,

We found a possible race condition issue during OP-TEE kernel driver probing the device. A NULL pointer dereference exception can happen when another kernel driver open OP-TEE context with tee_client_open_context() then do a SMC call to OP-TEE. Below is the exception:

How is your kernel driver being probed? Have you registered it as a proper TEE bus driver? Any custom ways to invoke TEE kernel client APIs are surely susceptible to races like the one you mentioned below.

-Sumit

Hi,

Our kernel driver does not registered as a TEE bus driver for some reasons, and we need to use other API to ensure the OP-TEE driver is probed to workaround this issue. But, we think this is error-prone for module developer that need to call to OP-TEE.

Although this is not the case, I think that after calling tee_device_register(), the optee device is shown under `/dev/` filesystem. There is a very small chance for userspace program to trigger this race condition, and they don't have clue on whether the device is properly being probed yet.

Best regards, Shao-Fu Chen

...

...
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=00000001026bb000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Workqueue: events_unbound deferred_probe_work_func pstate: 03400005 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : optee_cq_wait_init+0x78/0x124 [optee] lr : optee_cq_wait_init+0x60/0x124 [optee] sp : ffffffc081fcb7f0 x29: ffffffc081fcb7f0 x28: 0000000000000000 x27: 0000000000001000 x26: ffffff8080e42c60 x25: ffffff8084d46040 x24: 0000000000000000 x23: 0000000000000000 x22: ffffffc081fcb8c0 x21: ffffffc081fcb8a8 x20: 0000000000000000 x19: ffffff8082741570 x18: ffffffe572f8ca00 x17: 00000000fa28650f x16: 00000000fa28650f x15: ffffff8084d47000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000084d47000 x11: 0000000000000000 x10: 0000000032000012 x9 : 04d4600000000001 x8 : ffffffc081fcb8c8 x7 : 0000000000000000 x6 : 000000000000003f x5 : ffffff83c86649e0 x4 : 0000000000000008 x3 : 0000000000000000 x2 : ffffff80827415a0 x1 : 0000000000000000 x0 : ffffffc081fcb8c0 Call trace: optee_cq_wait_init+0x78/0x124 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_smc_do_call_with_arg+0x12c/0x95c [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_shm_register+0x284/0x360 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] register_shm_helper+0x1a4/0x2f4 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] tee_shm_register_kernel_buf+0x60/0x90 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] cmdq_sec_allocate_wsm+0x58/0xc4 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a] cmdq_sec_probe+0x80/0x4a0 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a]

We found the optee->call_queue hasn't been initialized when our driver called OP-TEE, and it might have some issues about the device data structure initialize order in optee_probe()[drivers/tee/optee/smc_abi.c]:

rc = tee_device_register(optee->teedev); // <----- TEE device register here if (rc) goto err_unreg_supp_teedev;

rc = tee_device_register(optee->supp_teedev); if (rc) goto err_unreg_supp_teedev;

optee_cq_init(&optee->call_queue, thread_count); // <----- Some data structures are initialized afterwards optee_supp_init(&optee->supp); optee->smc.memremaped_shm = memremaped_shm; optee->pool = pool; optee_shm_arg_cache_init(optee, arg_cache_flags); mutex_init(&optee->rpmb_dev_mutex);

We want to ask if the data structure initialization should be done before the tee device registration?

Best regards, Shao-Fu Chen

Jens Wiklander

10:16 a.m.

Hi,

On Mon, Mar 30, 2026 at 6:07 AM Shf Chen (陳少甫) via OP-TEE op-tee@lists.trustedfirmware.org wrote:

...

On Fri, 2026-03-27 at 15:31 +0530, Sumit Garg wrote:

...
On Fri, Mar 27, 2026 at 09:01:15AM +0000, Shf Chen (陳少甫) via OP-TEE wrote:

...
Hi,

We found a possible race condition issue during OP-TEE kernel driver probing the device. A NULL pointer dereference exception can happen when another kernel driver open OP-TEE context with tee_client_open_context() then do a SMC call to OP-TEE. Below is the exception:

How is your kernel driver being probed? Have you registered it as a proper TEE bus driver? Any custom ways to invoke TEE kernel client APIs are surely susceptible to races like the one you mentioned below.

-Sumit

Hi,

Our kernel driver does not registered as a TEE bus driver for some reasons, and we need to use other API to ensure the OP-TEE driver is probed to workaround this issue. But, we think this is error-prone for module developer that need to call to OP-TEE.

Although this is not the case, I think that after calling tee_device_register(), the optee device is shown under `/dev/` filesystem. There is a very small chance for userspace program to trigger this race condition, and they don't have clue on whether the device is properly being probed yet.

I think you're on to something. After calling tee_device_register(), the driver must be able to handle requests, at least without crashing. optee_enumerate_devices() requires the devices to be registered, so that must happen after those calls. But other than that, I think the calls to tee_device_register() should be the last thing in the probe function.

Sumit, do you agree?

Cheers, Jens

...

Best regards, Shao-Fu Chen

...
...
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=00000001026bb000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Workqueue: events_unbound deferred_probe_work_func pstate: 03400005 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : optee_cq_wait_init+0x78/0x124 [optee] lr : optee_cq_wait_init+0x60/0x124 [optee] sp : ffffffc081fcb7f0 x29: ffffffc081fcb7f0 x28: 0000000000000000 x27: 0000000000001000 x26: ffffff8080e42c60 x25: ffffff8084d46040 x24: 0000000000000000 x23: 0000000000000000 x22: ffffffc081fcb8c0 x21: ffffffc081fcb8a8 x20: 0000000000000000 x19: ffffff8082741570 x18: ffffffe572f8ca00 x17: 00000000fa28650f x16: 00000000fa28650f x15: ffffff8084d47000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000084d47000 x11: 0000000000000000 x10: 0000000032000012 x9 : 04d4600000000001 x8 : ffffffc081fcb8c8 x7 : 0000000000000000 x6 : 000000000000003f x5 : ffffff83c86649e0 x4 : 0000000000000008 x3 : 0000000000000000 x2 : ffffff80827415a0 x1 : 0000000000000000 x0 : ffffffc081fcb8c0 Call trace: optee_cq_wait_init+0x78/0x124 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_smc_do_call_with_arg+0x12c/0x95c [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] optee_shm_register+0x284/0x360 [optee f6dbc35f8d96acbe1f3c329d72018151d796208d] register_shm_helper+0x1a4/0x2f4 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] tee_shm_register_kernel_buf+0x60/0x90 [tee cdd8a0d077d984bda1f31f9e8903836edbe46603] cmdq_sec_allocate_wsm+0x58/0xc4 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a] cmdq_sec_probe+0x80/0x4a0 [mtk_cmdq_sec_mbox ac82dc958e32252a21c6ce55bb8839ebb288387a]

We found the optee->call_queue hasn't been initialized when our driver called OP-TEE, and it might have some issues about the device data structure initialize order in optee_probe()[drivers/tee/optee/smc_abi.c]:

rc = tee_device_register(optee->teedev); // <----- TEE device register here if (rc) goto err_unreg_supp_teedev;

rc = tee_device_register(optee->supp_teedev); if (rc) goto err_unreg_supp_teedev;

optee_cq_init(&optee->call_queue, thread_count); // <----- Some data structures are initialized afterwards optee_supp_init(&optee->supp); optee->smc.memremaped_shm = memremaped_shm; optee->pool = pool; optee_shm_arg_cache_init(optee, arg_cache_flags); mutex_init(&optee->rpmb_dev_mutex);

We want to ask if the data structure initialization should be done before the tee device registration?

Best regards, Shao-Fu Chen

Sumit Garg

10:51 a.m.

On Mon, Mar 30, 2026 at 11:16:59AM +0200, Jens Wiklander wrote:

...

Hi,

On Mon, Mar 30, 2026 at 6:07 AM Shf Chen (陳少甫) via OP-TEE op-tee@lists.trustedfirmware.org wrote:

...
On Fri, 2026-03-27 at 15:31 +0530, Sumit Garg wrote:

...
On Fri, Mar 27, 2026 at 09:01:15AM +0000, Shf Chen (陳少甫) via OP-TEE wrote:

...
Hi,

We found a possible race condition issue during OP-TEE kernel driver probing the device. A NULL pointer dereference exception can happen when another kernel driver open OP-TEE context with tee_client_open_context() then do a SMC call to OP-TEE. Below is the exception:

How is your kernel driver being probed? Have you registered it as a proper TEE bus driver? Any custom ways to invoke TEE kernel client APIs are surely susceptible to races like the one you mentioned below.

-Sumit

Hi,

Our kernel driver does not registered as a TEE bus driver for some reasons, and we need to use other API to ensure the OP-TEE driver is probed to workaround this issue. But, we think this is error-prone for module developer that need to call to OP-TEE.

The kernel modules API for OP-TEE is via TEE bus only.

...

...
Although this is not the case, I think that after calling tee_device_register(), the optee device is shown under `/dev/` filesystem. There is a very small chance for userspace program to trigger this race condition, and they don't have clue on whether the device is properly being probed yet.

I think you're on to something. After calling tee_device_register(), the driver must be able to handle requests, at least without crashing. optee_enumerate_devices() requires the devices to be registered, so that must happen after those calls. But other than that, I think the calls to tee_device_register() should be the last thing in the probe function.

Sumit, do you agree?

Sounds reasonable to me if it's something that can be reproduced via the user-space client.

-Sumit

days inactive

days old

op-tee@lists.trustedfirmware.org

4 comments

participants

tags (0)

participants (3)

Jens Wiklander
Shf Chen (陳少甫)
Sumit Garg