Hello,
Thanks for the patch and the detailed commit message. For ST boards, we usually treat this kind of issues with watchdog. But I admit some watchdogs patches are still missing upstream. I'll try to send them soon.
We are discussing internally to see if this particular use-case could be done as you proposed. I'll get back to you next week about this, when one of my colleagues comes back. Maybe this could be selected with a compilation flag.
About the patch submission, TF-A uses gerrit, and we do not merge patches from mailing-list like Linux does. Please check this page: https://trustedfirmware-a.readthedocs.io/en/latest/process/contributing.html
Best regards, Yann
On 4/29/26 05:21, Chanhong Jung wrote:
A failed eMMC initialization in BL2's boot_mmc() currently calls panic(), leaving the system spinning forever and forcing an external power cycle to recover. In production deployments where the eMMC is the on-board boot medium, transient init failures (power-rail ramp timing, bus-line noise just after eMMC fast-boot mode entry, RCC clock-domain settling jitter, etc.) are far more common than hard failures, and the recovery path for all of them is "boot again from cold."
Invoke stm32mp_system_reset() before panic() in the stm32_sdmmc2_mmc_init() failure path so the SoC restarts and BootROM re-runs the entire boot chain from scratch. Transient failures that clear themselves between cold boots are then resolved automatically without operator intervention.
stm32mp_system_reset() carries the __dead2 (no-return) attribute, so control never reaches the following panic() in normal operation. The panic() call is intentionally retained for two reasons:
Defensive fallback should the reset circuit / power sequencer fail to actually issue a reset; staying in a tight panic() loop is then still preferable to executing past the failure point.
As an explicit "this branch must not continue" signal to static analyzers and future readers, so a subsequent edit cannot accidentally drop the reset call and silently revert the policy.
This change has been in production on a downstream STM32MP153D board running a TF-A v2.4 backport for over a year. Only the central panic-to-reset change is sent here; related debug NOTICE() prints and an MMC retry-count bump that lived alongside the downstream patch are intentionally not included, as the unconditional reset fallback already covers the recovery cases the retry bump targeted.
Signed-off-by: Chanhong Jung happycpu@gmail.com
plat/st/common/bl2_io_storage.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/plat/st/common/bl2_io_storage.c b/plat/st/common/bl2_io_storage.c index c478b497c..e6a73e0d3 100644 --- a/plat/st/common/bl2_io_storage.c +++ b/plat/st/common/bl2_io_storage.c @@ -28,6 +28,7 @@ #include <drivers/st/stm32_fmc2_nand.h> #include <drivers/st/stm32_qspi.h> #include <drivers/st/stm32_sdmmc2.h> +#include <drivers/st/stm32mp_reset.h> #include <drivers/usb_device.h> #include <lib/fconf/fconf.h> #include <lib/mmio.h> @@ -255,7 +256,19 @@ static void boot_mmc(enum mmc_device_type mmc_dev_type, params.device_info = &mmc_info; if (stm32_sdmmc2_mmc_init(¶ms) != 0) {
ERROR("SDMMC%u init failed\n", boot_interface_instance);
ERROR("SDMMC%u init failed - resetting system\n",boot_interface_instance);/** eMMC init failures here are usually transient (rail-ramp* timing, bus-line noise on fast-boot entry, RCC clock-domain* settling jitter). panic() leaves the SoC frozen and forces* an external power cycle; a system reset lets BootROM re-run* the entire boot path, which most transient failures survive.* stm32mp_system_reset() is __dead2, so panic() below is a* defensive fallback if the reset circuit is itself wedged,* and a no-return marker for analyzers.*/ panic(); }stm32mp_system_reset();base-commit: de387341ee73d99446fbbf6a7053d7b759b8b3a6