Hi Quentin,

Sorry for the delay, I’ve spent the last few hours looking into this.

So, the semantics of the inline function specifier in C99 are as follows:

> A function declared with an inline function specifier is an inline function. The function specifier may appear more than
> once; the behavior is the same as if it appeared only once. Making a function an inline function suggests that calls to the
> function be as fast as possible. The extent to which such suggestions are effective is implementation-defined.

To truly *enforce* inlining you need to use a GCC/Clang compiler extension like `__attribute__((always_inline))`.

In this specific case, I’m not quite sure I understand why the function is marked as `inline` at all - it is quite large. If the function is only ever used once then the compiler can (and almost always will) automatically inline it regardless of whether you use the `inline` hint, although because its address is taken elsewhere in the file the compiler must always generate a non-inlined version as well.

Additionally, the loop in the inline assembly does make it a bit harder for the compiler here – the compiler usually cannot inspect inline assembly very well, so it doesn’t automatically understand the internal infinite loop. I’ve posted a patch to resolve these issues in one swoop here: https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/33175

> You're right, there probably isn't. I was thinking more along the lines of using the make_helpers/build_macros.mk helpers but that probably won't work either. Would propagating the TF_FLAGS be possible? Chris should probably weigh in on this.

I presume we’re talking about the RK3399 Cortex-M0 toolchain? The build macros are not set up to handle this sort of situation, I’m afraid. We can’t propagate `TF_CFLAGS` either because many of the flags are not compatible across the AArch64 and AArch32 toolchains, so it’s a manual process.

Hope that helps,

Chris

From: Quentin Schulz via TF-A <tf-a@lists.trustedfirmware.org>
Date: Thursday, 31 October 2024 at 13:53
To: Boyan Karatotev <Boyan.Karatotev@arm.com>, tf-a@lists.trustedfirmware.org <tf-a@lists.trustedfirmware.org>
Cc: nb@arm.com <nb@arm.com>
Subject: [TF-A] Re: TF-A: Compilation issues for Rockchip platforms

Hi Boyan,

On 10/31/24 11:02 AM, Boyan Karatotev wrote:
> [You don't often get email from boyan.karatotev@arm.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Hi Quentin,
>
>>> [You don't often get email from boyan.karatotev@arm.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>>
>>> Hi Quentin,
>>>
>>> I dug around a bit and I think I have answers for most.
>>>
>>
>> Thanks for looking into this :)
>
> No worries!
>
>>>> 1) RK3588 symbol redefinition in assembly
>>>>
>>>> This only happens for clang and I'm a bit confused as to why. Applying the above patch fixing that issue but inspecting the assembly code reveals something really odd.
>>>>
>>>> **With** that patch, with gcc:
>>>>      25c8c:       d510149f        msr     dbgprcr_el1, xzr
>>>>      25c90:       d538f2e0        mrs     x0, s3_0_c15_c2_7
>>>>      25c94:       b2400000        orr     x0, x0, #0x1
>>>>      25c98:       d518f2e0        msr     s3_0_c15_c2_7, x0
>>>>      25c9c:       58000020        ldr     x0, 0x25ca0
>>>>      25ca0:       d5033fdf        isb
>>>>      25ca4:       d503207f        wfi
>>>>      25ca8:       d61f0000        br      x0
>>>>
>>>> **with** that patch, with clang:
>>>>      26d2c:       d510149f        msr     dbgprcr_el1, xzr
>>>>      26d30:       d538f2e0        mrs     x0, s3_0_c15_c2_7
>>>>      26d34:       b2400000        orr     x0, x0, #0x1
>>>>      26d38:       d518f2e0        msr     s3_0_c15_c2_7, x0
>>>>      26d3c:       58000020        ldr     x0, 0x26d40        <--- "my" ldr
>>>>      26d40:       d5033fdf        isb
>>>>      26d44:       d503207f        wfi
>>>>      26d48:       d61f0000        br      x0                 <--- "my" br
>>>>      26d4c:       94000e76        bl      0x2a724
>>>>      26d50:       d510149f        msr     dbgprcr_el1, xzr
>>>>      26d54:       d538f2e0        mrs     x0, s3_0_c15_c2_7
>>>>      26d58:       b2400000        orr     x0, x0, #0x1
>>>>      26d5c:       d518f2e0        msr     s3_0_c15_c2_7, x0
>>>>      26d60:       58000020        ldr     x0, 0x26d64        <--- "my" ldr
>>>>      26d64:       d5033fdf        isb
>>>>      26d68:       d503207f        wfi
>>>>      26d6c:       d61f0000        br      x0                 <--- "my" br
>>>>
>>>> So I think the above patch is just a band-aid on what the actually is but I have no idea what could be going on.
>>>>
>>>> If I use this patch instead, to match the other assembly instructions in this file
>>>>
>>>> =========================================================================
>>>> diff --git a/plat/rockchip/rk3588/drivers/pmu/pmu.c b/plat/rockchip/rk3588/drivers/pmu/pmu.c
>>>> index f693dbdf0..a4128b214 100644
>>>> --- a/plat/rockchip/rk3588/drivers/pmu/pmu.c
>>>> +++ b/plat/rockchip/rk3588/drivers/pmu/pmu.c
>>>> @@ -760,10 +760,10 @@ static inline void cpus_pd_req_enter_wfi(void)
>>>>                  "mrs    x0, S3_0_C15_C2_7\n"
>>>>                  "orr    x0, x0, #0x1\n"
>>>>                  "msr    S3_0_C15_C2_7, x0\n"
>>>> -              "wfi_loop:\n"
>>>> +              "1:\n"
>>>>                  "isb\n"
>>>>                  "wfi\n"
>>>> -              "b wfi_loop\n");
>>>> +              "b 1b\n");
>>>>    }
>>>>
>>>>    static void nonboot_cpus_off(void)
>>>> =========================================================================
>>>>
>>>> It happily compiles, though the binary still seems to contain this loop twice.
>>>>
>>>> Using __attribute__((noinline)) the code is not duplicated, but I assume this is expected as it would now simply be a function.
>>>
>>>
>>> So, the function only runs once, but it has its address taken elsewhere. For the actual call, it does get inlined, as expected. But the address taking bit, it is kept as a standalone function which happens to be places right after. Disassembling with symbols shows this a bit better. My guess would be that clang either isn't smart enough to deduplicate this, or it's doing it on purpose. Gcc must decide to not inline.
>>>
>>
>> But the function currently has the inline marker, doesn't it force the compiler to inline it?
>
> Apparently, it doesn't have to: https://www.kernel.org/doc/local/inline.html
>
>> Are you suggesting anything we could do here?
>>
>> Would any of the three above suggestions be reasonably fit for upstreaming?
>
> Well the numbered labels should let it compile and we already merged another patch for that. We do have __always_inline, ALWAYS_INLINE, or just __attribute__((__always_inline__)) which should force the inlining. But if there are two uses of the function, I'd imagine the named label would produce similar errors.
>

Bingo!

If I force the inlining on GCC, I get the same error as in clang.

Using a local label works for GCC and clang, so will submit something
like that instead. Note that for some reason I can only use numbered
local labels even though named labels are supposedly supported by the
Aarch64 ASM, but not GAS? Anyway, good enough for me, too far from my
comfort zone already :)

Thanks for the hint!

I'm wondering now if it really makes sense to have inline functions if
GCC just doesn't care about those? Should we remove the inline part or
introduce the same macro as the kernel so that we force inlining even on
GCC? I am not planning on submitting anything for that, just being
curious where this puts us.

>>>> 2) RK3399 warning array subscript 0 is outside array bounds of 'volatile unsigned int[0]' [-Warray-bounds=]
>>>>
>>>> """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
>>>> In file included from src/dram.c:12:
>>>> src/dram.c: In function 'm0_main':
>>>> include/rk3399_mcu.h:15:34: warning: array subscript 0 is outside array bounds of 'volatile unsigned int[0]' [-Warray-bounds=]
>>>>      15 |                                 (*(volatile unsigned int *)(c)); __v; })
>>>>         |                                 ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> include/rk3399_mcu.h:16:69: note: in definition of macro 'mmio_write_32'
>>>>      16 | #define mmio_write_32(c, v)     ((*(volatile unsigned int *)(c)) = (v))
>>>>         |     ^
>>>> src/dram.c:67:23: note: in expansion of macro 'mmio_read_32'
>>>>      67 |                       mmio_read_32(PARAM_ADDR + PARAM_FREQ_SELECT));
>>>>         |                       ^~~~~~~~~~~~
>>>>
>>>> This function is used in many places, but only in very specific places is it triggering a warning, when PARAM_ADDR is used, but only when accessed from m0 C files.
>>>>
>>>> PARAM_ADDR is set to 0xc0. From !m0 files, access to PARAM_ADDR seems to be done through the M0_PARAM_ADDR which is M0_BINCODE_BASE + PARAM_ADDR. For m0 files, it's directly accessing PARAM_ADDR. I honestly don't really know why it's complaining for m0 files but not the !m0 ones.
>>>
>>> This seems identical to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523. The top-level makefile has a workaround for this and it sets --param=min-pagesize=0. However, this driver has a very peculiar way of building and it does not inherit most of tf-a's flags. Adding that flag to its own internal makefile fixes the issue, but I feel like it's bound to repeat. Can't it be built by the standard tfa build?
>>>
>>
>> The only thing I know about TF-A on RK3399 is that it is indeed peculiar :)
>>
>> As far as I know, some part of the binary runs on the Cortex-A core(s) and another part on one of the Cortex-M0, hence why we need an additional toolchain and different drivers and all that.
>
> Ah, I hadn't though of that.
>
>> Not sure if there exists any platform today in TF-A that has the same behavior and is "built by the standard tfa build"? How would one go forward with that?
>
> You're right, there probably isn't. I was thinking more along the lines of using the make_helpers/build_macros.mk helpers but that probably won't work either. Would propagating the TF_FLAGS be possible? Chris should probably weigh in on this.
>

Except if Chris's mail address is nb at arm.com, they weren't explicitly
Cc'ed in your mail (if that was your intention :) ).

Cheers,
Quentin
--
TF-A mailing list -- tf-a@lists.trustedfirmware.org
To unsubscribe send an email to tf-a-leave@lists.trustedfirmware.org