First of all i am new to the FreeRTOS context, so i might be completely wrong in my approach. Please bear with me. Now into the issue. We are working on ARM Cortex M0 controller based project. We are having an External flash with QSPI interface to the controller. Internal RAM size is 190KB. We are using FreeRTOS based software development. Since QSPI is slow in speed we can’t run the code from flash directly. We are having a ROM Bootloader which loads the code from external flash to the RAM. Our application is big enough that it will exceed the available RAM size. Hence we are looking for a soulution to deal with this issue. As we are not much familiar with RTOS customization, we are not aware about the feasibility of a solution which uses Dynamic code loading. Can anyone please suggest some solution to this issue and the feasibility of Dynamic code loading Feature. Note: As we are using an ARM Cortex M0 based controller it doesn’t have an MMU. My assumption is that it will be pretty difficult to implement Dynamic code loading feature without an MMU, correct me if i am wrong. Any help will be appreciable. Thanks & Regards Jitheesh Surendran
Unless I misunderstand, I think it would be more accurate to say you need paging, rather than dynamic code loading. Paging normally being a scheme whereby code that is to large to fit in an executable region is swapped into and out of an executable region (RAM in your case) as it is needed. Normally that does require an MMU, in order to get the virtual memory space. It is also very slow for a real time system if the QSPI is slow. As you don’t have an MMU it might be possible to do this using a combination of dynamic loading and paging. In that case you would need to build each swappable image into the same address space, so you would have multiple separate build images, and swap those in and out. To be honest, I think this will be very complex on a Cortex-M0.
Hi, Thanks for your quick reply. I understand that the solution you suggested will be very complex to implement(as you said). Still what will be the estimated time which will be required for this activity(atleast to give it a try) and where i can find some documentation on the same(i was not getting any support document as this is not a very common issue OR may be a hard technology to get with). Do you suggest any other solution which is rather simple to tackle this issue. Any help will be appreciable. Thanks & Regards Jitheesh Surendran
There are several way to deal with not having enough memory. ‘Paging’ (also called virtual memory) generally requires a MMU to implement, so not really an option on your machine. A second method that is normally called ‘Overlay’ might be doable on your machine. FreeRTOS doesn’t have any direct support for it, but neither does it do anything particular to prevent it. Implementing overlays is mostly a matter of compiler/linker support (It was somewhat common in ancient days when many machines had limited ram memory, it is less commonly supported now).Overlays are based on dividing your code into segments, one which will always be in memory (and FreeRTOS itself should probably be in that segment) and then into (possibly multiple) classes of segments, when only one segment in each class is needed at a given time. Calls between segments go through a ‘think’ that checks if it needs to load a new segment into a segment class. If your compiler/linker supports overlays, they aren’t THAT hard to implement. The biggest challenge is that normally YOU need to figure out what routines need to be placed into each segment/segment class. Multiple threads make this a bit more of a chalenge, as you normally don’t want a context switch to change which segment is loaded into a class (that would slow things down a lot and would require support from FreeRTOS), so generally you would need to have either one task use multiple segments depending on a mode it is in, or have a cluster of tasks that you start/stop as part of mode switching. As I said, the availability of this is highly dependant on compiler/linker support. If it is there, it is just a bit of work to partition your program into segments. If it isn’t supported, it often can be build yourself with a bit of discipline and manually creating the thunks, but this tends to require a lot of work, including assembly language to let your force the thunks to be consistant in all the segments, as well as good knowledge of some of the internals of the compiler/linker.
Hi Richard, Thanks for your quick reply and this seems to be a valid and feasible option. But few more doubts on the same. Does enough documentation available on the internet for this ‘Overlay’ mechanism? I hope you noticed my first explanation of the actual issue. So i am worried whether simply building the code into different segments will alone help OR will need some customization on the FreeRTOS platform? We are using ARM GCC Toolchain for our platform. Does it support the ‘Overlay’ mechanism? Thanks & Regards Jitheesh Surendran
I haven’t had to use overlays for about 20 years. Doing a quick search, GCC does support the idea of overlays in the linker (so you can link multiple segments of the program at the same address). Sometime chip manufactures use modified version of GCC for their chips, not sure how common that is with ARM version (probably less likely to have variantes). I don’t know if they have built in support for the thunking, or if you would need to do this yourself. With linker support, it isn’t that hard to write overlay thunking code (it isn’t quite as transparent/pretty as automatically generated thunks). As I thought I said, FreeRTOS doesn’t have any code that supports this, but it doesn’t really need to. It really shouldn’t have anything that gets in the way of doing it either. One key is that FreeRTOS should NOT have any of its code placed in an overlay, but it is very small anyway. Basically, you would either have a large task that uses overlay for different phases of its operations, or a set of tasks, that only one set is active at a time, each set in its own overlay and those overlays flipped with a controller that stops the old tasks, loads the new code, and starts the new tasks.
Hi Richard, Thanks for your reply. So if i understood ‘thunking’ is a mechanism which used to load(from ROM/Flash to RAM) different segements/overlays(of the code compiled with overlay mechanism) as and when it is required in the current context. Correct me if i am wrong. Thanks & Regards Jitheesh Surendran
Thunking is a layer of code between the caller and the ‘real’ subroutine that make sure that the proper overlay is in memory before calling the routine. Typically it will check a flag variable which indicates what overay is loaded, and if it is wrong, load the right overlay. With a system that transparently supports overlays, you just call the real subroutine and the linker will rewrite the call to the thunk. With non-transparent overlay support (which may be what you end up with), when you call a function in the overlay from outside, you call the thunk instead, and it does its work and then forwards to the real function.
Hi Richard, How do we check whether a particular toolchain does transparently supports overlyay? Thanks & Regards Jitheesh Surendran
Perhaps by reading its manual?
As RTE says, read the manual. The problem you may run into is that the need for Overlays is much less prevelent then it was decades ago. Back then, the issue was that programs were growing bigger than the address space of the processors, and there really weren’t alternative bigger processors available (you went from a order of $1,000 desktop system to order of $100,000-$1,000,000 dollar filling a room system to get something bigger), so there was a lot of incentive to provide software solutions. Now, we have systems with multi-GB of memory available (and these support virtual memory so the 64 bit address space gets bigger than we can easily imagine) so hardware solutions are generally available. Rather than trying to shoehorn a program into a too small system, it is normally a lot more productive to use a better system. In your case, a M0 processor with 190k built in memory is fairly hefty for that level of processor, (many have much less memory than that), as they are aimed at fairly simple tasks. I suspect most people trying to do what you are (not sure of your final application) would target a somewhat bigger processor, to get more memory, or at least add some external memory to get enough space. I realize that where you are in the project this might not be an option, but the fact that overlays are much less needed will be a reason you may find less support than you might want. Less demand causes less support.
Hi Richard, Thanks for your valuable information. As of now we really don’t have a choice of going for a better processor. So we will try to use this ‘overlay’ mechanism to resolve our issues. But as now we are not able to find any good documentation for the overlay mechanism. But i have seen one explanation but which is for armcc(not for gcc-arm-none-eabi-492015q3), which uses scatter file, not using ld file. Can you please help us on finding some good documentation for the overlay. Thanks & Regards Jitheesh Surendran
With a quick search, I find https://sourceware.org/binutils/docs/ld/Overlay-Description.html which talks about using the gcc linker ld to create an overlay. I have no idea if ld is used in your suite. The section also talks about how you could build overlays without the special command.
Thanks Richard. We are using ld in our suite. That part is more clear now. But still we are left with thunking mechanism which used to call swap the overlays in and out of the executable region. We were trying to adapt from one of the example provided(but unfortunately example is for ARMLink not for GNU Toolchain). So were not able to find exact GNU equivalents for some commands provided in that example. especially for command like IMPORT, EXPORT, $Sub$$, $Super$$, DCD etc. Please find the link below to the URL i was refering to, http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/17302.html Thanks & Regards Jitheesh Surendran
The ARM package looks like it provides more support for overlays than the GCC package does, so their examples may not be that useful. In particular, it provides an automatic remapping of calls to the thunks with calls to func() being mapped to calls to $Sub$$func() and calls to $Super$$func() to func(), allowing you to write transparent stubs (I don’t see an equivelent function in GCC, so it looks like you need to explicitly call the thunk when making a call from outside the overlay. For example, instead of just calling func(), outside the overay, you call funcstub(), and in the stub and in the overay you call funcimpl() (which is how the function is defined). (I am renaming the function everywhere so I must look at every call to get it right, disaster happens if a call outside the overlay goes directly to func). IMPORT, EXPORT, and DCD are assembler statements to pull in (IMPORT) linker symbols, (EXPORT) define symbols for external reference, and (DCD) define a constant value in memory. Those should be supported (or have an equivelent) in the GCC assembly package.
Hi Richard, Thanks for your valuable information and support. I will try this out. Thanks & Regards Jitheesh Surendran
Hi Richard, We were able to successfully implement overlay in our system. Once again thanks for your time and effort. Few more doubts related to linker script. In our project when using overlay one of the overlay module is going to be lib.a file. Hence we need to know how this can excluded from linking normally using EXCLUDE_FILE() option. we tried Exclude option as we are using for an object file. But it didn’t work. And also how this lib.a can be placed in an overlay section. I tried searching, but unfortunately none of them works for me. The following link suggest something ,but it doesn’t works for me. https://gcc.gnu.org/ml/gcc-help/2010-12/msg00176.html Thanks & Regards Jitheesh Surendran
Copyright (C) Amazon Web Services, Inc. or its affiliates. All rights reserved.