Quality RTOS & Embedded Software

 Real time embedded FreeRTOS RSS feed 
Quick Start Supported MCUs PDF Books Trace Tools Ecosystem TCP & FAT




Loading

About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by BraveBull on July 18, 2007
Posiont:

In Function void vTaskSwitchContext( void ), FreeRTOS 4.3.1, at line 1864
: listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, &( pxReadyTasksLists[ uxTopReadyPriority ] ) ).


Problem:

Because listGET_OWNER_OF_NEXT_ENTRY is a macro by defined, the compilers can rarely optimize its parameters. They only replace them simply. I have tested it as below.

A partial Solution:

change " listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, &( pxReadyTasksLists[ uxTopReadyPriority ] ) );"
to "
xList* ppp=&( pxReadyTasksLists[ uxTopReadyPriority ] );
listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, ppp );
"

Test:

Condition: ATMega323, WinAVR20070525, AVR Studio4.13 simulation mode Frequence = 4MHz, makefile: OPT = s, configUSE_TRACE_FACILITY == 0.
Result:
(1) listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, &( pxReadyTasksLists[ uxTopReadyPriority ] ) ):
asm:
1420: listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, &( pxReadyTasksLists[ uxTopReadyPriority ] ) );
+00000100: 91200079 LDS R18,0x0079 Load direct from data space
+00000102: 91800079 LDS R24,0x0079 Load direct from data space
+00000104: 2799 CLR R25 Clear Register
+00000105: 01FC MOVW R30,R24 Copy register pair
+00000106: E0A3 LDI R26,0x03 Load immediate
+00000107: 0FEE LSL R30 Logical Shift Left
+00000108: 1FFF ROL R31 Rotate Left Through Carry
+00000109: 95AA DEC R26 Decrement
+0000010A: F7E1 BRNE PC-0x03 Branch if not equal
+0000010B: 0FE8 ADD R30,R24 Add without carry
+0000010C: 1FF9 ADC R31,R25 Add with carry
+0000010D: 58E1 SUBI R30,0x81 Subtract immediate
+0000010E: 4FFF SBCI R31,0xFF Subtract immediate with carry
+0000010F: 8001 LDD R0,Z+1 Load indirect with displacement
+00000110: 81F2 LDD R31,Z+2 Load indirect with displacement
+00000111: 2DE0 MOV R30,R0 Copy register
+00000112: 8182 LDD R24,Z+2 Load indirect with displacement
+00000113: 8193 LDD R25,Z+3 Load indirect with displacement
+00000114: 2733 CLR R19 Clear Register
+00000115: 01F9 MOVW R30,R18 Copy register pair
+00000116: E073 LDI R23,0x03 Load immediate
+00000117: 0FEE LSL R30 Logical Shift Left
+00000118: 1FFF ROL R31 Rotate Left Through Carry
+00000119: 957A DEC R23 Decrement
+0000011A: F7E1 BRNE PC-0x03 Branch if not equal
+0000011B: 0FE2 ADD R30,R18 Add without carry
+0000011C: 1FF3 ADC R31,R19 Add with carry
+0000011D: 58E1 SUBI R30,0x81 Subtract immediate
+0000011E: 4FFF SBCI R31,0xFF Subtract immediate with carry
+0000011F: 8392 STD Z+2,R25 Store indirect with displacement
+00000120: 8381 STD Z+1,R24 Store indirect with displacement
+00000121: 91800079 LDS R24,0x0079 Load direct from data space
+00000123: 91200079 LDS R18,0x0079 Load direct from data space
+00000125: 9F24 MUL R18,R20 Multiply unsigned
+00000126: 0190 MOVW R18,R0 Copy register pair
+00000127: 2411 CLR R1 Clear Register
+00000128: 572E SUBI R18,0x7E Subtract immediate
+00000129: 4F3F SBCI R19,0xFF Subtract immediate with carry
+0000012A: 2799 CLR R25 Clear Register
+0000012B: 01FC MOVW R30,R24 Copy register pair
+0000012C: E063 LDI R22,0x03 Load immediate
+0000012D: 0FEE LSL R30 Logical Shift Left
+0000012E: 1FFF ROL R31 Rotate Left Through Carry
+0000012F: 956A DEC R22 Decrement
+00000130: F7E1 BRNE PC-0x03 Branch if not equal
+00000131: 0FE8 ADD R30,R24 Add without carry
+00000132: 1FF9 ADC R31,R25 Add with carry
+00000133: 58E1 SUBI R30,0x81 Subtract immediate
+00000134: 4FFF SBCI R31,0xFF Subtract immediate with carry
+00000135: 8181 LDD R24,Z+1 Load indirect with displacement
+00000136: 8192 LDD R25,Z+2 Load indirect with displacement
+00000137: 1782 CP R24,R18 Compare
+00000138: 0793 CPC R25,R19 Compare with carry
+00000139: F509 BRNE PC+0x22 Branch if not equal
+0000013A: 91200079 LDS R18,0x0079 Load direct from data space
+0000013C: 91800079 LDS R24,0x0079 Load direct from data space
+0000013E: 2799 CLR R25 Clear Register
+0000013F: 01FC MOVW R30,R24 Copy register pair
+00000140: E053 LDI R21,0x03 Load immediate
+00000141: 0FEE LSL R30 Logical Shift Left
+00000142: 1FFF ROL R31 Rotate Left Through Carry
+00000143: 955A DEC R21 Decrement
+00000144: F7E1 BRNE PC-0x03 Branch if not equal
+00000145: 0FE8 ADD R30,R24 Add without carry
+00000146: 1FF9 ADC R31,R25 Add with carry
+00000147: 58E1 SUBI R30,0x81 Subtract immediate
+00000148: 4FFF SBCI R31,0xFF Subtract immediate with carry
+00000149: 8001 LDD R0,Z+1 Load indirect with displacement
+0000014A: 81F2 LDD R31,Z+2 Load indirect with displacement
+0000014B: 2DE0 MOV R30,R0 Copy register
+0000014C: 8182 LDD R24,Z+2 Load indirect with displacement
+0000014D: 8193 LDD R25,Z+3 Load indirect with displacement
+0000014E: 2733 CLR R19 Clear Register
+0000014F: 01F9 MOVW R30,R18 Copy register pair
+00000150: E043 LDI R20,0x03 Load immediate
+00000151: 0FEE LSL R30 Logical Shift Left
+00000152: 1FFF ROL R31 Rotate Left Through Carry
+00000153: 954A DEC R20 Decrement
+00000154: F7E1 BRNE PC-0x03 Branch if not equal
+00000155: 0FE2 ADD R30,R18 Add without carry
+00000156: 1FF3 ADC R31,R19 Add with carry
+00000157: 58E1 SUBI R30,0x81 Subtract immediate
+00000158: 4FFF SBCI R31,0xFF Subtract immediate with carry
+00000159: 8392 STD Z+2,R25 Store indirect with displacement
+0000015A: 8381 STD Z+1,R24 Store indirect with displacement
+0000015B: 91800079 LDS R24,0x0079 Load direct from data space
+0000015D: 2799 CLR R25 Clear Register
+0000015E: 01FC MOVW R30,R24 Copy register pair
+0000015F: E023 LDI R18,0x03 Load immediate
+00000160: 0FEE LSL R30 Logical Shift Left
+00000161: 1FFF ROL R31 Rotate Left Through Carry
+00000162: 952A DEC R18 Decrement
+00000163: F7E1 BRNE PC-0x03 Branch if not equal
+00000164: 0FE8 ADD R30,R24 Add without carry
+00000165: 1FF9 ADC R31,R25 Add with carry
+00000166: 58E1 SUBI R30,0x81 Subtract immediate
+00000167: 4FFF SBCI R31,0xFF Subtract immediate with carry
+00000168: 8001 LDD R0,Z+1 Load indirect with displacement
+00000169: 81F2 LDD R31,Z+2 Load indirect with displacement
+0000016A: 2DE0 MOV R30,R0 Copy register
+0000016B: 8186 LDD R24,Z+6 Load indirect with displacement
+0000016C: 8197 LDD R25,Z+7 Load indirect with displacement
+0000016D: 93900073 STS 0x0073,R25 Store direct to data space
+0000016F: 93800072 STS 0x0072,R24 Store direct to data space
+00000171: 9508 RET Subroutine return
time: ~= 49us


(2) xList* ppp=&( pxReadyTasksLists[ uxTopReadyPriority ] );
listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, ppp );

1422: xList* ppp=&( pxReadyTasksLists[ uxTopReadyPriority ] );
+00000102: 91800079 LDS R24,0x0079 Load direct from data space
+00000104: 9F89 MUL R24,R25 Multiply unsigned
+00000105: 01D0 MOVW R26,R0 Copy register pair
+00000106: 2411 CLR R1 Clear Register
+00000107: 58A1 SUBI R26,0x81 Subtract immediate
+00000108: 4FBF SBCI R27,0xFF Subtract immediate with carry
1423: listGET_OWNER_OF_NEXT_ENTRY( pxCurrentTCB, ppp );
+00000109: 01ED MOVW R28,R26 Copy register pair
+0000010A: 81E9 LDD R30,Y+1 Load indirect with displacement
+0000010B: 81FA LDD R31,Y+2 Load indirect with displacement
+0000010C: 8002 LDD R0,Z+2 Load indirect with displacement
+0000010D: 81F3 LDD R31,Z+3 Load indirect with displacement
+0000010E: 2DE0 MOV R30,R0 Copy register
+0000010F: 83FA STD Y+2,R31 Store indirect with displacement
+00000110: 83E9 STD Y+1,R30 Store indirect with displacement
+00000111: 01CD MOVW R24,R26 Copy register pair
+00000112: 9603 ADIW R24,0x03 Add immediate to word
+00000113: 17E8 CP R30,R24 Compare
+00000114: 07F9 CPC R31,R25 Compare with carry
+00000115: F421 BRNE PC+0x05 Branch if not equal
+00000116: 8182 LDD R24,Z+2 Load indirect with displacement
+00000117: 8193 LDD R25,Z+3 Load indirect with displacement
+00000118: 839A STD Y+2,R25 Store indirect with displacement
+00000119: 8389 STD Y+1,R24 Store indirect with displacement
+0000011A: 01ED MOVW R28,R26 Copy register pair
+0000011B: 81E9 LDD R30,Y+1 Load indirect with displacement
+0000011C: 81FA LDD R31,Y+2 Load indirect with displacement
+0000011D: 8186 LDD R24,Z+6 Load indirect with displacement
+0000011E: 8197 LDD R25,Z+7 Load indirect with displacement
+0000011F: 93900073 STS 0x0073,R25 Store direct to data space
+00000121: 93800072 STS 0x0072,R24 Store direct to data space
+00000123: 91DF POP R29 Pop register from stack
+00000124: 91CF POP R28 Pop register from stack
+00000125: 9508 RET Subroutine return
time: ~= 14us

End:

Be careful to use a function defined by #define. "Maybe" to implement it with a true function is better in this case. There are some other places like this flaws, please correct it in next version.

B.R.

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by Dave on July 18, 2007
Very good point, I agree. But in part it seems you have a very week optimizer. Here is the ARM IAR equivalent with and without optimization.

Old method no optimization.

000018E0 E59F0638 LDR R0, [PC, #+1592]
000018E4 E5900000 LDR R0, [R0, #+0]
000018E8 E3A0101C MOV R1, #0x1C
000018EC E59F264C LDR R2, [PC, #+1612]
000018F0 E0202091 MLA R0, R1, R0, R2
000018F4 E59F1624 LDR R1, [PC, #+1572]
000018F8 E5911000 LDR R1, [R1, #+0]
000018FC E3A0201C MOV R2, #0x1C
00001900 E59F3638 LDR R3, [PC, #+1592]
00001904 E0213192 MLA R1, R2, R1, R3
00001908 E5911004 LDR R1, [R1, #+4]
0000190C E5911004 LDR R1, [R1, #+4]
00001910 E5801004 STR R1, [R0, #+4]
00001914 E59F0604 LDR R0, [PC, #+1540]
00001918 E5900000 LDR R0, [R0, #+0]
0000191C E3A0101C MOV R1, #0x1C
00001920 E59F2618 LDR R2, [PC, #+1560]
00001924 E0202091 MLA R0, R1, R0, R2
00001928 E5900004 LDR R0, [R0, #+4]
0000192C E59F15EC LDR R1, [PC, #+1516]
00001930 E5911000 LDR R1, [R1, #+0]
00001934 E3A0201C MOV R2, #0x1C
00001938 E59F3600 LDR R3, [PC, #+1536]
0000193C E0213192 MLA R1, R2, R1, R3
00001940 E2911008 ADDS R1, R1, #0x8
00001944 E1500001 CMP R0, R1
00001948 1A00000C BNE 0x001980
0000194C E59F05CC LDR R0, [PC, #+1484]
00001950 E5900000 LDR R0, [R0, #+0]
00001954 E3A0101C MOV R1, #0x1C
00001958 E59F25E0 LDR R2, [PC, #+1504]
0000195C E0202091 MLA R0, R1, R0, R2
00001960 E59F15B8 LDR R1, [PC, #+1464]
00001964 E5911000 LDR R1, [R1, #+0]
00001968 E3A0201C MOV R2, #0x1C
0000196C E59F35CC LDR R3, [PC, #+1484]
00001970 E0213192 MLA R1, R2, R1, R3
00001974 E5911004 LDR R1, [R1, #+4]
00001978 E5911004 LDR R1, [R1, #+4]
0000197C E5801004 STR R1, [R0, #+4]
00001980 E59F0170 LDR R0, [PC, #+368]
00001984 E59F1594 LDR R1, [PC, #+1428]
00001988 E5911000 LDR R1, [R1, #+0]
0000198C E3A0201C MOV R2, #0x1C
00001990 E59F35A8 LDR R3, [PC, #+1448]
00001994 E0213192 MLA R1, R2, R1, R3
00001998 E5911004 LDR R1, [R1, #+4]
0000199C E591100C LDR R1, [R1, #+12]
000019A0 E5801000 STR R1, [R0, #+0]
000019A4 E59F014C LDR R0, [PC, #+332]
000019A8 E5900000 LDR R0, [R0, #+0]
000019AC E5900004 LDR R0, [R0, #+4]
000019B0 E59F1140 LDR R1, [PC, #+320]



Old method with optimization.

00001558 E5902000 LDR R2, [R0, #+0]
0000155C E0221293 MLA R2, R3, R2, R1
00001560 E5903000 LDR R3, [R0, #+0]
00001564 E3A0C01C MOV R12, #0x1C
00001568 E023139C MLA R3, R12, R3, R1
0000156C E5933004 LDR R3, [R3, #+4]
00001570 E5933004 LDR R3, [R3, #+4]
00001574 E5823004 STR R3, [R2, #+4]
00001578 E5902000 LDR R2, [R0, #+0]
0000157C E1A0300C MOV R3, R12
00001580 E0221293 MLA R2, R3, R2, R1
00001584 E5922004 LDR R2, [R2, #+4]
00001588 E5903000 LDR R3, [R0, #+0]
0000158C E023139C MLA R3, R12, R3, R1
00001590 E2833008 ADD R3, R3, #0x8
00001594 E1520003 CMP R2, R3
00001598 1A000007 BNE 0x0015BC
0000159C E5902000 LDR R2, [R0, #+0]
000015A0 E1A0300C MOV R3, R12
000015A4 E0221293 MLA R2, R3, R2, R1
000015A8 E5903000 LDR R3, [R0, #+0]
000015AC E023139C MLA R3, R12, R3, R1
000015B0 E5933004 LDR R3, [R3, #+4]
000015B4 E5933004 LDR R3, [R3, #+4]
000015B8 E5823004 STR R3, [R2, #+4]
000015BC E59F34B0 LDR R3, [PC, #+1200]
000015C0 E5900000 LDR R0, [R0, #+0]
000015C4 E1A0200C MOV R2, R12
000015C8 E0201092 MLA R0, R2, R0, R1
000015CC E5900004 LDR R0, [R0, #+4]
000015D0 E590000C LDR R0, [R0, #+12]
000015D4 E5830000 STR R0, [R3, #+0]



New method no optimisation:

000018E0 E59F05C4 LDR R0, [PC, #+1476]
000018E4 E5900000 LDR R0, [R0, #+0]
000018E8 E3A0101C MOV R1, #0x1C
000018EC E59F25D8 LDR R2, [PC, #+1496]
000018F0 E0202091 MLA R0, R1, R0, R2
000018F4 E1B04000 MOVS R4, R0
000018F8 E5940004 LDR R0, [R4, #+4]
000018FC E5900004 LDR R0, [R0, #+4]
00001900 E5840004 STR R0, [R4, #+4]
00001904 E5940004 LDR R0, [R4, #+4]
00001908 E2941008 ADDS R1, R4, #0x8
0000190C E1500001 CMP R0, R1
00001910 1A000002 BNE 0x001920
00001914 E5940004 LDR R0, [R4, #+4]
00001918 E5900004 LDR R0, [R0, #+4]
0000191C E5840004 STR R0, [R4, #+4]
00001920 E59F02F4 LDR R0, [PC, #+756]
00001924 E5941004 LDR R1, [R4, #+4]
00001928 E591100C LDR R1, [R1, #+12]
0000192C E5801000 STR R1, [R0, #+0]
00001930 E59F02E4 LDR R0, [PC, #+740]
00001934 E5900000 LDR R0, [R0, #+0]
00001938 E5900004 LDR R0, [R0, #+4]
0000193C E59F12D8 LDR R1, [PC, #+728]
00001940 E5911000 LDR R1, [R1, #+0]



New method no optimisation:

00001550 E5900000 LDR R0, [R0, #+0]
00001554 E1A02003 MOV R2, R3
00001558 E0201092 MLA R0, R2, R0, R1
0000155C E5901004 LDR R1, [R0, #+4]
00001560 E5911004 LDR R1, [R1, #+4]
00001564 E5801004 STR R1, [R0, #+4]
00001568 E2802008 ADD R2, R0, #0x8
0000156C E1510002 CMP R1, R2
00001570 1A000001 BNE 0x00157C
00001574 E5911004 LDR R1, [R1, #+4]
00001578 E5801004 STR R1, [R0, #+4]
0000157C E59F34A4 LDR R3, [PC, #+1188]
00001580 E5900004 LDR R0, [R0, #+4]
00001584 E590000C LDR R0, [R0, #+12]
00001588 E5830000 STR R0, [R3, #+0]


With the optimizer the difference is no where near as marked as the AVR code you supplied.

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by Richard on July 27, 2007
I have updated the macro in SVN, and will include this in the next release (within the next few days). This seems to be the only place where this can be achieved without actually increasing the compiled code size.

In the change history within the file I have credited the change to B.R. Let me know your full name if you want it included in the project history file.

Thanks for your contribution.

Regards.

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by BraveBull on July 30, 2007
I feel highly honoured that my effort can be acknowledged.
My name : Niu Yong
Tianjin University
a Chinese man

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by David Hawks on August 3, 2007
Ouch! I just updated to V4.4.0 and got this "fix". Not only does it increase my compiled code size, but it also breaks the use of this macro in vTaskSwitchContext() for my compiler.

I am using Keil's C51 compiler on a derivative of the Cygnal 8051 port. C51 uses a compiled stack to compensate for the lack of stack space in the 8051. Prior to this change, vTaskSwitchContext() did not use any compiled stack space and was, therefore, intrinsically reentrant. This change causes vTaskSwitchContext() to use three bytes of the compiled stack for the new pointer copy, causing the function to no longer be reentrant.

I am forced to either add the compiler's inefficient "reentrant" attribute to the function (which would move the three bytes to a software run-time stack) or revert back to the previous version of the macro. I dislike both options.

Clearly, you can't please everyone (or every compiler) all of the time.

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by Richard on August 3, 2007
Thats a bit of a bugger. I tried it on several ports and in each case the code size was smaller. I'm surprised this is not the case for Keil, where they make lots of claims on their code efficiency. Is the same true when you turn optimization on?

Regards.

p.s. Its not a 'fix' but an 'improvement', not in your case though :-(

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by Richard on August 3, 2007
Does vTaskSwitchContext() need to be reentrant? It is only used with interrupts disabled.

Regards.

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by David Hawks on August 6, 2007
Good point. As long as vTaskSwitchContext() is only called with interrupts disabled, it does not need to be tagged as reentrant. Currently vTaskSwitchContext() is called from the timer tick ISR and vPortYield(). I don't imagine that will change, so I could just remove the function from the overlay analysis. I like that this would only affect the makefile and not the source code.

P.S. Optimization was at maximum.

RE: About macro: listGET_OWNER_OF_NEXT_ENTRY

Posted by David Hawks on August 6, 2007
Update: I removed vTaskSwitchContext() from the overlay analysis and the resultant image uses one more byte of code space and three more bytes of RAM. I can live with that.

Thanks for the suggestion Richard!


[ Back to the top ]    [ About FreeRTOS ]    [ Sitemap ]    [ ]




Copyright (C) Amazon Web Services, Inc. or its affiliates. All rights reserved.

Latest News

FreeRTOS kernel V10 is available for immediate download. Now MIT licensed.


FreeRTOS Partners

ARM Connected RTOS partner for all ARM microcontroller cores

IAR Partner

Microchip Premier RTOS Partner

RTOS partner of NXP for all NXP ARM microcontrollers

STMicro RTOS partner supporting ARM7, ARM Cortex-M3, ARM Cortex-M4 and ARM Cortex-M0

Texas Instruments MCU Developer Network RTOS partner for ARM and MSP430 microcontrollers

OpenRTOS and SafeRTOS