diff --git a/Documentation/platforms/avr/avrdx/docs/up_udelay.rst b/Documentation/platforms/avr/avrdx/docs/up_udelay.rst new file mode 100644 index 0000000000000..a89f954ff00cd --- /dev/null +++ b/Documentation/platforms/avr/avrdx/docs/up_udelay.rst @@ -0,0 +1,69 @@ +============================================= +AVR DA/DB family ``up_udelay`` implementation +============================================= + +NuttX provides functions for busy sleep, these are documented +:doc:`here `. These functions +use ``BOARD_LOOPSPERMSEC`` configuration value to determine how many loops need +to be done to cause requested delay. Creator of a the board code is supposed +to calibrate this value to make the delay as precise as possible. + +This does not match well with AVR DA/DB microcontrollers because the CPU clock +frequency is configurable and unless the user chooses one and sticks with it, +any value in ``BOARD_LOOPSPERMSEC`` is going to be incorrect. + +Instead of using the default ``up_udelay`` function, custom one is provided. +When called, it determines current clock settings and infers required loop count +from that. + +Configuration +============= + +This implementation is enabled automatically and can be turned off using +:menuselection:`System Type --> Use AVR DA/DB implementation of up_udelay` +configuration option. If enabled, there are two additional options: + +:menuselection:`External clock is not supported in up_udelay` and +:menuselection:`32.768kHz oscillator is not supported in up_udelay`. + +The latter simply excludes the code that checks if any 32.768kHz clock source +is in use. + +The former does the same thing for external clock but also has an additional +effect: when not selected, it does not set ``ARCH_HAVE_DYNAMIC_UDELAY`` +configuration option. This in turn means that ``BOARD_LOOPSPERMSEC`` will +need to be configured +in :menuselection:`System Type --> Delay loops per millisecond` + +The reason for this is that with the external clock, there is no way of knowing +what the current CPU clock frequency is and it is therefore impossible +to calculate the loop count. It needs to be provided by board code's author +to match the external clock the board is using. If the main clock prescaler is +active, the loop value is recalculated to take that into consideration. + +Note that ``BOARD_LOOPSPERMSEC`` also needs to be specified if this ``up_udelay`` +implementation is disabled altogether. + +The other functions - ``up_mdelay`` and ``up_ndelay`` - are unchanged. These +simply multiply or divide their time parameter by 1000 and pass the result +to ``up_udelay``. + +Precision +========= + +The loop count calculation takes time and that time depends on current settings +and requested wait time. For example, calculation for wait time below +180 microseconds when using high frequency oscillator can be done using 16bit +arithmetic and without expensive division - unless the compiler is set +to optimize for code size. + +On the other hand - whenever the main clock prescaler is in effect, +division is unavoidable. + +The function attempts to account for this by guessing how much time all +the processing took and reduce the loop count accordingly. Nevertheless, +the wait may be considerably longer than requested for shorter delays +and lower clock speeds. + +(For example - 32.768kHz clock with prescaler of 2 needs to do two +divisions, taking 42 milliseconds total.) diff --git a/Documentation/platforms/avr/avrdx/index.rst b/Documentation/platforms/avr/avrdx/index.rst index 059413978ca86..96b528b17e97a 100644 --- a/Documentation/platforms/avr/avrdx/index.rst +++ b/Documentation/platforms/avr/avrdx/index.rst @@ -46,6 +46,9 @@ in :menuselection:`RTOS Features --> Clocks and Timers --> Support tick-less OS` changed to value of at least 300. Higher value is recommended though, 300us is not going to be precise at all. +Architecture code for this CPU family provides custom ``up_udelay`` function. +More information can be found in :doc:`docs/up_udelay` document + Peripheral Support ================== diff --git a/Documentation/reference/os/sleep.rst b/Documentation/reference/os/sleep.rst index ce7f06ff5eba3..393cee5fe00a3 100644 --- a/Documentation/reference/os/sleep.rst +++ b/Documentation/reference/os/sleep.rst @@ -100,7 +100,8 @@ Busy Sleep Interfaces --------------------- Spin in a loop for the requested duration and never yield the CPU. The delay accuracy depends on -``CONFIG_BOARD_LOOPSPERMSEC``. +``CONFIG_BOARD_LOOPSPERMSEC`` (unless the architecture/board code replaces these +function(s) with an implementation that does not use the value.) .. c:function:: void up_mdelay(unsigned int milliseconds) diff --git a/arch/Kconfig b/arch/Kconfig index 7cd6ec4f10d41..0085b50d1ad9b 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -560,6 +560,36 @@ config ARCH_LDST_16BIT_NOT_ATOMIC (The same applies for non-interrupt context write and interrupt context read.) +config ARCH_HAVE_UDELAY + bool + default n + ---help--- + This configuration option is set for architectures that define + their own version of up_udelay. That function is defined as weak + but due to linker behaviour with regards to static libraries, + it was found that the overriding definition is not included + in the resulting binary, the default (weak) one is still used. + + (Found with GCC 14.2 on AVR and Risc-V but likely present + elsewhere too. Manifests when up_udelay override is the only + function in the source file.) + + If set, the default up_udelay function is excluded from build. + +config ARCH_HAVE_DYNAMIC_UDELAY + bool + default n + depends on ARCH_HAVE_UDELAY + ---help--- + This configuration option is set if architecture-specific up_udelay + function does not use CONFIG_BOARD_LOOPSPERMSEC. This should be used + for architectures/boards where the user has the option to choose + clock frequency of the CPU. + + (Keep in mind that up_udelay function may also be used in situations + after a failure - the implementation should always re-read current + hardware settings as anything stored in memory may be corrupted.) + config ARCH_HAVE_TESTSET bool default n @@ -1366,6 +1396,7 @@ comment "Board Settings" config BOARD_LOOPSPERMSEC int "Delay loops per millisecond" + depends on !ARCH_HAVE_DYNAMIC_UDELAY default -1 ---help--- Simple delay loops are used by some logic, especially during boot-up, diff --git a/arch/avr/src/avrdx/Kconfig b/arch/avr/src/avrdx/Kconfig index 3025f919cf61c..dcfe1e674d81c 100644 --- a/arch/avr/src/avrdx/Kconfig +++ b/arch/avr/src/avrdx/Kconfig @@ -583,4 +583,63 @@ config AVRDX_HFO_CLOCK_FREQ default 20000000 if AVRDX_HFO_CLOCK_20MHZ default 24000000 if AVRDX_HFO_CLOCK_24MHZ -endif +config AVRDX_ARCH_UDELAY + bool "Use AVR DA/DB implementation of up_udelay" + default y + select ARCH_HAVE_UDELAY + ---help--- + This option enables architecture-specific implementation + of up_udelay. + + The implementation is still a busy wait that runs a loop + as many times as needed for requested delay period. It is + "dynamic" meaning that it calculates number of loops needed + from current CPU frequency. + + If disabled, BOARD_LOOPSPERMSEC needs to be configured. + +config AVRDX_ARCH_UDELAY_NO_EXTCLK + bool "External clock is not supported in up_udelay" + default y + depends on AVRDX_ARCH_UDELAY + select ARCH_HAVE_DYNAMIC_UDELAY + ---help--- + Selecting this option indicates that the board does not use + external clock source (EXTCLK setting in CLKCTRL.MCLKCTRLA.) + + The implementation of up_udelay is able to determine CPU + clock frequency of the other sources but not this one. + If external clock is in use, the implementation needs to fall + back to the value of BOARD_LOOPSPERMSEC which must + be specified. + + If external clock is not in use (and this option is set), + BOARD_LOOPSPERMSEC must be set to its default value of -1. + + Note that this option does not apply to the XOSC32K setting. + We are able to determine CPU clock frequency rather easily + when driven by external 32.768 kHz crystal oscillator. + + Also note that if this option is enabled and the board uses + external clock (EXTCLK), up_udelay will fail and wait for + maximum amount of time. + + Say Y if your board does not use external clock. + +config AVRDX_ARCH_UDELAY_NO_OSC32K + bool "32.768kHz oscillator is not supported in up_udelay" + default y + depends on AVRDX_ARCH_UDELAY + ---help--- + Selecting this option indicates that the board does not use + either of OSC32K and XOSC32K settings in CLKCTRL.MCLKCTRLA. + In other words - neither source of 32.768kHz clock is used. + + If this is set, the implementation of up_udelay will skip + support for these frequencies to save flash space. Note that + if one of these clock sources is used anyway, up_udelay + will fail and wait for maximum amount of time. + + Say Y if your board does not use 32.768kHz clock. + +endif # if ARCH_CHIP_AVRDX diff --git a/arch/avr/src/avrdx/Make.defs b/arch/avr/src/avrdx/Make.defs index 1b92a37994b4e..203390efb4ef9 100644 --- a/arch/avr/src/avrdx/Make.defs +++ b/arch/avr/src/avrdx/Make.defs @@ -31,11 +31,15 @@ CMN_CSRCS = avr_allocateheap.c avr_copystate.c avr_createstack.c CMN_CSRCS += avr_doirq.c avr_exit.c avr_idle.c avr_initialize.c CMN_CSRCS += avr_initialstate.c avr_irq.c avr_lowputs.c CMN_CSRCS += avr_nputs.c avr_releasestack.c avr_registerdump.c -CMN_CSRCS += avr_schedulesigaction.c avr_sigdeliver.c avr_getintstack.c +CMN_CSRCS += avr_getintstack.c CMN_CSRCS += avr_stackframe.c avr_switchcontext.c avr_usestack.c # Configuration-dependent common files +ifeq ($(CONFIG_ENABLE_ALL_SIGNALS),y) +CMN_CSRCS += avr_schedulesigaction.c avr_sigdeliver.c +endif + ifeq ($(CONFIG_AVR_SPI),y) CMN_CSRCS += avr_spi.c endif @@ -55,6 +59,8 @@ CHIP_CSRCS = avrdx_lowconsole.c avrdx_lowinit.c avrdx_init.c CHIP_CSRCS += avrdx_serial.c avrdx_serial_early.c CHIP_CSRCS += avrdx_peripherals.c CHIP_CSRCS += avrdx_twi.c +CHIP_ASRCS += avrdx_delay_loop.S +CHIP_CSRCS += avrdx_delay.c # Configuration-dependent files diff --git a/arch/avr/src/avrdx/avrdx.h b/arch/avr/src/avrdx/avrdx.h index 84efae45da9d4..5b53c33337b89 100644 --- a/arch/avr/src/avrdx/avrdx.h +++ b/arch/avr/src/avrdx/avrdx.h @@ -90,19 +90,53 @@ extern "C" void up_clkinitialize(void); +/**************************************************************************** + * Name: avrdx_current_freq_main_prescaler + * + * Description: + * Reduces given frequency by main clock prescaler. (Note - this is also + * used for non-frequency values. Implementation of up_udelay uses this + * function to reduce number of needed loops when external clock is used.) + * + * Input Parameters: + * frequency - input frequency + * + * Return value: output frequency in Hz + */ + +uint32_t avrdx_current_freq_main_prescaler(uint32_t frequency); + /**************************************************************************** * Name: avrdx_current_freq_per * * Description: - * Calculate and return current f_per + * Calculate and return current f_per (peripheral clock frequency) + * + * Returned Value: frequency in Hz. * - * Assumptions: - * Main clock source is internal oscillator + * Assumptions/Limitations: + * Main clock must not be driven by external clock. * ****************************************************************************/ uint32_t avrdx_current_freq_per(void); +/**************************************************************************** + * Name: avrdx_current_freq_cpu + * + * Description: + * Calculate and return current f_cpu (CPU frequency). Returns value + * of avrdx_current_freq_per because both clocks are identical. + * + * Returned Value: frequency in Hz + * + * Assumptions/Limitations: + * Main clock must not be driven by external clock. + * + ****************************************************************************/ + +uint32_t avrdx_current_freq_cpu(void); + /**************************************************************************** * Name: up_consoleinit * diff --git a/arch/avr/src/avrdx/avrdx_delay.c b/arch/avr/src/avrdx/avrdx_delay.c new file mode 100644 index 0000000000000..836276ee8da6b --- /dev/null +++ b/arch/avr/src/avrdx/avrdx_delay.c @@ -0,0 +1,350 @@ +/**************************************************************************** + * arch/avr/src/avrdx/avrdx_delay.c + * Custom implementation of up_udelay + * + * SPDX-License-Identifier: Apache-2.0 + * + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. The + * ASF licenses this file to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance with the + * License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + * License for the specific language governing permissions and limitations + * under the License. + * + ****************************************************************************/ + +/**************************************************************************** + * Included Files + ****************************************************************************/ + +#include +#include +#include + +#include "avrdx.h" +#include "avrdx_delay_loop.h" + +/**************************************************************************** + * Pre-processor Definitions + ****************************************************************************/ + +/* Division takes some 350 clock cycles and the delay loop needs 6 cycles + * per loop. Each division therefore takes as much time as this number + * of loops. + */ + +#define DIVISION_LOOP_INCREMENT 55 + +/* Multiplication is not that difficult */ + +#define MULTIPLICATION_LOOP_INCREMENT 5 + +/* Detect if divisions are optimized for size - that means the program + * will actually call the division function even when dividing by constant + * (which can be done without the actual division.) + */ + +#if (defined(CONFIG_ARCH_TOOLCHAIN_GCC) && defined(__OPTIMIZE_SIZE__)) +# define DIVISION_OPTIMIZED_FOR_SIZE +#endif + +/**************************************************************************** + * Private Types + ****************************************************************************/ + +/**************************************************************************** + * Private Function Prototypes + ****************************************************************************/ + +/**************************************************************************** + * Private Data + ****************************************************************************/ + +/**************************************************************************** + * Public Data + ****************************************************************************/ + +/**************************************************************************** + * Private Functions + ****************************************************************************/ + +/**************************************************************************** + * Public Functions + ****************************************************************************/ + +/**************************************************************************** + * Name: up_udelay + * + * Description: + * Delay execution for requested number of microseconds. + * + * This is a replacement for core implementation (which is defined + * as weak) specific for AVR DA/DB (AVRxt) cores. The main difference + * is the means used to calculate the number of cycles. The core function + * uses CONFIG_BOARD_LOOPSPERMSEC, a value configured by either the user, + * or the author of board code. + * + * AVRxt chips can run with various clock sources and those can be further + * configured and prescaled during runtime. As such, there is no single + * universal value of CONFIG_BOARD_LOOPSPERMSEC (unless the board uses + * external clock source.) + * + * This implementation therefore attempts to determine current running + * frequency of the MCU and calculate needed loop count from that. + * + * Note - same as core implementation of up_udelay, the function waits + * in a busy-wait loop. As such: + * + * * It is *** NOT multi-tasking friendly *** + * * If interrupted (by interrupt handler or preempted by another task), + * the total wait time is increased by the duration of the outside code. + * + * Input Parameters: + * microseconds - requested wait time + * + * Returned Value: none + * + * Assumptions/Limitations: + * Conversion from microseconds is done in run-time and can take + * a lot of cycles itself. + * + ****************************************************************************/ + +void up_udelay(useconds_t microseconds) +{ + /* Determined frequency of the CPU. 16bit variable is used + * to force math with smaller data types. + */ + + uint32_t f_cpu; + uint16_t f_cpu_shifted; + + /* Variable for determined loop count. The counter accumulates + * number of loops that were already "done" by computations + * in this function. + */ + + uint32_t loops; + uint16_t loop_like_counter = 0; + + /* This is added to loop_like_counter whenever this function + * does something that divides clock frequency by main prescaler + * if the main prescaler is enabled. + */ + + uint8_t main_clock_prescaler = 0; + + /* Other helper variables. */ + +#ifndef DIVISION_OPTIMIZED_FOR_SIZE + uint8_t microseconds_u8; +#endif + uint8_t mclkctrla; + + if (microseconds == 0) + { + return; + } + + if (CLKCTRL.MCLKCTRLB & CLKCTRL_PEN_bm) + { + /* Main prescaler is enabled. Frequency is divided + * by prescaling ratio, loop count calculation is slowed + * by that. + */ + + main_clock_prescaler = DIVISION_LOOP_INCREMENT; + } + + /* We need to convert duration in microseconds to loop count. + * + * - one clock cycle takes 1/f seconds (where f is CPU frequency.) + * - one loop of avrdx_delay_loop takes 6 clock cycles. Duration + * of one loop pass is therefore 6/f seconds. + * - we receive microseconds as a parameter, duration of one loop + * pass is 6e6/f microseconds + * - total number of required loops is therefore: + * usec/(6e6/f) == (f * usec) / 6e6 + */ + + f_cpu = avrdx_current_freq_cpu(); + + /* Increment the counter if the main clock prescaler is enabled, + * ie. if the frequency obtained from the hardware got divided. + */ + + loop_like_counter += main_clock_prescaler; + + mclkctrla = CLKCTRL.MCLKCTRLA & CLKCTRL_CLKSEL_GM; + if (mclkctrla == CLKCTRL_CLKSEL_OSCHF_GC) + { + /* All of this works with 32 bit operands and the calculation + * will involve some divisions and those are expensive. We can + * attempt to alleviate that by forcing 16 bit operation instead. + * That is possible but some conditions need to be met: + * + * 1. f_cpu is more than 1MHz (that's minimum for high frequency + * oscillator but it can be prescaled + * 2. microseconds is less than 180 + * 3. compiler is not optimizing for size + * + * The first condition allows us to bitshift the frequency by 16 + * without losing too much accuracy. + * + * The second condition makes sure that the result fits into 16 bit + * value even with maximum f_cpu (24MHz) + * + * The third condition accounts for the fact that the compiler + * is able to avoid division when dividing by a compile-time + * constant but will not do it if optimizing for size. + */ + +#ifndef DIVISION_OPTIMIZED_FOR_SIZE + if ((microseconds < 180) && (f_cpu >= 1000000)) + { + /* Condition above allows us to bitshift the frequency 16 bits + * to the right, we will bitshift the denominator as well + * to compensate + */ + + f_cpu_shifted = f_cpu >> 16; + microseconds_u8 = microseconds; + loops = microseconds_u8 * f_cpu_shifted / (6000000 >> 16); + } + else +#endif + { + /* Cannot do 16 bit math above for some reason. We still need + * to bitshift the inputs to the multiplication though, + * otherwise any delay (value of microseconds) greater than 178 + * overflows even 32 bit math. + */ + + if (f_cpu >= 1000000) + { + f_cpu_shifted = f_cpu >> 16; + } + else + { + /* This is safe to do - f_cpu is at least 41666, which + * is 1MHz / 24 from prescaler. + */ + + f_cpu_shifted = f_cpu >> 8; + + /* Still need to shift something by 8 to the right. Can't do + * microseconds blindly though - the compiler may know + * (from the test above) that the value is less than 180 + * and use that knowledge here to determine that bitshifted + * value is 0. Calculated value of loops will be 0 too and + * the function terminates because value of 0 will always + * be less than loop_like_counter which is non-zero. + * + * However - since we know frequency is less than 1MHz, + * we also know that the prescaler is applied. Reading f_cpu + * therefore did a division for 350 clock cycles. Highest + * value of f_cpu can therefore be 833333Hz (20MHz oscillator + * divided by 24 prescaler.) That is 1.2us per cycle. + * + * The division therefore took 420us. If the requested delay + * value is less than 256us, we already achieved that. + * + * So in the end - we CAN do the bitshift blindly. + */ + + microseconds = microseconds >> 8; + } + + /* This compiles into 2x16bit to 32bit multiplication, followed + * by division. Some 350 cycles for the division, + * 40 for multiplication, this function takes some, let's say + * 420 clock cycles total. This means that this function already + * did 70 loops by itself, without entering the actual loop. + */ + + loops = microseconds * f_cpu_shifted / (6000000 >> 16); + loop_like_counter += (DIVISION_LOOP_INCREMENT + \ + MULTIPLICATION_LOOP_INCREMENT); + } + } + +#ifndef CONFIG_AVRDX_ARCH_UDELAY_NO_OSC32K + else if ((mclkctrla == CLKCTRL_CLKSEL_OSC32K_GC) || \ + (mclkctrla == CLKCTRL_CLKSEL_XOSC32K_GC)) + { + /* Unlike with the high frequency oscillator, we can't bitshift + * the CPU frequency by 16 bits, that would erase it to zero. + * We can only do bitshift by 8 bits (and we have to because + * otherwise the multiplication between f_cpu and microseconds + * overflows for 2^17 (131072) microseconds or more. + * + * Also cannot assume it's actually 32.768k - main clock prescaler + * may be in effect. We will check for that one though, that could + * save us a lot of time. + */ + + if (main_clock_prescaler) + { + /* Main prescaler in use, we are out of luck and need + * to do the computation. Considering that the CPU clock + * is slow and we need to divide (and that obtaining + * f_cpu also did a division), the accuracy of this + * function will most likely be way off.) + */ + + f_cpu_shifted = f_cpu >> 8; + loops = f_cpu_shifted * microseconds / (6000000 >> 8); + loop_like_counter += DIVISION_LOOP_INCREMENT; + } + else + { + /* Main prescaler not in use, frequency is a known constant. + * + */ + + loops = (32768 >> 8) * microseconds / (6000000 >> 8); + + /* Does no division unless optimizing for size */ + +# ifdef DIVISION_OPTIMIZED_FOR_SIZE + loop_like_counter += DIVISION_LOOP_INCREMENT; +# endif + } + } +#endif + +#ifndef CONFIG_AVRDX_ARCH_UDELAY_NO_EXTCLK + else if (mclkctrla == CLKCTRL_CLKSEL_EXTCLK_GC) + { + loops = (USEC_PER_MSEC * microseconds) * CONFIG_BOARD_LOOPSPERMSEC; + avrdx_current_freq_main_prescaler(loops); + loop_like_counter += main_clock_prescaler; + } +#endif + + else + { + /* Unsupported clock, wait as long as possible */ + + loops = UINT32_MAX; + } + + if (loops < loop_like_counter) + { + /* This function already took more time that it was supposed to. + * Note - must not pass 0 to avrdx_delay_loop. + */ + + return; + } + + avrdx_delay_loop(loops - loop_like_counter); +} diff --git a/arch/avr/src/avrdx/avrdx_delay_loop.S b/arch/avr/src/avrdx/avrdx_delay_loop.S new file mode 100644 index 0000000000000..0883075e4667c --- /dev/null +++ b/arch/avr/src/avrdx/avrdx_delay_loop.S @@ -0,0 +1,81 @@ +/**************************************************************************** + * arch/avr/src/avrdx/avrdx_delay_loop.S + * + * SPDX-License-Identifier: Apache-2.0 + * + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. The + * ASF licenses this file to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance with the + * License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + * License for the specific language governing permissions and limitations + * under the License. + * + ****************************************************************************/ + +/**************************************************************************** + * Included Files + ****************************************************************************/ + +#include + +/**************************************************************************** + * External Symbols + ****************************************************************************/ + + .file "avrdx_delay_loop.S" + +/**************************************************************************** + * Name: avrdx_delay_loop + * + * Description: + * Spins in a loop to cause a delay. On chips with AVRxt cores, it takes: + * (3 + count * 6 + call) cycles where count is the parameter and call + * is the number of cycles needed for instruction that jumps into this + * function (2 cycles for RCALL or 3 cycles for CALL, depends on what + * the compiler/linker does.) + * + * Input Parameters: + * Loop count - uint32_t (registers r22:r25) + * + * Return value: none + * + * Assumptions/Limitations: + * - only called from avrdx_udelay + * - count is not zero + * + ****************************************************************************/ + + .global avrdx_delay_loop + +avrdx_delay_loop: + + ; Because: + ; - all of the registers that contain loop count are call-clobbered. + ; - avrdx_udelay does not call this function with count set to zero + ; we can start counting right away + + ; Note - r22 cannot be used with SBIW. SBIW takes two clock cycles + ; though so we can achieve the same timing with with SUBI/SBC + +1: + subi r22, 1 ; subtract immediate, 1 cycle + sbc r23, r1 ; subtract zero register from r26 with carry + ; 1 cycle + sbc r24, r1 ; 1 cycle + sbc r25, r1 ; 1 cycle + brne 1b ; 2 cycles if the condition is true (ie. if branching) + ; 1 cycle otherwise + + ; 1 pass through the loop takes 6 cycles total (except for last pass + ; which takes 5 cycles) + +2: + ret ; 4 cycles diff --git a/arch/avr/src/avrdx/avrdx_delay_loop.h b/arch/avr/src/avrdx/avrdx_delay_loop.h new file mode 100644 index 0000000000000..bc894323a59f8 --- /dev/null +++ b/arch/avr/src/avrdx/avrdx_delay_loop.h @@ -0,0 +1,92 @@ +/**************************************************************************** + * arch/avr/src/avrdx/avrdx_delay_loop.h + * + * SPDX-License-Identifier: Apache-2.0 + * + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. The + * ASF licenses this file to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance with the + * License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + * License for the specific language governing permissions and limitations + * under the License. + * + ****************************************************************************/ + +#ifndef __ARCH_AVR_ETC_ETC_H +#define __ARCH_AVR_ETC_ETC_H + +/**************************************************************************** + * Included Files + ****************************************************************************/ + +#include + +/**************************************************************************** + * Pre-processor Definitions + ****************************************************************************/ + +/**************************************************************************** + * Public Types + ****************************************************************************/ + +#ifndef __ASSEMBLY__ + +/**************************************************************************** + * Public Data + ****************************************************************************/ + +#ifdef __cplusplus +#define EXTERN extern "C" +extern "C" +{ +#else +#define EXTERN extern +#endif + +/**************************************************************************** + * Inline Functions + ****************************************************************************/ + +/**************************************************************************** + * Public Function Prototypes + ****************************************************************************/ + +/**************************************************************************** + * Name: avrdx_delay_loop + * + * Description: + * Spins in a loop to cause a delay. On chips with AVRxt cores, it takes: + * (3 + count * 6 + call) cycles where count is the parameter and call + * is the number of cycles needed for instruction that jumps into this + * function (2 cycles for RCALL or 3 cycles for CALL, depends on what + * the compiler/linker does.) + * + * Input Parameters: + * Loop count - uint32_t (registers r22:r25) + * + * Return value: none + * + * Assumptions/Limitations: + * - only called from avrdx_udelay + * - count is not zero + * + ****************************************************************************/ + +void avrdx_delay_loop(uint32_t count); + +#undef EXTERN +#ifdef __cplusplus +} +#endif + +#endif /* __ASSEMBLY__ */ + +#endif diff --git a/arch/avr/src/avrdx/avrdx_peripherals.c b/arch/avr/src/avrdx/avrdx_peripherals.c index a7442280f3910..2cc260b9440bb 100644 --- a/arch/avr/src/avrdx/avrdx_peripherals.c +++ b/arch/avr/src/avrdx/avrdx_peripherals.c @@ -301,11 +301,7 @@ const IOBJ uint8_t avrdx_usart_portmux_masks[] = ****************************************************************************/ /**************************************************************************** - * Public Functions - ****************************************************************************/ - -/**************************************************************************** - * Name: avrdx_current_freq_per + * Name: avrdx_current_freq_per_oschf * * Description: * Calculate and return current f_per (peripheral clock frequency) @@ -315,15 +311,13 @@ const IOBJ uint8_t avrdx_usart_portmux_masks[] = * ****************************************************************************/ -uint32_t avrdx_current_freq_per() +static uint32_t avrdx_current_freq_per_oschf(void) { uint32_t f_per; /* Shortcut variables */ uint8_t frqsel; - uint8_t pdiv; - uint8_t mclkctrlb; /* Calculate frequency in MHz, then divide it by main prescaler, * if set. @@ -332,23 +326,104 @@ uint32_t avrdx_current_freq_per() frqsel = (CLKCTRL.OSCHFCTRLA & CLKCTRL_FRQSEL_GM) >> CLKCTRL_FRQSEL_GP; f_per = 1000000UL * avrdx_frqsel_mhz[frqsel]; + return avrdx_current_freq_main_prescaler(f_per); +} + +/**************************************************************************** + * Public Functions + ****************************************************************************/ + +/**************************************************************************** + * Name: avrdx_current_freq_main_prescaler + * + * Description: + * Reduces given frequency by main clock prescaler. (Note - this is also + * used for non-frequency values. Implementation of up_udelay uses this + * function to reduce number of needed loops when external clock is used.) + * + * Input Parameters: + * frequency - input frequency + * + * Return value: output frequency in Hz + */ + +uint32_t avrdx_current_freq_main_prescaler(uint32_t frequency) +{ + uint8_t mclkctrlb; + uint8_t pdiv; + /* Read this once, no point in re-reading */ mclkctrlb = CLKCTRL.MCLKCTRLB; if (mclkctrlb & CLKCTRL_PEN_bm) { pdiv = (mclkctrlb & CLKCTRL_PDIV_GM) >> CLKCTRL_PDIV_GP; - f_per /= avrdx_main_pdiv[pdiv]; + frequency /= avrdx_main_pdiv[pdiv]; + } + + return frequency; +} + +/**************************************************************************** + * Name: avrdx_current_freq_per + * + * Description: + * Calculate and return current f_per (peripheral clock frequency) + * + * Returned Value: frequency in Hz. + * + * Assumptions/Limitations: + * Main clock must not be driven by external clock. + * + ****************************************************************************/ + +uint32_t avrdx_current_freq_per(void) +{ + uint8_t mclkctrla; + + mclkctrla = CLKCTRL.MCLKCTRLA & CLKCTRL_CLKSEL_GM; + + /* Using if - else to deal with the most likely case first */ + + if (mclkctrla == CLKCTRL_CLKSEL_OSCHF_GC) + { + /* Internal high frequency oscillator */ + + return avrdx_current_freq_per_oschf(); + } + else if ((mclkctrla == CLKCTRL_CLKSEL_OSC32K_GC) || \ + (mclkctrla == CLKCTRL_CLKSEL_XOSC32K_GC)) + { + /* 32768 oscillator or external oscillator/crystal. + * Only apply the prescaler. + */ + + return avrdx_current_freq_main_prescaler(32768); } - /* Currently, rest of the code only supports internal oscillator - * and its frequency is pre-set using Kconfig. Nevertheless, that - * can change at some point and this function accounts for some - * of that. - * - * It doesn't account for the chip being clocked by external source - * though, that's to be done. + /* External clock. This is not supported and we can not determine + * the frequency. This will likely cause a failure (unless the caller + * has this case handled using other means.) */ - return f_per; + return 0; +} + +/**************************************************************************** + * Name: avrdx_current_freq_cpu + * + * Description: + * Calculate and return current f_cpu (CPU frequency). Returns value + * of avrdx_current_freq_per because both clocks are identical. + * + * Returned Value: frequency in Hz + * + * Assumptions/Limitations: + * Main clock must not be driven by external clock. + * + ****************************************************************************/ + +uint32_t avrdx_current_freq_cpu(void) +{ + return avrdx_current_freq_per(); } diff --git a/arch/avr/src/avrdx/iodefs/avr128da28.h b/arch/avr/src/avrdx/iodefs/avr128da28.h index 6dbc143bc35a4..abb16bcd5ce0e 100644 --- a/arch/avr/src/avrdx/iodefs/avr128da28.h +++ b/arch/avr/src/avrdx/iodefs/avr128da28.h @@ -64,6 +64,16 @@ #define PORT_ISC_BOTHEDGES_GC ( PORT_ISC_0_bm ) +/* CLKCTRL.MCLKCTRLA */ + +#define CLKCTRL_CLKSEL_GM ( CLKCTRL_CLKSEL_0_bm | CLKCTRL_CLKSEL_1_bm | \ + CLKCTRL_CLKSEL_2_bm ) + +#define CLKCTRL_CLKSEL_OSCHF_GC (0) +#define CLKCTRL_CLKSEL_OSC32K_GC ( CLKCTRL_CLKSEL_0_bm ) +#define CLKCTRL_CLKSEL_XOSC32K_GC ( CLKCTRL_CLKSEL_1_bm ) +#define CLKCTRL_CLKSEL_EXTCLK_GC ( CLKCTRL_CLKSEL_0_bm | CLKCTRL_CLKSEL_1_bm ) + /* CLKCTRL.MCLKCTRLB */ #define CLKCTRL_PDIV_GM ( CLKCTRL_PDIV_0_bm | CLKCTRL_PDIV_1_bm | \ diff --git a/arch/avr/src/avrdx/iodefs/avr128da64.h b/arch/avr/src/avrdx/iodefs/avr128da64.h index f00fef5bbce7a..8ce2e251e07f0 100644 --- a/arch/avr/src/avrdx/iodefs/avr128da64.h +++ b/arch/avr/src/avrdx/iodefs/avr128da64.h @@ -89,6 +89,16 @@ #define PORT_ISC_BOTHEDGES_GC ( PORT_ISC_0_bm ) +/* CLKCTRL.MCLKCTRLA */ + +#define CLKCTRL_CLKSEL_gm ( CLKCTRL_CLKSEL_0_bm | CLKCTRL_CLKSEL_1_bm | \ + CLKCTRL_CLKSEL_2_bm ) + +#define CLKCTRL_CLKSEL_OSCHF_GC (0) +#define CLKCTRL_CLKSEL_OSC32K_GC ( CLKCTRL_CLKSEL_0_bm ) +#define CLKCTRL_CLKSEL_XOSC32K_GC ( CLKCTRL_CLKSEL_1_bm ) +#define CLKCTRL_CLKSEL_EXTCLK_GC ( CLKCTRL_CLKSEL_0_bm | CLKCTRL_CLKSEL_1_bm ) + /* CLKCTRL.MCLKCTRLB */ #define CLKCTRL_PDIV_GM ( CLKCTRL_PDIV_0_bm | CLKCTRL_PDIV_1_bm | \ diff --git a/arch/avr/src/avrdx/iodefs/avr128db64.h b/arch/avr/src/avrdx/iodefs/avr128db64.h index 6bd0456f6cabd..7fa48d800504a 100644 --- a/arch/avr/src/avrdx/iodefs/avr128db64.h +++ b/arch/avr/src/avrdx/iodefs/avr128db64.h @@ -89,6 +89,16 @@ #define PORT_ISC_BOTHEDGES_GC ( PORT_ISC_0_bm ) +/* CLKCTRL.MCLKCTRLA */ + +#define CLKCTRL_CLKSEL_gm ( CLKCTRL_CLKSEL_0_bm | CLKCTRL_CLKSEL_1_bm | \ + CLKCTRL_CLKSEL_2_bm ) + +#define CLKCTRL_CLKSEL_OSCHF_GC (0) +#define CLKCTRL_CLKSEL_OSC32K_GC ( CLKCTRL_CLKSEL_0_bm ) +#define CLKCTRL_CLKSEL_XOSC32K_GC ( CLKCTRL_CLKSEL_1_bm ) +#define CLKCTRL_CLKSEL_EXTCLK_GC ( CLKCTRL_CLKSEL_0_bm | CLKCTRL_CLKSEL_1_bm ) + /* CLKCTRL.MCLKCTRLB */ #define CLKCTRL_PDIV_GM ( CLKCTRL_PDIV_0_bm | CLKCTRL_PDIV_1_bm | \ diff --git a/boards/avr/avrdx/breadxavr/src/Makefile b/boards/avr/avrdx/breadxavr/src/Makefile index f5ef01580aa89..da62b5457ca11 100644 --- a/boards/avr/avrdx/breadxavr/src/Makefile +++ b/boards/avr/avrdx/breadxavr/src/Makefile @@ -22,16 +22,12 @@ include $(TOPDIR)/Make.defs -CSRCS = avrdx_boot.c +CSRCS = avrdx_boot.c avrdx_init.c ifeq ($(CONFIG_ARCH_LEDS),y) CSRCS += avr_leds.c endif -ifeq ($(CONFIG_BOARD_EARLY_INITIALIZE),y) -CSRCS += avrdx_init.c -endif - ifeq ($(CONFIG_BREADXAVR_BUTTONS_DRIVER),y) CSRCS += avrdx_buttons.c endif diff --git a/boards/avr/avrdx/breadxavr/src/avrdx_init.c b/boards/avr/avrdx/breadxavr/src/avrdx_init.c index 6e09599e38f5c..096c1f9730b47 100644 --- a/boards/avr/avrdx/breadxavr/src/avrdx_init.c +++ b/boards/avr/avrdx/breadxavr/src/avrdx_init.c @@ -89,3 +89,32 @@ void board_early_initialize(void) } #endif /* CONFIG_BOARD_EARLY_INITIALIZE */ + +/**************************************************************************** + * Name: board_late_initialize + * + * Description: + * Excerpt from Kconfig: + * + * If CONFIG_BOARD_LATE_INITIALIZE is set, board_late_initialize() is + * called after up_initialize() just before the main application is + * started. It can be used to initialize more complex, board-specific + * device drivers. + * + * Function runs on a temporary, internal kernel thread and can therefore + * wait for events. + * + * Currently, the board has no need for this function but the Kconfig + * option is enabled by default so the function either needs to be defined, + * or default configuration needs to be changed. This is why this function + * exists but is empty. + * + ****************************************************************************/ + +#ifdef CONFIG_BOARD_LATE_INITIALIZE + +void board_late_initialize(void) +{ +} + +#endif /* CONFIG_BOARD_LATE_INITIALIZE */ diff --git a/drivers/timers/arch_alarm.c b/drivers/timers/arch_alarm.c index 557a326f36e10..b8eb29bf8afc1 100644 --- a/drivers/timers/arch_alarm.c +++ b/drivers/timers/arch_alarm.c @@ -39,8 +39,16 @@ /* If no value is given, we proceed with 0 since a one-shot timer is used for * accurate delays. A runtime DEBUGASSERT catches the case where the one-shot * timer lower-half isn't registered in time. + * + * If ARCH_HAVE_DYNAMIC_UDELAY is set, BOARD_LOOPSPERMSEC is unset. + * Considering the above, it should not be used. Set a default value of -1, + * turning this case into an already handled one. */ +#ifndef CONFIG_BOARD_LOOPSPERMSEC +# define CONFIG_BOARD_LOOPSPERMSEC -1 +#endif + #if CONFIG_BOARD_LOOPSPERMSEC == -1 # undef CONFIG_BOARD_LOOPSPERMSEC # define CONFIG_BOARD_LOOPSPERMSEC 0 @@ -181,13 +189,21 @@ void weak_function up_mdelay(unsigned int milliseconds) * Delay inline for the requested number of microseconds. * WARNING: NOT multi-tasking friendly * + * This function is both compiled optionally based on ARCH_HAVE_UDELAY + * and declared with weak attribute. See comment of up_udelay + * implementation in sched/clock/clock_delay.c for explanation. + * ****************************************************************************/ +#ifndef CONFIG_ARCH_HAVE_UDELAY + void weak_function up_udelay(useconds_t microseconds) { up_ndelay(NSEC_PER_USEC * microseconds); } +#endif + /**************************************************************************** * Name: up_ndelay * diff --git a/drivers/timers/arch_timer.c b/drivers/timers/arch_timer.c index c3cf4f83d060d..5ee0da8c1de03 100644 --- a/drivers/timers/arch_timer.c +++ b/drivers/timers/arch_timer.c @@ -39,16 +39,22 @@ /* If no value is given, we proceed with 0 since a timer is used for accurate * delays. A runtime DEBUGASSERT catches the case where the timer lower-half * isn't registered in time. + * + * Value is unset if ARCH_HAVE_DYNAMIC_UDELAY is set. In that case, + * ARCH_HAVE_UDELAY is also set and the only user of these values + * (udelay_coarse) is excluded from the build. */ -#if CONFIG_BOARD_LOOPSPERMSEC == -1 -# undef CONFIG_BOARD_LOOPSPERMSEC -# define CONFIG_BOARD_LOOPSPERMSEC 0 -#endif +#ifndef CONFIG_ARCH_HAVE_UDELAY +# if CONFIG_BOARD_LOOPSPERMSEC == -1 +# undef CONFIG_BOARD_LOOPSPERMSEC +# define CONFIG_BOARD_LOOPSPERMSEC 0 +# endif -#define CONFIG_BOARD_LOOPSPER100USEC ((CONFIG_BOARD_LOOPSPERMSEC+5)/10) -#define CONFIG_BOARD_LOOPSPER10USEC ((CONFIG_BOARD_LOOPSPERMSEC+50)/100) -#define CONFIG_BOARD_LOOPSPERUSEC ((CONFIG_BOARD_LOOPSPERMSEC+500)/1000) +# define CONFIG_BOARD_LOOPSPER100USEC ((CONFIG_BOARD_LOOPSPERMSEC+5)/10) +# define CONFIG_BOARD_LOOPSPER10USEC ((CONFIG_BOARD_LOOPSPERMSEC+50)/100) +# define CONFIG_BOARD_LOOPSPERUSEC ((CONFIG_BOARD_LOOPSPERMSEC+500)/1000) +#endif /**************************************************************************** * Private Types @@ -127,6 +133,18 @@ static void udelay_accurate(useconds_t microseconds) } } +#ifndef CONFIG_ARCH_HAVE_UDELAY + +/**************************************************************************** + * Name: udelay_coarse + * + * Description: + * Wait loop called (only) by up_udelay if udelay_accurate + * is not available. (Excluded from the build if up_udelay is also + * excluded from the build.) + * + ****************************************************************************/ + static void udelay_coarse(useconds_t microseconds) { volatile int i; @@ -176,6 +194,8 @@ static void udelay_coarse(useconds_t microseconds) } } +#endif /* ifndef CONFIG_ARCH_HAVE_UDELAY */ + static bool timer_callback(FAR uint32_t *next_interval, FAR void *arg) { #ifdef CONFIG_SCHED_TICKLESS @@ -461,8 +481,14 @@ void weak_function up_mdelay(unsigned int milliseconds) * * *** NOT multi-tasking friendly *** * + * This function is both compiled optionally based on ARCH_HAVE_UDELAY + * and declared with weak attribute. See comment of up_udelay + * implementation in sched/clock/clock_delay.c for explanation. + * ****************************************************************************/ +#ifndef CONFIG_ARCH_HAVE_UDELAY + void weak_function up_udelay(useconds_t microseconds) { if (g_timer.lower != NULL) @@ -475,6 +501,8 @@ void weak_function up_udelay(useconds_t microseconds) } } +#endif + /**************************************************************************** * Name: up_ndelay * diff --git a/sched/clock/clock_delay.c b/sched/clock/clock_delay.c index cf3e2f0c902d7..f217477505a3b 100644 --- a/sched/clock/clock_delay.c +++ b/sched/clock/clock_delay.c @@ -67,17 +67,27 @@ "new value to apache/nuttx." #endif +#ifndef CONFIG_ARCH_HAVE_DYNAMIC_UDELAY static_assert(CONFIG_BOARD_LOOPSPERMSEC != -1, "Configure BOARD_LOOPSPERMSEC to non-default value."); -#define CONFIG_BOARD_LOOPSPER100USEC ((CONFIG_BOARD_LOOPSPERMSEC+5)/10) -#define CONFIG_BOARD_LOOPSPER10USEC ((CONFIG_BOARD_LOOPSPERMSEC+50)/100) -#define CONFIG_BOARD_LOOPSPERUSEC ((CONFIG_BOARD_LOOPSPERMSEC+500)/1000) +/* If ARCH_HAVE_DYNAMIC_UDELAY is set, BOARD_LOOPSPERMSEC is unset, + * calculate these optionally. (Only used in udelay_coarse and + * that function is excluded from the build if these end up undefined.) + */ + +# define CONFIG_BOARD_LOOPSPER100USEC ((CONFIG_BOARD_LOOPSPERMSEC+5)/10) +# define CONFIG_BOARD_LOOPSPER10USEC ((CONFIG_BOARD_LOOPSPERMSEC+50)/100) +# define CONFIG_BOARD_LOOPSPERUSEC ((CONFIG_BOARD_LOOPSPERMSEC+500)/1000) + +#endif /**************************************************************************** * Private Functions ****************************************************************************/ +#ifndef CONFIG_ARCH_HAVE_UDELAY + static void udelay_coarse(useconds_t microseconds) { volatile int i; @@ -125,6 +135,8 @@ static void udelay_coarse(useconds_t microseconds) } } +#endif + /**************************************************************************** * Public Functions ****************************************************************************/ @@ -151,13 +163,38 @@ void weak_function up_mdelay(unsigned int milliseconds) * * *** NOT multi-tasking friendly *** * + * Note that this method is both marked weak and compiled optionally + * based on ARCH_HAVE_UDELAY option. This is redundant but kept + * as a temporary measure. + * + * It was discovered that sometimes the weak attribute is not enough + * to have the method replaced with other, non-weak implementation. + * (Described in more detail in help text for the ARCH_HAVE_UDELAY + * Kconfig option.) The ARCH_HAVE_UDELAY option is meant to remedy + * this but the author of the change does not own all the hardware + * that implements its own up_udelay and is affected by this problem. + * + * Enabling ARCH_HAVE_UDELAY for every such chip is therefore impossible, + * the change would be untested. Instead, the weak attribute is kept, + * preserving previous behaviour of current code. If, at some future + * time, the change is tested for other chips that implement up_udelay, + * the weak attribute and this comment can be removed. + * + * The same also applies to up_udelay implementations + * in drivers/timers/arch_alarm.c and drivers/timers/arch_timer.c + * and this comment is referenced there. + * ****************************************************************************/ +#ifndef CONFIG_ARCH_HAVE_UDELAY + void weak_function up_udelay(useconds_t microseconds) { udelay_coarse(microseconds); } +#endif + /**************************************************************************** * Name: up_ndelay *