1
0
mirror of https://github.com/esp8266/Arduino.git synced 2025-04-25 20:02:37 +03:00

Waveform: fix significant jitter, that stresses servos and is clearly audible in Tone output (#7022)

* Allow 100% high or low periods.

Let output remain at current level on stopping instead of always turning to low.

* Fix serious jitter issues in previous versions.

* Use ESP.getCycleCount() just like everyone else.

* Highest timer rate at which this runs stable appears to be 2µs (500kHz).

* Guard for zero period length undefined waveforms.

Fix for zero duty or off cycles and expiring from them.

* Cycle precision for expiry instead of special treatment for 0 value.

* Give expiry proper precedence over updating a waveform

* Important comment

* Refactored, identical behavior.

* Use plural for bit arrays.

* Fix for completely duty or all off cycle period case.

* Expiration is explicitly relative to service time.

* Comment updated, here it's about cycles not usecs.

* Revert misconception of how waveformToEnable/Disable communicates with the NMI handler.

* Rewrite to keep phase in sync if period remains same during duty cycle change.

Refactor identifies to distinguish CPU clock cycle from waveform cycle.

* Rather iterate even if full-duty or no-duty cycle in period, than too many calculations in NMI handler.

* Must fire timer early to reach waveform deadlines, otherwise under some load aggressive jitter occurs.

* Schedule expiry explicitly, too.

Needed to keep track of next timer ccy in each iteration, not just when changing level.

* Quick change lets analogWrite keep phase for any duty cycle (including 0% and 100%).

* Set duration to multiple of period,  so tone stops on LOW pin output.

* Improve phase timing

* Eror causing next Timer IRQ to fail busy-to-off cycle transitions.

* Regression fix, don't reset timer if pending shortly.

* Rather reschedule ISR instead of busy looping during permitted maximum time.

* Lead time improved for ISR

* Reduce number of cycle calculations.

* Reactive the gcc optimize pragmas.

* Simplify calculation.

* handles overshoot where an updated period is shorter than the previous duty cycle

* Misleading code, there must ever be only one bit set at a time, start and stop block until the ISR has handled and reset the token.

* Prevent missing a duty cycle unless it is overshot already.

* Continuously remove distant pending waveform edges from the loop, continuously update now.

* Replace volatile for one-way exchange into ISR with memory fence.

* Remove redundant stack object.

* Revert pending waveform removal from loop - corrupts continuous next event computation.

* Reduce if/do ... while to while

* Convert relative timings to absolute.

* Relax waveform start to possibly cluster phases into same IRQ interval.

* max 12us in ISR seems to work best for servo/fan/led/tone combo test.

* Restructured code in ISR for expiration, this saves 36 byte IRAM, and improves PWM resolution.

* Simplified overshot detection and 0% / 100% duty cycle.

* Leave ISR early if rescheduling is more promising than busy-waiting until next edge.

* Stabilized timings.

* Prevent WDT under load.

* Use clock cycle resolution instead of us for analogWrite.

* Reduce idle calculations in ISR.

* Optimize in-ISR time.

* Support starting new waveform in phase with another running waveform.

* Align phase for analogWrite PWMs.

* Tune preshoot, add lost period fast forward.

* Adapt phase sync code from analogWrite to Servo

* Fix for going off 100% duty cycle period.

* Eschew obfuscation.

* Fixed logic for zero duty cycle.

* Determine generator quantum during same IRQ - this is better than timer resolution, but non-zero.

* Tune timings, fix write barriers and overshoot logic.

* Migrate Tone to waveform with CPU cycle precision

* Can do 60kHz PWM.

* Recalibrated timings after performance optimizations.

Initialize GPIO if needed.

* Fix regression for waveform runtime.

* Test cycle duration values for signed arithmetic safety.

* Performance tuning.

* Performance tweak, in-ISR quantum is now 1.12µs.

* Round up duration instead of down - possibly to zero, which means forever.

* Extend phase alignment with optional phase offset.

* Slightly better in-ISR quantum approximation for steadier increments.

* Waveform stopped by runtime limit in iSR doesn't deinit the timer, but stopWaveform refuses

to do anything if the waveform was stopped by runtime, either.

* Improved quantum correction code.

* Fix broken multi-wave generation.

* Aggregate GPIO output across inner loop. True phase sync, and now better performance.

* IRQ latency can be reduced from 2 to 1 us now, no WDT etc.

* Improved handling of complete idle cycle miss, progress directly into duty cycle.

* Recalibrated after latest changes and reverts.

* Overshoot compensation for duty cycle results in PWM milestone.

* Adjustments to duty/idle cycle to mitigate effects of floating duty cycle logic.

* Remove implicit condition from loop guard and fix timer restart duration

* Host all static globals in an anonymous static struct.

* Busy wait directly for next pending event and go to that pin.

* Record nextEventCcy in waveform struct to save a few cycles.

* Adapt duty cycle modification to only fix full duty and all idle cases.

* Remember next pin to operate between IRQs.

* Don't set pinMode each time on already running PWM or Tone.

* Remove quantum, correct irq latency from testing,reuse isr timeout from master et al

* Move updating "now" out of inner loop, prevents float between pins that are in phase lock.

* Merge init loop with action loop again.

* Adaptive PWM frequency and floating duty cycle.

* Predictive static frequency scaling.

* Dynamic frequency down-scaling

* Frequency scaling is only for PWM-like applications, anything needing real time duty cycles or frequency must be able to fail on overload.

* Conserve IRAM cache, resort to best effort.

* Directly scale frequency for all duty/all idle waves to reasonable maximum, reduces thrashing.

* Getting the math right beats permanently reducing PWM frequency.

* Rename identifier to help think about the problem.

* AutoPwm correction moved to correct location - after overshoot recalc - and allow limited duty floating

* Finish overshoot math fixes.

* First set pin mode, then digital write.

* Simplify calculations, fix non-autoPwm for servo use, where exact duty is needed, idle is elastic.

* Move wave initialization and modification outside the inner loop.

* Some optimizing.

* Updating "now" in the inner loop should lessen interference

* Finally get rid of volatile and use atomic thread fence memory barriers, great for ISR performance.

* Improved idle cycle overshoot mitigation.

* Improved duty cycle overshoot mitigation.

Case for investigation: 3% (shl 5) vs. 1.5% (shl 6), either less fuzz, but a few marked stray spots, or more fuzz, but no bumps in counter-PWM travel test.

* Move startPin etc. into common static struct

* Persist next event cycle across ISR invocations, like initPin was before.

* Recalibrated DELTAIRQ and IRQLATENCY. Tested @ 3x 40kHz PWM + 440Hz Tone

* CPU clock to Timer1 ccy correction must be dynamic even when BSP is compiled for fixed CPU clock.

* Corrected use of Timer1 registers and add rationale to Timer1 use in comment.

Recalibrate for improved frequence downscaling @ 80MHz and 160MHz.

* Let duty cycle overshoot correction depend on relative impact compareed to both period and duty.

* 80MHz/160MHz specific code can be compile-time selected in general, only NMI is affected by

apparent CPU frequency scaling in SDK code.

* Seems that removing the redudant resetting of edge interrupt mode shaves 0.5us off rearm latency.

* Recalibrated delta irq ccys.

* Off-by-one in 100% duty overshoot correction.

* Simple register writes.

* Memory fences checked and joining events into same loop iteration that are close to one another.

* Shorten progression when going off 100% duty.

* Code simplifications.

* Dynamically map pins out from in-ISR handling based on next event timing.

Major performance boost.

* Reverting maximum IRQ period to 10ms. This sets the wave reprogramming rate to 100Hz max.

* Revert recent change that is the most likely cause of reported PWM frequency drop regression.

* Much simplified overshoot mitigation code.

* Fixing overshoot mitigation, 3x 880Hz, 256 states now.

* Increase resolution by keeping reference time moving forward earlier.

* Mitigation logic for ESP8266 SDK boosting to 160MHz during some WiFi ops.

* Event timestamps are all recorded for compile-time CPU frequency, the timer ticks conversion

must be set at compile-time also. The SDK WiFi 160MHz boost mitigation temporarily handles
the CPU clock running twice as fast.

* Expired pins must not be checked for next event.

* Recalibrate after latest changes.

* Save a few bytes code.

* Guards are in place, so xor rather than and bitwise not.

* Reduce memory use.

* SDK boost to 160MHz may last across multiple ISR invocations, therefore adjust target ccy instead of ccount.

* Overshoot mitigation w/o PWM frequency change.

* New PWM overshoot mitigation code keeps frequency. Averages duty between consecutive periods.

* Small refactoring, remove code path that is never taken even at 3x25kHz/1023 PWM.

* Don't ever skip off duty, no matter if late or infinitely short.

* Shed speed-up code that didn't speed up things.

* Must always recompute new waveform.nextEventCcy if there is any busy pin.

* Break out of ISR if timespan to next event allows, instead of busy waiting and stealing CPU cycles from userland.

* Minor code simplification.

* Improve code efficiency.

* Improved performance of loop.

* Recalibrated.

* No positive effect of lead time inclusion was found during testing, remove this code.

Maximum period duration limit is implicit to timer, consider it documented constraint, don't runtime
check in ISR.

* Fix WDT when at 160MHz CPU clock the Timer1 is set below 1µs.

* Consolidate 160MHz constexpr check, finish 1µs minimum for Timer1 fix.

* Test for non-zero before subtract should improve performance.

* Reviewers/tested noted they were seeing WDT, and this change appeared to fix that.

* More expressive use of parentheses and alias CPU2X for reduced code size.

* Bug fix: at 160MHz compiled, don't force minimum Timer1 latency to 2µs.

* Alternate CPU frequency scaling mitigation.

* Handle time-of-flight in the right spot.

* Remove _toneMap from Tone.cpp

Co-authored-by: david gauchard <gauchard@laas.fr>
This commit is contained in:
Dirk O. Kaar 2020-11-19 22:12:06 +01:00 committed by GitHub
parent 8fe80f1630
commit 0e735e386d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 358 additions and 226 deletions

View File

@ -25,10 +25,6 @@
#include "core_esp8266_waveform.h"
#include "user_interface.h"
// Which pins have a tone running on them?
static uint32_t _toneMap = 0;
static void _startTone(uint8_t _pin, uint32_t high, uint32_t low, uint32_t duration) {
if (_pin > 16) {
return;
@ -42,9 +38,7 @@ static void _startTone(uint8_t _pin, uint32_t high, uint32_t low, uint32_t durat
duration = microsecondsToClockCycles(duration * 1000UL);
duration += high + low - 1;
duration -= duration % (high + low);
if (startWaveformClockCycles(_pin, high, low, duration)) {
_toneMap |= 1 << _pin;
}
startWaveformClockCycles(_pin, high, low, duration);
}
@ -86,6 +80,5 @@ void noTone(uint8_t _pin) {
return;
}
stopWaveform(_pin);
_toneMap &= ~(1 << _pin);
digitalWrite(_pin, 0);
}

View File

@ -3,6 +3,7 @@
supporting outputs on all pins in parallel.
Copyright (c) 2018 Earle F. Philhower, III. All rights reserved.
Copyright (c) 2020 Dirk O. Kaar.
The core idea is to have a programmable waveform generator with a unique
high and low period (defined in microseconds or CPU clock cycles). TIMER1 is
@ -19,8 +20,8 @@
This replaces older tone(), analogWrite(), and the Servo classes.
Everywhere in the code where "cycles" is used, it means ESP.getCycleCount()
clock cycle count, or an interval measured in CPU clock cycles, but not TIMER1
Everywhere in the code where "ccy" or "ccys" is used, it means ESP.getCycleCount()
clock cycle time, or an interval measured in clock cycles, but not TIMER1
cycles (which may be 2 CPU clock cycles @ 160MHz).
This library is free software; you can redistribute it and/or
@ -38,275 +39,398 @@
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "core_esp8266_waveform.h"
#include <Arduino.h>
#include "ets_sys.h"
#include "core_esp8266_waveform.h"
#include <atomic>
extern "C" {
// Timer is 80MHz fixed. 160MHz CPU frequency need scaling.
constexpr bool ISCPUFREQ160MHZ = clockCyclesPerMicrosecond() == 160;
// Maximum delay between IRQs, Timer1, <= 2^23 / 80MHz
constexpr int32_t MAXIRQTICKSCCYS = microsecondsToClockCycles(10000);
// Maximum servicing time for any single IRQ
constexpr uint32_t ISRTIMEOUTCCYS = microsecondsToClockCycles(18);
// The latency between in-ISR rearming of the timer and the earliest firing
constexpr int32_t IRQLATENCYCCYS = microsecondsToClockCycles(2);
// The SDK and hardware take some time to actually get to our NMI code
constexpr int32_t DELTAIRQCCYS = ISCPUFREQ160MHZ ?
microsecondsToClockCycles(2) >> 1 : microsecondsToClockCycles(2);
// Maximum delay between IRQs
#define MAXIRQUS (10000)
// Set/clear GPIO 0-15 by bitmask
#define SetGPIO(a) do { GPOS = a; } while (0)
#define ClearGPIO(a) do { GPOC = a; } while (0)
// for INFINITE, the NMI proceeds on the waveform without expiry deadline.
// for EXPIRES, the NMI expires the waveform automatically on the expiry ccy.
// for UPDATEEXPIRY, the NMI recomputes the exact expiry ccy and transitions to EXPIRES.
// for INIT, the NMI initializes nextPeriodCcy, and if expiryCcy != 0 includes UPDATEEXPIRY.
enum class WaveformMode : uint8_t {INFINITE = 0, EXPIRES = 1, UPDATEEXPIRY = 2, INIT = 3};
// Waveform generator can create tones, PWM, and servos
typedef struct {
uint32_t nextServiceCycle; // ESP cycle timer when a transition required
uint32_t expiryCycle; // For time-limited waveform, the cycle when this waveform must stop
uint32_t nextTimeHighCycles; // Copy over low->high to keep smooth waveform
uint32_t nextTimeLowCycles; // Copy over high->low to keep smooth waveform
uint32_t nextPeriodCcy; // ESP clock cycle when a period begins. If WaveformMode::INIT, temporarily holds positive phase offset ccy count
uint32_t endDutyCcy; // ESP clock cycle when going from duty to off
int32_t dutyCcys; // Set next off cycle at low->high to maintain phase
int32_t adjDutyCcys; // Temporary correction for next period
int32_t periodCcys; // Set next phase cycle at low->high to maintain phase
uint32_t expiryCcy; // For time-limited waveform, the CPU clock cycle when this waveform must stop. If WaveformMode::UPDATE, temporarily holds relative ccy count
WaveformMode mode;
int8_t alignPhase; // < 0 no phase alignment, otherwise starts waveform in relative phase offset to given pin
bool autoPwm; // perform PWM duty to idle cycle ratio correction under high load at the expense of precise timings
} Waveform;
static Waveform waveform[17]; // State of all possible pins
static volatile uint32_t waveformState = 0; // Is the pin high or low, updated in NMI so no access outside the NMI code
static volatile uint32_t waveformEnabled = 0; // Is it actively running, updated in NMI so no access outside the NMI code
namespace {
// Enable lock-free by only allowing updates to waveformState and waveformEnabled from IRQ service routine
static volatile uint32_t waveformToEnable = 0; // Message to the NMI handler to start a waveform on a inactive pin
static volatile uint32_t waveformToDisable = 0; // Message to the NMI handler to disable a pin from waveform generation
static struct {
Waveform pins[17]; // State of all possible pins
uint32_t states = 0; // Is the pin high or low, updated in NMI so no access outside the NMI code
uint32_t enabled = 0; // Is it actively running, updated in NMI so no access outside the NMI code
static uint32_t (*timer1CB)() = NULL;
// Enable lock-free by only allowing updates to waveform.states and waveform.enabled from IRQ service routine
int32_t toSetBits = 0; // Message to the NMI handler to start/modify exactly one waveform
int32_t toDisableBits = 0; // Message to the NMI handler to disable exactly one pin from waveform generation
uint32_t(*timer1CB)() = nullptr;
// Non-speed critical bits
#pragma GCC optimize ("Os")
bool timer1Running = false;
uint32_t nextEventCcy;
} waveform;
static inline ICACHE_RAM_ATTR uint32_t GetCycleCount() {
uint32_t ccount;
__asm__ __volatile__("esync; rsr %0,ccount":"=a"(ccount));
return ccount;
}
// Interrupt on/off control
static ICACHE_RAM_ATTR void timer1Interrupt();
static bool timerRunning = false;
// Non-speed critical bits
#pragma GCC optimize ("Os")
static void initTimer() {
timer1_disable();
ETS_FRC_TIMER1_INTR_ATTACH(NULL, NULL);
ETS_FRC_TIMER1_NMI_INTR_ATTACH(timer1Interrupt);
timer1_enable(TIM_DIV1, TIM_EDGE, TIM_SINGLE);
timerRunning = true;
waveform.timer1Running = true;
timer1_write(IRQLATENCYCCYS); // Cause an interrupt post-haste
}
static void ICACHE_RAM_ATTR deinitTimer() {
ETS_FRC_TIMER1_NMI_INTR_ATTACH(NULL);
timer1_disable();
timer1_isr_init();
timerRunning = false;
waveform.timer1Running = false;
}
extern "C" {
// Set a callback. Pass in NULL to stop it
void setTimer1Callback(uint32_t (*fn)()) {
timer1CB = fn;
if (!timerRunning && fn) {
waveform.timer1CB = fn;
std::atomic_thread_fence(std::memory_order_acq_rel);
if (!waveform.timer1Running && fn) {
initTimer();
timer1_write(microsecondsToClockCycles(1)); // Cause an interrupt post-haste
} else if (timerRunning && !fn && !waveformEnabled) {
} else if (waveform.timer1Running && !fn && !waveform.enabled) {
deinitTimer();
}
}
int startWaveform(uint8_t pin, uint32_t highUS, uint32_t lowUS,
uint32_t runTimeUS, int8_t alignPhase, uint32_t phaseOffsetUS, bool autoPwm) {
return startWaveformClockCycles(pin,
microsecondsToClockCycles(highUS), microsecondsToClockCycles(lowUS),
microsecondsToClockCycles(runTimeUS), alignPhase, microsecondsToClockCycles(phaseOffsetUS), autoPwm);
}
// Start up a waveform on a pin, or change the current one. Will change to the new
// waveform smoothly on next low->high transition. For immediate change, stopWaveform()
// first, then it will immediately begin.
int startWaveform(uint8_t pin, uint32_t timeHighUS, uint32_t timeLowUS, uint32_t runTimeUS) {
return startWaveformClockCycles(pin, microsecondsToClockCycles(timeHighUS), microsecondsToClockCycles(timeLowUS), microsecondsToClockCycles(runTimeUS));
}
int startWaveformClockCycles(uint8_t pin, uint32_t timeHighCycles, uint32_t timeLowCycles, uint32_t runTimeCycles) {
if ((pin > 16) || isFlashInterfacePin(pin)) {
int startWaveformClockCycles(uint8_t pin, uint32_t highCcys, uint32_t lowCcys,
uint32_t runTimeCcys, int8_t alignPhase, uint32_t phaseOffsetCcys, bool autoPwm) {
uint32_t periodCcys = highCcys + lowCcys;
if (periodCcys < MAXIRQTICKSCCYS) {
if (!highCcys) {
periodCcys = (MAXIRQTICKSCCYS / periodCcys) * periodCcys;
}
else if (!lowCcys) {
highCcys = periodCcys = (MAXIRQTICKSCCYS / periodCcys) * periodCcys;
}
}
// sanity checks, including mixed signed/unsigned arithmetic safety
if ((pin > 16) || isFlashInterfacePin(pin) || (alignPhase > 16) ||
static_cast<int32_t>(periodCcys) <= 0 ||
static_cast<int32_t>(highCcys) < 0 || static_cast<int32_t>(lowCcys) < 0) {
return false;
}
Waveform *wave = &waveform[pin];
// Adjust to shave off some of the IRQ time, approximately
wave->nextTimeHighCycles = timeHighCycles;
wave->nextTimeLowCycles = timeLowCycles;
wave->expiryCycle = runTimeCycles ? GetCycleCount() + runTimeCycles : 0;
if (runTimeCycles && !wave->expiryCycle) {
wave->expiryCycle = 1; // expiryCycle==0 means no timeout, so avoid setting it
}
Waveform& wave = waveform.pins[pin];
wave.dutyCcys = highCcys;
wave.adjDutyCcys = 0;
wave.periodCcys = periodCcys;
wave.autoPwm = autoPwm;
uint32_t mask = 1<<pin;
if (!(waveformEnabled & mask)) {
// Actually set the pin high or low in the IRQ service to guarantee times
wave->nextServiceCycle = GetCycleCount() + microsecondsToClockCycles(1);
waveformToEnable |= mask;
if (!timerRunning) {
initTimer();
timer1_write(microsecondsToClockCycles(10));
} else {
// Ensure timely service....
if (T1L > microsecondsToClockCycles(10)) {
timer1_write(microsecondsToClockCycles(10));
std::atomic_thread_fence(std::memory_order_acquire);
const uint32_t pinBit = 1UL << pin;
if (!(waveform.enabled & pinBit)) {
// wave.nextPeriodCcy and wave.endDutyCcy are initialized by the ISR
wave.nextPeriodCcy = phaseOffsetCcys;
wave.expiryCcy = runTimeCcys; // in WaveformMode::INIT, temporarily hold relative cycle count
wave.mode = WaveformMode::INIT;
wave.alignPhase = (alignPhase < 0) ? -1 : alignPhase;
if (!wave.dutyCcys) {
// If initially at zero duty cycle, force GPIO off
if (pin == 16) {
GP16O = 0;
}
else {
GPOC = pinBit;
}
}
while (waveformToEnable) {
delay(0); // Wait for waveform to update
std::atomic_thread_fence(std::memory_order_release);
waveform.toSetBits = 1UL << pin;
std::atomic_thread_fence(std::memory_order_release);
if (!waveform.timer1Running) {
initTimer();
}
else if (T1V > IRQLATENCYCCYS) {
// Must not interfere if Timer is due shortly
timer1_write(IRQLATENCYCCYS);
}
}
return true;
}
// Speed critical bits
#pragma GCC optimize ("O2")
// Normally would not want two copies like this, but due to different
// optimization levels the inline attribute gets lost if we try the
// other version.
static inline ICACHE_RAM_ATTR uint32_t GetCycleCountIRQ() {
uint32_t ccount;
__asm__ __volatile__("rsr %0,ccount":"=a"(ccount));
return ccount;
}
static inline ICACHE_RAM_ATTR uint32_t min_u32(uint32_t a, uint32_t b) {
if (a < b) {
return a;
else {
wave.mode = WaveformMode::INFINITE; // turn off possible expiry to make update atomic from NMI
std::atomic_thread_fence(std::memory_order_release);
wave.expiryCcy = runTimeCcys; // in WaveformMode::UPDATEEXPIRY, temporarily hold relative cycle count
if (runTimeCcys) {
wave.mode = WaveformMode::UPDATEEXPIRY;
std::atomic_thread_fence(std::memory_order_release);
waveform.toSetBits = 1UL << pin;
}
}
return b;
std::atomic_thread_fence(std::memory_order_acq_rel);
while (waveform.toSetBits) {
delay(0); // Wait for waveform to update
std::atomic_thread_fence(std::memory_order_acquire);
}
return true;
}
// Stops a waveform on a pin
int ICACHE_RAM_ATTR stopWaveform(uint8_t pin) {
// Can't possibly need to stop anything if there is no timer active
if (!timerRunning) {
if (!waveform.timer1Running) {
return false;
}
// If user sends in a pin >16 but <32, this will always point to a 0 bit
// If they send >=32, then the shift will result in 0 and it will also return false
if (waveformEnabled & (1UL << pin)) {
waveformToDisable = 1UL << pin;
std::atomic_thread_fence(std::memory_order_acquire);
const uint32_t pinBit = 1UL << pin;
if (waveform.enabled & pinBit) {
waveform.toDisableBits = 1UL << pin;
std::atomic_thread_fence(std::memory_order_release);
// Must not interfere if Timer is due shortly
if (T1L > microsecondsToClockCycles(10)) {
timer1_write(microsecondsToClockCycles(10));
if (T1V > IRQLATENCYCCYS) {
timer1_write(IRQLATENCYCCYS);
}
while (waveformToDisable) {
while (waveform.toDisableBits) {
/* no-op */ // Can't delay() since stopWaveform may be called from an IRQ
std::atomic_thread_fence(std::memory_order_acquire);
}
}
if (!waveformEnabled && !timer1CB) {
if (!waveform.enabled && !waveform.timer1CB) {
deinitTimer();
}
return true;
}
// The SDK and hardware take some time to actually get to our NMI code, so
// decrement the next IRQ's timer value by a bit so we can actually catch the
// real CPU cycle counter we want for the waveforms.
#if F_CPU == 80000000
#define DELTAIRQ (microsecondsToClockCycles(3))
#else
#define DELTAIRQ (microsecondsToClockCycles(2))
#endif
};
// Speed critical bits
#pragma GCC optimize ("O2")
static ICACHE_RAM_ATTR void timer1Interrupt() {
// Optimize the NMI inner loop by keeping track of the min and max GPIO that we
// are generating. In the common case (1 PWM) these may be the same pin and
// we can avoid looking at the other pins.
static int startPin = 0;
static int endPin = 0;
uint32_t nextEventCycles = microsecondsToClockCycles(MAXIRQUS);
uint32_t timeoutCycle = GetCycleCountIRQ() + microsecondsToClockCycles(14);
if (waveformToEnable || waveformToDisable) {
// Handle enable/disable requests from main app.
waveformEnabled = (waveformEnabled & ~waveformToDisable) | waveformToEnable; // Set the requested waveforms on/off
waveformState &= ~waveformToEnable; // And clear the state of any just started
waveformToEnable = 0;
waveformToDisable = 0;
// Find the first GPIO being generated by checking GCC's find-first-set (returns 1 + the bit of the first 1 in an int32_t)
startPin = __builtin_ffs(waveformEnabled) - 1;
// Find the last bit by subtracting off GCC's count-leading-zeros (no offset in this one)
endPin = 32 - __builtin_clz(waveformEnabled);
// For dynamic CPU clock frequency switch in loop the scaling logic would have to be adapted.
// Using constexpr makes sure that the CPU clock frequency is compile-time fixed.
static inline ICACHE_RAM_ATTR int32_t scaleCcys(const int32_t ccys, const bool isCPU2X) {
if (ISCPUFREQ160MHZ) {
return isCPU2X ? ccys : (ccys >> 1);
}
bool done = false;
if (waveformEnabled) {
do {
nextEventCycles = microsecondsToClockCycles(MAXIRQUS);
for (int i = startPin; i <= endPin; i++) {
uint32_t mask = 1<<i;
// If it's not on, ignore!
if (!(waveformEnabled & mask)) {
continue;
}
Waveform *wave = &waveform[i];
uint32_t now = GetCycleCountIRQ();
// Disable any waveforms that are done
if (wave->expiryCycle) {
int32_t expiryToGo = wave->expiryCycle - now;
if (expiryToGo < 0) {
// Done, remove!
waveformEnabled &= ~mask;
if (i == 16) {
GP16O &= ~1;
} else {
ClearGPIO(mask);
}
continue;
}
}
// Check for toggles
int32_t cyclesToGo = wave->nextServiceCycle - now;
if (cyclesToGo < 0) {
waveformState ^= mask;
if (waveformState & mask) {
if (i == 16) {
GP16O |= 1; // GPIO16 write slow as it's RMW
} else {
SetGPIO(mask);
}
wave->nextServiceCycle = now + wave->nextTimeHighCycles;
nextEventCycles = min_u32(nextEventCycles, wave->nextTimeHighCycles);
} else {
if (i == 16) {
GP16O &= ~1; // GPIO16 write slow as it's RMW
} else {
ClearGPIO(mask);
}
wave->nextServiceCycle = now + wave->nextTimeLowCycles;
nextEventCycles = min_u32(nextEventCycles, wave->nextTimeLowCycles);
}
} else {
uint32_t deltaCycles = wave->nextServiceCycle - now;
nextEventCycles = min_u32(nextEventCycles, deltaCycles);
}
}
// Exit the loop if we've hit the fixed runtime limit or the next event is known to be after that timeout would occur
uint32_t now = GetCycleCountIRQ();
int32_t cycleDeltaNextEvent = timeoutCycle - (now + nextEventCycles);
int32_t cyclesLeftTimeout = timeoutCycle - now;
done = (cycleDeltaNextEvent < 0) || (cyclesLeftTimeout < 0);
} while (!done);
} // if (waveformEnabled)
if (timer1CB) {
nextEventCycles = min_u32(nextEventCycles, timer1CB());
else {
return isCPU2X ? (ccys << 1) : ccys;
}
if (nextEventCycles < microsecondsToClockCycles(10)) {
nextEventCycles = microsecondsToClockCycles(10);
}
nextEventCycles -= DELTAIRQ;
// Do it here instead of global function to save time and because we know it's edge-IRQ
#if F_CPU == 160000000
T1L = nextEventCycles >> 1; // Already know we're in range by MAXIRQUS
#else
T1L = nextEventCycles; // Already know we're in range by MAXIRQUS
#endif
TEIE |= TEIE1; // Edge int enable
}
};
static ICACHE_RAM_ATTR void timer1Interrupt() {
const uint32_t isrStartCcy = ESP.getCycleCount();
int32_t clockDrift = isrStartCcy - waveform.nextEventCcy;
const bool isCPU2X = CPU2X & 1;
if ((waveform.toSetBits && !(waveform.enabled & waveform.toSetBits)) || waveform.toDisableBits) {
// Handle enable/disable requests from main app.
waveform.enabled = (waveform.enabled & ~waveform.toDisableBits) | waveform.toSetBits; // Set the requested waveforms on/off
// Find the first GPIO being generated by checking GCC's find-first-set (returns 1 + the bit of the first 1 in an int32_t)
waveform.toDisableBits = 0;
}
if (waveform.toSetBits) {
const int toSetPin = __builtin_ffs(waveform.toSetBits) - 1;
Waveform& wave = waveform.pins[toSetPin];
switch (wave.mode) {
case WaveformMode::INIT:
waveform.states &= ~waveform.toSetBits; // Clear the state of any just started
if (wave.alignPhase >= 0 && waveform.enabled & (1UL << wave.alignPhase)) {
wave.nextPeriodCcy = waveform.pins[wave.alignPhase].nextPeriodCcy + wave.nextPeriodCcy;
}
else {
wave.nextPeriodCcy = waveform.nextEventCcy;
}
if (!wave.expiryCcy) {
wave.mode = WaveformMode::INFINITE;
break;
}
// fall through
case WaveformMode::UPDATEEXPIRY:
// in WaveformMode::UPDATEEXPIRY, expiryCcy temporarily holds relative CPU cycle count
wave.expiryCcy = wave.nextPeriodCcy + scaleCcys(wave.expiryCcy, isCPU2X);
wave.mode = WaveformMode::EXPIRES;
break;
default:
break;
}
waveform.toSetBits = 0;
}
// Exit the loop if the next event, if any, is sufficiently distant.
const uint32_t isrTimeoutCcy = isrStartCcy + ISRTIMEOUTCCYS;
uint32_t busyPins = waveform.enabled;
waveform.nextEventCcy = isrStartCcy + MAXIRQTICKSCCYS;
uint32_t now = ESP.getCycleCount();
uint32_t isrNextEventCcy = now;
while (busyPins) {
if (static_cast<int32_t>(isrNextEventCcy - now) > IRQLATENCYCCYS) {
waveform.nextEventCcy = isrNextEventCcy;
break;
}
isrNextEventCcy = waveform.nextEventCcy;
uint32_t loopPins = busyPins;
while (loopPins) {
const int pin = __builtin_ffsl(loopPins) - 1;
const uint32_t pinBit = 1UL << pin;
loopPins ^= pinBit;
Waveform& wave = waveform.pins[pin];
if (clockDrift) {
wave.endDutyCcy += clockDrift;
wave.nextPeriodCcy += clockDrift;
wave.expiryCcy += clockDrift;
}
uint32_t waveNextEventCcy = (waveform.states & pinBit) ? wave.endDutyCcy : wave.nextPeriodCcy;
if (WaveformMode::EXPIRES == wave.mode &&
static_cast<int32_t>(waveNextEventCcy - wave.expiryCcy) >= 0 &&
static_cast<int32_t>(now - wave.expiryCcy) >= 0) {
// Disable any waveforms that are done
waveform.enabled ^= pinBit;
busyPins ^= pinBit;
}
else {
const int32_t overshootCcys = now - waveNextEventCcy;
if (overshootCcys >= 0) {
const int32_t periodCcys = scaleCcys(wave.periodCcys, isCPU2X);
if (waveform.states & pinBit) {
// active configuration and forward are 100% duty
if (wave.periodCcys == wave.dutyCcys) {
wave.nextPeriodCcy += periodCcys;
wave.endDutyCcy = wave.nextPeriodCcy;
}
else {
if (wave.autoPwm) {
wave.adjDutyCcys += overshootCcys;
}
waveform.states ^= pinBit;
if (16 == pin) {
GP16O = 0;
}
else {
GPOC = pinBit;
}
}
waveNextEventCcy = wave.nextPeriodCcy;
}
else {
wave.nextPeriodCcy += periodCcys;
if (!wave.dutyCcys) {
wave.endDutyCcy = wave.nextPeriodCcy;
}
else {
int32_t dutyCcys = scaleCcys(wave.dutyCcys, isCPU2X);
if (dutyCcys <= wave.adjDutyCcys) {
dutyCcys >>= 1;
wave.adjDutyCcys -= dutyCcys;
}
else if (wave.adjDutyCcys) {
dutyCcys -= wave.adjDutyCcys;
wave.adjDutyCcys = 0;
}
wave.endDutyCcy = now + dutyCcys;
if (static_cast<int32_t>(wave.endDutyCcy - wave.nextPeriodCcy) > 0) {
wave.endDutyCcy = wave.nextPeriodCcy;
}
waveform.states |= pinBit;
if (16 == pin) {
GP16O = 1;
}
else {
GPOS = pinBit;
}
}
waveNextEventCcy = wave.endDutyCcy;
}
if (WaveformMode::EXPIRES == wave.mode && static_cast<int32_t>(waveNextEventCcy - wave.expiryCcy) > 0) {
waveNextEventCcy = wave.expiryCcy;
}
}
if (static_cast<int32_t>(waveNextEventCcy - isrTimeoutCcy) >= 0) {
busyPins ^= pinBit;
if (static_cast<int32_t>(waveform.nextEventCcy - waveNextEventCcy) > 0) {
waveform.nextEventCcy = waveNextEventCcy;
}
}
else if (static_cast<int32_t>(isrNextEventCcy - waveNextEventCcy) > 0) {
isrNextEventCcy = waveNextEventCcy;
}
}
now = ESP.getCycleCount();
}
clockDrift = 0;
}
int32_t callbackCcys = 0;
if (waveform.timer1CB) {
callbackCcys = scaleCcys(microsecondsToClockCycles(waveform.timer1CB()), isCPU2X);
}
now = ESP.getCycleCount();
int32_t nextEventCcys = waveform.nextEventCcy - now;
// Account for unknown duration of timer1CB().
if (waveform.timer1CB && nextEventCcys > callbackCcys) {
waveform.nextEventCcy = now + callbackCcys;
nextEventCcys = callbackCcys;
}
// Timer is 80MHz fixed. 160MHz CPU frequency need scaling.
int32_t deltaIrqCcys = DELTAIRQCCYS;
int32_t irqLatencyCcys = IRQLATENCYCCYS;
if (isCPU2X) {
nextEventCcys >>= 1;
deltaIrqCcys >>= 1;
irqLatencyCcys >>= 1;
}
// Firing timer too soon, the NMI occurs before ISR has returned.
if (nextEventCcys < irqLatencyCcys + deltaIrqCcys) {
waveform.nextEventCcy = now + IRQLATENCYCCYS + DELTAIRQCCYS;
nextEventCcys = irqLatencyCcys;
}
else {
nextEventCcys -= deltaIrqCcys;
}
// Register access is fast and edge IRQ was configured before.
T1L = nextEventCcys;
}

View File

@ -3,6 +3,7 @@
supporting outputs on all pins in parallel.
Copyright (c) 2018 Earle F. Philhower, III. All rights reserved.
Copyright (c) 2020 Dirk O. Kaar.
The core idea is to have a programmable waveform generator with a unique
high and low period (defined in microseconds or CPU clock cycles). TIMER1 is
@ -19,7 +20,7 @@
This replaces older tone(), analogWrite(), and the Servo classes.
Everywhere in the code where "cycles" is used, it means ESP.getCycleCount()
Everywhere in the code where "ccy" or "ccys" is used, it means ESP.getCycleCount()
clock cycle count, or an interval measured in CPU clock cycles, but not TIMER1
cycles (which may be 2 CPU clock cycles @ 160MHz).
@ -48,13 +49,25 @@ extern "C" {
#endif
// Start or change a waveform of the specified high and low times on specific pin.
// If runtimeUS > 0 then automatically stop it after that many usecs.
// If runtimeUS > 0 then automatically stop it after that many usecs, relative to the next
// full period.
// If waveform is not yet started on pin, and on pin == alignPhase a waveform is running,
// the new waveform is started at phaseOffsetUS phase offset, in microseconds, to that.
// Setting autoPwm to true allows the wave generator to maintain PWM duty to idle cycle ratio
// under load, for applications where frequency or duty cycle must not change, leave false.
// Returns true or false on success or failure.
int startWaveform(uint8_t pin, uint32_t timeHighUS, uint32_t timeLowUS, uint32_t runTimeUS);
int startWaveform(uint8_t pin, uint32_t timeHighUS, uint32_t timeLowUS,
uint32_t runTimeUS = 0, int8_t alignPhase = -1, uint32_t phaseOffsetUS = 0, bool autoPwm = false);
// Start or change a waveform of the specified high and low CPU clock cycles on specific pin.
// If runtimeCycles > 0 then automatically stop it after that many CPU clock cycles.
// If runtimeCycles > 0 then automatically stop it after that many CPU clock cycles, relative to the next
// full period.
// If waveform is not yet started on pin, and on pin == alignPhase a waveform is running,
// the new waveform is started at phaseOffsetCcys phase offset, in CPU clock cycles, to that.
// Setting autoPwm to true allows the wave generator to maintain PWM duty to idle cycle ratio
// under load, for applications where frequency or duty cycle must not change, leave false.
// Returns true or false on success or failure.
int startWaveformClockCycles(uint8_t pin, uint32_t timeHighCycles, uint32_t timeLowCycles, uint32_t runTimeCycles);
int startWaveformClockCycles(uint8_t pin, uint32_t timeHighCcys, uint32_t timeLowCcys,
uint32_t runTimeCcys = 0, int8_t alignPhase = -1, uint32_t phaseOffsetCcys = 0, bool autoPwm = false);
// Stop a waveform, if any, on the specified pin.
// Returns true or false on success or failure.
int stopWaveform(uint8_t pin);

View File

@ -45,8 +45,8 @@ extern void __analogWriteResolution(int res) {
extern void __analogWriteFreq(uint32_t freq) {
if (freq < 100) {
analogFreq = 100;
} else if (freq > 40000) {
analogFreq = 40000;
} else if (freq > 60000) {
analogFreq = 60000;
} else {
analogFreq = freq;
}
@ -63,22 +63,22 @@ extern void __analogWrite(uint8_t pin, int val) {
val = analogScale;
}
if (analogMap & 1UL << pin) {
// Per the Arduino docs at https://www.arduino.cc/reference/en/language/functions/analog-io/analogwrite/
// val: the duty cycle: between 0 (always off) and 255 (always on).
// So if val = 0 we have digitalWrite(LOW), if we have val==range we have digitalWrite(HIGH)
analogMap &= ~(1 << pin);
analogMap &= ~(1 << pin);
}
else {
pinMode(pin, OUTPUT);
}
uint32_t high = (analogPeriod * val) / analogScale;
uint32_t low = analogPeriod - high;
pinMode(pin, OUTPUT);
if (low == 0) {
digitalWrite(pin, HIGH);
} else if (high == 0) {
digitalWrite(pin, LOW);
} else {
if (startWaveformClockCycles(pin, high, low, 0)) {
analogMap |= (1 << pin);
}
// Find the first GPIO being generated by checking GCC's find-first-set (returns 1 + the bit of the first 1 in an int32_t)
int phaseReference = __builtin_ffs(analogMap) - 1;
if (startWaveformClockCycles(pin, high, low, 0, phaseReference, 0, true)) {
analogMap |= (1 << pin);
}
}

View File

@ -69,8 +69,8 @@ uint8_t Servo::attach(int pin, uint16_t minUs, uint16_t maxUs)
uint8_t Servo::attach(int pin, uint16_t minUs, uint16_t maxUs, int value)
{
if (!_attached) {
digitalWrite(pin, LOW);
pinMode(pin, OUTPUT);
digitalWrite(pin, LOW);
_pin = pin;
_attached = true;
}
@ -115,7 +115,9 @@ void Servo::writeMicroseconds(int value)
_valueUs = value;
if (_attached) {
_servoMap &= ~(1 << _pin);
if (startWaveform(_pin, _valueUs, REFRESH_INTERVAL - _valueUs, 0)) {
// Find the first GPIO being generated by checking GCC's find-first-set (returns 1 + the bit of the first 1 in an int32_t)
int phaseReference = __builtin_ffs(_servoMap) - 1;
if (startWaveform(_pin, _valueUs, REFRESH_INTERVAL - _valueUs, 0, phaseReference)) {
_servoMap |= (1 << _pin);
}
}