Lecture #12
Time And Counters;
Watchdog Timers
18-348 Embedded System Engineering
Philip Koopman
Monday, 22-Feb-2016
Electrical &Computer
ENGINEERING
© Copyright 2006-2016, Philip Koopman, All Rights Reserved
https://www.youtube.com/watch?v=-5wpm-gesOY
Where Are We Now?
Where we’ve been:
• Part 1 of course – lots of general topics that you’ll need
• DON’T FORGET TO:
– Look at feedback from TAs on your labs
°
Where we’re going today:
• Time and counters – a bit more nitty-gritty
• Look for using previous concepts (e.g., fixed point math)
°
Where we’re going next:
• Test #1 in class Wednesday Feb 24, 2016
• Interrupts, concurrency, and scheduling
• Analog and other I/O
• Test #2 on Wednesday April 20, 2016
• Final project is more self-directed; a bit more time to work on it
Preview
Time of day
• Accuracy, drift
• How computers really measure time
Hardware timer operation
• Setting up a timer, including frequency calculations
• Converting a hardware timer to time of day
• Classic timer mistakes
Watchdog timer operation
• How and why to use a watchdog timer
• How not to use a watchdog timer
4
How Do You Know What Time It Really Is?
www.time.gov
Other good sources:
• GPS
• NIST radio broadcast (WWV radio)
• Cell phone system
• Internet time servers
• But you have to know what time zone you’re in!
– (What about mobile systems?) 5
Daylight Savings Time & Time Zones
Daylight savings time switches on particular dates
• Which are declared annually by Congress and have been known to change
– WW II had war-time daylight savings time to save energy
– “Energy Crisis” in the 70’s resulted in year-round daylight savings time
– Only the Navajo nation within Arizona does DST (not the state; not the Hopi resv.)
• http://www.energy.ca.gov/daylightsaving.html
– Beginning in 2007, Daylight Saving Time extended:
– 2 a.m. on the Second Sunday in March to
2 a.m. on the First Sunday of November.
– This does not correspond to European dates!
www.time.gov
6
F-22 Raptor Date Line Incident [Wikipedia]
February 2007
• A flight of six F-22 Raptor fighters attempts
to deploy to Japan
• $360 million per aircraft
– (Perhaps $120M RE, rest is NRE)
• Crossing the International Date Line, computers crash
– No navigation
– No communications
– No fuel management
– Almost everything gone!
– Escorted to Hawaii by tankers
– If bad weather, might have
caused loss of aircraft [DoD]
• Cause: “It was a computer glitch in the
millions of lines of code, somebody made an
error in a couple lines of the code and
everything goes.”
9
2013: NASA Declares End to Deep Impact Comet Mission
http://apod.nasa.gov/apod/
image/0505/art1_deepimpact.jpg
http://news.nationalgeographic.com/news/2013/09/1309
20-deep-impact-ends-comet-mission-nasa-jpl/ 10
Note: Unix epoch is
00:00:00 UTC on
1 January 1970.
ISO 8601 date format:
1970-01-01T00:00:00Z.
Problems With Time in the Real World
Coordinated Universal Time (UTC; world time standard)
• Is not a continuous function due to leap seconds
(and is only monotonic by putting 61 seconds in a minute just before midnight)
• Leap year also causes discontinuities, although they’re more predictable
Time zones
• Not just on hourly boundaries – Venezuela is UTC/GMT -4:30 hours; no DST
• TV auto-time-set might sync to channel from wrong time zone via cable feed
DST changeover date changes fairly often
• With little warning compared to a 10-20 year embedded system lifetime
“Y2K”
• “99” “00” on Jan 1, 2000 (there were many failures, but world did not end)
• The GPS 1024 week time rollover (a ship got lost at sea…)
• And Unix rollover problem (January 19, 2038 03:15:07 GMT)
12
Internationalization
The Moral Of The Time Stories:
• Keep time in GMT or UTC, not local time
• Keeping time is tricky (rollover, time zones, etc.); kids don’t try this at home
• And … it’s more than just time keeping
What day is 02/03/16?
• In the US: Feb 3, 2016
• In Europe: 2 March 2016
• Don’t forget: AM / PM vs. 24 hour clock
Other internationalization issues:
• English vs. Metric (F vs. C; ft vs. meters; speed limit in mph + distance in km)
• Many, many complications on translation
– Singular v. plural; Gender
– Currency signs & conversion
– ASCII vs. 16-bit Unicode
– …
13
Time and Computers
Computers are digital (and therefore discrete) devices
• Can count up things (for example, seconds)
• But, can’t actually represent exact analog values
Time is an analog value
• Time flows smoothly as far as we’re concerned, not in big chunks
• How do we get from a smooth, continuous flow to a countable number?
Basic source of timing information – the system clock
• A clock provides discrete time chunks at some operating speed
• Not only cycles the CPU registers, but also providing a basis for counting time!
• Basis of time in computers is – no surprise – counting “clock” cycles
14
Physical Clock – What’s The Basis For Time?
Typical source: oscillator circuit, perhaps augmented
with GPS time signal
• R/C timing circuit; somewhat stable (e.g., 1% resistor gives a lot of drift!)
• Commodity crystal oscillator; perhaps 10-6 /sec stability (14-pin DIP size)
• Oven-controlled for wireless communications; perhaps 10-11 /sec stability
• Micro-rubidium atomic oscillator
– perhaps 10-11 /month stability
– 0.7 kg weight
– 0.3 liter volume
15
Can You Run Faster Than The Oscillator?
Course board uses a 4 MHz crystal oscillator
• Divided to 2 MHz to get accurate 50% duty cycle
• Specs from module documentation are: 4 MHz +/- 30 ppm
– 30 ppm = 3 * 10-5 = 0.003% => +/- 120 Hz
We want to run at 8 MHz
• CPU can handle up to 25 MHz
• Old modules ran at 8 MHz and this avoids many
potential bugs in course software infrastructure
• Any guesses as to why new module is at 2 MHz?
Running faster than the oscillator:
• Turn on the PLL (Phase Locked Loop)
• Set PLL multiplier
• Hardware automatically generates
faster clock that tracks input oscillator edges
• What is drift rate of this faster oscillator? 16
Simple Real-World Drift Example
A gizmo has a crystal oscillator running at 32,768 Hz + 0.002%
• 32,768 is a standard watch crystal frequency (15-bit divider gives you 1 Hz)
– (.002% is a 2*10-5 drift rate)
• The product specification requires accuracy of 2 seconds/day
• Assume perfect software counting of oscillator clock cycles
• Will the oscillator meet the specification?
(.00002 sec/sec drift rate * (60 sec * 60 min * 24 hr)
= 1.728 sec drift per day (so it meets the spec.)
• How far will it drift over a 2-year battery life?
1.728 sec/day * (365.25 days * 2 years) = 21 minutes drift over 2 years
Observations:
• 10-6 or 10-7 is probably desirable for consumer products that keep time
– Is the course computer good enough to be a clock?
• There are a lot of seconds in a year (31.6 million ~= 225 of them) 17
Counting The Clocks
Time is an integer count of some number of clock “ticks”
• One year @ 2 MHz takes about 47 bits to represent as an integer – too big to be
useful for most embedded applications
• But, most applications don’t need time to the nearest 1/2,000,000 second
• So, we want time with a bigger granularity than this
Thus, the concept of the timer
• Increment a “timer” once every N CPU clocks (this is a clock “tick”)
– Potentially, tell the CPU to update its software-maintained clock on every timer
increment; maybe a 32-bit integer
• Example: Original IBM PC updated time of day 18.2 times/second
– Windows Forms timer is still that speed (55 msec)
• Many Unix systems have base timers that run at 30 or 60 times/second
– Why this frequency?
18
Course Chip Timing Support
Ignore for today
[Freescale] 19
Hardware Timer Operation
“Channels” and “IOC” items are for pulse inputs/outputs
• Not relevant to this lecture
Prescaler
• Divide system clock by an integer value as input to timer
– System clock is 8 MHz for course HW; defaults to some other speed in simulator
• PR[2:0] controls prescale amount
– Divide bus clock by: 1, 2, 4, 8, 16, 32, 64, 128
16-bit Counter -- TCNT
• “up” counter – always increments
• Clocked by prescaled bus – one increment every 1, 2, 4, 8, … , 128 bus clocks
[Freescale] 20
Reading The Hardware Timer
// set TN = 1 Timer Enable TSCR1 bit 7
TSCR1 |= 0x80;
// set PR[2:0] Timer prescale in bottom 3 bits of TSCR2
TSCR2 = (TSCR2 & 0xF8) | 0x04; // 0x04 bus clock / 16
for(;;) { timer_val = TCNT;
} /* update timer_val forever */
[Freescale] 21
How Can We Use This To Measure Time?
Every time TCNT rolls over to zero, increment a software time counter
• This is really inefficient!!! – but demonstrates the general idea
int time_count = 0;
// set TN = 1 Timer Enable TSCR1 bit 7
TSCR1 |= 0x80;
// set PR[2:0] Timer prescale in bottom 3 bits of TSCR2
TSCR2 = (TSCR2 & 0xF8) | 0x07; // 0x07 bus clock / 128
// Only works if loop is faster than timer increments!
for(;;)
{ // increment time_count whenever TCNT reaches zero
if (TCNT == 0)
{ time_count++;
while (TCNT == 0); /*wait for TCNT to change again*/
}
}
22
How Fast Does That Time Counter Increment?
Analytically:
• 8 MHz module
• 16-bit counter rolls over every 65536 counts
• Prescale by 128, so roll over happens 128 times slower
time 65536 * 128 / 8,000,000 1.048576 seconds
• (How long for 2 MHz Module?)
Experimentally (via simulator set for 8 MHz):
• Set breakpoint at: time_count++;
• First breakpoint: 8,372,411
• Second breakpoint: 16,761,026
• Elapsed time: 8388615 clocks = 1.048577 seconds
– (accurate within less than time to execute the loop testing for zero)
23
Accuracy
What if we wanted to display time with seconds?
• This hardware won’t make that easy!
• Can’t get exactly 1 second tick values from hardware
• Can do better by updating a lot more frequently than every second
For example, to display time in seconds…
• Find a divider value for which TCNT rolls over every 0.025 to 0.10 seconds
– This is how the IBM PC got 0.055 second ticks – it was an “easy” divider value
• Update a software counter on every TCNT rollover
• Whenever that software counter exceeds 1 second of value,
update the seconds count
• This still won’t display exact seconds….
– Accurate to within TCNT rollover period plus sampling jitter
– But for a clock the human eye can only “see” about 0.05 to 0.1 seconds anyway
24
Design To Track Seconds
Keep state machine to track rollover
• Only needs to sample TCNT a few times per
rollover to avoid missing one
• Accuracy improved with sampling speed
• Do other application stuff in both “TCNT high bit
set” and “TCNT high bit clear” states
• But how do we handle the rollover?
25
Design Example – Don’t Lose Fractions
Assume bus clock divide by 64; 25 MHz board
• 65536 * 64 / 25,000,000 TCNT rollover every 0.167772 seconds
• (Need to sample TCNT every 0.08 seconds to catch the rollover event)
• If we want to keep seconds, then increment seconds every 5 or 6 rollovers
How do we track fractional seconds without floating point?
• Answer: 16.16 fixed point! – a 32-bit fixed point integer
– unsigned long current_time;
– Top 16 bits are integer seconds
– Bottom 16 bits are fractional seconds
(each integer “count” = 1/65536 seconds = 0.00001525878906 seconds)
• For each TCNT rollover, add 0.166772 / 0.00001525878906
= 10930 fractional seconds
• TCNT rollover becomes: current_time += 10930;
• Seconds are in: (current_time >> 16) & 0xFFFF;
26
Time Accuracy Calculation
An approximation makes life easy, but how far off is it?
In 10,000 seconds, TCNT will roll over:
• 10,000 * 25,000,000 / (65536 * 64) = 59,605 times
• That’s 10930 fractional seconds added to the 32-bit time counter
10930 * 59,605 = 651,482,650 $26D4 D61A
• Top 16 bits are $26D5 (rounded) 9941 (instead of 10,000)
• Accuracy is 9941/10,000 99.41% (0.59% error due to timer interval)
• How could we be better?
Is this good enough?
• Crystal Oscillator is 4 MHz +/- 0.003%, which is insignificant for this purpose
• Error is: 0.59% * 31536000 seconds/year = 51.7 hours per year; 8.5
minutes/day
• NOTE: Our time counter rolls over every 64K seconds = 18.2 hours
– What this really means is you want 32.32 fixed point time for longer operation
27
Why Are Timers Such A Big Deal?
No more counting NOPs in loops
• NOP-delay loops are a pain to build and get right
• And they break every time you change the oscillator speed or CPU clocks/instr!
Lets processor do other useful work while keeping time
• Can check timer once in a while to see if top bit of TCNT rolled over
• Combined with “interrupts” (next lecture), processor doesn’t have to check time
periodically – is just notified on every rollover of TCNT
Time values independent of software execution
• Not sensitive to variations in instruction timing
• Still works if software inside loop has multiple “if/else” paths…
because it is not based on how long software takes to run
• Still works at different clock speed (need to adjust the prescale value)
BUT, it’s a bit of work getting accurate time-of-day values
• Have to take into account exactly how often HW timer ticks and rolls over!
28
Classic Timer Mistakes – “Nanosecond” Time
• [http://www.gnu.org/software/libc/manual/html_node/Elapsed-
Time.html#Elapsed-Time]
• — Data Type: struct timespec
The struct timespec structure represents an elapsed time. It is
declared in time.h and has the following members:
• long int tv_sec
– This represents the number of whole seconds of elapsed time.
• long int tv_nsec
– This is the rest of the elapsed time (a fraction of a second),
represented as the number of nanoseconds. It is always less
than one billion.
This value reports time in nanoseconds
• That means it is a number of nanoseconds
• That does NOT mean it is the nearest nanosecond
• The underlying hardware has a timer that only increments once in a while!
• Classic mistake is to ignore quantization error in the timers
29
Classic Timer Mistakes – Non-Atomic Access
[Freescale]
What happens if you use two 8-bit reads (LDAA) instead of LDD?
• 16-bit fetch locks the value as it is being read; gives correct result
• Timer hardware might increment between two byte-sized reads
• AND, that increment might include a carry from low 8 to high 8 bits
• $03FF $0400 read hi then lo gives $03 … $00 => $0300!
• This is an absolutely classic timer bug – don’t let it happen to you!!!
30
Classic Timer Mistakes – Rollover
http://www.nytimes.com/2015/05/01/business/faa-orders-
fix-for-possible-power-loss-in-boeing-787.html
Eventually integer timers roll over
• Assume time kept in 100ths of a second as a signed 32-bit integer (wrong type!)
• 0x7FFFFFF = 2147483647 / (24 * 60 * 60 * 100) = 248.55 days to overflow
• (Note: unsigned int would roll over after 497 days)
http://rgl.faa.gov/Regulatory_and_Guidance_Library/rgad.nsf/0/584c7ee3b270fa3086257e38004d0f3e/$FILE/2015-09-07.pdf 31
32
Watchdog Timers – Detecting Software “Hangs”
A common symptom of software problem – system hang
• Could be an infinite loop
• Could be continually chasing a “wild” pointer around
• Could be corrupted data
• … but often systems “lock up” or “hang”
Good general-purpose remedy – reboot system if it hangs
• But, there is no person around to press “ctl-alt-delete”
• So, let the watchdog timer do it instead
• BUT realize this doesn’t solve all problems
– just some that are nice to address
Basic watchdog idea:
• Have a hardware timer running all the time (count-down timer)
• When timer reaches zero, it resets the system
• Software periodically “kicks” (or “pets”) the watchdog, restarting the count
• If software has “kicked” the watchdog often enough, no reset takes place
33
Watchdog General Block Diagram
System reset starts the watchdog initially
• Clock is used to count-down the watchdog timer
• Kick restarts the watchdog
• Watchdog resets CPU when it reaches zero
CLOCK
KICK
WATCHDOG Microcontroller
TIMER RESET
CPU
34
Course MCU Watchdog “COP”
See chapter 9 of data sheet – “Clocks and Reset Generator” (CRGV4)
• COP = “Computer Operating Properly” Freescale name for watchdog
[Freescale] 35
When To Kick
Kick periodically
• Often enough to avoid reset
Kick only when doing so means the
system is really alive
• Between major subroutine calls
• Only in the main program loop
• NEVER within individual task loops
– Except if you are sure they will
terminate (e.g., fixed integer loop
bounds)
– And even then, probably only in the
main program loop
These are basic rules
• Advanced topic: with multitasking
system, every task should participate in a
consensus-based watchdog reset
operation
36
Watchdog Timer Select
Set watchdog so that it is fast enough to catch problems quickly
• But not so fast you miss it
• Requires estimate of program execution speed between kicks
[Freescale] 37
Petting The Watchdog (Kicking the COP)
NOTE – multi-operation “kick” to reduce chance of random code kicking it
[Freescale] 38
Bad Watchdog Use
Kicking inside a single task loop
• OK, so that loop is alive, but what about other tasks?
Kicking in a great many places in the code
• Only kick in the main loop; as few places as possible
• What if you make a mistake and kick inside a loop?
Hooking up a timer interrupt to kick the watchdog
• Every time timer rolls over, kick the watchdog
• Only proves the timer is working, not the main tasks!
• (There are very special exceptions for multitasking)
Watchdog can be defeated by software
• HW should prevent watchdog turning off once on
• HW should prevent masking/disabling the watchdog
reset once enabled
• Watchdog should require sequence of values to “kick”
• Some systems forget to turn on watchdog 39
Watchdog Margin
Let’s say you set the watchdog where you think it should be
• You compute expected task execution time
• In the lab, you never see a watchdog trip
– Hopefully you don’t blame one on something else – make sure they are
unmistakable!
• In the field, the watchdog trips – what happened?
– Well, obviously something you didn’t test
– Maybe you set the watchdog too close to the edge!
Testing watchdog margin
• Change the watchdog divider until it trips
– Does it trip where you expect? (If not, you don’t understand something)
• Add some time-wasting nop-loops in your code
– Does it trip where you expect? (If not, you don’t understand something)
40
Multi-Tasking Watchdog
Consider a preemptive tasking system
• (We’ll talk more about preemption later – we just mean “multi-tasking” here)
• Assume there is a watchdog timer (a COP timer)
• kick() restarts the watchdog time at initial value
void task0(void) { .. Do stuff..; kick(); …more… ;}
void task1(void) { .. Do stuff..; kick(); …more… ;}
void task2(void) { .. Do stuff..; kick(); …more… ;}
void task3(void) { .. Do stuff..; kick(); …more… ;}
• What’s wrong with the above approach?
• (Murphy00 supplemental reading also talks about this issue)
41
Effective Multi-Tasking Watchdog Approach
void task0(void) { .. Do stuff..; Alive(0x1); }
void task1(void) { .. Do stuff..; Alive(0x2); }
void task2(void) { .. Do stuff..; Alive(0x4); }
void task3(void) { .. Do stuff..; Alive(0x8); }
Main idea – each task sets a bit indicating it has run
• Separate watchdog monitor task kicks watchdog only when every task has reported in
• Needs to be modified to account for task periods, but this is the basic idea
static uint16 watch_flag = 0;
void Alive(uint16 x)
{ SEI(); // disable interrupts
watch_flag |= x;
CLI(); // enable interrupts
} // set task’s “I’m Alive” bit
void taskw(void) // run periodically
{ if (watch_flag == 0x0F) // if all tasks alive
{ kick(); // kick watchdog
watch_flag = 0; // erase flags
}}
42
Review
Time of day
• Accuracy – time measurement and quantization
• Drift – due to oscillator speed AND software inaccuracies
• Converting a hardware timer to time of day
Hardware timer operation
• Setting up a timer, including frequency calculations
• Classic timer mistakes
Watchdog timer operation
• Setting up the watchdog, including frequency calculations
• How to ensure a watchdog timer is set properly
• Rules for good and bad watchdog use
• Multi-tasking watchdog
43
Lab Skills
Counter/timer
• Be able to set, read, and generate time of day from a hardware timer
Watchdog timer
• Be able to set up and measure effects of watchdog timer
44