Embedded Things to Consider #
- Extremely useful article - https://www.embedded.com/design/prototyping-and-development/4006649/Real-time-debugging-101
Tools #
-
Source-level debugger
- Let your step through code, stop it, and examine memory content and program variables
-
Simple
printf
statement- Flexible and primitive tool
- clumsy to use
-
In-circuit emulators (ICE) and JTAG debuggers
-
Allows you to carefully control the execution of the running chip
- indispensable for the early development
-
Can use to debugger the second processor turn on
-
Can use to debug ROM (nonvolatile)
-
One the system is up, may not be sufficient because of multiple processes
-
-
Data monitor
-
Operating system monitors
- displays events (eg. task switches, semaphore activities, interrupts)
- help visualize the relationship and timing between system events
- easily reveals semaphore priority inversion, deadlocks, and interrupt latency
-
Profilers
- Measures where the system spend the CPU cycles
- tells you where the bottleneck is, allows you to pinpoint where to optimze
-
Memory testers
- find leaks, fragmentation, and corruption
- first line of attack for unpredictable or intermitten problems
-
Execution tracers
- Show which routines were executed, who called them, what the parameters were, and when they were called
- very useful when finding rare events in a huge event stream
-
Coverage Testers
- Show what code is being executed
Potential Issues with Embedded #
-
Find memory problems early
-
Leaks - Allocating memory without releasing it
- May happen over long long period of time. Eventually fail, them blame hardware, etc…
-
Fragmentation - Memory that are allocated and released
- Allocated blocks are distributed throughout memory, resulting in small sets of blocks that can’t be carved out for new big blocks
- Fragmentation is a fact of life for dynamic allocated memory
-
Memory corruption
-
can occur in any language that allows for pointer
-
Ways that it can occur:
- Writing off the end of a array
- writing to freed memory
- bugs with pointer arithmetics
- dangling pointers
- writing off the end of a task stack
-
Classic example: Allocate memory –> provide it to a library or OS as buffer –> and then freed and forgotten
-
“protected memory” - protect processes from cross-corrupting, but is still capable of corrupting its own memory
-
How do you stop it? by the language you choose to use, or by testing
-
-
-
Optimized through Understanding
-
profiling is simple and powerful -> but profiling a real-time system is tricky
- You typically profile when you need CPU, but enabling the profiler uses more CPU
-
KNOW HOW YOUR CPU IS EXECUTING your code, IT IS the only path to efficiency
-
-
Trace tools
- scan for errors return from all OS function calls and common
applicatio program interfaces
mallo()
fails? timing out?
- scan for errors return from all OS function calls and common
applicatio program interfaces
-
Isolate a problem, reproduce it reliably
- REPRODUCE it, that’s half the problem
- separate, and isolate it! turn features off/on
-
Know where you’ve been -> it means to track your code. revision control.
-
Make sure code is completely tested. Use coverage tool to see if it is.
-
Pursue quality to save time
- dev spent about 80% of time debugging, spend the time to write good codes
-
See, understand and make it work
- Analogous to a running car, mechanic cannot stop the car to see whats going on when the driver reports that the steering wheel shakes
- So real-time monitor is your answer to dynamic performance questions/problems
-
HAVE TO’s:
- Test for memory leak and corruptions
- Scan for performance bottlenecks
- collecting varible trace of key sequences
- Ensure sufficient bandwidth for CPU
- Evaluate test-coverage effectiveness
- Tracing operating system resource usage
- record execution sequences (semaphores, process interactions) of key activities
- checking for error returns from common application programming interfaces
Memory Management #
-
Run-time memory allocation is… SLOW
-
Most embedded system use flat memory model without
- When enabled, will reduce performance
-
Stack-overflowing
- Temporary run-time stack for function execution; params in, values out
-
A processor register will keep track of memory address and stack pointer (SP)
- For high-level language like C
- the C-compiler generates a prologue and epilogue code to create/destroy stack with every function
- For high-level language like C
-
Memory allocation
-
Typically take place at run-time
-
Stand-alone applications
- Stack is set-up during the run-time initialization
-
If managed by an operating system (e.g. RTOS)
- Each thread/process will get its own stack
- Stack management is then done by the operating system
- RTOS
- For performance reason, RTOS have pre-defined stack size
-
Stack overflow
-
Occurs when stack pointer crosses its allocated boundary
-
When a LARGE variable gets pushed on to the stack, overflow may occur
- IF there is no way about it, use
malloc()
or declare a global variable
- IF there is no way about it, use
-
To prevent, and performance reason, this is probably why C and C++ are fast and efficient
- Location of variables are used, rather than pushing the entire variable to the local stack
-
Debugging is kind hard
- Applications can register exception handlers to catch these during debugging
- Some RTOS can monitor and request dynamic increase in stack size
-
-
-
ISR
-
Async by nature, easily corrupted –> very hard to track down
-
poorly written ISR can crash entire system
-
In Assembly
- ie. Registers that gets interrupted by ISR are not all saved and restore before/after ISR
-
In C
- the compiler will place any local variable in the ISR to the
current stack.
- this works well, but what happens when you have a stack-overflow of the current running stack because of poorly written ISR not accounting for large local var
- the compiler will place any local variable in the ISR to the
current stack.
-
-
Solutions to possible ISR register corruption
- for C
- Every stack should have enough space to handle interrupts and or nested interrupt stack requirements
- ISR should be short and simple, then defer processing to a
thread and low-priority interrupt, or deferring callback
- During development, add diagnostic functions at the start and end of interrupts to compare registers used in the ISR to ensure the state of syste is maintained
- for C
-
Inline assembly code offen used to manipulate memory-mapped registers and improve performance
- For example, masking an interrupt using assembly is more efficient than calling equivalent functions
-
Simple operation like atomic (uninterruptible) increment/decrement are commonly written in Assembly
-
-
MUST be aware of compiler run-time semantics before MIXING (Assembly and C)
-
Compiler Optimization
- although it may always be logically correct, the optimized reorder of code execution may produce invalid results
- If optimization is set and fails, look for shuffled instructions that have been “optimized by the compiler”
-
Exceptions
-
used to raise system events (cache miss, stack overflow, hardware trace buffer full.. )
-
Most processor will have its own set of of exceptions
- Os the operating system will typically handle these exceptions
-
Typically error that is reported will be “corruption of instruction memory”
- the root cause will be hard to debug because the real-time nature may have long chains of events leading up to the corruption
-
-
Solution:
-
examine the exception context, most processor track addresses of offending instructions
- Identification of the execution path that lead to the failure is troublesome
-
Some processors have HARDWARE-LEVEL tracing, it allows you to see the history of instructions that was executed most recently
-
Watch points
- similar to breakpoints; it monitors the processor’s internal data bus and raise exception if a match is found
- useful if a particular memory location is getting corrupted consistently and you can’t pinpoint the instruction that cause it
-
-
Most debugger allow you to modify memory and register directly
-
Mutex vs Semaphores #
- Mutex can be owned by only one execution entity at any point
- Semaphore:: A resource can be shared with a finite number of execution entities
Debugging Embedded Systems - #
- Simultaneous execution of threads or interruptions on the shared data
- Improper configuration of thread priorities
Examples #
Semaphore example: A message pipe - it can handle only a finite number of messages
Have a counting semaphore, it is associated to the message pipe - Have the initial value at a limit, say 10 - If an executing entity want to place a message on the pipe it will need to acquire the semaphore and place the message - The acquiring process will decrement the semaphore count.. if no semaphore exists, then the entity will be blocked until other entity release it - This releasing process can happen once a message is READ from the pipe, then the reading entity can release the semaphore