Embedded Scratchpad
Aug 16, 2022
[Huy Le]
7 minute read

Embedded Things to Consider #

Tools #

  • Source-level debugger

    • Let your step through code, stop it, and examine memory content and program variables
  • Simple printf statement

    • Flexible and primitive tool
    • clumsy to use
  • In-circuit emulators (ICE) and JTAG debuggers

    • Allows you to carefully control the execution of the running chip

      • indispensable for the early development
    • Can use to debugger the second processor turn on

    • Can use to debug ROM (nonvolatile)

    • One the system is up, may not be sufficient because of multiple processes

  • Data monitor

  • Operating system monitors

    • displays events (eg. task switches, semaphore activities, interrupts)
    • help visualize the relationship and timing between system events
    • easily reveals semaphore priority inversion, deadlocks, and interrupt latency
  • Profilers

    • Measures where the system spend the CPU cycles
    • tells you where the bottleneck is, allows you to pinpoint where to optimze
  • Memory testers

    • find leaks, fragmentation, and corruption
    • first line of attack for unpredictable or intermitten problems
  • Execution tracers

    • Show which routines were executed, who called them, what the parameters were, and when they were called
    • very useful when finding rare events in a huge event stream
  • Coverage Testers

    • Show what code is being executed

Potential Issues with Embedded #

  • Find memory problems early

    • Leaks - Allocating memory without releasing it

      • May happen over long long period of time. Eventually fail, them blame hardware, etc…
    • Fragmentation - Memory that are allocated and released

      • Allocated blocks are distributed throughout memory, resulting in small sets of blocks that can’t be carved out for new big blocks
      • Fragmentation is a fact of life for dynamic allocated memory
    • Memory corruption

      • can occur in any language that allows for pointer

      • Ways that it can occur:

        • Writing off the end of a array
        • writing to freed memory
        • bugs with pointer arithmetics
        • dangling pointers
        • writing off the end of a task stack
      • Classic example: Allocate memory –> provide it to a library or OS as buffer –> and then freed and forgotten

      • “protected memory” - protect processes from cross-corrupting, but is still capable of corrupting its own memory

      • How do you stop it? by the language you choose to use, or by testing

  • Optimized through Understanding

    • profiling is simple and powerful -> but profiling a real-time system is tricky

      • You typically profile when you need CPU, but enabling the profiler uses more CPU
    • KNOW HOW YOUR CPU IS EXECUTING your code, IT IS the only path to efficiency

  • Trace tools

    • scan for errors return from all OS function calls and common applicatio program interfaces
      • mallo() fails? timing out?
  • Isolate a problem, reproduce it reliably

    • REPRODUCE it, that’s half the problem
    • separate, and isolate it! turn features off/on
  • Know where you’ve been -> it means to track your code. revision control.

  • Make sure code is completely tested. Use coverage tool to see if it is.

  • Pursue quality to save time

    • dev spent about 80% of time debugging, spend the time to write good codes
  • See, understand and make it work

    • Analogous to a running car, mechanic cannot stop the car to see whats going on when the driver reports that the steering wheel shakes
    • So real-time monitor is your answer to dynamic performance questions/problems
  • HAVE TO’s:

    • Test for memory leak and corruptions
    • Scan for performance bottlenecks
    • collecting varible trace of key sequences
    • Ensure sufficient bandwidth for CPU
    • Evaluate test-coverage effectiveness
    • Tracing operating system resource usage
    • record execution sequences (semaphores, process interactions) of key activities
    • checking for error returns from common application programming interfaces

Memory Management #

  • Run-time memory allocation is… SLOW

  • Most embedded system use flat memory model without

    • When enabled, will reduce performance
  • Stack-overflowing

    • Temporary run-time stack for function execution; params in, values out
  • A processor register will keep track of memory address and stack pointer (SP)

    • For high-level language like C
      • the C-compiler generates a prologue and epilogue code to create/destroy stack with every function
  • Memory allocation

    • Typically take place at run-time

    • Stand-alone applications

      • Stack is set-up during the run-time initialization
    • If managed by an operating system (e.g. RTOS)

      • Each thread/process will get its own stack
      • Stack management is then done by the operating system
      • RTOS
        • For performance reason, RTOS have pre-defined stack size
    • Stack overflow

      • Occurs when stack pointer crosses its allocated boundary

      • When a LARGE variable gets pushed on to the stack, overflow may occur

        • IF there is no way about it, use malloc() or declare a global variable
      • To prevent, and performance reason, this is probably why C and C++ are fast and efficient

        • Location of variables are used, rather than pushing the entire variable to the local stack
      • Debugging is kind hard

        • Applications can register exception handlers to catch these during debugging
        • Some RTOS can monitor and request dynamic increase in stack size
  • ISR

    • Async by nature, easily corrupted –> very hard to track down

    • poorly written ISR can crash entire system

      • In Assembly

        • ie. Registers that gets interrupted by ISR are not all saved and restore before/after ISR
      • In C

        • the compiler will place any local variable in the ISR to the current stack.
          • this works well, but what happens when you have a stack-overflow of the current running stack because of poorly written ISR not accounting for large local var
    • Solutions to possible ISR register corruption

      • for C
        • Every stack should have enough space to handle interrupts and or nested interrupt stack requirements
        • ISR should be short and simple, then defer processing to a thread and low-priority interrupt, or deferring callback
          • During development, add diagnostic functions at the start and end of interrupts to compare registers used in the ISR to ensure the state of syste is maintained
    • Inline assembly code offen used to manipulate memory-mapped registers and improve performance

      • For example, masking an interrupt using assembly is more efficient than calling equivalent functions
    • Simple operation like atomic (uninterruptible) increment/decrement are commonly written in Assembly

  • MUST be aware of compiler run-time semantics before MIXING (Assembly and C)

  • Compiler Optimization

    • although it may always be logically correct, the optimized reorder of code execution may produce invalid results
    • If optimization is set and fails, look for shuffled instructions that have been “optimized by the compiler”
  • Exceptions

    • used to raise system events (cache miss, stack overflow, hardware trace buffer full.. )

      • Most processor will have its own set of of exceptions

        • Os the operating system will typically handle these exceptions
      • Typically error that is reported will be “corruption of instruction memory”

        • the root cause will be hard to debug because the real-time nature may have long chains of events leading up to the corruption
    • Solution:

      • examine the exception context, most processor track addresses of offending instructions

        • Identification of the execution path that lead to the failure is troublesome
      • Some processors have HARDWARE-LEVEL tracing, it allows you to see the history of instructions that was executed most recently

      • Watch points

        • similar to breakpoints; it monitors the processor’s internal data bus and raise exception if a match is found
        • useful if a particular memory location is getting corrupted consistently and you can’t pinpoint the instruction that cause it
    • Most debugger allow you to modify memory and register directly

Mutex vs Semaphores #

  • Mutex can be owned by only one execution entity at any point
  • Semaphore:: A resource can be shared with a finite number of execution entities

Debugging Embedded Systems - #

Link here

  • Simultaneous execution of threads or interruptions on the shared data
  • Improper configuration of thread priorities

Examples #

Semaphore example: A message pipe - it can handle only a finite number of messages

Have a counting semaphore, it is associated to the message pipe - Have the initial value at a limit, say 10 - If an executing entity want to place a message on the pipe it will need to acquire the semaphore and place the message - The acquiring process will decrement the semaphore count.. if no semaphore exists, then the entity will be blocked until other entity release it - This releasing process can happen once a message is READ from the pipe, then the reading entity can release the semaphore

Resources #