Serching for an easy and everywhere usable tool to study and develop with ARM NEON, I found the ghaflims work to do baremetal programming on Cortex-A9 using QEMU.
I'm using his effort to start my study and development with ARM NEON. In my support came the wonderfull on-line tool: NEVADA a NEON Visualizer Tool. This tool help me a lot, I use it to test instruction and syntax, before write down something and compile it.
Still a work in progess, here I write some of the steps I follow. Hope to help.
I use this code to try to develop NEON assembler code on cortex-a9 emulated processor on windows. I use Linaro GCC, and QEMU.
To debug the software I use GDB. See the file gdb_cmd.txt for the run time GDB options I use.
I use: qemu-w32-setup-20170420 and gcc-linaro-6.3.1-2017.05-i686-mingw32_arm-linux-gnueabihf.tar.
After the installation I update the PATH environment var with the GCC and QEMU path.
To compile and run QEMU I use the BASH shell that Octave have installed on my PC. I modify the Makefile accordingly.
To run the software I use the command make qemu. The make command execute: qemu-system-arm -M vexpress-a9 -serial mon:stdio -kernel bin/kernel.elf.
To debug I use make dqemu and, on a second bash shell, I run: arm-linux-gnueabihf-gdb -se bin/kernel.elf -x gdb_cmd.txt.
At this point, I set a breakpoint with the b 77 comamnd, and issue the c command to continue. Then I use the si command to step-into. The GDB is configured to show the assembler line. See the content of the file: gdb_cmd.txt.
Nothing special! I made a function to enable the NEON capabilities: EnableNEON_asm, and some function to: split the RGB channel: SplitRGB_asm, merge it to RGB again: MergeRGB888_asm. This code came from ARM example. I find on Internet a color to gray that reuse modified as: Color2Gray888_asm
I update the file ugui.c with a couple of functions to draw RGB.
Inside the file kernel.c I write some code to do geometrics modification to images. Some code came from a my project the other, working very well, from: Image Processing in C by Dwayne Phillips.
Mostly it's a work in progress. I'm planning to recode the geometrics functions in NEON Intrinsics.
The images I use are 160x120, and RGB888.
ARM Coding for NEON Part 1: Load and Stores
ARM Coding for NEON Part 2: Dealing With Leftovers
ARM Coding for NEON Part 3: Matrix Multiplication
ARM Coding for NEON Part 4: Shifting Left and Right
ARM Coding for NEON Part 5: Rearranging Vectors
ARM NEON 1.0 Programmer's Guide
ARM NEON Optimization. An Example
Using gdb for Assembly Language Debugging