Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[llvm-objdump] Add the --visualize-jumps option #74858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ostannard
Copy link
Collaborator

This is a feature which GNU objdump currently has, which prints a control-flow graph showing all of the direct branches alongside the disassembly.

This should work for any architecture which implements the MCInstrAnalysis class, I've tested it on ARM, Thumb and AArch64.

There are a few (known) differences between this and the GNU objdump
version:

  • This has the option to draw the lines using unicode line-drawing charcters, instead of ASCII. This is on by default because I find the connected lines much easier to read. I've included an option to revert back to ASCII for terminals or fonts which don't support the right bits of unicode.
  • I haven't yet implemented the extended-color mode yet. With GNU objdump this often results in some very similar colors, so we might want to carefully pick a pallet which minimises that.

Related issue: #60172

Demo: Screenshot from 2023-12-08 16-11-24

@llvmbot
Copy link
Member

llvmbot commented Dec 8, 2023

@llvm/pr-subscribers-llvm-binary-utilities

Author: None (ostannard)

Changes

This is a feature which GNU objdump currently has, which prints a control-flow graph showing all of the direct branches alongside the disassembly.

This should work for any architecture which implements the MCInstrAnalysis class, I've tested it on ARM, Thumb and AArch64.

There are a few (known) differences between this and the GNU objdump
version:

  • This has the option to draw the lines using unicode line-drawing charcters, instead of ASCII. This is on by default because I find the connected lines much easier to read. I've included an option to revert back to ASCII for terminals or fonts which don't support the right bits of unicode.
  • I haven't yet implemented the extended-color mode yet. With GNU objdump this often results in some very similar colors, so we might want to carefully pick a pallet which minimises that.

Related issue: #60172

Demo: Screenshot from 2023-12-08 16-11-24


Patch is 60.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74858.diff

19 Files Affected:

  • (modified) llvm/lib/Support/FormattedStream.cpp (+1)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt (+31)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt (+31)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt (+34)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt (+31)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt (+27)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt (+27)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt (+34)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt (+34)
  • (added) llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s (+69)
  • (added) llvm/test/tools/llvm-objdump/visualize-jumps-arm.s (+54)
  • (added) llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s (+63)
  • (modified) llvm/tools/llvm-objdump/ObjdumpOpts.td (+7)
  • (modified) llvm/tools/llvm-objdump/SourcePrinter.cpp (+194-2)
  • (modified) llvm/tools/llvm-objdump/SourcePrinter.h (+112-1)
  • (modified) llvm/tools/llvm-objdump/XCOFFDump.cpp (+1-1)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+235-80)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.h (+13)
  • (modified) llvm/unittests/Support/formatted_raw_ostream_test.cpp (+4-5)
diff --git a/llvm/lib/Support/FormattedStream.cpp b/llvm/lib/Support/FormattedStream.cpp
index c0d28435099570..48b02889c6d3e5 100644
--- a/llvm/lib/Support/FormattedStream.cpp
+++ b/llvm/lib/Support/FormattedStream.cpp
@@ -45,6 +45,7 @@ void formatted_raw_ostream::UpdatePosition(const char *Ptr, size_t Size) {
       break;
     case '\t':
       // Assumes tab stop = 8 characters.
+      Column += 1;
       Column += (8 - (Column & 0x7)) & 0x7;
       break;
     }
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt
new file mode 100644
index 00000000000000..592e50d98f8ff7
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt
@@ -0,0 +1,31 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:           94000000   	bl	0x0 <test_func>
+       4:           14000000   	b	0x4 <test_func+0x4>
+       8:       /-- 14000001   	b	0xc <test_func+0xc>
+       c:       +-> d503201f   	nop
+      10:       \-- 17ffffff   	b	0xc <test_func+0xc>
+      14:           14000000   	b	0x14 <test_func+0x14>
+      18:       /-- 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:       +-- b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:       \-> d503201f   	nop
+      24:   /------ 14000005   	b	0x38 <test_func+0x38>
+      28:   | /---- 14000003   	b	0x34 <test_func+0x34>
+      2c:   | | /-- 14000001   	b	0x30 <test_func+0x30>
+      30:   | | \-> d503201f   	nop
+      34:   | \---> d503201f   	nop
+      38:   \-----> d503201f   	nop
+      3c:       /-- 14000002   	b	0x44 <test_func+0x44>
+      40:     /-|-- 14000002   	b	0x48 <test_func+0x48>
+      44:     | \-> d503201f   	nop
+      48:     \---> d503201f   	nop
+      4c:       /-- 14000001   	b	0x50 <test_func+0x50>
+      50:     /-|-- 14000001   	b	0x54 <test_func+0x54>
+      54:     \---> d503201f   	nop
+      58:       /-- 14000002   	b	0x60 <test_func+0x60>
+      5c:       |   94000000   	bl	0x5c <test_func+0x5c>
+      60:       \-> d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt
new file mode 100644
index 00000000000000..925c5ef52a8d22
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt
@@ -0,0 +1,31 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:          �[0m 94000000   	bl	0x0 <test_func>
+       4:          �[0m 14000000   	b	0x4 <test_func+0x4>
+       8:      �[0;31m ╭�[0;31m──�[0m 14000001   	b	0xc <test_func+0xc>
+       c:      �[0;31m ├�[0;31m─>�[0m d503201f   	nop
+      10:      �[0;31m ╰�[0;31m──�[0m 17ffffff   	b	0xc <test_func+0xc>
+      14:          �[0m 14000000   	b	0x14 <test_func+0x14>
+      18:      �[0;32m ╭�[0;32m──�[0m 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:      �[0;32m ├�[0;32m──�[0m b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:      �[0;32m ╰�[0;32m─>�[0m d503201f   	nop
+      24:  �[0;33m ╭�[0;33m──�[0;33m──�[0;33m──�[0m 14000005   	b	0x38 <test_func+0x38>
+      28:  �[0;33m │�[0;34m ╭�[0;34m──�[0;34m──�[0m 14000003   	b	0x34 <test_func+0x34>
+      2c:  �[0;33m │�[0;34m │�[0;35m ╭�[0;35m──�[0m 14000001   	b	0x30 <test_func+0x30>
+      30:  �[0;33m │�[0;34m │�[0;35m ╰�[0;35m─>�[0m d503201f   	nop
+      34:  �[0;33m │�[0;34m ╰�[0;34m──�[0;34m─>�[0m d503201f   	nop
+      38:  �[0;33m ╰�[0;33m──�[0;33m──�[0;33m─>�[0m d503201f   	nop
+      3c:      �[0;36m ╭�[0;36m──�[0m 14000002   	b	0x44 <test_func+0x44>
+      40:    �[0;31m ╭�[0;31m─�[0;36m│�[0;31m──�[0m 14000002   	b	0x48 <test_func+0x48>
+      44:    �[0;31m │�[0;36m ╰�[0;36m─>�[0m d503201f   	nop
+      48:    �[0;31m ╰�[0;31m──�[0;31m─>�[0m d503201f   	nop
+      4c:      �[0;32m ╭�[0;32m──�[0m 14000001   	b	0x50 <test_func+0x50>
+      50:    �[0;33m ╭�[0;33m─�[0;32m│�[0;33m──�[0m 14000001   	b	0x54 <test_func+0x54>
+      54:    �[0;33m ╰�[0;33m──�[0;33m─>�[0m d503201f   	nop
+      58:      �[0;34m ╭�[0;34m──�[0m 14000002   	b	0x60 <test_func+0x60>
+      5c:      �[0;34m │  �[0m 94000000   	bl	0x5c <test_func+0x5c>
+      60:      �[0;34m ╰�[0;34m─>�[0m d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt
new file mode 100644
index 00000000000000..75ec5741fdaae2
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt
@@ -0,0 +1,34 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:           94000000   	bl	0x0 <test_func>
+                   		0000000000000000:  R_AARCH64_CALL26	extern_func
+       4:           14000000   	b	0x4 <test_func+0x4>
+                   		0000000000000004:  R_AARCH64_JUMP26	extern_func
+       8:       ╭── 14000001   	b	0xc <test_func+0xc>
+       c:       ├─> d503201f   	nop
+      10:       ╰── 17ffffff   	b	0xc <test_func+0xc>
+      14:           14000000   	b	0x14 <test_func+0x14>
+      18:       ╭── 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:       ├── b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:       ╰─> d503201f   	nop
+      24:   ╭────── 14000005   	b	0x38 <test_func+0x38>
+      28:   │ ╭──── 14000003   	b	0x34 <test_func+0x34>
+      2c:   │ │ ╭── 14000001   	b	0x30 <test_func+0x30>
+      30:   │ │ ╰─> d503201f   	nop
+      34:   │ ╰───> d503201f   	nop
+      38:   ╰─────> d503201f   	nop
+      3c:       ╭── 14000002   	b	0x44 <test_func+0x44>
+      40:     ╭─│── 14000002   	b	0x48 <test_func+0x48>
+      44:     │ ╰─> d503201f   	nop
+      48:     ╰───> d503201f   	nop
+      4c:       ╭── 14000001   	b	0x50 <test_func+0x50>
+      50:     ╭─│── 14000001   	b	0x54 <test_func+0x54>
+      54:     ╰───> d503201f   	nop
+      58:       ╭── 14000002   	b	0x60 <test_func+0x60>
+      5c:       │   94000000   	bl	0x5c <test_func+0x5c>
+                │  		000000000000005c:  R_AARCH64_CALL26	extern_func
+      60:       ╰─> d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt
new file mode 100644
index 00000000000000..4cabdc92d61c1d
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt
@@ -0,0 +1,31 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:           94000000   	bl	0x0 <test_func>
+       4:           14000000   	b	0x4 <test_func+0x4>
+       8:       ╭── 14000001   	b	0xc <test_func+0xc>
+       c:       ├─> d503201f   	nop
+      10:       ╰── 17ffffff   	b	0xc <test_func+0xc>
+      14:           14000000   	b	0x14 <test_func+0x14>
+      18:       ╭── 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:       ├── b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:       ╰─> d503201f   	nop
+      24:   ╭────── 14000005   	b	0x38 <test_func+0x38>
+      28:   │ ╭──── 14000003   	b	0x34 <test_func+0x34>
+      2c:   │ │ ╭── 14000001   	b	0x30 <test_func+0x30>
+      30:   │ │ ╰─> d503201f   	nop
+      34:   │ ╰───> d503201f   	nop
+      38:   ╰─────> d503201f   	nop
+      3c:       ╭── 14000002   	b	0x44 <test_func+0x44>
+      40:     ╭─│── 14000002   	b	0x48 <test_func+0x48>
+      44:     │ ╰─> d503201f   	nop
+      48:     ╰───> d503201f   	nop
+      4c:       ╭── 14000001   	b	0x50 <test_func+0x50>
+      50:     ╭─│── 14000001   	b	0x54 <test_func+0x54>
+      54:     ╰───> d503201f   	nop
+      58:       ╭── 14000002   	b	0x60 <test_func+0x60>
+      5c:       │   94000000   	bl	0x5c <test_func+0x5c>
+      60:       ╰─> d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt
new file mode 100644
index 00000000000000..7e006841980534
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt
@@ -0,0 +1,27 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           ebfffffe   	bl	0x0 <test_func>         @ imm = #-0x8
+       4:           eafffffe   	b	0x4 <test_func+0x4>     @ imm = #-0x8
+       8:       /-- eaffffff   	b	0xc <test_func+0xc>     @ imm = #-0x4
+       c:       +-> e320f000   	nop
+      10:       \-- eafffffd   	b	0xc <test_func+0xc>     @ imm = #-0xc
+      14:           eafffffe   	b	0x14 <test_func+0x14>   @ imm = #-0x8
+      18:       /-- 0affffff   	beq	0x1c <test_func+0x1c>   @ imm = #-0x4
+      1c:       \-> e320f000   	nop
+      20:   /------ ea000003   	b	0x34 <test_func+0x34>   @ imm = #0xc
+      24:   | /---- ea000001   	b	0x30 <test_func+0x30>   @ imm = #0x4
+      28:   | | /-- eaffffff   	b	0x2c <test_func+0x2c>   @ imm = #-0x4
+      2c:   | | \-> e320f000   	nop
+      30:   | \---> e320f000   	nop
+      34:   \-----> e320f000   	nop
+      38:       /-- ea000000   	b	0x40 <test_func+0x40>   @ imm = #0x0
+      3c:     /-|-- ea000000   	b	0x44 <test_func+0x44>   @ imm = #0x0
+      40:     | \-> e320f000   	nop
+      44:     \---> e320f000   	nop
+      48:     /---- eaffffff   	b	0x4c <test_func+0x4c>   @ imm = #-0x4
+      4c:     \-|-> eaffffff   	b	0x50 <test_func+0x50>   @ imm = #-0x4
+      50:       \-> e320f000   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt
new file mode 100644
index 00000000000000..62194b4b33c172
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt
@@ -0,0 +1,27 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           ebfffffe   	bl	0x0 <test_func>         @ imm = #-0x8
+       4:           eafffffe   	b	0x4 <test_func+0x4>     @ imm = #-0x8
+       8:       ╭── eaffffff   	b	0xc <test_func+0xc>     @ imm = #-0x4
+       c:       ├─> e320f000   	nop
+      10:       ╰── eafffffd   	b	0xc <test_func+0xc>     @ imm = #-0xc
+      14:           eafffffe   	b	0x14 <test_func+0x14>   @ imm = #-0x8
+      18:       ╭── 0affffff   	beq	0x1c <test_func+0x1c>   @ imm = #-0x4
+      1c:       ╰─> e320f000   	nop
+      20:   ╭────── ea000003   	b	0x34 <test_func+0x34>   @ imm = #0xc
+      24:   │ ╭──── ea000001   	b	0x30 <test_func+0x30>   @ imm = #0x4
+      28:   │ │ ╭── eaffffff   	b	0x2c <test_func+0x2c>   @ imm = #-0x4
+      2c:   │ │ ╰─> e320f000   	nop
+      30:   │ ╰───> e320f000   	nop
+      34:   ╰─────> e320f000   	nop
+      38:       ╭── ea000000   	b	0x40 <test_func+0x40>   @ imm = #0x0
+      3c:     ╭─│── ea000000   	b	0x44 <test_func+0x44>   @ imm = #0x0
+      40:     │ ╰─> e320f000   	nop
+      44:     ╰───> e320f000   	nop
+      48:     ╭──── eaffffff   	b	0x4c <test_func+0x4c>   @ imm = #-0x4
+      4c:     ╰─│─> eaffffff   	b	0x50 <test_func+0x50>   @ imm = #-0x4
+      50:       ╰─> e320f000   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt
new file mode 100644
index 00000000000000..f18062167dd97d
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt
@@ -0,0 +1,34 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           f7ff fffe  	bl	0x0 <test_func>         @ imm = #-0x4
+       4:           f7ff bffe  	b.w	0x4 <test_func+0x4>     @ imm = #-0x4
+       8:       /-- e7ff       	b	0xa <test_func+0xa>     @ imm = #-0x2
+       a:       +-> bf00       	nop
+       c:       \-- e7fd       	b	0xa <test_func+0xa>     @ imm = #-0x6
+       e:           e7fe       	b	0xe <test_func+0xe>     @ imm = #-0x4
+      10:       /-- f000 b807  	b.w	0x22 <test_func+0x22>   @ imm = #0xe
+      14:       +-- d005       	beq	0x22 <test_func+0x22>   @ imm = #0xa
+      16:       +-- f040 8004  	bne.w	0x22 <test_func+0x22>   @ imm = #0x8
+      1a:       +-- b110       	cbz	r0, 0x22 <test_func+0x22> @ imm = #0x4
+      1c:       |   bfd8       	it	le
+      1e:       +-- e000       	ble	0x22 <test_func+0x22>   @ imm = #0x0
+      20:       |   bf00       	nop
+      22:       \-> bf00       	nop
+      24:           bf00       	nop
+      26:   /------ e003       	b	0x30 <test_func+0x30>   @ imm = #0x6
+      28:   | /---- e001       	b	0x2e <test_func+0x2e>   @ imm = #0x2
+      2a:   | | /-- e7ff       	b	0x2c <test_func+0x2c>   @ imm = #-0x2
+      2c:   | | \-> bf00       	nop
+      2e:   | \---> bf00       	nop
+      30:   \-----> bf00       	nop
+      32:       /-- e000       	b	0x36 <test_func+0x36>   @ imm = #0x0
+      34:     /-|-- e000       	b	0x38 <test_func+0x38>   @ imm = #0x0
+      36:     | \-> bf00       	nop
+      38:     \---> bf00       	nop
+      3a:       /-- e7ff       	b	0x3c <test_func+0x3c>   @ imm = #-0x2
+      3c:     /-|-- e7ff       	b	0x3e <test_func+0x3e>   @ imm = #-0x2
+      3e:     \---> bf00       	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt
new file mode 100644
index 00000000000000..b171fb1c5d28c2
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt
@@ -0,0 +1,34 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           f7ff fffe  	bl	0x0 <test_func>         @ imm = #-0x4
+       4:           f7ff bffe  	b.w	0x4 <test_func+0x4>     @ imm = #-0x4
+       8:       ╭── e7ff       	b	0xa <test_func+0xa>     @ imm = #-0x2
+       a:       ├─> bf00       	nop
+       c:       ╰── e7fd       	b	0xa <test_func+0xa>     @ imm = #-0x6
+       e:           e7fe       	b	0xe <test_func+0xe>     @ imm = #-0x4
+      10:       ╭── f000 b807  	b.w	0x22 <test_func+0x22>   @ imm = #0xe
+      14:       ├── d005       	beq	0x22 <test_func+0x22>   @ imm = #0xa
+      16:       ├── f040 8004  	bne.w	0x22 <test_func+0x22>   @ imm = #0x8
+      1a:       ├── b110       	cbz	r0, 0x22 <test_func+0x22> @ imm = #0x4
+      1c:       │   bfd8       	it	le
+      1e:       ├── e000       	ble	0x22 <test_func+0x22>   @ imm = #0x0
+      20:       │   bf00       	nop
+      22:       ╰─> bf00       	nop
+      24:           bf00       	nop
+      26:   ╭────── e003       	b	0x30 <test_func+0x30>   @ imm = #0x6
+      28:   │ ╭──── e001       	b	0x2e <test_func+0x2e>   @ imm = #0x2
+      2a:   │ │ ╭── e7ff       	b	0x2c <test_func+0x2c>   @ imm = #-0x2
+      2c:   │ │ ╰─> bf00       	nop
+      2e:   │ ╰───> bf00       	nop
+      30:   ╰─────> bf00       	nop
+      32:       ╭── e000       	b	0x36 <test_func+0x36>   @ imm = #0x0
+      34:     ╭─│── e000       	b	0x38 <test_func+0x38>   @ imm = #0x0
+      36:     │ ╰─> bf00       	nop
+      38:     ╰───> bf00       	nop
+      3a:       ╭── e7ff       	b	0x3c <test_func+0x3c>   @ imm = #-0x2
+      3c:     ╭─│── e7ff       	b	0x3e <test_func+0x3e>   @ imm = #-0x2
+      3e:     ╰───> bf00       	nop
diff --git a/llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s b/llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s
new file mode 100644
index 00000000000000..f78b3e27975556
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s
@@ -0,0 +1,69 @@
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=unicode - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-unicode.txt
+
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=ascii - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-ascii.txt
+
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=unicode,color - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-unicode-color.txt
+
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=unicode --reloc - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-unicode-relocs.txt
+
+test_func:
+  // Relocated instructions don't get control-flow edges.
+  bl extern_func
+  b extern_func
+  
+  // Two branches to the same label, one forward and one backward.
+  b .Llabel1
+.Llabel1:
+  nop
+  b .Llabel1
+
+  // Branch to self, no CFG edge shown
+  b .
+
+  // Conditional branches
+  b.eq .Llabel2
+  cbz x0, .Llabel2
+.Llabel2:
+  nop
+
+  // Branches are sorted with shorter ones to the right, to reduce number of
+  // crossings, and keep the lines for short branches short themselves.
+  b .Llabel5
+  b .Llabel4
+  b .Llabel3
+.Llabel3:
+  nop
+.Llabel4:
+  nop
+.Llabel5:
+  nop
+
+  // Sometimes crossings can't be avoided.
+  b .Llabel6
+  b .Llabel7
+.Llabel6:
+  nop
+.Llabel7:
+  nop
+
+  // TODO If a branch goes to another branch instruction, we don't have a way
+  // to represent that. Can we improve on this?
+  b .Llabel8
+.Llabel8:
+  b .Llabel9
+.Llabel9:
+  nop
+
+  // Graph lines need to be drawn on the same output line as relocations.
+  b .Llabel10
+  bl extern_func
+.Llabel10:
+  nop
diff --git a/llvm/test/tools/llvm-objdump/visualize-jumps-arm.s b/llvm/test/tools/llvm-objdump/visualize-jumps-arm.s
new file mode 100644
index 00000000000000..6855e6ff84e32a
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/visualize-jumps-arm.s
@@ -0,0 +1,54 @@
+// RUN: llvm-mc < %s -triple armv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple armv8a -d --visualize-jumps=unicode - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-arm-unicode.txt
+
+// RUN: llvm-mc < %s -triple armv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple armv8a -d --visualize-jumps=ascii - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-arm-ascii.txt
+
+test_func:
+  // Relocated instructions don't get control-flow edges.
+  bl extern_func
+  b extern_func
+  
+  // Two branches to the same label, one forward and one backward.
+  b .Llabel1
+.Llabel1:
+  nop
+  b .Llabel1
+
+  // Branch to self, no CFG edge shown
+  b .
+
+  // Conditional branches
+  beq .Llabel2
+.Llabel2:
+  nop
+
+  // Branches are sorted with shorter ones to the right, to reduce number of
+  // crossings, and keep the lines for short branches short themselves.
+  b .Llabel5
+  b .Llabel4
+  b .Llabel3
+.Llabel3:
+  nop
+.Llabel4:
+  nop
+.Llabel5:
+  nop
+
+  // Sometimes crossings can't be avoided.
+  b .Llabel6
+  b .Llabel7
+.Llabel6:
+  nop
+.Llabel7:
+  nop
+
+  // TODO If a branch goes to another branch instruction, we don't have a way
+  // to represent that. Can we improve on this?
+  b .Llabel8
+.Llabel8:
+  b .Llabel9
+.Llabel9:
+  nop
diff --git a/llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s b/llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s
new file mode 100644
index 00000000000000..8be73d3aa4c946
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s
@@ -0,0 +1,63 @@
+// RUN: llvm-mc < %s -triple thumbv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple thumbv8a -d --visualize-jumps=unicode - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-thumb-unicode.txt
+
+// RUN: llvm-mc < %s -triple thumbv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple thumbv8a -d --visualize-jumps=ascii - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-thumb-ascii.txt
+
+test_func:
+  // Relocated instructions don't get control-flow edges.
+  bl extern_func
+  b extern_func
+  
+  // Two branches to the same label, one forward and one backward.
+  b .Llabel1
+.Llabel1:
+  nop
+  b .Llabel1
+
+  // Branch to self, no CFG edge shown
+  b .
+
+  // Different branch instructions
+  b.w .Llabel2
+  beq .Llabel2
+  bne.w .Llabel2
+  cbz r0, .Llabel2
+  it le
+  ble .Llabel2
+  nop
+.Llabel2:
+  nop
+.Llabel2.1:
+  nop
+
+  // Branches are sorted with shorter ones to the right, to reduce number of
+  // crossings, and keep the lines for short branches short themselves.
+  b .Llabel5
+  b .Llabel4
+  b .Llabel3
+.Llabel3:
+  nop
+.Llabel4:
+  nop
+.Llabel5:
+  nop
+
+  // Sometimes crossings can't be avoided.
+  b ....
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Dec 8, 2023

@llvm/pr-subscribers-llvm-support

Author: None (ostannard)

Changes

This is a feature which GNU objdump currently has, which prints a control-flow graph showing all of the direct branches alongside the disassembly.

This should work for any architecture which implements the MCInstrAnalysis class, I've tested it on ARM, Thumb and AArch64.

There are a few (known) differences between this and the GNU objdump
version:

  • This has the option to draw the lines using unicode line-drawing charcters, instead of ASCII. This is on by default because I find the connected lines much easier to read. I've included an option to revert back to ASCII for terminals or fonts which don't support the right bits of unicode.
  • I haven't yet implemented the extended-color mode yet. With GNU objdump this often results in some very similar colors, so we might want to carefully pick a pallet which minimises that.

Related issue: #60172

Demo: Screenshot from 2023-12-08 16-11-24


Patch is 60.76 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74858.diff

19 Files Affected:

  • (modified) llvm/lib/Support/FormattedStream.cpp (+1)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt (+31)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt (+31)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt (+34)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt (+31)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt (+27)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt (+27)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt (+34)
  • (added) llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt (+34)
  • (added) llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s (+69)
  • (added) llvm/test/tools/llvm-objdump/visualize-jumps-arm.s (+54)
  • (added) llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s (+63)
  • (modified) llvm/tools/llvm-objdump/ObjdumpOpts.td (+7)
  • (modified) llvm/tools/llvm-objdump/SourcePrinter.cpp (+194-2)
  • (modified) llvm/tools/llvm-objdump/SourcePrinter.h (+112-1)
  • (modified) llvm/tools/llvm-objdump/XCOFFDump.cpp (+1-1)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.cpp (+235-80)
  • (modified) llvm/tools/llvm-objdump/llvm-objdump.h (+13)
  • (modified) llvm/unittests/Support/formatted_raw_ostream_test.cpp (+4-5)
diff --git a/llvm/lib/Support/FormattedStream.cpp b/llvm/lib/Support/FormattedStream.cpp
index c0d28435099570..48b02889c6d3e5 100644
--- a/llvm/lib/Support/FormattedStream.cpp
+++ b/llvm/lib/Support/FormattedStream.cpp
@@ -45,6 +45,7 @@ void formatted_raw_ostream::UpdatePosition(const char *Ptr, size_t Size) {
       break;
     case '\t':
       // Assumes tab stop = 8 characters.
+      Column += 1;
       Column += (8 - (Column & 0x7)) & 0x7;
       break;
     }
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt
new file mode 100644
index 00000000000000..592e50d98f8ff7
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-ascii.txt
@@ -0,0 +1,31 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:           94000000   	bl	0x0 <test_func>
+       4:           14000000   	b	0x4 <test_func+0x4>
+       8:       /-- 14000001   	b	0xc <test_func+0xc>
+       c:       +-> d503201f   	nop
+      10:       \-- 17ffffff   	b	0xc <test_func+0xc>
+      14:           14000000   	b	0x14 <test_func+0x14>
+      18:       /-- 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:       +-- b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:       \-> d503201f   	nop
+      24:   /------ 14000005   	b	0x38 <test_func+0x38>
+      28:   | /---- 14000003   	b	0x34 <test_func+0x34>
+      2c:   | | /-- 14000001   	b	0x30 <test_func+0x30>
+      30:   | | \-> d503201f   	nop
+      34:   | \---> d503201f   	nop
+      38:   \-----> d503201f   	nop
+      3c:       /-- 14000002   	b	0x44 <test_func+0x44>
+      40:     /-|-- 14000002   	b	0x48 <test_func+0x48>
+      44:     | \-> d503201f   	nop
+      48:     \---> d503201f   	nop
+      4c:       /-- 14000001   	b	0x50 <test_func+0x50>
+      50:     /-|-- 14000001   	b	0x54 <test_func+0x54>
+      54:     \---> d503201f   	nop
+      58:       /-- 14000002   	b	0x60 <test_func+0x60>
+      5c:       |   94000000   	bl	0x5c <test_func+0x5c>
+      60:       \-> d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt
new file mode 100644
index 00000000000000..925c5ef52a8d22
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-color.txt
@@ -0,0 +1,31 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:          �[0m 94000000   	bl	0x0 <test_func>
+       4:          �[0m 14000000   	b	0x4 <test_func+0x4>
+       8:      �[0;31m ╭�[0;31m──�[0m 14000001   	b	0xc <test_func+0xc>
+       c:      �[0;31m ├�[0;31m─>�[0m d503201f   	nop
+      10:      �[0;31m ╰�[0;31m──�[0m 17ffffff   	b	0xc <test_func+0xc>
+      14:          �[0m 14000000   	b	0x14 <test_func+0x14>
+      18:      �[0;32m ╭�[0;32m──�[0m 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:      �[0;32m ├�[0;32m──�[0m b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:      �[0;32m ╰�[0;32m─>�[0m d503201f   	nop
+      24:  �[0;33m ╭�[0;33m──�[0;33m──�[0;33m──�[0m 14000005   	b	0x38 <test_func+0x38>
+      28:  �[0;33m │�[0;34m ╭�[0;34m──�[0;34m──�[0m 14000003   	b	0x34 <test_func+0x34>
+      2c:  �[0;33m │�[0;34m │�[0;35m ╭�[0;35m──�[0m 14000001   	b	0x30 <test_func+0x30>
+      30:  �[0;33m │�[0;34m │�[0;35m ╰�[0;35m─>�[0m d503201f   	nop
+      34:  �[0;33m │�[0;34m ╰�[0;34m──�[0;34m─>�[0m d503201f   	nop
+      38:  �[0;33m ╰�[0;33m──�[0;33m──�[0;33m─>�[0m d503201f   	nop
+      3c:      �[0;36m ╭�[0;36m──�[0m 14000002   	b	0x44 <test_func+0x44>
+      40:    �[0;31m ╭�[0;31m─�[0;36m│�[0;31m──�[0m 14000002   	b	0x48 <test_func+0x48>
+      44:    �[0;31m │�[0;36m ╰�[0;36m─>�[0m d503201f   	nop
+      48:    �[0;31m ╰�[0;31m──�[0;31m─>�[0m d503201f   	nop
+      4c:      �[0;32m ╭�[0;32m──�[0m 14000001   	b	0x50 <test_func+0x50>
+      50:    �[0;33m ╭�[0;33m─�[0;32m│�[0;33m──�[0m 14000001   	b	0x54 <test_func+0x54>
+      54:    �[0;33m ╰�[0;33m──�[0;33m─>�[0m d503201f   	nop
+      58:      �[0;34m ╭�[0;34m──�[0m 14000002   	b	0x60 <test_func+0x60>
+      5c:      �[0;34m │  �[0m 94000000   	bl	0x5c <test_func+0x5c>
+      60:      �[0;34m ╰�[0;34m─>�[0m d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt
new file mode 100644
index 00000000000000..75ec5741fdaae2
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode-relocs.txt
@@ -0,0 +1,34 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:           94000000   	bl	0x0 <test_func>
+                   		0000000000000000:  R_AARCH64_CALL26	extern_func
+       4:           14000000   	b	0x4 <test_func+0x4>
+                   		0000000000000004:  R_AARCH64_JUMP26	extern_func
+       8:       ╭── 14000001   	b	0xc <test_func+0xc>
+       c:       ├─> d503201f   	nop
+      10:       ╰── 17ffffff   	b	0xc <test_func+0xc>
+      14:           14000000   	b	0x14 <test_func+0x14>
+      18:       ╭── 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:       ├── b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:       ╰─> d503201f   	nop
+      24:   ╭────── 14000005   	b	0x38 <test_func+0x38>
+      28:   │ ╭──── 14000003   	b	0x34 <test_func+0x34>
+      2c:   │ │ ╭── 14000001   	b	0x30 <test_func+0x30>
+      30:   │ │ ╰─> d503201f   	nop
+      34:   │ ╰───> d503201f   	nop
+      38:   ╰─────> d503201f   	nop
+      3c:       ╭── 14000002   	b	0x44 <test_func+0x44>
+      40:     ╭─│── 14000002   	b	0x48 <test_func+0x48>
+      44:     │ ╰─> d503201f   	nop
+      48:     ╰───> d503201f   	nop
+      4c:       ╭── 14000001   	b	0x50 <test_func+0x50>
+      50:     ╭─│── 14000001   	b	0x54 <test_func+0x54>
+      54:     ╰───> d503201f   	nop
+      58:       ╭── 14000002   	b	0x60 <test_func+0x60>
+      5c:       │   94000000   	bl	0x5c <test_func+0x5c>
+                │  		000000000000005c:  R_AARCH64_CALL26	extern_func
+      60:       ╰─> d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt
new file mode 100644
index 00000000000000..4cabdc92d61c1d
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-aarch64-unicode.txt
@@ -0,0 +1,31 @@
+
+<stdin>:	file format elf64-littleaarch64
+
+Disassembly of section .text:
+
+0000000000000000 <test_func>:
+       0:           94000000   	bl	0x0 <test_func>
+       4:           14000000   	b	0x4 <test_func+0x4>
+       8:       ╭── 14000001   	b	0xc <test_func+0xc>
+       c:       ├─> d503201f   	nop
+      10:       ╰── 17ffffff   	b	0xc <test_func+0xc>
+      14:           14000000   	b	0x14 <test_func+0x14>
+      18:       ╭── 54000040   	b.eq	0x20 <test_func+0x20>
+      1c:       ├── b4000020   	cbz	x0, 0x20 <test_func+0x20>
+      20:       ╰─> d503201f   	nop
+      24:   ╭────── 14000005   	b	0x38 <test_func+0x38>
+      28:   │ ╭──── 14000003   	b	0x34 <test_func+0x34>
+      2c:   │ │ ╭── 14000001   	b	0x30 <test_func+0x30>
+      30:   │ │ ╰─> d503201f   	nop
+      34:   │ ╰───> d503201f   	nop
+      38:   ╰─────> d503201f   	nop
+      3c:       ╭── 14000002   	b	0x44 <test_func+0x44>
+      40:     ╭─│── 14000002   	b	0x48 <test_func+0x48>
+      44:     │ ╰─> d503201f   	nop
+      48:     ╰───> d503201f   	nop
+      4c:       ╭── 14000001   	b	0x50 <test_func+0x50>
+      50:     ╭─│── 14000001   	b	0x54 <test_func+0x54>
+      54:     ╰───> d503201f   	nop
+      58:       ╭── 14000002   	b	0x60 <test_func+0x60>
+      5c:       │   94000000   	bl	0x5c <test_func+0x5c>
+      60:       ╰─> d503201f   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt
new file mode 100644
index 00000000000000..7e006841980534
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-ascii.txt
@@ -0,0 +1,27 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           ebfffffe   	bl	0x0 <test_func>         @ imm = #-0x8
+       4:           eafffffe   	b	0x4 <test_func+0x4>     @ imm = #-0x8
+       8:       /-- eaffffff   	b	0xc <test_func+0xc>     @ imm = #-0x4
+       c:       +-> e320f000   	nop
+      10:       \-- eafffffd   	b	0xc <test_func+0xc>     @ imm = #-0xc
+      14:           eafffffe   	b	0x14 <test_func+0x14>   @ imm = #-0x8
+      18:       /-- 0affffff   	beq	0x1c <test_func+0x1c>   @ imm = #-0x4
+      1c:       \-> e320f000   	nop
+      20:   /------ ea000003   	b	0x34 <test_func+0x34>   @ imm = #0xc
+      24:   | /---- ea000001   	b	0x30 <test_func+0x30>   @ imm = #0x4
+      28:   | | /-- eaffffff   	b	0x2c <test_func+0x2c>   @ imm = #-0x4
+      2c:   | | \-> e320f000   	nop
+      30:   | \---> e320f000   	nop
+      34:   \-----> e320f000   	nop
+      38:       /-- ea000000   	b	0x40 <test_func+0x40>   @ imm = #0x0
+      3c:     /-|-- ea000000   	b	0x44 <test_func+0x44>   @ imm = #0x0
+      40:     | \-> e320f000   	nop
+      44:     \---> e320f000   	nop
+      48:     /---- eaffffff   	b	0x4c <test_func+0x4c>   @ imm = #-0x4
+      4c:     \-|-> eaffffff   	b	0x50 <test_func+0x50>   @ imm = #-0x4
+      50:       \-> e320f000   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt
new file mode 100644
index 00000000000000..62194b4b33c172
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-arm-unicode.txt
@@ -0,0 +1,27 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           ebfffffe   	bl	0x0 <test_func>         @ imm = #-0x8
+       4:           eafffffe   	b	0x4 <test_func+0x4>     @ imm = #-0x8
+       8:       ╭── eaffffff   	b	0xc <test_func+0xc>     @ imm = #-0x4
+       c:       ├─> e320f000   	nop
+      10:       ╰── eafffffd   	b	0xc <test_func+0xc>     @ imm = #-0xc
+      14:           eafffffe   	b	0x14 <test_func+0x14>   @ imm = #-0x8
+      18:       ╭── 0affffff   	beq	0x1c <test_func+0x1c>   @ imm = #-0x4
+      1c:       ╰─> e320f000   	nop
+      20:   ╭────── ea000003   	b	0x34 <test_func+0x34>   @ imm = #0xc
+      24:   │ ╭──── ea000001   	b	0x30 <test_func+0x30>   @ imm = #0x4
+      28:   │ │ ╭── eaffffff   	b	0x2c <test_func+0x2c>   @ imm = #-0x4
+      2c:   │ │ ╰─> e320f000   	nop
+      30:   │ ╰───> e320f000   	nop
+      34:   ╰─────> e320f000   	nop
+      38:       ╭── ea000000   	b	0x40 <test_func+0x40>   @ imm = #0x0
+      3c:     ╭─│── ea000000   	b	0x44 <test_func+0x44>   @ imm = #0x0
+      40:     │ ╰─> e320f000   	nop
+      44:     ╰───> e320f000   	nop
+      48:     ╭──── eaffffff   	b	0x4c <test_func+0x4c>   @ imm = #-0x4
+      4c:     ╰─│─> eaffffff   	b	0x50 <test_func+0x50>   @ imm = #-0x4
+      50:       ╰─> e320f000   	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt
new file mode 100644
index 00000000000000..f18062167dd97d
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-ascii.txt
@@ -0,0 +1,34 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           f7ff fffe  	bl	0x0 <test_func>         @ imm = #-0x4
+       4:           f7ff bffe  	b.w	0x4 <test_func+0x4>     @ imm = #-0x4
+       8:       /-- e7ff       	b	0xa <test_func+0xa>     @ imm = #-0x2
+       a:       +-> bf00       	nop
+       c:       \-- e7fd       	b	0xa <test_func+0xa>     @ imm = #-0x6
+       e:           e7fe       	b	0xe <test_func+0xe>     @ imm = #-0x4
+      10:       /-- f000 b807  	b.w	0x22 <test_func+0x22>   @ imm = #0xe
+      14:       +-- d005       	beq	0x22 <test_func+0x22>   @ imm = #0xa
+      16:       +-- f040 8004  	bne.w	0x22 <test_func+0x22>   @ imm = #0x8
+      1a:       +-- b110       	cbz	r0, 0x22 <test_func+0x22> @ imm = #0x4
+      1c:       |   bfd8       	it	le
+      1e:       +-- e000       	ble	0x22 <test_func+0x22>   @ imm = #0x0
+      20:       |   bf00       	nop
+      22:       \-> bf00       	nop
+      24:           bf00       	nop
+      26:   /------ e003       	b	0x30 <test_func+0x30>   @ imm = #0x6
+      28:   | /---- e001       	b	0x2e <test_func+0x2e>   @ imm = #0x2
+      2a:   | | /-- e7ff       	b	0x2c <test_func+0x2c>   @ imm = #-0x2
+      2c:   | | \-> bf00       	nop
+      2e:   | \---> bf00       	nop
+      30:   \-----> bf00       	nop
+      32:       /-- e000       	b	0x36 <test_func+0x36>   @ imm = #0x0
+      34:     /-|-- e000       	b	0x38 <test_func+0x38>   @ imm = #0x0
+      36:     | \-> bf00       	nop
+      38:     \---> bf00       	nop
+      3a:       /-- e7ff       	b	0x3c <test_func+0x3c>   @ imm = #-0x2
+      3c:     /-|-- e7ff       	b	0x3e <test_func+0x3e>   @ imm = #-0x2
+      3e:     \---> bf00       	nop
diff --git a/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt
new file mode 100644
index 00000000000000..b171fb1c5d28c2
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/Inputs/visualize-jumps-thumb-unicode.txt
@@ -0,0 +1,34 @@
+
+<stdin>:	file format elf32-littlearm
+
+Disassembly of section .text:
+
+00000000 <test_func>:
+       0:           f7ff fffe  	bl	0x0 <test_func>         @ imm = #-0x4
+       4:           f7ff bffe  	b.w	0x4 <test_func+0x4>     @ imm = #-0x4
+       8:       ╭── e7ff       	b	0xa <test_func+0xa>     @ imm = #-0x2
+       a:       ├─> bf00       	nop
+       c:       ╰── e7fd       	b	0xa <test_func+0xa>     @ imm = #-0x6
+       e:           e7fe       	b	0xe <test_func+0xe>     @ imm = #-0x4
+      10:       ╭── f000 b807  	b.w	0x22 <test_func+0x22>   @ imm = #0xe
+      14:       ├── d005       	beq	0x22 <test_func+0x22>   @ imm = #0xa
+      16:       ├── f040 8004  	bne.w	0x22 <test_func+0x22>   @ imm = #0x8
+      1a:       ├── b110       	cbz	r0, 0x22 <test_func+0x22> @ imm = #0x4
+      1c:       │   bfd8       	it	le
+      1e:       ├── e000       	ble	0x22 <test_func+0x22>   @ imm = #0x0
+      20:       │   bf00       	nop
+      22:       ╰─> bf00       	nop
+      24:           bf00       	nop
+      26:   ╭────── e003       	b	0x30 <test_func+0x30>   @ imm = #0x6
+      28:   │ ╭──── e001       	b	0x2e <test_func+0x2e>   @ imm = #0x2
+      2a:   │ │ ╭── e7ff       	b	0x2c <test_func+0x2c>   @ imm = #-0x2
+      2c:   │ │ ╰─> bf00       	nop
+      2e:   │ ╰───> bf00       	nop
+      30:   ╰─────> bf00       	nop
+      32:       ╭── e000       	b	0x36 <test_func+0x36>   @ imm = #0x0
+      34:     ╭─│── e000       	b	0x38 <test_func+0x38>   @ imm = #0x0
+      36:     │ ╰─> bf00       	nop
+      38:     ╰───> bf00       	nop
+      3a:       ╭── e7ff       	b	0x3c <test_func+0x3c>   @ imm = #-0x2
+      3c:     ╭─│── e7ff       	b	0x3e <test_func+0x3e>   @ imm = #-0x2
+      3e:     ╰───> bf00       	nop
diff --git a/llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s b/llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s
new file mode 100644
index 00000000000000..f78b3e27975556
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/visualize-jumps-aarch64.s
@@ -0,0 +1,69 @@
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=unicode - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-unicode.txt
+
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=ascii - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-ascii.txt
+
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=unicode,color - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-unicode-color.txt
+
+// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
+// RUN:   llvm-objdump --triple aarch64 -d --visualize-jumps=unicode --reloc - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-aarch64-unicode-relocs.txt
+
+test_func:
+  // Relocated instructions don't get control-flow edges.
+  bl extern_func
+  b extern_func
+  
+  // Two branches to the same label, one forward and one backward.
+  b .Llabel1
+.Llabel1:
+  nop
+  b .Llabel1
+
+  // Branch to self, no CFG edge shown
+  b .
+
+  // Conditional branches
+  b.eq .Llabel2
+  cbz x0, .Llabel2
+.Llabel2:
+  nop
+
+  // Branches are sorted with shorter ones to the right, to reduce number of
+  // crossings, and keep the lines for short branches short themselves.
+  b .Llabel5
+  b .Llabel4
+  b .Llabel3
+.Llabel3:
+  nop
+.Llabel4:
+  nop
+.Llabel5:
+  nop
+
+  // Sometimes crossings can't be avoided.
+  b .Llabel6
+  b .Llabel7
+.Llabel6:
+  nop
+.Llabel7:
+  nop
+
+  // TODO If a branch goes to another branch instruction, we don't have a way
+  // to represent that. Can we improve on this?
+  b .Llabel8
+.Llabel8:
+  b .Llabel9
+.Llabel9:
+  nop
+
+  // Graph lines need to be drawn on the same output line as relocations.
+  b .Llabel10
+  bl extern_func
+.Llabel10:
+  nop
diff --git a/llvm/test/tools/llvm-objdump/visualize-jumps-arm.s b/llvm/test/tools/llvm-objdump/visualize-jumps-arm.s
new file mode 100644
index 00000000000000..6855e6ff84e32a
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/visualize-jumps-arm.s
@@ -0,0 +1,54 @@
+// RUN: llvm-mc < %s -triple armv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple armv8a -d --visualize-jumps=unicode - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-arm-unicode.txt
+
+// RUN: llvm-mc < %s -triple armv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple armv8a -d --visualize-jumps=ascii - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-arm-ascii.txt
+
+test_func:
+  // Relocated instructions don't get control-flow edges.
+  bl extern_func
+  b extern_func
+  
+  // Two branches to the same label, one forward and one backward.
+  b .Llabel1
+.Llabel1:
+  nop
+  b .Llabel1
+
+  // Branch to self, no CFG edge shown
+  b .
+
+  // Conditional branches
+  beq .Llabel2
+.Llabel2:
+  nop
+
+  // Branches are sorted with shorter ones to the right, to reduce number of
+  // crossings, and keep the lines for short branches short themselves.
+  b .Llabel5
+  b .Llabel4
+  b .Llabel3
+.Llabel3:
+  nop
+.Llabel4:
+  nop
+.Llabel5:
+  nop
+
+  // Sometimes crossings can't be avoided.
+  b .Llabel6
+  b .Llabel7
+.Llabel6:
+  nop
+.Llabel7:
+  nop
+
+  // TODO If a branch goes to another branch instruction, we don't have a way
+  // to represent that. Can we improve on this?
+  b .Llabel8
+.Llabel8:
+  b .Llabel9
+.Llabel9:
+  nop
diff --git a/llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s b/llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s
new file mode 100644
index 00000000000000..8be73d3aa4c946
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/visualize-jumps-thumb.s
@@ -0,0 +1,63 @@
+// RUN: llvm-mc < %s -triple thumbv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple thumbv8a -d --visualize-jumps=unicode - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-thumb-unicode.txt
+
+// RUN: llvm-mc < %s -triple thumbv8a -filetype=obj | \
+// RUN:   llvm-objdump --triple thumbv8a -d --visualize-jumps=ascii - | \
+// RUN:   diff - %p/Inputs/visualize-jumps-thumb-ascii.txt
+
+test_func:
+  // Relocated instructions don't get control-flow edges.
+  bl extern_func
+  b extern_func
+  
+  // Two branches to the same label, one forward and one backward.
+  b .Llabel1
+.Llabel1:
+  nop
+  b .Llabel1
+
+  // Branch to self, no CFG edge shown
+  b .
+
+  // Different branch instructions
+  b.w .Llabel2
+  beq .Llabel2
+  bne.w .Llabel2
+  cbz r0, .Llabel2
+  it le
+  ble .Llabel2
+  nop
+.Llabel2:
+  nop
+.Llabel2.1:
+  nop
+
+  // Branches are sorted with shorter ones to the right, to reduce number of
+  // crossings, and keep the lines for short branches short themselves.
+  b .Llabel5
+  b .Llabel4
+  b .Llabel3
+.Llabel3:
+  nop
+.Llabel4:
+  nop
+.Llabel5:
+  nop
+
+  // Sometimes crossings can't be avoided.
+  b ....
[truncated]

@jthackray jthackray self-requested a review December 8, 2023 16:24
Copy link

github-actions bot commented Dec 8, 2023

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

@nickdesaulniers
Copy link
Member

omg yassss!!!!

T->Column = Column;
MaxColumn = std::max(MaxColumn, Column);

#if 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


IndentToColumn(STI, OS, DisassemblyColumn::ControlFlow);

// TODO: What happens if an instruction has both incoming and outgoing edges?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, a branch to a branch? isn't that what the Tee is used for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tee is used when there are multiple branches to the same destination, which get one ControlFlowTarget object, and one line which is visually joined together. The case this doesn't handle correctly at the moment is where the destination of a branch is itself a branch instruction, so one disassembly line needs both an outgoing line and an incoming line, which should be unconnected and in different colors. I haven't yet though of a good way to represent that with either ASCII or box-drawing characters. There's an example of the current output for this in the test files.

Comment on lines 568 to 569
bool IsASCII =
(OutputMode & VisualizeJumpsMode::CharsMask) == VisualizeJumpsMode::CharsASCII;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nearly every reference to OutputMode is verbose, and a few of the lambdas it's captured in look redundant. Consider making helper methods for these comparisons. I suspect you can avoid the lambdas with such methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I've made this a struct of two enums instead of a single bit-packed enum.

Comment on lines 243 to 250
raw_ostream::Colors PickColor() {
if ((OutputMode & VisualizeJumpsMode::ColorMask) ==
VisualizeJumpsMode::Off)
return raw_ostream::RESET;
auto Ret = LineColors[NextColorIdx];
NextColorIdx = (NextColorIdx + 1) % (sizeof(LineColors) / sizeof(LineColors[0]));
return Ret;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider moving this definition and a few of the larger definitions above to the .cpp file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at that point, you might not even need to declare LineColors in this header.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 517 to 519
if (Col == DisassemblyColumn::Assembly) {
return Indent;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove braces for single statement ifs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 1379 to 1380
std::vector<RelocationRef>::const_iterator RelCur = Relocs.begin();
std::vector<RelocationRef>::const_iterator RelEnd = Relocs.end();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to use const auto here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 3400 to 3401
VisualizeJumps = (VisualizeJumpsMode)(VisualizeJumpsMode::CharsUnicode |
VisualizeJumpsMode::ColorAuto);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the cast necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the Command Guide documentation with the new option.

@@ -0,0 +1,69 @@
// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
// RUN: llvm-objdump --triple aarch64 -d --visualize-jumps=unicode - | \
// RUN: diff - %p/Inputs/visualize-jumps-aarch64-unicode.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than have the expected files in a separate location, please include them within this file. You can use split-file to divide a test file into multiple components (e.g. assembly and expected output).

That being said, a better method of testing would be to use FileCheck, with the expected input being the check patterns used by FileCheck.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used FileCheck for testing similar output with --debug-vars and was never very happy with that, because the disassembly contains tabs, so the output ends up getting misaligned in the text editor. I did it this way so that column 0 of the expected output is at column 0 of the test file, so the test looks exactly like the output.

Thanks for pointing out split-file, I didn't know about that. That looks like the best solution to this, so I'll switch the tests over to using it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the disassembly contains tabs, so the output ends up getting misaligned in the text editor. I did it this way so that column 0 of the expected output is at column 0 of the test file, so the test looks exactly like the output.

Assuming you have --strict-whitespace and --match-full-lines on, I'm not sure I understand where this can go wrong, if you're careful with how you line up the colon following the check prefixes? I have a soft preference for FileCheck, as it has better diagnostics when things don't match, and it's easier to omit stuff irrelevant to what is under test.

This moves all of the logic for deciding the widths of different parts
of the disassembly and indenting to them into one place. This will make
it easier to add new columns. It also simplifies some things, because we
were already using formatted_raw_ostream almost everywhere, which tracks
the current column number for us.
This is a feature which GNU objdump currently has, which prints a
control-flow graph showing all of the direct branches alongside the
disassembly.

This should work for any architecture which implements the
MCInstrAnalysis class, I've tested it on ARM, Thumb and AArch64.

There are a few (known) differences between this and the GNU objdump
version:
* This has the option to draw the lines using unicode line-drawing
  charcters, instead of ASCII. This is on by default because I find the
  connected lines much easier to read.
* I haven't yet implemented the extended-color mode yet. With GNU
  objdump this often results in some very similar colors, so we might
  want to carefully pick a pallete which minimises that.
@nickdesaulniers
Copy link
Member

This should work for any architecture which implements the MCInstrAnalysis class, I've tested it on ARM, Thumb and AArch64.

I gave this a quick shot for RISCV and it didn't work, though RISCV does subclass MCInstrAnalysis. Is there more to this than just implementing that class? Can you update the description to better document what else might be missing for other architectures?

I also tested this on some linux kernel code; test arm64, arm (32b), and x86 all look great.

Thumb2 code wasn't working. Can I send you an object file?

@@ -291,6 +291,36 @@ OPTIONS

Target triple to disassemble for, see ``--version`` for available targets.

.. option:: --visualize-jumps=<modes>

Display a control-flow graph which shows the targets of branch instructions to the left of disasembly. ``modes`` is a comma-separated list of options, which configure the character set and used to print the graph.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be wrapped at a sensible width.


.. option:: off

Disable control-flow graph
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: all of these should end with "."


If ``modes`` is omitted, the default is ``unicode,auto``

.. option:: off
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you built and inspected the documentation? I don't have a currently working doc build, so I can't myself, but I'd be slightly concerned that this would look like a general command, rather than an argument of the --visualize-jumps option. --disassembler-color uses a different format. Perhaps that should be what you follow for consistency?

Oh, hang on, looking downwards, I see the --x86-asm-syntax=<style> is formatted this way. I personally prefer the bullet points of --disassembler-color (plus it's used in another place too), so I still think you should change.

@@ -45,6 +45,7 @@ void formatted_raw_ostream::UpdatePosition(const char *Ptr, size_t Size) {
break;
case '\t':
// Assumes tab stop = 8 characters.
Column += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned this will have a wider impact on tools than just llvm-objdump. What's it actually doing?

nop
b .Llabel1

// Branch to self, no CFG edge shown
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing full stop, here and in other comments.

// Check for relocations which apply to this instruction.
bool Relocated = false;
while (RelCur != RelEnd) {
// FIXME RelAdjustment for executables & shared objects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment to the earlier TODOs/FIXME: will this be actioned in a follow-up PR?

@@ -2209,7 +2320,7 @@ disassembleObject(ObjectFile &Obj, const ObjectFile &DbgObj,
printBTFRelocation(FOS, *BTF, {Index, Section.getIndex()}, LVP);

// Hexagon does this in pretty printer
if (Obj.getArch() != Triple::hexagon) {
if (Obj.getArch() != Triple::hexagon && InlineRelocs) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated?

} else if (Part == "unicode") {
Chars = VisualizeJumpsMode::Unicode;
} else {
reportCmdLineError("'" + Part + "' is not a valid value for '" +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test case?

Comment on lines +156 to +158
unsigned GetColumnIndent(MCSubtargetInfo const &STI, DisassemblyColumn Col);
void IndentToColumn(MCSubtargetInfo const &STI, formatted_raw_ostream &OS,
DisassemblyColumn Col);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsigned GetColumnIndent(MCSubtargetInfo const &STI, DisassemblyColumn Col);
void IndentToColumn(MCSubtargetInfo const &STI, formatted_raw_ostream &OS,
DisassemblyColumn Col);
unsigned getColumnIndent(MCSubtargetInfo const &STI, DisassemblyColumn Col);
void indentToColumn(MCSubtargetInfo const &STI, formatted_raw_ostream &OS,
DisassemblyColumn Col);

@@ -0,0 +1,69 @@
// RUN: llvm-mc < %s -triple aarch64 -filetype=obj | \
// RUN: llvm-objdump --triple aarch64 -d --visualize-jumps=unicode - | \
// RUN: diff - %p/Inputs/visualize-jumps-aarch64-unicode.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the disassembly contains tabs, so the output ends up getting misaligned in the text editor. I did it this way so that column 0 of the expected output is at column 0 of the test file, so the test looks exactly like the output.

Assuming you have --strict-whitespace and --match-full-lines on, I'm not sure I understand where this can go wrong, if you're careful with how you line up the colon following the check prefixes? I have a soft preference for FileCheck, as it has better diagnostics when things don't match, and it's easier to omit stuff irrelevant to what is under test.

@ostannard
Copy link
Collaborator Author

I gave this a quick shot for RISCV and it didn't work, though RISCV does subclass MCInstrAnalysis. Is there more to this than just implementing that class? Can you update the description to better document what else might be missing for other architectures?

I think that's the only target-specific code I've relied on. This is all in llvm-objdump.cpp:collectLocalBranchTargets, which is shared with the --symbolize-operands option, does that work with RISCV?

Thumb2 code wasn't working. Can I send you an object file?

One of the test files is Thumb2, but I'll look at your object file and see if there's something which prevents this working.

Copy link
Contributor

@jthackray jthackray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, look forward to this being merged.

case LineChar::Tee:
return IsASCII ? "+" : u8"\u251c";
case LineChar::Arrow:
return ">";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or u8"\u2192" (aka Unicode "RIGHTWARDS ARROW")?

llvm_unreachable("Unhandled LineChar enum");
}

#define C(id) getLineChar(LineChar::id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would prefer not to have a #define if possible

@antmox
Copy link
Contributor

antmox commented Jul 11, 2024

Hi @ostannard ,
I just tried your branch on some AArch64 code without any problems.
Like the GNU version, this feature is very useful.
What's stopping you from merging?

@@ -0,0 +1,204 @@
# RUN: rm -rf %t && split-file %s %t && cd %t

// RUN: llvm-mc < input.s -triple aarch64 -filetype=obj | \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omit < for llvm-mc

REQUIRES: aarch64-registered-target is needed unless you place it in ELF/AArch64

MetaVarName<"mode,...">,
HelpText<"Print a control flow graph along side disassembly. "
"Color modes: auto (default), nocolor, color. "
"Character modes: unicode (default), ascii.">;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We omit the last period.

@MaskRay
Copy link
Member

MaskRay commented Jul 15, 2024

Hi @ostannard , I just tried your branch on some AArch64 code without any problems. Like the GNU version, this feature is very useful. What's stopping you from merging?

Quite a few comments from @jh7370 and @jthackray need to be addressed first. I just left some comments, too.


Display a control-flow graph which shows the targets of branch instructions to the left of disasembly. ``modes`` is a comma-separated list of options, which configure the character set and used to print the graph.

If ``modes`` is omitted, the default is ``unicode,auto``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike .td help messages, the documentation descriptions should have a trailing period.

@jroelofs
Copy link
Contributor

jroelofs commented Oct 9, 2024

@ostannard this seems very useful. I'd love to see it get merged.

@MaskRay
Copy link
Member

MaskRay commented May 10, 2025

@ostannard this seems very useful. I'd love to see it get merged.

+1 in 2025:) This PR needs a rebase. A bunch of @jh7370 's comments need to be addressed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants