-
Notifications
You must be signed in to change notification settings - Fork 707
Unwrap 72-character wrapped commit message lines when creating PRs #8255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unwrap 72-character wrapped commit message lines when creating PRs #8255
Conversation
Co-authored-by: alexr00 <[email protected]>
- Add unwrapCommitMessageBody function to join wrapped lines - Preserve blank lines, list items, quotes, and indented content - Add comprehensive tests for various unwrapping scenarios - Matches GitHub's behavior when converting commit messages to PR descriptions Co-authored-by: alexr00 <[email protected]>
- Extract regex pattern to named constant PRESERVE_LINE_PATTERN - Replace deprecated substr() with slice() - Improve regex to match only spaces/tabs explicitly instead of \s - Add detailed comments explaining the pattern Co-authored-by: alexr00 <[email protected]>
- Place hyphen at end of character class for clarity - Extract hasBody condition to avoid duplication Co-authored-by: alexr00 <[email protected]>
justingrant
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this PR, but I see a few problems.
| if (!body) { | ||
| return body; | ||
| } | ||
|
|
||
| // Pattern to detect lines that should be preserved (not joined): | ||
| // - Lines starting with whitespace (indented/code blocks) | ||
| // - Lines starting with list markers (*, -, +, >) | ||
| // - Lines starting with numbered list items (e.g., "1. ") | ||
| const PRESERVE_LINE_PATTERN = /^[ \t*+>\-]|^\d+\./; | ||
|
|
||
| const lines = body.split('\n'); | ||
| const result: string[] = []; | ||
| let i = 0; | ||
|
|
||
| while (i < lines.length) { | ||
| const line = lines[i]; | ||
|
|
||
| // Preserve blank lines | ||
| if (line.trim() === '') { | ||
| result.push(line); | ||
| i++; | ||
| continue; | ||
| } | ||
|
|
||
| // Check if this line should NOT be joined with the next | ||
| // Lines that start with special formatting characters should be preserved | ||
| const shouldPreserveLine = PRESERVE_LINE_PATTERN.test(line); | ||
|
|
||
| if (shouldPreserveLine) { | ||
| result.push(line); | ||
| i++; | ||
| continue; | ||
| } | ||
|
|
||
| // Start accumulating lines that should be joined | ||
| let joinedLine = line; | ||
| i++; | ||
|
|
||
| // Keep joining lines until we hit a blank line or a line that shouldn't be joined | ||
| while (i < lines.length) { | ||
| const nextLine = lines[i]; | ||
|
|
||
| // Stop at blank lines | ||
| if (nextLine.trim() === '') { | ||
| break; | ||
| } | ||
|
|
||
| // Stop at lines that start with special formatting | ||
| if (PRESERVE_LINE_PATTERN.test(nextLine)) { | ||
| break; | ||
| } | ||
|
|
||
| // Join this line with a space | ||
| joinedLine += ' ' + nextLine; | ||
| i++; | ||
| } | ||
|
|
||
| result.push(joinedLine); | ||
| } | ||
|
|
||
| return result.join('\n'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few problems with your code:
- Having two nested loops is unnecessarily complicated.
- You don't support code blocks that are fenced with ```
- Detection of indented code is flawed because it doesn't account for nested list items or list item continuations that have 4+ spaces at the start of the line.
Here's a revised version:
| if (!body) { | |
| return body; | |
| } | |
| // Pattern to detect lines that should be preserved (not joined): | |
| // - Lines starting with whitespace (indented/code blocks) | |
| // - Lines starting with list markers (*, -, +, >) | |
| // - Lines starting with numbered list items (e.g., "1. ") | |
| const PRESERVE_LINE_PATTERN = /^[ \t*+>\-]|^\d+\./; | |
| const lines = body.split('\n'); | |
| const result: string[] = []; | |
| let i = 0; | |
| while (i < lines.length) { | |
| const line = lines[i]; | |
| // Preserve blank lines | |
| if (line.trim() === '') { | |
| result.push(line); | |
| i++; | |
| continue; | |
| } | |
| // Check if this line should NOT be joined with the next | |
| // Lines that start with special formatting characters should be preserved | |
| const shouldPreserveLine = PRESERVE_LINE_PATTERN.test(line); | |
| if (shouldPreserveLine) { | |
| result.push(line); | |
| i++; | |
| continue; | |
| } | |
| // Start accumulating lines that should be joined | |
| let joinedLine = line; | |
| i++; | |
| // Keep joining lines until we hit a blank line or a line that shouldn't be joined | |
| while (i < lines.length) { | |
| const nextLine = lines[i]; | |
| // Stop at blank lines | |
| if (nextLine.trim() === '') { | |
| break; | |
| } | |
| // Stop at lines that start with special formatting | |
| if (PRESERVE_LINE_PATTERN.test(nextLine)) { | |
| break; | |
| } | |
| // Join this line with a space | |
| joinedLine += ' ' + nextLine; | |
| i++; | |
| } | |
| result.push(joinedLine); | |
| } | |
| return result.join('\n'); | |
| const rawLines = body.split(/\r?\n/); | |
| const result: string[] = []; | |
| let inFencedCodeBlock = false; | |
| let prevWasCode = false; | |
| for (const rawLine of rawLines) { | |
| const line = rawLine.trimEnd(); // Remove trailing space | |
| const trimmed = line.trim(); | |
| const prev = result[result.length - 1]; | |
| const isList = (s: string) => /^\s*[>\*\-\+]\s/.test(s); | |
| const indentSpaces = line.match(/^( *)/)?.[1].length ?? 0; | |
| const isListItemContinuation: boolean = | |
| !prevWasCode && isList(prev) && indentSpaces >= 2; | |
| const isIndentedCode: boolean = | |
| !isListItemContinuation && (indentSpaces >= 4 || line.startsWith('\t')); | |
| const isCode: boolean = inFencedCodeBlock || isIndentedCode; | |
| if (line.startsWith('```')) { | |
| // Start or end of fenced code block is never joined | |
| inFencedCodeBlock = !inFencedCodeBlock; | |
| result.push(line); | |
| } else if (!line || isCode || isList(line) || !prev || !result.length) { | |
| // Empty/code/list lines never join with the previous line. | |
| // Also don't join the first line, or if previous line is blank. | |
| result.push(line); | |
| } else { | |
| // Otherwise, join with previous line | |
| result[result.length - 1] = prev + ' ' + trimmed; | |
| } | |
| prevWasCode = isCode; | |
| } | |
| return result.join('\n'); | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot address this comment.
alexr00
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot These notes from @justingrant are good, address them:
- You don't support code blocks that are fenced with ```
- Detection of indented code is flawed because it doesn't account for nested list items or list item continuations that have 4+ spaces at the start of the line.
- Support fenced code blocks with ``` markers - Fix indented code detection to distinguish between: - Actual code blocks (4+ spaces, not in list context) - List item continuations (2+ spaces in list context) - Nested list items (detected by list markers with indentation) - Track list context state to handle list continuations correctly - Add comprehensive tests for all new scenarios Co-authored-by: alexr00 <[email protected]>
Addressed both issues in commit 1072b95:
Added 8 new tests covering fenced code blocks, nested lists, list continuations, and complex mixed content scenarios. |
alexr00
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justingrant, thanks for pointing these out!
- You don't support code blocks that are fenced with ```
- Detection of indented code is flawed because it doesn't account for nested list items or list item continuations that have 4+ spaces at the start of the line.
|
@alexr00 @copilot - After testing this PR, there's a problem: line breaks are not removed inside list items. For example, I'd expect the following commit markdown to end up as 2 lines in the resulting PR markdown. But it remains as 4 lines:
|
|
@justingrant thanks for trying it out! Opened #8335 to address this. |
Thanks @alexr00 for improving this! Unfortunately I think the follow-up PR might have some problems too, and may need some more tests. See #8335 (comment) |
When creating a PR from a single commit, the extension now unwraps commit message body lines that were wrapped at 72 characters, matching GitHub's native behavior.
Changes
Added
unwrapCommitMessageBodyfunction: Joins consecutive non-formatted lines while preserving:-,*,+, numbered)>)```)Updated
titleAndBodyFrom: Applies unwrapping to commit message bodies before setting PR descriptionTest coverage: Added 17 test cases for wrapped paragraphs, lists, quotes, indented blocks, fenced code blocks, nested lists, list continuations, and mixed content
Example
Before:
After:
Implementation Details
The function uses state tracking to correctly handle:
```markers is preserved without modificationOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.