Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Console]Improve formatter for double-width character #10714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

denkiryokuhatsuden
Copy link
Contributor

Q A
Bug fix? yes
New feature? no
BC breaks? yes for one that expecting current broken output
Deprecations? no
Fixed tickets
Tests pass? yes
License MIT
Doc PR

EDIT: fixed the table above not to remove irrelevant line

As mb_strlen just returns "how many chars in the string",
formatting with double-width character is bit broken.

The test I add is skipped when mbstring extension is not loaded.
I'm afraid if some of you cannot properly display japanese string.
(表示するテキスト just means "Some text to display")

`$formatter->formatBlock()` formats a broken block
when some double-width character is set.
Some character has doubled-width so `mb_strlen` does not return actual string width (just returns count of chars).
By using `mb_strwidth` (http://www.php.net/manual/en/function.mb-strwidth.php),
it can property measure the actual width of string.
When `Application` renderes Exception, it also writes broken block when
message contains double-width (more generally, multi-byted) character.

This commit fixes above.
@denkiryokuhatsuden
Copy link
Contributor Author

I found that rendering error uses logic independent of FormatterHelper, and it has similar problem.
15e136d is commit to fix them.

// str_split is not suitable for multi-byte characters, we should use preg_split to get char array properly.
// additionally, array_slice() is not enough as some character has doubled width.
// we need a function to split string not by character count but by string width
$str_split_width = function ($string, $width) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make it a private method rather than a closure which need to be recreated by PHP for each call to renderException

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

above $str_split_width, $strlen also exists as closure. Shall I move $strlen as private method either?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and could you please give advice for method naming,

  • $strlen
    -> private strlen for consistency (Helper has a function with same name)
    -> private strwidth as this does not return length of string but width
  • $str_split_width
    -> how about private splitStringByWidth()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed about splitStringByWidth rather than $str_split_width.

and for $strlen, it could indeed be renamed to a private method stringWidth as it does not compute the length anymore after your change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for advice.
Shall I leave the name of Helper#strlen unchanged (which I changed its behavior, too) unlike Application's, as changing that name introduces BC break?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it cannot be renamed as it is protected

}
$originalEncoding = mb_detect_encoding($string);

$convertedString = false === $originalEncoding ? $string : mb_convert_encoding($string, 'utf8', $originalEncoding);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with stringWidth, you should fallback to str_split if mb_string does not support the encoding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've overlooked the code in stringWidth. I've changed variable's name, too.

@fabpot
Copy link
Member

fabpot commented May 12, 2014

Thank you @denkiryokuhatsuden.

fabpot added a commit that referenced this pull request May 12, 2014
…kiryokuhatsuden)

This PR was squashed before being merged into the 2.3 branch (closes #10714).

Discussion
----------

[Console]Improve formatter for double-width character

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | yes for one that expecting current broken output
| Deprecations? | no
| Fixed tickets |
| Tests pass?   | yes
| License       | MIT
| Doc PR        |

EDIT: fixed the table above not to remove irrelevant line

As mb_strlen just returns "how many chars in the string",
formatting with double-width character is bit broken.

The test I add is skipped when mbstring extension is not loaded.
I'm afraid if some of you cannot properly display japanese string.
(表示するテキスト just means "Some text to display")

Commits
-------

a52f41d [Console]Improve formatter for double-width character
@fabpot fabpot closed this May 12, 2014
@jakzal
Copy link
Contributor

jakzal commented May 13, 2014

@denkiryokuhatsuden on which PHP version have you been testing this?

@denkiryokuhatsuden
Copy link
Contributor Author

@jakzal tested on php5.5. Are there some tests failing?

@jakzal
Copy link
Contributor

jakzal commented May 14, 2014

@denkiryokuhatsuden yes, they were never passing on PHP 5.5. See #10897.

fabpot added a commit that referenced this pull request May 14, 2014
This PR was merged into the 2.3 branch.

Discussion
----------

[Console] Fix a console test

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

re #10714

Every wide character in the fixture file is actually 3 ansi-characters long.

Commits
-------

61108b9 Disable 5.6 until it is stable again.
8cadb49 Update the fixtures.
fabpot added a commit that referenced this pull request May 14, 2014
This PR was merged into the 2.3 branch.

Discussion
----------

Explicitly define the encoding.

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Credits for discovering it go to @nicolas-grekas. Cheers!

Travis for PHP 5.6 cannot be enabled yet as there's one more test failing.

re #10714 #10714

Commits
-------

619ff58 Explicitly define the encoding.
@jakzal jakzal mentioned this pull request May 15, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants