Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[DOC] Tweaks for Strings#byteslice #13737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions doc/string/byteslice.rdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Returns a substring of +self+, or +nil+ if the substring cannot be constructed.

With integer arguments +offset+ and +length+ given,
returns the substring beginning at the given +offset+
and of the given +length+ (as available):

s = '0123456789' # => "0123456789"
s.byteslice(2) # => "2"
s.byteslice(200) # => nil
s.byteslice(4, 3) # => "456"
s.byteslice(4, 30) # => "456789"

Returns +nil+ if +length+ is negative or +offset+ falls outside of +self+:

s.byteslice(4, -1) # => nil
s.byteslice(40, 2) # => nil

Counts backwards from the end of +self+
if +offset+ is negative:

s = '0123456789' # => "0123456789"
s.byteslice(-4) # => "6"
s.byteslice(-4, 3) # => "678"

With Range argument +range+ given, returns
<tt>byteslice(range.begin, range.size)</tt>:

s = '0123456789' # => "0123456789"
s.byteslice(4..6) # => "456"
s.byteslice(-6..-4) # => "456"
s.byteslice(5..2) # => "" # range.size is zero.
s.byteslice(40..42) # => nil

The starting and ending offsets need not be on character boundaries:

s = 'こんにちは'
s.byteslice(0, 3) # => "こ"
s.byteslice(1, 3) # => "\x81\x93\xE3"

The encodings of +self+ and the returned substring
are always the same:

s.encoding # => #<Encoding:UTF-8>
s.byteslice(0, 3).encoding # => #<Encoding:UTF-8>
s.byteslice(1, 3).encoding # => #<Encoding:UTF-8>

But, depending on the character boundaries,
the encoding of the returned substring may not be valid:

s.valid_encoding? # => true
s.byteslice(0, 3).valid_encoding? # => true
s.byteslice(1, 3).valid_encoding? # => false

Related: see {Converting to New String}[rdoc-ref:String@Converting+to+New+String].
41 changes: 3 additions & 38 deletions string.c
Original file line number Diff line number Diff line change
Expand Up @@ -6870,45 +6870,10 @@ str_byte_aref(VALUE str, VALUE indx)

/*
* call-seq:
* byteslice(index, length = 1) -> string or nil
* byteslice(range) -> string or nil
*
* Returns a substring of +self+, or +nil+ if the substring cannot be constructed.
*
* With integer arguments +index+ and +length+ given,
* returns the substring beginning at the given +index+
* of the given +length+ (if possible),
* or +nil+ if +length+ is negative or +index+ falls outside of +self+:
*
* s = '0123456789' # => "0123456789"
* s.byteslice(2) # => "2"
* s.byteslice(200) # => nil
* s.byteslice(4, 3) # => "456"
* s.byteslice(4, 30) # => "456789"
* s.byteslice(4, -1) # => nil
* s.byteslice(40, 2) # => nil
*
* In either case above, counts backwards from the end of +self+
* if +index+ is negative:
*
* s = '0123456789' # => "0123456789"
* s.byteslice(-4) # => "6"
* s.byteslice(-4, 3) # => "678"
*
* With Range argument +range+ given, returns
* <tt>byteslice(range.begin, range.size)</tt>:
*
* s = '0123456789' # => "0123456789"
* s.byteslice(4..6) # => "456"
* s.byteslice(-6..-4) # => "456"
* s.byteslice(5..2) # => "" # range.size is zero.
* s.byteslice(40..42) # => nil
*
* In all cases, a returned string has the same encoding as +self+:
*
* s.encoding # => #<Encoding:UTF-8>
* s.byteslice(4).encoding # => #<Encoding:UTF-8>
* byteslice(offset, length = 1) -> string or nil
* byteslice(range) -> string or nil
*
* :include: doc/string/byteslice.rdoc
*/

static VALUE
Expand Down