rust string slice negative index

Why doesn’t Rust have reverse indexing / indexing from end for arrays?

stackoverflow.com › questions › 77673540 › why-doesn-t-rust-have-reverse-indexing-indexing-from-end-for-arrays

I can think of several reasons:

If you calculate the index, you can easily calculate a negative index by mistake. This would not be a memory-safety issue in Rust, but bounds-checking would no longer work reliably. This would make a common class of bugs impossible to detect at run-time.
It would have a run-time cost (it needs an addition, at least, compared to bounds-checking). It would be generating machine code that isn't needed in most cases, hurting performance.
Arrays may have a known size at compile-time, reducing the cost, but slices do not. Surely you'd want indexing to mean the same for both slices and arrays.
arr[-1] implies that the index is signed, so if you have a pointer you can use indexing only to access half of the possible memory range. This can be a problem in systems programming where the highest memory bit may have a special meaning.
Also, in C++ mixing signed and unsigned integers has a long history of pain and bugs. I'm not sure if this reasoning applies here, but at least many would feel better to avoid mixing signed and unsigned types whenever possible. (E.g. it requires care to compare those types with correct overflow handling.)

Answer from maxy on Stack Overflow

The Rust Programming Language

doc.rust-lang.org › book › ch04-03-slices.html

The Slice Type - The Rust Programming Language

Internally, the slice data structure stores the starting position and the length of the slice, which corresponds to ending_index minus starting_index. So, in the case of let world = &s[6..11];, world would be a slice that contains a pointer to the byte at index 6 of s with a length value of 5. Figure 4-7 shows this in a diagram. Figure 4-7: A string slice referring to part of a String

GitHub

github.com › rust-lang › rfcs › issues › 2249

Signed integer indexing and slicing · Issue #2249 · rust-lang/rfcs

December 14, 2017 - And just to clarify, negative indexes like -1 and -2 get added to the length of the slice to become real indices, like len - 1 and len - 2.

Published Dec 14, 2017

Rust

docs.rs › slicestring

slicestring - Rust

slicestring is a crate for slicing Strings. It provides the slice() method for String and &str. It takes the index-range as an argument, whereby also a negative value can be passed for the second index.

Rust Internals

internals.rust-lang.org › libs

Allowing slice indexing with non-usize integer types - libs - Rust Internals

December 23, 2020 - In code that stores and manipulates indexes into slices (custom hash maps, some compression code, string searching/indexing, etc) it is common to manipulate and store these indexes in types that are not usize. It is unne…

Stack Overflow

stackoverflow.com › questions › 77673540 › why-doesn-t-rust-have-reverse-indexing-indexing-from-end-for-arrays

Why doesn’t Rust have reverse indexing / indexing from end for arrays? - Stack Overflow

Top answer

1 of 4

I can think of several reasons:

If you calculate the index, you can easily calculate a negative index by mistake. This would not be a memory-safety issue in Rust, but bounds-checking would no longer work reliably. This would make a common class of bugs impossible to detect at run-time.
It would have a run-time cost (it needs an addition, at least, compared to bounds-checking). It would be generating machine code that isn't needed in most cases, hurting performance.
Arrays may have a known size at compile-time, reducing the cost, but slices do not. Surely you'd want indexing to mean the same for both slices and arrays.
arr[-1] implies that the index is signed, so if you have a pointer you can use indexing only to access half of the possible memory range. This can be a problem in systems programming where the highest memory bit may have a special meaning.
Also, in C++ mixing signed and unsigned integers has a long history of pain and bugs. I'm not sure if this reasoning applies here, but at least many would feel better to avoid mixing signed and unsigned types whenever possible. (E.g. it requires care to compare those types with correct overflow handling.)

2 of 4

You can easily implement it yourself if you want to.

use std::ops::{Index, IndexMut};

pub struct ReverseIndex(pub usize);

impl<T> Index<ReverseIndex> for [T] {
    type Output = T;
    
    fn index(&self, idx: ReverseIndex)->&T{
        let len = self.len();
        &self[len - 1 - idx.0]
    }
}

impl<T> IndexMut<ReverseIndex> for [T] {
    fn index_mut(&mut self, idx: ReverseIndex)->&mut T{
        let len = self.len();
        &mut self[len - 1 - idx.0]
    }
}


fn main(){
    let g = [1, 2, 3, 4, 5];
    assert_eq!(g[ReverseIndex(1)], 4);
}

GitHub

github.com › rust-lang › rust › issues › 18662

Slice notation with negative indices · Issue #18662 · rust-lang/rust

August 25, 2014 - Slice notation with negative indices#18662 · Copy link · lovasoa · opened · on Nov 5, 2014 · Issue body actions · fn main() { let a = "haha"; println!("Nope. {}", a[1..-1]); } Compilation succeeds without any warning. -1 is silently converted to an unsigned int, and I get an executable that panics with index 1 and/or 18446744073709551615 inhahado not lie on character boundary ·

Published Nov 05, 2014

Stack Overflow

stackoverflow.com › questions › 65347982 › how-to-address-an-array-index-by-a-negative-offset

rust - How to address an array index by a negative offset? - Stack Overflow

Top answer

1 of 2

Like @Aplet123 answered, you may use casts. But if b is always negative, you could save the absolute value and just substact it instead:

let b = 1;

return a[SIZE/2 - b];

2 of 2

Cast it to an isize (a signed integer with the same size as a usize) first:

a[((SIZE / 2) as isize + b) as usize]

Rust Programming Language

users.rust-lang.org › help

Can slice but can't index an str - help - The Rust Programming Language Forum

April 22, 2021 - Hi, I'm was writing some code in the playground to expericence with strs when I found something strange : str can't be indexed but can be sliced (from a range). Here's an example: let a = "Some cool stuff"; let b = "Another awesome str"; let index: usize = 4; // let a = &a[index]; Error ! let b = &b[index..index+3]; println!("a = {}, b = {}", a, b); (Rust Playground) I read some issues about why a str can't be indexed but I don't understand why it can be sliced.

MIT

web.mit.edu › rust-lang_v1.25 › arch › amd64_ubuntu1404 › share › doc › rust › html › book › second-edition › ch04-03-slices.html

Slices - The Rust Programming Language

Internally, the slice data structure stores the starting position and the length of the slice, which corresponds to ending_index minus starting_index. So in the case of let world = &s[6..11];, world would be a slice that contains a pointer to the 6th byte of s and a length value of 5. Figure 4-6 shows this in a diagram. Figure 4-6: String slice referring to part of a String · With Rust’s ..

Find elsewhere

Google Bing Mojeek

Hacker News

news.ycombinator.com › item

The Rust approach of having all indexing being unsigned has been extremely annoy... | Hacker News

May 11, 2021 - I think unsignedness on array indices is one of those places where the “make invalid states unrepresentable” mantra has gone too far: yes it’s nice that theoretically the whole 64-bit index space is addressable from a byte array based at 0, but in reality -1 is just as stupid and invalid ...

reddit.com › r/rust › what’s the logic behind not being able to index with numeric types other than usize

r/rust on Reddit: What’s the logic behind not being able to index with numeric types other than usize

November 14, 2023 -

For example

fn main() { let x: u16 = 0; let mut v: Vec<u16> = Vec::new(); v.push(1); let y = v[x]; println!("{y}"); }

Doesn’t run but

fn main() { let x: u16 = 0; let mut v: Vec<u16> = Vec::new(); v.push(1); let y = v[x as usize]; println!("{y}"); }

Does

Top answer

1 of 12

497

usize compiles to the same size of a pointer on the target platform. This just makes it explicit what kind of memory constraints your program will have; it will be based on your target. It might be helpful to know that all other languages also require usize equivalents, but they make that conversion implicitly. Requiring the cast is just keeping with the Rust theme of being explicit.

2 of 12

vector indexing really is just an addition (and a read). you add some offset (the index) to a pointer to the first element. and therefore the type you're adding must match. since pointer size is platform dependent, rust has a type that means "an int the size of the platform's pointer size" (i know, very convenient), and this is usize. in other languages, it is just int which can cause some confusion as your integer's type changes size depending on where you run your code. so, pointer is an usize, the offset (or index) needs to be too. you could implicitly convert integer types, but this is not guaranteed to work (as the source integer could be larger than the usize max)

Rust

docs.rs › ndarray › latest › ndarray › struct.Slice.html

Slice in ndarray - Rust

Slice::new(a, None, -1) is every element, from a until the end, in reverse order. It can also be created with Slice::from(a..).step_by(-1). The Python equivalent is [a::-1]. ... end index; negative are counted from the back of the axis; when not present the default is the full length of the axis.

Rust Programming Language

users.rust-lang.org › t › negative-tuple-indexes › 93568

Negative tuple indexes - The Rust Programming Language Forum

Top answer

1 of 1

Indeed, there’s no such thing as negative tuple-struct indices. You can provide the index of the field as an argument to the macro. Or consider named fields.

Rust Programming Language

users.rust-lang.org › t › indexing-vec-with-i32-is-necessary › 79378

Indexing Vec with i32 is necessary - The Rust Programming Language Forum

August 4, 2022 - Hi. I'm new to rust. It's a interesting language and I learned a lot from it ^-^. Recently, I found that rust restricts users to index vectors with unsigned numbers only. However, it's not plausible at all. During programming, we often need to add an index with a negative offset.

Stack Overflow

stackoverflow.com › questions › 65250876 › how-do-i-go-beyond-the-starting-index-of-a-string-slice

rust - How do I go beyond the starting index of a string slice? - Stack Overflow

Top answer

1 of 1

it's a pointer that can point to any character of a String, not necessarily the 0-th or 1st one, right?

Yes.

is there a way to go back N characters and refer to -N character of an original String? Not to -Nth character of a slice, but an original String.

The slice itself retains no knowledge of whatever originally created it (which might not even be a String), it only has a pointer and a length.

Provided that I know that it may cause memory segmentation error and am fine with it.

You can always do that with unsafe and raw pointers but as the name notes this is wildly unsafe, not just because there's no guarantees whatsoever as to where's what, but because the other segments might be mutably borrowed or any other nonsense, which land you straight into UB. A segfault is the least of your worries.

What are you actually trying to achieve?

Lib.rs

lib.rs › crates › rev_slice

RevSlice — Rust library // Lib.rs

October 23, 2018 - › Rust patterns · #indexing #reverse #view #reversed #slice #negative #operating · A newtype for operating on a reversed view of a slice · by Scott McMurray · Uses old Rust 2015 · #7 in #reversed · 220 downloads per month · MIT license · 9KB 166 lines ·

Stack Overflow

stackoverflow.com › questions › 24542115 › how-to-index-a-string-in-rust

indexing - How to index a String in Rust - Stack Overflow

Top answer

1 of 10

217

Yes, indexing into a string is not available in Rust. The reason for this is that Rust strings are saved in a contiguous UTF-8 encoded buffer internally, so the concept of indexing itself would be ambiguous, and people would misuse it: byte indexing is fast, but almost always incorrect (when your text contains non-ASCII symbols, byte indexing may leave you inside a character / unicode code point, which is really bad if you need text processing), while code point indexing is not free because UTF-8 is a variable-length encoding, so you have to traverse the entire string buffer to find the required code point.

If you are certain that your strings contain ASCII characters only, you can use the as_bytes() method on &str which returns a byte slice, and then index into this slice:

let num_string = num.to_string();

// ...

let b: u8 = num_string.as_bytes()[i];
let c: char = b as char;  // if you need to get the character as a unicode code point

If you do need to index code points, you have to use the chars() iterator:

num_string.chars().nth(i).unwrap()

As I said above, this would require traversing the entire iterator up to the ith code element.

Finally, in many cases of text processing, it is actually necessary to work with grapheme clusters rather than with code points or bytes. For example, many emojis are composed of multiple code points, but are perceived as one "character". With the help of the unicode-segmentation crate, you can index into grapheme clusters as well:

use unicode_segmentation::UnicodeSegmentation

let string: String = ...;
UnicodeSegmentation::graphemes(&string, true).nth(i).unwrap()

Naturally, grapheme cluster indexing into the contiguous UTF-8 buffer has the same requirement of traversing the entire string as indexing into code points.

2 of 10

The correct approach to doing this sort of thing in Rust is not indexing but iteration. The main problem here is that Rust's strings are encoded in UTF-8, a variable-length encoding for Unicode characters. Being variable in length, the memory position of the nth character can't determined without looking at the string. This also means that accessing the nth character has a runtime of O(n)!

In this special case, you can iterate over the bytes, because your string is known to only contain the characters 0–9 (iterating over the characters is the more general solution but is a little less efficient).

Here is some idiomatic code to achieve this (playground):

fn is_palindrome(num: u64) -> bool {
    let num_string = num.to_string();
    let half = num_string.len() / 2;

    num_string.bytes().take(half).eq(num_string.bytes().rev().take(half))
}

We go through the bytes in the string both forwards (num_string.bytes().take(half)) and backwards (num_string.bytes().rev().take(half)) simultaneously; the .take(half) part is there to halve the amount of work done. We then simply compare one iterator to the other one to ensure at each step that the nth and nth last bytes are equivalent; if they are, it returns true; if not, false.

Rustlabs

rustlabs.github.io › home › rust - quick start › slicing a string

Slicing a String | Learn Rust

January 24, 2024 - Slicing a String linkWhat Is Slicing? Slicing is used to get a portion of a string value. Syntax The general syntax is: let slice = &string[start_index..end_index] Here, start_index and end_index are the positions of starting and ending index of the original array respectively.

Wduquette

wduquette.github.io › parsing-strings-into-slices

Parsing Rust Strings into Slices

Given a character index into a slice, and the character at that index, you can get the index of the next character by adding the first character’s length in bytes using the len_utf8 method. Here’s a partial implementation of a type that wraps a Peekable<Chars> iterator and uses len_utf8 to keep track of the current index. /// The Tokenizer type. #[derive(Clone,Debug)] pub struct Tokenizer<'a> { // The string being parsed.

crates.io

crates.io › crates › at › 0.1.1

at - crates.io: Rust Package Registry

These methods offer a few benefits over standard indexing: They work for any integer type, rather than just usize1 · They support Pythonesque negative indices; for example, nums.at(-1) returns the last element2