But when you have unicode combining characters, and things like Hangul composable forms, what use does indexing by unicode code point actually have which would justify the complexity to which your refer? Iterating by whole grapheme cluster might possibly be useful but indexing could not be constant time, and I believe that the Julia language has such a thing, in its ‘graphemes’ function; but I don’t think python provides that.