There seems to be a lot overlap between the tokenizers of Llama 3 and GPT-4. How similar are they?
Large language models (predictably) learn to represent the semantic meaning of sentences.