It’s not one AI doing it in a big blob.
You ask ChatGPT something. It builds a web query. Another program returns search results. Then ChatGPT parses the list of results and chooses one to visit. The same program then returns the content of that page. Then ChatGPT parses that etc etc.
If the program (which is not an AI) that handles the queries and returns content is set to respect robots.txt, it will just not return the content to ChatGPT to be parsed.
Scan and Go is becoming very wide spread in Denmark. It’s lovely! Cuts down the time for a quick shopping trip on the way home from work to less than half