• 0 Posts
  • 14 Comments
Joined 12 days ago
cake
Cake day: March 10th, 2025

help-circle
  • Okay, so you can’t conceive of the idea of an email that it’s important that you don’t miss.

    Let’s go with what Apple sold Apple Intelligence on, shall we? You say to Siri “what time do I need to pick my mother up from the airport?” and Siri coombs through your messages for the flight time, checks the time of arrival from the airline’s website, accesses maps to get journey time accounting for local traffic, and tells you when you need to leave.

    With LLMs, absolutely none of those steps can be trusted. You have to check each one yourself. Because if they’re wrong, then the output is wrong. And it’s important that the output is right. And if you have to check the input of every step, then what do you save by having Siri do it in the first place? It’s actually taking you more time than it would have to do everything yourself.

    AI assistants are being sold as saving you time and taking meaningless busywork away from you. In some applications, like writing easy, boring code, or crunching more data than a human could in a very short time frame, they are. But for the applications they’re being sold on for phones? Not without being reliable. Which they can’t be, because of their architecture.





  • I’m not sure we’re disagreeing very much, really.

    My main point WRT “kinda” is that there are a tonne of applications that 99% isn’t good enough for.

    For example, one use that all the big players in the phone world seem to be pushing ATM is That of sorting your emails for you. If you rely on that and it classifies an important email as unimportant so you miss it, then that’s actually a useless feature. Either you have to check all your emails manually yourself, in which case it’s quicker to just do that in the first place and the AI offers no value, or you rely on it and end up missing something that it es important you didn’t miss.

    And it doesn’t matter if it gets it wrong one time in a hundred, that one time is enough to completely negate all potential positives of the feature.

    As you say, 100% isn’t really possible.

    I think where it’s useful is for things like analysing medical data and helping coders who know what they’re doing with their work. In terms of search it’s also good at “what’s the name of that thing that’s kinda like this?”-type queries. Kind of the opposite of traditional search engines where you’re trying to find out information about a specific thing, where i think non-Google engines are still better.


  • I’m not saying they don’t have applications. But the idea of them being a one size fits all solution to everything is something being sold to VC investors and shareholders.

    As you say - the issue is accuracy. And, as you also say - that’s not what these things do, and instead they make predictions about what comes next and present that confidently. Hallucinations aren’t errors, they’re what they were built to do.

    If you want something which can set an alarm for you or find search results then something that responds to set inputs correctly 100% of the time is better than something more natural-seeming which is right 99%of the time.

    Maybe along the line there will be a new approach, but what is currently branded as AI is never going to be what it’s being sold as.


  • If you follow AI news you should know that it’s basically out of training data, that extra training is inversely exponential and so extra training data would only have limited impact anyway, that companies are starting to train AI on AI generated data -both intentionally and unintentionally, and that hallucinations and unreliability are baked-in to the technology.

    You also shouldn’t take improvements at face value. The latest chatGPT is better than the previous version, for sure. But its achievements are exaggerated (for example, it already knew the answers ahead of time for the specific maths questions that it was denoted answering, and isn’t better than before or other LLMs at solving maths problems that it doesn’t have the answers already hardcoded), and the way it operates is to have a second LLM check its outputs. Which means it takes,IIRC, 4-5 times the energy (and therefore cost) for each answer, for a marginal improvement of functionality.

    The idea that “they’ve come on in leaps and bounds over the Last 3 years therefore they will continue to improve at that rate isn’t really supported by the evidence.








  • Someone has died due to a touchscreen. A woman had a Tesla which you put in park forwards or reverse with a touchscreen. She’d always had trouble with it and got it wrong and reversed into a pond. That meant the power went out so she couldn’t open that door. To get to the emergency escape handle you have to remove the speakers in the doors. So she drowned.

    The kicker? Her husband was a millionaire and he immediately put out a statement absolving Tesla and musk from any wrongdoing.