🇮🇹 🇪🇪 🖥

  • 0 Posts
  • 100 Comments
Joined 9 months ago
cake
Cake day: March 19th, 2024

help-circle

  • There is a bunch of research showing that model improvement is marginal compared to energy demand and/or amount of training data. OpenAI itself ~1 month ago mentioned that they are seeing a smaller improvements in Orion (I believe) vs GPT4 than there was between GPT 4 and 3. We are also running out of quality data to use for training.

    Essentially what I mean is that the big improvements we have seen in the past seem to be over, now improving a little cost a lot. Considering that the costs are exorbitant and the difference small enough, it’s not impossible to imagine that companies will eventually give up if they can’t monetize this stuff.



  • That is my experience, it’s generally quite decent for small and simple stuff (as I said, distillation of documentation). I use it for rust, where I am sure the training material was much smaller than other languages. It’s not a matter a prompting though, it’s not my prompt that makes it hallucinate functions that don’t exist in libraries or make it write code that doesn’t compile, it’s a feature of the technology itself.

    GPTs are statistical text generators after all, they don’t “understand” the problem.


  • I hardly see it changed to be honest. I work in the field too and I can imagine LLMs being good at producing decent boilerplate straight out of documentation, but nothing more complex than that.

    I often use LLMs to work on my personal projects and - for example - often Claude or ChatGPT 4o spit out programs that don’t compile, use inexistent functions, are bloated etc. Possibly for languages with more training (like Python) they do better, but I can’t see it as a “radical change” and more like a well configured snippet plugin and auto complete feature.

    LLMs can’t count, can’t analyze novel problems (by definition) and provide innovative solutions…why would they radically change programming?



  • What job could possibly replace…? If you can replace a job with LLMs it means either that the job is not needed on the first place (bullshit job) or that you can replace it with a dice (e.g., decision-making processes), since LLMs-output will depend essentially just on what is in the training material -which we don’t know (I.e., the answer is essentially random).


  • Models are not improving, companies are still largely (massively) unprofitable, the tech has a very high environmental impact (and demand) and not a solid business case has been found so far (despite very large investments) after 2 years.

    That AI isn’t going anywhere is possible, but LLM-based tools might also simply follow crypto, VR, metaverses and the other tech “revolutions” that were just hyped and that ended nowhere. I can’t say it will go one way or another, but I disagree with you about “adjustment period”. I think generative AI is cool and fun, but it’s a toy. If companies don’t make money with it, they will eventually stop investing into it.

    Also

    Today’s hype will have lasting effects that constrain tomorrow’s possibilities

    Is absolutely true. Wasting capital (human and economic) on something means that it won’t be used for something else instead. This is especially true now that it’s so hard to get investments for any other business. If all the money right now goes into AI, and IF this turns out to be just hype, we just collectively lost 2, 4, 10 years of research and investments on other areas (for example, environment protection). I am really curious about what makes you think that that sentence is false and stupid.






  • I like the idea of canaries in documents, I think is a good point but obviously it only applies to certain types of data. Still a good idea.

    Looking at OP, they seem a small shop, with a limited budget. Seriously the best recommendation I think is to use some kind of remote storage for data (works as long as the employee complies) and to make sure the access control is done in a decent way (reducing the blast of employee behaving maliciously). Anything else is probably out of reach for a small company without a security department.

    Maybe I sounded too harsh, that’s just because in this post I have seen all kinds of comments who completely missed the point (IMHO) and suggested super complicated technical implementations that show how disconnected some people can be from real technical operations, despite the good tech skills.


  • DLP solutions are honestly a joke. 99% of the case they only cost you a fortune and prevent nothing. DLP is literally a corporate religion.

    What you mentioned also makes sense if you are windows shop running AD. If you are not, setting it up to lock 1 workstation is insane.

    Also, the moment the data gets put on the workstation you failed. Blocking USB is still a good idea, but does very little (network exfiltration is trivial, including with DLP solutions). So the idea to use remotely a machine is a decent control, and all efforts and resources should be put in place to prevent data leaving that machine. Obviously even this is imperfect, because if I can see the data on my screen I can take a picture and OCR it. So the effort needs to go in ensuring the data is accessed on a need basis.


  • Jamf doesn’t do anything for this problem, besides costing you a fortune in both license and maintenance/operation. Especially if you are not a Mac shop.

    MDM at most can be used as a reactive tool to do something on the machine - as long as the one with the machine in their hand leaves the network connection on.

    There are much cheaper solution to do that for 1 machine, and -as others correctly pointed out- the only solution (partial) here is not storing the data on a machine you don’t control. Period.




  • Your ability to SSH in the machine depends on the network connectivity. Knowing the IP does nothing if the SSH port is not forwarded by the router or if you don’t establish a reverse tunnel yourself with a public host. As a company you can do changes to the client device, but you can’t do them on the employee’s network (and they might not even be connected there). So the only option is to have the machine establish a reverse tunnel, and this removes even the need for dynamic DNS (which also might not work in certain ISPs).

    The no-sudo is also easier said than done, that means you will need to assist every time the employee needs a new package installed, you need to set unattended upgrades and of course help with debugging should something break. Depending on the job type, this might be possible.

    I still think this approach (lock laptop) is an old, ineffective approach (vs zero-trust + remote data).



  • This is honestly an extremely expensive (in terms of skills, maintenance, chance of messing up) solution for a small shop that doesn’t mitigate at all the threats posed.

    You said correctly, the employee has the final word on what happens to the data appearing on their screen. Especially in the case of client data (I.e., few and sensitive pieces of data), it might even be possible to take pictures of the screen (or type it manually) and all the time invested in (imperfect) solutions to restrict drives and network (essentially impossible unless you have a whitelist of IPs/URLs) goes out the window too.

    To me it seems this problemi is simply approached from the wrong angle: once the data is on a machine you don’t trust, it’s gone. It’s not just the employee, it’s anybody who compromises that workstation or accesses it while left unlocked. The only approach to solving the issue OP is having is simply avoiding for the data to be stored on the machine in the first place, and making sure that the access is only for the data actually needed.

    Data should be stored in the company-controlled infrastructure (be in cloud storage, object storage, a privileged-access workstation, etc.) and controls should be applied there (I.e., monitor for data transfers, network controls, etc.). This solves both the availability concerns (what if the laptop gets stolen, or breaks) and some of the security concerns. The employee will need to authenticate each time with a short-lived token to access the data, which means revoking access is also easy.

    This still does not solve the fundamental problem: if the employee can see the data, they can take it. There is nothing that can be done about this, besides ensuring that the data is minimised and the employee has only access to what’s strictly needed.