@sudneo

sudneo@lemm.ee · 13 hours ago

I really can’t see this being done by any sane person. Why would you have a generator of text reviewing stuff (besides grammar)? Do you have any reference of some companies doing this, perhaps?

sudneo@lemm.ee · 13 hours ago

There is a bunch of research showing that model improvement is marginal compared to energy demand and/or amount of training data. OpenAI itself ~1 month ago mentioned that they are seeing a smaller improvements in Orion (I believe) vs GPT4 than there was between GPT 4 and 3. We are also running out of quality data to use for training.

Essentially what I mean is that the big improvements we have seen in the past seem to be over, now improving a little cost a lot. Considering that the costs are exorbitant and the difference small enough, it’s not impossible to imagine that companies will eventually give up if they can’t monetize this stuff.

sudneo@lemm.ee · 14 hours ago

Oh boy…what can possibly go wrong for documents where small minutiae like wording can make a huge difference.

sudneo@lemm.ee · 14 hours ago

That is my experience, it’s generally quite decent for small and simple stuff (as I said, distillation of documentation). I use it for rust, where I am sure the training material was much smaller than other languages. It’s not a matter a prompting though, it’s not my prompt that makes it hallucinate functions that don’t exist in libraries or make it write code that doesn’t compile, it’s a feature of the technology itself.

GPTs are statistical text generators after all, they don’t “understand” the problem.

sudneo@lemm.ee · 16 hours ago

I hardly see it changed to be honest. I work in the field too and I can imagine LLMs being good at producing decent boilerplate straight out of documentation, but nothing more complex than that.

I often use LLMs to work on my personal projects and - for example - often Claude or ChatGPT 4o spit out programs that don’t compile, use inexistent functions, are bloated etc. Possibly for languages with more training (like Python) they do better, but I can’t see it as a “radical change” and more like a well configured snippet plugin and auto complete feature.

LLMs can’t count, can’t analyze novel problems (by definition) and provide innovative solutions…why would they radically change programming?

sudneo@lemm.ee · 17 hours ago

Even if they plateaued in place where they are right now it would lead to major shakeups in humanity’s current workflow

Like which one? Because it’s now 2 years we have chatGPT and already quite a lot of (good?) models. Which shakeup do you think is happening or going to happen?

sudneo@lemm.ee · 18 hours ago

What job could possibly replace…? If you can replace a job with LLMs it means either that the job is not needed on the first place (bullshit job) or that you can replace it with a dice (e.g., decision-making processes), since LLMs-output will depend essentially just on what is in the training material -which we don’t know (I.e., the answer is essentially random).

sudneo@lemm.ee · 18 hours ago

Models are not improving, companies are still largely (massively) unprofitable, the tech has a very high environmental impact (and demand) and not a solid business case has been found so far (despite very large investments) after 2 years.

That AI isn’t going anywhere is possible, but LLM-based tools might also simply follow crypto, VR, metaverses and the other tech “revolutions” that were just hyped and that ended nowhere. I can’t say it will go one way or another, but I disagree with you about “adjustment period”. I think generative AI is cool and fun, but it’s a toy. If companies don’t make money with it, they will eventually stop investing into it.

Also

Today’s hype will have lasting effects that constrain tomorrow’s possibilities

Is absolutely true. Wasting capital (human and economic) on something means that it won’t be used for something else instead. This is especially true now that it’s so hard to get investments for any other business. If all the money right now goes into AI, and IF this turns out to be just hype, we just collectively lost 2, 4, 10 years of research and investments on other areas (for example, environment protection). I am really curious about what makes you think that that sentence is false and stupid.

sudneo@lemm.ee · 4 days ago

The rapist kept videos and pictures I believe. There was tons of proof for everything he (and the others) did. I can see why this was relatively straightforward (in addition to have a big - deserved - amount of attention).

sudneo@lemm.ee · 4 days ago

That’s not how the justice system works in most (all?) Europe. Crimes are not a point system where you redeem “prizes”, and the sentence is based on the particular crime committed, not the sum of all individual counts.

Many countries are also very rarely giving life sentences because they have generally very little point (no possibility to re-enter society=no rehabilitation possible) in addition to create other problems (like a complete disincentive to good behavior in prison - since literally nothing worse can happen to you).

sudneo@lemm.ee · 19 days ago

I wish this could be blamed on current or recent (or right wing) governments. The progressive demolition and/or privatization of welfare (from healthcare to social security nets) is a process that goes on for at least 20 years now, carried out by all the main parties.

sudneo@lemm.ee · 19 days ago

But bringing it down is 1)illegal, 2) costly (DDoS cost money), 3) not guaranteed (CDNs can be very resilient) and 4) doesn’t show the collective support that not visiting the site does.

sudneo@lemm.ee · 25 days ago

I like the idea of canaries in documents, I think is a good point but obviously it only applies to certain types of data. Still a good idea.

Looking at OP, they seem a small shop, with a limited budget. Seriously the best recommendation I think is to use some kind of remote storage for data (works as long as the employee complies) and to make sure the access control is done in a decent way (reducing the blast of employee behaving maliciously). Anything else is probably out of reach for a small company without a security department.

Maybe I sounded too harsh, that’s just because in this post I have seen all kinds of comments who completely missed the point (IMHO) and suggested super complicated technical implementations that show how disconnected some people can be from real technical operations, despite the good tech skills.

sudneo@lemm.ee · 25 days ago

DLP solutions are honestly a joke. 99% of the case they only cost you a fortune and prevent nothing. DLP is literally a corporate religion.

What you mentioned also makes sense if you are windows shop running AD. If you are not, setting it up to lock 1 workstation is insane.

Also, the moment the data gets put on the workstation you failed. Blocking USB is still a good idea, but does very little (network exfiltration is trivial, including with DLP solutions). So the idea to use remotely a machine is a decent control, and all efforts and resources should be put in place to prevent data leaving that machine. Obviously even this is imperfect, because if I can see the data on my screen I can take a picture and OCR it. So the effort needs to go in ensuring the data is accessed on a need basis.

sudneo@lemm.ee · 25 days ago

Jamf doesn’t do anything for this problem, besides costing you a fortune in both license and maintenance/operation. Especially if you are not a Mac shop.

MDM at most can be used as a reactive tool to do something on the machine - as long as the one with the machine in their hand leaves the network connection on.

There are much cheaper solution to do that for 1 machine, and -as others correctly pointed out- the only solution (partial) here is not storing the data on a machine you don’t control. Period.

sudneo@lemm.ee · 26 days ago

Yeah, that’s what I wrote too, but that is still a very fragile way. For once, you depend on a network connections, or in the local firewall not blocking you etc.

Reactive, on-demand ssh is something you can do for tech support, not for security imho.

sudneo@lemm.ee · 26 days ago

Disk encryption is a control against lost or stolen device and malicious physical access (kinda). Storing the data elsewhere is more a control (or the basis for controls) against malicious insiders.

sudneo@lemm.ee · 26 days ago

Your ability to SSH in the machine depends on the network connectivity. Knowing the IP does nothing if the SSH port is not forwarded by the router or if you don’t establish a reverse tunnel yourself with a public host. As a company you can do changes to the client device, but you can’t do them on the employee’s network (and they might not even be connected there). So the only option is to have the machine establish a reverse tunnel, and this removes even the need for dynamic DNS (which also might not work in certain ISPs).

The no-sudo is also easier said than done, that means you will need to assist every time the employee needs a new package installed, you need to set unattended upgrades and of course help with debugging should something break. Depending on the job type, this might be possible.

I still think this approach (lock laptop) is an old, ineffective approach (vs zero-trust + remote data).

sudneo@lemm.ee · 26 days ago

Useful for standardized management of fleets, but requires personnel to maintain and configure it, but I don’t think it’s very effective (or feasible - I doubt they will even join the call for a 1-device contract) for what OP needs.

sudneo@lemm.ee · 26 days ago

This is honestly an extremely expensive (in terms of skills, maintenance, chance of messing up) solution for a small shop that doesn’t mitigate at all the threats posed.

You said correctly, the employee has the final word on what happens to the data appearing on their screen. Especially in the case of client data (I.e., few and sensitive pieces of data), it might even be possible to take pictures of the screen (or type it manually) and all the time invested in (imperfect) solutions to restrict drives and network (essentially impossible unless you have a whitelist of IPs/URLs) goes out the window too.

To me it seems this problemi is simply approached from the wrong angle: once the data is on a machine you don’t trust, it’s gone. It’s not just the employee, it’s anybody who compromises that workstation or accesses it while left unlocked. The only approach to solving the issue OP is having is simply avoiding for the data to be stored on the machine in the first place, and making sure that the access is only for the data actually needed.

Data should be stored in the company-controlled infrastructure (be in cloud storage, object storage, a privileged-access workstation, etc.) and controls should be applied there (I.e., monitor for data transfers, network controls, etc.). This solves both the availability concerns (what if the laptop gets stolen, or breaks) and some of the security concerns. The employee will need to authenticate each time with a short-lived token to access the data, which means revoking access is also easy.

This still does not solve the fundamental problem: if the employee can see the data, they can take it. There is nothing that can be done about this, besides ensuring that the data is minimised and the employee has only access to what’s strictly needed.