r00ty

joined 2 years ago
[–] r00ty@kbin.life 2 points 3 months ago (1 children)

This already happens right now. If you have 22 open, your firewall is getting hammered with bots trying to get in, regardless of what cipher you're using, trying to exploit known weaknesses.

I know, except they're only ever trying lame user/password pairs that only an idiot would have on their luggage. Same as on asterisk and the bots trying to exploit decades old exploits on wordpress etc. Regardless of whether the site you host is even remotely like wordpress.

I'm not sure how you'd achieve this. If you have a mechanism to change cipher modes then there would be part of the codebase and handshake that validates settings in some way, which adds potential attack vector.

Doesn't need to change the handshake. If the server is mine, and run by me and I decide I was to change say, just the key exchange part of the process. It could be changed without negotiation. I just need to make sure all clients are configured the same way. My point being there wouldn't be a negotiation. If you try to connect to wireguard on my server, you'd need to have the key exchange setup in the same way, with the same parameters too. Yes, it should be entirely optional and require specific configuration changes on both client and server to achieve. So long as server and client are configured with the same parameters there's no negotiation to make. The channel can be setup and if the configuration is wrong it just won't work.

[–] r00ty@kbin.life 3 points 3 months ago (2 children)

And their "AI tool" looks just like the hundreds of AI scraping bots. And I've already said the answer is easy. They need to differentiate themselves enough to convince cloudflare to make an exception for them.

Until then, they're "just another AI company scraping data"

[–] r00ty@kbin.life 5 points 3 months ago (4 children)

Yes, but my point is I cannot tell the difference. If they can convince cloudflare they deserve special treatment and exemption then they can probably get it.

I would argue there being a difference "depends" though. There's two problems I see. They are only potentially not guilty of one.

The first problem is, that AI crawlers are a true DDoS and this is I think the main reason most (including myself) do not want them. They cause performance issues by essentially speed running collecting every unique piece of data from your site. If they're dynamic as the article says then they are potentially not doing this. I cannot say for sure here.

The second problem is, many sites are monetized from advert revenue or otherwise motivated by actual organic traffic. In this case, I would bet some money that this company is taking the data from these sites, not providing ad revenue or organic traffic and serving it to the querying user with their own ads included. In which case, this is also very very bad.

So, their beef is only potentially partially valid. Like I say, if they can convince cloudflare, and people like me to add exceptions for them, then great. So far though, I'm not convinced. AI scrapers have a bad reputation in general, and it's deserved. They need to do a LOT to escape that stigma.

[–] r00ty@kbin.life 24 points 3 months ago (8 children)

Well. Try running a web server and you'll find quite quickly that you get hit quick and hard by AI crawlers that do not respect server operators. Unlike web crawlers of old, these will hit a site over and over with sometimes 100s, even 1000s of requests per second to strip mine all the content they can find, as quickly as possible.

When you try to block them by user agent, they start faking real client user agents.

When you block the AS Numbers involved traffic starts to go down. But there's still a large number of non organic requests, coming from, well frankly everywhere. Cellular network in Brazil, cable internet in the USA, other non business subcribers in other countries around the world.

How do I know they're not organic? Turn on cloudflare managed challenge and they all go away.

So, personally that's my biggest beef against them. Yes ripping off data without permission is bad already, but this level of trying to bypass any clear sign we do not want you is far worse.

[–] r00ty@kbin.life 2 points 3 months ago (3 children)

Well, I did think the "security through obscurity" line would come up. But that's really something that should be reserved for people making their own "triple XOR" crypto implementations closed source and hoping that protects them.

The "obscurity" if it's the term we want to use here in my use case isn't hiding using closed source to provide a perception of security. It's just giving a choice of crypto, but not adding to the protocol with negotiation.

My thinking is this, and we'll look at say ssh. We can choose between multiple key types and lengths for that. Now let's say for example ed25519 is compromised (in real terms I think the only likely compromise for any of the ssh key based auth options would be deriving a private key from the public key, so the "scanning" I talk about is a fantasy. But I'm going with it!). For ssh, there will for sure be bots hunting the internet for vulnerable ssh servers very soon after. Automating the process of getting in, installing whatever nefarious tools they want and moving on. But, crucially they will only get those that have used ed25519 for their auth key login. However they might well get every single wireguard vpn.

I'm really just advocating for the same option really. The option to not use the same as everyone else. With no reduction in security for anyone else and no need to negotiate, the onus would entirely be on the operator to ensure the same stack is configured on client and server. Of course with the understanding that using any other stack is at your own risk. E.g. "triple XOR" security might not be the best, for example :P

Oh and as I said, I doubt I would use it. I use wireguard as it is, I like wireguard as it is. But, I feel like having options is not a bad thing, provided the default is the "best" option currently known.

[–] r00ty@kbin.life 2 points 3 months ago

Well the posts to inbox are generally for incoming info. Yes, there's endpoints for fetching objects. But, they don't work for indexing, at least not on mbin/kbin. If you have a link, you can use activitypub to traverse upwards from that object to the root post. But you cannot iterate down to child comments from any point.

The purpose is that say I receive an "event" from your instance. You click like on a post I don't have on my instance. Then the like event has a link to the object for that on activitypub. If I fetch that object it will have a link to the comment, if I fetch the comment it will have the comment it was in reply to, or the post. It's not intended to be used to backfill.

So they do it the old fashioned way, traversing the human side links. Which is essentially what I lock down with the managed challenge. And this is all on the free tier too.

[–] r00ty@kbin.life 14 points 3 months ago

It's the usual enshittification tactic. Make AI cheap so companies fire tech workers. Keep it cheap long enough that we all have established careers as McDonald's branch managers, then whack up the prices once they're locked in.

[–] r00ty@kbin.life 13 points 3 months ago (4 children)

For mbin I managed to kill the attack of the scrapers only using cloudflare managed challenge for all except to fediverse post endpoints, from fediverse ua agents on certain get endpoints. Managed challenge on everything else.

So far, they've not gotten past it. But, a matter of time.

[–] r00ty@kbin.life 8 points 3 months ago (1 children)

While I don't doubt that's part of the reason. I would assume ensuring only the microsoft key was used to create a trusted boot path to a clean windows install. At which point during the boot process these invasive anti-cheat engines take over and are then watching everything loading makes it a bit harder to cheat.

But I think there's a lot of hardware options available that could still remain invisible here. Maybe it makes software options close to impossible though. Not too sure, there's always inventive workarounds people come up with.

I always find it amusing the lengths people will go to, to cheat.. Just short of, learning to play the game better.

[–] r00ty@kbin.life 15 points 3 months ago (2 children)

Yep. I entirely agree about the good points. I am just always weary about removing options like this, regardless of intention.

I'd be fine if for example I'm running my own wireguard implementation, I could choose the suite to use, not negotiate anything and ensure my client has the same configuration.

I'd probably not use it, but I like the option, and knowing that anyone that wants to try to break this now also needs to guess what options I'm running.

[–] r00ty@kbin.life 23 points 3 months ago (9 children)

I only have one problem with this. When they say wireguard being crypto opinionated is a good thing. I am weary to agree with that statement entirely.

While it is good for stability (only one stack to support and get right, and to be secure and efficient) I do wonder about overall and future security. Saying "You must use this specific cipher suite because we think it's the best" is a bit of a dangerous road to take.

I say this just because Curve 25519 is considered a very secure elliptic curve, to the best of my very limited knowledge on this subject. But we had a certain dual elliptic curve pseudo random number generator was pushed as "best practice" (NIST backed) some time ago, which didn't turn out so well, even omitting possible conspiracy scenarios, it had known weaknesses even before it was recommended. [1]

Since then I've generally not been a huge fan of being given one option as "the right way" when it comes to cryptography. Even if it is the "best" it gives one target to try to find a weakness in, rather than many.

I say all this as a wireguard user, it's a great, fast and reliable VPN. I just have concerns when the choice of using other algorithms and especially putting my own chosen chain together is taken away. Because it puts the exact same target to break on every one of us, rather than having to work out how to break multiple methods and algorithms and multiple combinations.

[1] https://en.wikipedia.org/wiki/Dual_EC_DRBG

[–] r00ty@kbin.life 12 points 3 months ago (1 children)

I think it's a real shame because all three of those things you mention are useful. The problem is that once they become a buzzword, then everything needs to be done using that buzzword.

Cloud has been misused to hell and back, and I have no doubt AI will too.

view more: ‹ prev next ›