Kissaki

joined 1 year ago
[–] Kissaki@lemmy.dbzer0.com 2 points 15 hours ago* (last edited 15 hours ago)

Depending on what you want to scape, that's a lot of overkill and overcomplication. Full website testing frameworks may not be necessary to scrape. Python with it's tooling and package management may not be necessary.

I've recently extracted and downloaded stuff via Nushell.

  1. Requirement: Knowledge of CSS Selectors
  2. Inspect Website DOM in Webbrowser web developer tools
    1. Identify structure
    2. Identify adequate selectors; testable via browser dev tools console document.querySelectorAll()
  3. Get and query data

For me, my command line terminal and scripting language of choice is Nushell:

let $html = http get 'https://example.org/'
let $meta = $html | query web --query '#infobox .title, #infobox .tags' |  | { title: $in.0.0 tags: $in.1.0 }
let $content = $html | query web --query 'main img' --attribute data-src
$meta | save meta.json

or

1..30 | each {|x| http get $'https://example.org/img/($x).jpg' | save $'($x).jpg'; sleep 100ms }

Depending on the tools you use, it'll be quite similar or very different.

Selenium is an entire web-browser driver meaning it does a lot more and has a more extensive interface because of it; and you can talk to it through different interfaces and languages.

[–] Kissaki@lemmy.dbzer0.com 3 points 15 hours ago (1 children)

You don't even need a VPN to use a different DNS server.

[–] Kissaki@lemmy.dbzer0.com 4 points 16 hours ago

Injecting a malicious undisclosed firmware/software update. Very private and secure. /s

[–] Kissaki@lemmy.dbzer0.com 2 points 1 day ago

That's bullshit. There's no reason to limit or target a specific or non-maximum CPU core usage.

That would only make sense to evade hardware faults or cooling issues. Never as a general guideline.

[–] Kissaki@lemmy.dbzer0.com 4 points 2 days ago* (last edited 2 days ago)

YouTube channels can be terminated for both repeated copyright infringement and community guideline violations. In these cases, revenues are often withheld as well. It’s possible, however, that linked AdSense accounts are treated differently.

AdSense policies can be confusing, but based on additional information provided by Google’s AI, YouTube copyright bans are most likely to result in AdSense terminations too.

This is the first time I read of an AI as a source / AI being a source for an article.

[–] Kissaki@lemmy.dbzer0.com 2 points 6 days ago (1 children)

I don't see any free leech information in their announcement forums, and the news page is empty (looks broken).

[–] Kissaki@lemmy.dbzer0.com 3 points 6 days ago (1 children)

MusicBee has Tools -> Manage Duplicates

Screenshot

[–] Kissaki@lemmy.dbzer0.com 2 points 2 weeks ago

For reference, the source file is background.js

URLs at the top, init calls at the bottom, and above that the event registering stuff (tab nav and nav).

[–] Kissaki@lemmy.dbzer0.com 13 points 2 weeks ago* (last edited 2 weeks ago)

Notably, 5.0.1 was released three days ago. So a fix is available.

The first patched release is version 5.0.1, released 2 days ago.

[–] Kissaki@lemmy.dbzer0.com 9 points 3 weeks ago

With Ollama you can install and use various free AI models.

[–] Kissaki@lemmy.dbzer0.com 4 points 3 weeks ago

What do you mean by Grammarly costs a lot of money? It has a free tier. Which is quite generous.

[–] Kissaki@lemmy.dbzer0.com 1 points 1 month ago

Seems strange that the dev seems to be keeping quiet on this, no?

Which one? The repo owner certainly doesn't seem very active in general.

view more: next ›