overview for pe1uca

Uses for local AI? in c/selfhosted@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 9 points 4 months ago (4 children)

I've used it to summarize long articles, news posts, or videos when the title/thumbnail looks interesting but I'm not sure if it's worth the 10+ minutes to read/watch.
There are other solutions, like a dedicated summarizer, but I've investigated into them and they only extract exact quotes from the original text, an LLM can also paraphrase making the summary a bit more informative IMO.
(For example, one article mentioned a quote from an expert talking about a company, the summarizer only extracted the quote and the flow of the summary made me believe the company said it, but the LLM properly stated the quote came from the expert)

This project https://github.com/goniszewski/grimoire has in it's road map a way to connect to an AI to summarize the bookmarks you make and generate at 3 tags.
I've seen the code, I don't remember what the exact status of the integration.

Also I have a few models dedicated for coding, so I've also asked a few pieces of code and configurations to just get started on a project, nothing too complicated.

A local database with a command line interface? (sqlite?) in c/linux@lemmy.ml

[–] pe1uca@lemmy.pe1uca.dev 1 points 4 months ago

Ah, that makes sense!
Yes, a DB would let you build this. But the point is in the word "build", you need to think about what is needed, in which format, how to properly make all the relationships to have data consistency and flexibility, etc.
For example, you might implement the tags as a text field, then we still have the same issue about addition, removal, and reorder. One fix could be have a many tags to one task table. Then we have the problem of mistyping a tag, you might want to add TODO but you forgot you have it as todo, which might not be a problem if the field is case insensitive, but what about to-do?
So there are still a lot of stuff you might oversight which will come up to sidetrack you from creating and doing your tasks even if you abstract all of this into a script.

Specifically for todo list I selfhost https://vikunja.io/
It has OAS so you can easily generate a library for any language for you to create a CLI.
Each task has a lot of attributes, including the ones you want: relation between tasks, labels, due date, assignee.

Maybe you can have a project for your book list, but it might be overkill.

For links and articles to read I'd say a simple bookmark software could be enough, even the ones in your browser.
If you want to go a bit beyond that I'm using https://github.com/goniszewski/grimoire
I like it because it has nested categories plus tags, most other bookmark projects only have simple categories or only tags.
It also has a basic API but is enough for most use cases.
Other option could be an RSS reader if you want to get all articles from a site. I'm using https://github.com/FreshRSS/FreshRSS which has the option to retrieve data form sites using XMLPath in case they don't offer RSS.

If you still want to go the DB route, then as others have mentioned, since it'll be local and single user, sqlite is the best option.
I'd still encourage you to use any existing project, and if it's open source you can easily contribute the code you'd have done for you to help improve it for the next person with your exact needs.

(Just paid attention to your username :P
I also love matcha, not an addict tho haha)

A local database with a command line interface? (sqlite?) in c/linux@lemmy.ml

[–] pe1uca@lemmy.pe1uca.dev 10 points 4 months ago (2 children)

I can't imagine this flow working with any DB without an UI to manage it.
How are you going to store all that in an easy yet flexible way to handle all with SQL?

A table for notes?
What fields would it have? Probably just a text field.
Creating it is simple: insert "initial note"... How are you going to update it? A simple update to the ID won't work since you'll be replacing all the content, you'd need to query the note, copy it to a text editor and then copy it back to a query (don't forget to escape it).
Then probably you want to know which is your oldest note, so you need to include created_at and updated_at fields.
Maybe a title per note is a nice addition, so a new field to add title.

What about the todo lists? Will they be stored in the same notes table?
If so, then the same problem, how are you going to update them? Include new items, mark items as done, remove them, reorder them.
Maybe a dedicated table, well, two tables, list metadata and list items.
In metadata almost the same fields as notes, but description instead of text. The list items will have status and text.

Maybe you can reuse the todo tables for your book list and links/articles to read.

so that I can script its commands to create simpler abstractions, rather than writing out the full queries every time.

This already exists, several note taking apps which wrap around either the filesystem or a DB so you only have to worry about writing your ideas into them.
I'd suggest to not reinvent the wheel unless nothing satisfies you.

What are the pros of using a DB directly for your use case?
What are the cons of using a note taking app which will provide a text editor?

If you really really want to use a DB maybe look into https://github.com/zadam/trilium
It uses sqlite to store the notes, so maybe you can check the code and get an idea if it's complicated or not for you to manually replicate all of that.
If not, I'd also recommend obsidian, it stores the notes in md files, so you can open them with any software you want and they'll have a standard syntax.

Using linux with 2 partitions, how do you do it and how do you organize your files? in c/linux@lemmy.ml

[–] pe1uca@lemmy.pe1uca.dev 1 points 4 months ago

I was juggling like that, I had most of my files in NTFS so I could read them in windows, even for files read only by Linux programs.
Most programs were able to read from any part of the file system, but for those with strict paths I used symlinks.

But I haven't had any use for Windows lately so I decided to delete all but one NTFS partition and this last one is only 256GB with 100GB free.
The rest of the data I moved it to ext4 and btrf partitions.

Tools and ideas for backup of many files in c/selfhosted@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 2 points 5 months ago

In that case I'd recommen you use immich-go to upload them and still backup only immich instead of your original folder, since if something happens to your immich library you'd have to manually recreate it because immich doesn't update its db from the file system.
There was a discussion in github about worries of data being compressed in immich, but it was clarified the uploaded files are saved as they are and only copies are modified, so you can safely backup its library.

I'm not familiar with RAID, but yeah, I've also read its mostly about up time.

I'd also recommend you look at restic and duplocati.
Both are backup tools, restic is a CLI and duplocati is a service with an ui.
So if you want to create the crons go for restic.
Tho if you want to be able to read your backups manually maybe check how the data is stored, because I'm using duplicati and it saves it in files that need to be read by duplicati, I'm not sure if I could go and easily open them unlike the data copied with rsync.

*Permanently Deleted* in c/fediverse@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 2 points 5 months ago (1 children)

Unless they've changed how it works I can confirm.
Some months ago I was testing lemmy in my local I used the same URL to create a new post, it never showed up in the ui, it was because Lemmy treated it as a crosspost and hid it under the older one.
At that time it was only a crosspost jf the URL was the same, I'm not so sure about the title, but the body could be different.

The thing would be to verify if this grouping is being done by the UI or by the server, which might explain some UIs showing duplicated posts.

Tools and ideas for backup of many files in c/selfhosted@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 6 points 5 months ago* (last edited 5 months ago) (2 children)

For local backups I use this command

$ rsync --update -ahr --no-i-r --info=progress2 /source /dest

You could first compress them, but since I have the space for the important stuff, this is the only command I need.

Recently I also made a migration similar to yours.

I've read jellyfin is hard to migrate, so I just reinstalled it and manually recreated the libraries, I didn't mind about the watch history and other stuff.
IIRC there's a post or github repo with a script to try to migrate jellyfin.

For immich you just have to copy this database files with the same command above and that's it (of course with the stack down, you don't want to copy db files while the database is running).
For the library I already had it in an external drive with a symlink, so I just had to mount it in the new machine and create a simlar symlink.

I don't run any *arr so I don't know how they'd be handled.
But I did do the migrarion of syncthing and duplicati.
For syncthing I just had to find the config path and I copied it with the same command above.
(You might need to run chown in the new machine).

For duplicati it was easier since it provides a way to export and import the configurations.

So depending on how the *arr programs handle their files it can be as easy as find their root directory and rsync it.
Maybe this could also be done for jellyfin.
Of course be sure to look for all config folders they need, some programs might split them into their working directory, into ~/.config, or ./.local, or /etc, or any other custom path.

EDIT: for jellyfin data, evaluate how hard to find is, it might be difficult, but if it's possible it doesn't require the same level of backups as your immich data, because immich normally holds data you created and can't be found anywhere else.

Most series I have them in just the main jellyfin drive.
But immich is backedup with 3-2-1, 3 copies of the data (I actually have 4), in at least 2 types of media (HDD and SSD), with 1 being offsite (rclone encrypted into e2 drive)

I updated wanderer (v0.6.1) - a self-hosted trail and GPS track database in c/selfhosted@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 1 points 5 months ago

Just tried it and seems too complicated haha. With traccar I just had to deploy a single service and use either the official app or previously gpslogger sending the data to an endpoint.

With owntracks the main documentation seems to be deploy it into the base system, docker is kind of hidden.
And with docker you need to deploy at least 3 services: recorder, Mosquitto, and the front end.
The app doesn't tell you what's expected to be filled into the fields to connect to the backend. I tried with https but haven't been able to make it work.

To be fair, this has been just today. But as long as a service has a docker compose I've always been able to deploy it in less than 10 minutes, and the rest of the day is just customizing the service.

I updated wanderer (v0.6.1) - a self-hosted trail and GPS track database in c/selfhosted@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 6 points 5 months ago (2 children)

It looks amazing!

How well fitted would this be for a Google maps timeline replacement?

I see you mention we need to upload the files which maybe could be obtained from an app like https://github.com/mendhak/gpslogger
I already had a flow to have them on my server with syncthing, so I could easily use your api to process them.

The thing would be to have each trail be marked as each day and have a way of showing them nicely (I haven't tested everything in the demo hehe).

Is there a plan to be able to process any GPS standard to automatically generate the trails?

I'm currently using traccar, but it looks more like a fleet management than something to remember where you've been.

NAS, Home Servers, and where do I even start? in c/selfhosted@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 7 points 5 months ago

I can share you a bit my journey and setups so maybe you can take a better decision.

About point 1:

In vultr with the second smallest shared CPU (1vCPU, 2GB RAM) several of my services have been running fine for years now:
invidious, squid proxy, TODO app (vikunja), bookmarks (grimoire), key-value storage (kinto), git forge (forgejo) with CI/CD (forgejo actions), freshrss, archival (archive-box), GPS tracker (traccar), notes (trilium), authentication (authelia), monitoring (munin).
The thing is since I'm the only one using them usually only one or two services receive considerable usage, and I'm kind of patient so if something takes 1 minute instead of 10 seconds I'm fine with it. This is rare to happen, maybe only forgejo actions or the archival.

In my main pc I was hosting some stuff too: immich, jellyfin, syncthing, and duplicati.

Just recently bought this minipc https://aoostar.com/products/aoostar-r7-2-bay-nas-amd-ryzen-7-5700u-mini-pc8c-16t-up-to-4-3ghz-with-w11-pro-ddr4-16gb-ram-512gb-nvme-ssd
(Although I bought it from amazon so I didn't had to handle the import.)

Haven't moved anything off of the VPS, but I think this will be enough for a lot of stuff I have because of the specs of the VPS.
The ones I've moved are the ones from my main PC.
Transcoding for jellyfin is not an issue since I already preprocessed my library to the formats my devices accept, so only immich could cause issues when uploading my photos.

Right now the VPS is around 0.3 CPU, 1.1/1.92GB RAM, 2.26/4.8GB swap.
The minipc is around 2.0CPU (most likely because duplicati is running right now), 3/16GB RAM, no swap.

There are several options for minipc even with potential to upgrade ram and storage like the one I bought.
Here's a spreadsheet I found with very good data on different options so you can easily compare them and find something that matches your needs https://docs.google.com/spreadsheets/d/1SWqLJ6tGmYHzqGaa4RZs54iw7C1uLcTU_rLTRHTOzaA/edit
(Here's the original post where I found it https://www.reddit.com/r/MiniPCs/comments/1afzkt5/2024_general_mini_pc_guide_usa/ )

For storage I don't have any comments since I'm still using a 512GB nvme and a 1TB external HDD, the minipc is basically my start setup for having a NAS which I plan to fill with drives when I find any in sale (I even bought it without ram and storage since I had spare ones).

But I do have some huge files around, they are in https://www.idrive.com/s3-storage-e2/
Using rclone I can easily have it mounted like any other drive and there's no need to worry of being on the cloud since rclone has an encrypt option.
Of course this is a temporary solution since it's cheaper to buy a drive for the long term (I also use it for my backups tho)

About point 2:

If you go the route of using only linux sshfs is very easy to use, I can easily connect from the files app or mount it via fstab. And for permissions you can easily manage everything with a new user and ACLs.

If you need to access it from windows I think your best bet will be to use samba, I think there are several services for this, I was using OpenMediaVault since it was the only one compatible with ARM when I was using a raspberry pi, but when you install it it takes over all your net interfaces and disables wifi, so you have to connect via ethernet to re-enable it.

About point 3:

In the VPS I also had pihole and searxng, but I had to move those to a separate instance since if I had something eating up the resources browsing internet was a pain hehe.

Probably my most critical services will remain in the VPS (like pihole, searxng, authelia, squid proxy, GPS tracker) since I don't have to worry about my power or internet going down or something that might prevent me from fixing stuff or from my minipc being overloaded with tasks that browsing the internet comes to a crawl (specially since I also ran stuff like whispercpp and llamacpp which basically makes the CPU unusable for a bit :P ).

About point 4:

To access everything I use tailscale and I was able to close all my ports while still being able to easily access everything in my main or mini pc without changing anything in my router.

If you need to give access to someone I'd advice for you to share your pihole node and the machine running the service.
And in their account a split DNS can be setup to only let them handle your domains by your pihole, everything else can still be with their own DNS.

If this is not possible and you need your service open on the internet I'd suggest having a VPS with a reverse proxy running tailscale so it can communicate with your service when it receive the requests while still not opening your lan to the internet.
Another option is tailscale funnel, but I think you're bound to the domain they give you. I haven't tried it so you'd need to confirm.

Lemmy.ml tankie censorship problem in c/fediverse@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 16 points 5 months ago

I use https://lemmyverse.net/
You can search for all communities of all instances, or click in a specific instance.

https://lemmyverse.net/instance/programming.dev/communities

Self-hosted diary in c/selfhosted@lemmy.world

[–] pe1uca@lemmy.pe1uca.dev 1 points 5 months ago

A note taking app can be turned into a diary app if you only create notes for each day.
Even better if you want to then expand a section of a diary entry without actually modifying it nor jumping between apps.

Obsidian can easily help you tag and link each note and theme/topic in each of them.
There are several plugins for creating daily notes which will be your diary entries.
Also it's local only, you can pair it with any sync service, the obsidian provided one, git, any cloud storage, or ones which work directly with the files like syncthing.

Just curious, what are the special features you expect from a diary service/app which a note taking one doesn't have?