lautan

joined 1 year ago
 

Sustainable open source will stay a dream

 

Self-hosted pastebin powered by Git, open-source alternative to Github Gist. - thomiceli/opengist

 

At a recent all-hands meeting, Google search head Prabhakar Raghavan told employees that the world is changing and they have to adjust.

 

A new report has shown that Amazon's "Just Walk Out" AI checkout process is actually processed by 1,000 staff in India.Tech companies are under pressure to d...

[–] lautan@lemmy.ca 2 points 11 months ago

I’m reaching out to see their thoughts. But there are limitations to what they can index.

[–] lautan@lemmy.ca 8 points 11 months ago

Good point. They should know they are making public comments. If you want it private then send a private message.

[–] lautan@lemmy.ca 1 points 11 months ago

I heard it will send out requests as fast as possible essentially creating a DOS attack but I could be wrong. Also the UI can use some improvement.

[–] lautan@lemmy.ca 2 points 11 months ago

It will be very little if not downloading full html pages.

[–] lautan@lemmy.ca 5 points 11 months ago

It's limited to only Peertube and it's not the most intuitive. I want to work with them on expanding this.

[–] lautan@lemmy.ca 4 points 11 months ago

Yep, the idea is to simulate the type of results you get from Google. People trust Lemmy answers more than spam sites now a days.

[–] lautan@lemmy.ca 4 points 11 months ago (2 children)

I mean they are posting on the public internet, they should know that it can be read by anyone. I like the idea of users opting out.

[–] lautan@lemmy.ca 3 points 11 months ago

The fediverse is a few thousand servers, from Mastodon, Lemmy, etc. Can't say the amount of posts but there are a lot.

So on the more technical side, I plan on using a light weight fast search engine called Sonic (It's written in rust). I have already used it in other projects and it can handle billions of messages / posts. But it has a cost it doesn't have faceted search, like for example if you want to exclude certain texts from the results. I think this is a fair trade off. The other solution would be to use something more mature like ElasticSearch but it'll be expensive (I'm assuming not much money will be made from this and I'm talking about donations)

For scanning sites there are premade lists to start with and it'll be possible to scan new sites from other instances if found. So a bit of both.

[–] lautan@lemmy.ca 2 points 11 months ago (2 children)

I heard it’s not optimized well but I’ll take a look at it.

[–] lautan@lemmy.ca 13 points 11 months ago (2 children)

Well that’s why I’m asking for input. And I won’t launch this on every instance without letting them know. Baby steps.

[–] lautan@lemmy.ca 14 points 11 months ago (2 children)

Yeah that would be the case.

 

Hey everyone,

This isn't an announcement, just wanted peoples thoughts on this.

I think everyone knows searching the fediverse can be better. Googling doesn't work too well, etc. So I wanted to do my part and help out.

Indexing all posts, etc is quite a lot to handle, so I wanted to start small and just focus on video search. I've started indexing videos from Peertube and other video websites. (Even YouTube but this could be removed to just focus on independent sites)

I know Peertube has their own search engine for videos. I will be reaching out to them. Compared to my site I'm planning it'll have other video sources and be easier to use.

So that leads to feedback from you guys.

  • What do you think about indexing videos posted on the fediverse and other independent platforms?
  • Are there similar services?
  • Am I just wasting my time?
 

cheap ≠ free Making nice things is difficult and time-consuming.

If we want people to make nice things for us, we have to pay for their rent and grocery bills and raw materials.

If you are spending less than $1 per hour on your entertainment (podcasts, videos, articles, games, books, etc.), consider finding ways to support creators and the infrastructure that supports them.

view more: ‹ prev next ›