Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

sudneo

@sudneo@lemmy.world

This profile is from a federated server and may be incomplete. Browse more on the original instance.

sudneo ,

I went to look for the video and somehow was worse than I had imagined.

sudneo ,

I personally like a lot the gazillion bangs also available, the personal up/downranking/blocking of websites and their quick answer is often fairly good (I mostly use it for documentation lookup). The lenses are definitely the best feature though, especially coupled with bangs.
I converted even my wife who really loves it.

sudneo ,

Thanks (grazie?)! I was looking for something similar and kanidm looks great feature wise and simple to deploy!

sudneo ,

I struggled with this for a long time, and then I just decided to use synology photos.

It has albums, tagging, geolocation, sharing. It has phone picture backup, it is inherently a backup as it's on my NAS and I back that data up again.

I want to keep the thing that I really care about the most friction free and also not too dependent on myself so that I can still experiment.

I didn't try PiGallery2 though, maybe I will have a look!

sudneo ,

Super cool project, thanks for sharing! I think I will try to integrate it with my static sites.

sudneo ,

Cgroups have the ability to limit TCP and total network bandwidth. I don't know from the top of my mind whether this can be configured at runtime (I.e. via docker run), but you can specifcy at runtime the cgroup parent to use. This means you can pre-create the cgroup, set the limits and start the container with that parent cgroup.

You can also run some hook script after launch that adds the PID to a cgroup every time the container is launched, or possibly use tc.

I am not aware of the ability to only limit uplink bandwidth, but I have not researched this.

sudneo , (edited )

Yeah ultimately every container has it's own veth interface, so you can do shaping using tc on those.

Edit:
I had a look at docker-tc. It does what you want, BUT. Unless your use case is complex, I would really think twice about running a tool written in bash which has access to the docker socket (I.e. trivial node escape) and runs with NET_ADMIN capability.

That's a lot of power to do something you can also do with a few lines of code executed after you start the container. Again, provided that your use case is not complex.

sudneo ,

Or rustic! It is compatible with restic but has some nice additions, for example the fact that supports a config files. It makes operations a bit easier IMHO (I am currently using both).

sudneo ,

I would say Docker. There is no substantial benefit in running podman, while docker is a widely adopted tool (which means more tooling in the ecosystem, easier to find answers to questions etc.). The difference is not huge tbh, and some time ago the biggest advantage for podman was being able to run rootless, while docker was stuck with a root daemon. This is not the case anymore (docker can run rootless), so I would say unless you have some specific argument to use podman, stick with docker.

sudneo ,

Docker can run rootless too, see https://docs.docker.com/engine/security/rootless/

sudneo ,

You have a bunch of options:

kubectl run $NAME --image=$IMAGE

this just creates a pod running the specific image. If you kill the pod, or it terminates, it won't be run again.
In general though, you probably want to do some customization before running (maybe you need volumes, secrets, env, ports, labels, securityContext, etc.) and for that you can simply let kubectl generate the boilerplate YAML and then simply make some edit:

kubectl run $NAME --image=$IMAGE --dry-run=client -o yaml > mypod.yaml
# edit mypod.yaml
kubectl create -f mypod.yaml

You can do the same with a deployment or statefulset:

kubectl create deployment $NAME -n $NAMESPACE [...] --dry-run=client -o yaml > deployment.yaml

In case you don't need anything fancy, the kubectl create subcommand allows you to create simple workload, so probably that's the answer to your question.

sudneo ,

Because the lxc way is inherently different from the docker/podman way. It's aimed at running full systems, rather than mono process containers. It has it's use cases, but they are not as common IMHO.

sudneo ,

I think k8s is a different beast, that requires way more domain specific knowledge besides server/Linux basic administration.
I do run it, but it's an evolution of a need, specifically when you want to manage a fleet of machines running containers.

sudneo ,

I really thought swarm was dead :)

To be honest, some kubernetes distributions make the cluster operations minimal (I use k0s managed via ansible)!

Either way, the moment you go from N containers on one box to N containers on M boxes you need to start considering how to handle stateful applications, load balancing, etc. And that in general requires knowledge on a domain which is different from having simply applications wrapped in containers locally.

sudneo ,

Did it sound cold? Because I didn't mean that, I just meant to actually answer the question from my PoV. Just for the record, I also did not down vote you.

So yeah, use whatever footgun you prefer, I don't judge :)

sudneo ,

One thing I have never understood and keep repeating in this context: Beehaw has >7k$ balance. If they really have a few issues that would solve 90% of the problems, why not putting a 500/1000/2000$ bounty of that feature.

sudneo ,

I am curious about the details of that conversation, because I remember reading Dev's comments in some post on Lemmy where they mentioned this option.

sudneo ,

What vendor lock-in are you talking about?

I can take my domain, customize DNS records and in a couple of minutes I am using a new provider. They also allow to export email content, which means I obviously don't lose anything.

With a free email account, you are anyway locked-in as with every provider, because you are using their domain. You can set automatic forwarding in that case.

Vendor lock exists when you invest substantial amount of work to build tools around a specific platform (say, AWS), or where you have no way to easily take the data from one platform out and use something else to do the same thing (say, Meta).

The fact that you can't use SMTP, which is a protocol that requires data on the server is not a vendor lock-in in any sense of the word. It's a decision that depends on having that content e2e encrypted, because the two things are simy incompatible.

Also the code for all Proton clients and the bridge is open source, and the bridge is essentially a client that emulates being a server so that you can use your preferred tools to access the emails.
Even in this scenario, there is no vendor lock and all it takes is changing the configuration of your tool from the local bridge address to whatever SMTP server you want to use elsewhere.

Can you please describe in which way you are actually locked-in, to show that you have a clue about what the word means?

sudneo ,

FYI, that was one of the things I disliked the most about Proton (email client slow). They released the newly rewritten app few weeks ago finally, and it's working great.

sudneo , (edited )

There are lots of ways to do e2e encryption on e-mail over SMTP (OpenPGP, S/MIME etc.)

Yes, and that requires using a client. The JS code of the webclient and the bridge are clients for PGP.

SMTP itself also supports TLS for secure server-to-server communications (or server-to-client in submission contexts) as well as header minimization options to prevent metadata leakage.

TLS is completely pointless in this conversation. TLS is a point-to-point protocol and it's not e2e where the definition of the "ends" are message recipient and sender (i.e., their client applications), it only protects the transport from your client to the server, then the server terminates the connection and has access to the plaintext data. Proton also uses TLS, but again, it has no use whatsoever for e2ee.

And Proton decided NOT to use any of those proven solutions and go for some obscure propriety thing instead because it fits their business better and makes development faster.

They didn't do anything obscure, they have opensource clients that do PGP encryption similar to how your web client would do. Doing encryption on the client is the only way to ensure the server can't have access to the content of the emails. It just happens that the client is called "proton bridge" or "proton web" instead of OpenPGP.

only exists until they allow it to exist.

It's their official product, and anyway it's not a blocker for anything. They stop giving you the bridge? You move in less than 1h to another provider.

You don’t know if there are rate limits on the bridge usage and other small details that may restrict your ability to move large amounts of email around.

Do you know that there are, or are we arguing on hypotheticals?

Decent providers will give you an export option that will export all your e-mail using industry standard formats such as mbox and maildir.

True. You can still get the data out, whether they don't do in a "best practice" way or not. It's not vendor lock.

Proton mail is so closed that you can’t even sync your Proton mail contacts / calendars with iOS or Android - you can only use their closed source mail client to access that data or the webui.

https://github.com/ProtonMail. All the mail clients are opensource.

Also, WebDAV, CardDAV, CalDAV do not support e2ee. You need once again a client that extends it, which is what Proton also does!

So the question is very simple: do you prefer e2ee or you prefer native plain caldav/webdav/carddav? If the answer for you is the latter, Proton is simply a product that is not for you. If you prefer the former, then Proton does it.
Either way, this is not again vendor-lock. They allow you to export contacts and calendar in a standard format, and you can move to a new provider.

Proton doesn’t respect the open internet by not basing their services on those protocols and then they feed miss-information (like the thing about e2e encryption being impossible on SMTP) and by using it you’re just contributing to a less open Internet.

SMTP does not allow e2ee by definition. I am not sure whether you don't understand SMTP or how e2ee works, but SMTP is a protocol based on the server having access to the content. The only way you can do e2ee is using a client that encyrpts the content, like PGP (which is what Proton uses), before sending it to the server. This is exactly what happens with Proton, the webclients use SMTP to talk to proton server but before that they do client-side encryption (using PGP), exactly like you would do with any other client (see https://github.com/search?q=repo%3AProtonMail%2FWebClients%20smtp&type=code).


Now, you made a claim, which is that Proton vendor locks you:

  • Mail can be moved easily. Change DNS record (or set forwarding) and export previous email.
  • Calendar can be moved easily, export ics -> import in new calendar
  • Contacts can be moved easily, export vcf -> import in new contact

So your claim that you are vendor locked it's simply false, deal with it.

You made some additional claims about Proton not using plain standard protocols. That's true. None of those protocols support e2ee, so they wrote clients that extend those protocols. All clients are opensourced, including the bridge. This has anyway nothing to do with being vendor locked, which in fact you completely did not explain. You talked about interoperability at most, which is not related to vendor lock.

You also made additional uniformed or false claims:

  • TLS being helpful for e2ee. Is not in the context of email.
  • You failed to understand why using native Cal/Web/CardDAV is incompatible with e2ee.
  • You called "closed source mail client", when all the email clients are opensource.
sudneo ,

So tell me, why is it that Proton simply didn’t go for OpenPGP + SMTP + IMAP and simply build a nice web / desktop client for it and kept it compatible with all the other generic solutions out there? What’s the point?

How is this relevant? I don't know and I don't care why they picked this technical solution.

It isn’t and I hope I’m never proven right about Proton for your own sake.

It is, and you have been proven wrong. Either that, or you completely misuse or worse misunderstand what vendor lock is.

Yes you move if you can.

It's not if. You can.

It has all to do with vendor lock-in and it is already explained.

Yes, you explained interoperability that has nothing to do with vendor lock. They are two. different. things.

If a company uses shady stuff that restricts interoperability by definition it is a form of vendor lock-in as you won’t be able to move to another provider as quickly and fast as you might with others.

False. Again.
Interoperability it's a property that has to do with using the application. Interoperable applications potentially can totally vendor lock. Lemmy interoperates with Mastodon, but vendor locks you because you can not export everything and port all your content away.
You definition is wrong. Just admit you misused the term and move on, there is no need to double down.

And it is, because there’s specific metadata that might get leaked on email headers if not minimized by other techniques that will get protected with TLS between servers.

They use TLS. TLS is useful for transport security. Proton uses TLS.
TLS doesn't have anything to do with e2ee in the context of emails because TLS is always terminated by the server. Therefore it is by definition not an e2ee protocol in this context.
It is in the context of web, because there the two "ends" are your browser and the web server. It's not in the context of messaging where the other "end" is another client.

It isn’t perfect

This has nothing to do with perfection, you are simply misunderstanding fundamentally what e2ee is in this context.

it’s better than having email servers delivering PGP mail over plain text connections

And in fact Proton doesn't do that.

You should be ashamed of yourself for even suggesting that there’s no usefulness whatsoever for this use case.

I am not ashamed because I understand TLS, and I understand that it's useless in the context of email e2ee. You simply don't understand the topic but feel brave enough to evangelize on the internet about something you don't fully understand.

here’s how decent companies deal with that: they encrypt the data in transit (using TLS) and when stored on their servers by using key store and that is itself encrypted with your login password / hash / derivation or some similar technique.

JFC.
Proton uses TLS for transit connections.
E2EE means that the server does not have access to the data. If the server has the key, in whatever form, and can perform a decryption, it's not e2ee. The only way to have e2ee for these protocols is that the client(s) and only the clients do the encryption/decryption operations. This is exactly what Proton clients do. They use DAV protocols but they extend them with implementing encryption on the client side. Therefore, naturally, by design, they are not compatible with servers which -instead- expect data unencrypted to serve it, unencrypted (only via TLS, which again, it's a transport protocol, has nothing to do with application data) to other clients.

Ironically, when saying what "decent companies" do, you have described what Proton does: they use your client key to encrypt data on client side. Then they transfer this data via a secure channel (TLS). The server has no keys and sees only encrypted data, and serves such data to other clients (Proton web, android etc.) that do the decryption/encryption operation back. Underlying it's still CalDAV/WebDAV.

You clearly have bought into their propaganda but oh well think whatever you want, you’re the one using the service and it’s your data that will be hostage after all.

I don't need to buy propaganda, I am a security professional and do this stuff for a living. I also understand what vendor lock is because all the companies I ever worked with had forms of vendor lock, and I am aware of Proton features instead.

Maybe you should really stop, reflect and evaluate if you really have the competence to make certain claims on the internet. I understand nobody is there keeping score and there are no consequences, but you are honestly embarassing yourself and spreading false information due to the clear lack of understanding about concepts such as e2ee, transport security, vendor locking, etc.

sudneo ,

Your just focusing on strict technical definitions and completely ignoring the context of things. I described before how TLS is useful in the context of some SMTP e2e encrypted solution.

Yes, mentioning things that have not to do with e2ee.
Anything that is encrypted with TLS is not e2ee in the context of emails. You talked about metadata, but the server has access to those because it terminates the connection, therefore, they are not e2ee. It's a protection against leakage between you and the server (and between server and other server, and between server and the destination of your email), not between you and the destination, hence, irrelevant in the context of e2ee. Metadata such as destination can obviously never be e2ee, otherwise the server wouldn't know where to send the email, and since it needs access to it, it's not e2ee, whether you use TLS or not.
TLS in this context doesn't contribute at all to end to end encryption.
Your definition is wrong, e2ee is a technical definition, is not an abstract thing: e2ee means that only the two ends of a conversation have access to the data encrypted. TLS is by definition between you and your mail server, hence it doesn't provide any benefit in the context of e2ee. It is useful, but for other reasons that have nothing to do with e2ee.

never questioned that. With SMTP you can do true e2ee using PGP and friends

Exactly, and this is what Proton does. You simply don't accept that Proton decided to write another client that is tightly coupled with their mail service, which is absolutely nothing malicious or vendor-locky, compared to using an already made client. Proton is simply PGP + SMTP.

with CalDAV/WebDAV you’ll need to have some kind of middle man doing the encryption and decryption - it’s a fair trade off for a specific problem.

Yes, and this middle-man is proton client, which sits on the client's side. I am glad you understood how the only way to have e2ee with *DAV automatically technically impedes you to use "whatever server". If anybody else but the client does the encryption/decryption, you lost the end-to-end part. I am not saying e2ee in this context is absolutely necessary, you might not care and value more the possibility to plug other *DAV servers, good. Proton is not for you in that case.

but about SMTP where you can have true e2ee

Yes, you can using a PGP client, like OpenPGP of Proton webmail, or Proton bridge. You need stuff on top of SMTP.

Your previous comment of “SMTP requires server to access the data” is simply wrong.

Nope, you are simply misinterpreting it. In SMTP the server requires access to the data because it's the one delivering it. PGP is built so that the data it's a ciphertext and not plaintext, so that the server can't see the actual content of the mail, but it needs to have the data and ship it, in contrast for example to a p2p protocol. PGP is however on top of SMTP and requires a client doing it for you. OpenPGP or Proton do exactly this. There is no way to support SMTP "natively" and offer e2ee. You would like Proton not to do e2ee and leave the responsibility of the client to do the PGP part, with the freedom of picking whatever client you want? Well, that's exactly the opposite of their business model, since what they aimed is to make PGP de-facto transparent to the users so that it's available even to people who are not advanced users.

Do you have any proof that they use CalDAV/CardDAV?

https://github.com/search?q=org%3AProtonMail+CalDAV&type=code you can dig yourself into the code if you are curious to understand.

In the same way they don’t do IMAP/SMTP.

I sent you already a GIthub search of their clients for SMTP, look for yourself in the code. Do you think that makes any sense at all for them to reinvent the wheel and come up with ad-hoc protocols when all they need is a client?
You can also have a look at the job offers they post: https://boards.eu.greenhouse.io/proton/jobs/4294852101
You can see SMTP mentioned and experience with Postfix in production. It's very likely that they are running that in the background.

Get over yourself and your purist approaches, when a company provides a service that is standardized in a specific set of protocols and they decide to go ahead and implement their own stuff it is, at least, a subtle, form of vendor lock-in. End of story.

No it's not. Vendor lock means:

In economics, vendor lock-in, also known as proprietary lock-in or customer lock-in, makes a customer dependent on a vendor for products, unable to use another vendor without substantial switching costs.

Proton uses open standards, and just builds clients that wrap them. This means, emails are in a format that can easily be imported elsewhere, same for Calendar and Contacts.
You are now watering down the definition of vendor lock to try to make your claim less wrong, but it is wrong. I repeat, and you are welcome to prove me wrong:

  • There is no substantial cost in any form to migrate from Proton to an equivalent service elsewhere:
    • You can migrate email simply, by changing domain DNS records and exporting/importing the data.
    • You can export your Calendar in a standard format
    • You can export your Contacts in a standard format

This means that I can change vendor easily without significant cost, hence I am not locked-in.

What you actually mean is that while using Proton you can't interoperate easily with other tools, and this is a by-design compromise to have e2ee done in the way they wanted to make it, which is available to mainstream population. You disagree with their approach? Absolutely legitimate, you prefer to use OpenPGP, handle keys and everything yourself? Then for sure, Proton is not worth for you as you can choose the tools you want if that's important for you.
But there is no vendor-lock, they simply bundled together the email client with the PGP client, so that you don't have the full flexibility of separating the two.

You disagree with this definition of vendor lock? Awesome, give me your definition and link some source that use that definition. Because if you keep moving the goalpost and redefine what vendor lock means, there is no point to discuss.

sudneo ,

Yes, an exploitative thing that mostly consists of free labour for big orgs.

sudneo ,

Technical measures are impossible in this particular case.
However, I would say that the complete lack of benefits or incentives makes it very unlikely.
Doing so could be illegal and collecting data which is otherwise useless is only a liability and a waste of resources.
Basically the admin own self-interest I would say is what's stopping them. That said, if someone is individually afraid due to a bad relationship with an admin, then personal motives could void the above, in which case, they should change instance probably or use a VPN at least.

sudneo ,

I personally package the files in a scratch or distroless image and use https://github.com/static-web-server/static-web-server, which is a rust server, quite tiny. This is very similar to nginx or httpd, but the static nature of the binary removes clutter, reduces attack surface (because you can use smaller images) and reduces the size of the image.

sudneo ,

Containers are a perfectly suitable use-case for serving static sites. You get isolation and versioning at the absolutely negligible cost of duplicating a binary (the webserver - which in case of the one I linked in my comment, it's 5MB of space). Also, you get autostart of the server if you use compose, which is equivalent to what you would do with a Systemd unit, I suppose.

You can then use a reverse-proxy to simply route to the different containers.

sudneo ,

It really depends, if your setup is docker based (as OP's seems to be), adding something outside is not a good solution. I am talking for example about traefik or caddy with docker plugin.

By versioning I meant that when you do a push to master, you can have a release which produces a new image. This makes it IMHO simpler than having just git and local files.

I really don't see the complexity added, I do gain isolation (sure, static sites have tiny attack surfaces), easy portability (if I want to move machine it's one command), neat organization (no local fs paths to manage essentially), and the overhead is a 3 lines Dockerfile and a couple of MB needed to duplicate a webserver binary.
Of course it is a matter of preference, but I don't see the cons honestly.

sudneo ,

I would consider the lack of a shell a benefit in this scenario. You really don't want the extra attack surface and tooling.

Considering you also manage the host, if you want to see what's going on inside the container (which for such a simple image can be done once while building it the first time more likely), you can use unshare to spawn a bash process in the container namespaces (e.g., unshare -m -p [...] -t PID bash, or something like this - I am going by memory).

sudneo ,

If there is already another reverse proxy, doing this IMHO is worse than just running a container and adding one more rule in the proxy (if needed, with traefik it's not for example). I also build all my servers with IaC and a repeatable setup, so installing stuff manually breaks the model (I want to be able to migrate server with minimal manual action, as I had to do it already twice...).

The job is simple either way, I would say it mostly depends on which ecosystem someone is buying into and what secondary requirements one has.

sudneo ,

I guess a bunch of things, as they are specialized apps:

  • proper auth. I think with Firefox you can have a password, but a password manager will have multiple options for 2fa including security keys, and on phone fingerprint unlock etc. In general, password managers are more resistant to malicious users gaining access to your device.
  • store all kinds of stuff. Not everything happens in the browser, and it's just convenient to have an app just for credentials. Many password managers allow to store and autofill credit cards too, for example.
  • on the fly generation of aliases. Password managers have external integrations. For example proton and bitwarden can integrate with simplelogin.io to generate email aliases when you choose to generate a new username.
  • org-like features. Password managers can be also convenient for sharing with family (for example). I do manage a bitwardes organization used by all my immediate family, which means I can share credentials easily with any of them. Besides the sharing I can also ensure my (not tech savvy mom) won't lock herself out (emergency breakglass access configurable) and technically enforce policies on password strength etc.
  • as banal as it is, self-managing. I like to run my own services and running my own password manager with my own backups gives me peace of mind.
  • another perhaps obvious point. More compatibility? I can use my password manager on whatever device, whatever browser. For some, it might not change anything, but it's a convenient feature.

As a personal addition, I would say that I simply want the cornerstone of my online security to be a product for a company that is specialized in doing that. I have no idea how much effort goes into the password manager from Mozilla, for example.

sudneo ,

Yep, I know and it's very convenient. I discovered recently that bitwarden also has integration, but requires manually provisioning an API key. Not as convenient but quite nice as well.

sudneo ,

They had a security audit, they have a canary on their website, they have a privacy policy which is legally binding, and they have a business incentive.

If you so much suspect that they do collect searches and associate them with accounts (something which they claim they don't do), you can make a report to the relevant data protection authority, which then can audit them.

As someone else also commented, you can use an alias email and pay in crypto if you really wish to not associate your account with your searches. Just be advised that between IP addresses and browser fingerprinting it might always be possible to associate your searches together (even if not to you as an individual with name and surname), and this is something that big CDNs like cloudflare or imperva also provide for you. So you still rely in most cases on what the company says and what their business model is to determine whether you trust them or not.

So far kagi has both a good policy (great policy actually) and a business model that doesn't suggest any interest for them to illegally collect data to sell them.

sudneo ,

https://kagi.com/privacy

Kagi only stores the information about the client that you explicitly provide by using your account, as laid out in our interface. This includes:

Your email to facilitate account access and support contact (ex: password reset)
Your account settings (ex: theme, search region, selected language)

And nothing else.

sudneo ,

And I am saying that there are tools to increase this trust.

I also want to stress that you have no tools really to verify. Open source code is useless, audits are also partially useless. I have done audits myself (as the tech contact for the audited party) and the reality is that they are extremely easy to game and anyway are just point in time snapshots. There is nothing that impedes the company tomorrow to deploy a change that invalidates what was audited. The biggest tools we have are legal protection (I mean, most companies that collect all kind of data disclose that they do nowadays) and economic incentive. Kagi seems to provide good reason to trust them from both these angles.

Obviously, if that's not enough for you, fair enough, but if you are considering a company to be intentionally malicious or deceptive, then even the guarantees you suggest do not guarantee anything, so at this point I really wonder if or how you trust anybody, starting from your ISP, your DNS provider, your browser etc.

sudneo ,

It means that they are open sourcing an increasing number of components? In the very same page they are linked: browser extensions, libraries they use for their AI features.

sudneo ,

I am not understanding something then.

The basics in this case are a legally binding document saying they don't do x and y.
Them doing x or y means that they would be doing something illegal, and they are being intentionally deceptive (because they say they don't do it).

So, the way I see it, the risk you are trying to mitigate it is a company which actively tries to deceive you. I completely agree that this can happen, but I think this is quite rare and unfortunately a problem with everything, that does not have a solution generally (or to be more specific, that what you consider basics - open source code and an audit - do not mitigate).

Other than that, I consider a legally binding privacy policy a much stronger "basic" compared to open source code which is much harder to review and to keep track of changes.

Again, I get your point and whatever your threshold of trust is, that's up to you, but I disagree with the weight of what you consider "the basics" when it comes to privacy guarantees to build trust. And I believe that in your risk mapping your mitigations do not match properly with the threat actors.

sudneo ,

Sure, but if you are considering a malicious party in the kagi case, your steps don't help. What you propose can totally work if you are considering good faith parties.

In other words: assume you use searXNG. If you now want to consider a malicious party running an instance, what guarantees do you have? The source code is useless, as the instance owner could have modified it.
I don't see a privacy policy for example on https://searxng.site/searxng/ and I don't see any infrastructure audit that confirms they are running an unmodified version of the code, which - let's assume - has been verified to respect your privacy.

How do you trust them?

I am curious, what do you use as your search engine?

sudneo ,

OK guarantee was too strong of a word, I meant more like "assurance" or "elements to believe".

Either way, my point stand: you did not audit the code you are running, even if open source (let's be honest). I am a selfhoster myself and I don't do either.

You are simply trusting the software author and contributors not to screw you up, and in general, you are right. And that's because people are assholes for a gain, usually, and because there is a chance that someone else might found out the bad code in the project (far from a guarantee).
That's why I quoted both the policy and the business model for kagi not to screw me over. Not only it would be illegal, but would also be completely devastating for their business if they were to be caught.

But yeah, generally hosting yourself, looking at the code, building controls around the code (like namespaces, network policies, DNS filtering) is a stronger guarantee that no funny business is going on compared to a legal compliance and I agree.
That said, despite being a selfhoster myself, I do have a problem with the open source ecosystem and the inherent dependency on free labour, so I understand the idea of proprietary code. Ultimately this is what allowed kagi to build features that make kagi much more powerful than searXNG for example.

sudneo ,

In reality I did not read anywhere that they intend to create a profile on you. What I read is some fuzzy idea about a future in which AIs could be customised to the individual level. So far, Kagi's attitude has been to offer features to do such customisations, rather than doing them on behalf of users, so I don't see why you are reading that and jumping to the conclusion that they want to build a profile on you, rather than giving you the tools to create that profile.
It's still "data" given to them, but it's a voluntary action which is much different from data collection in the negative sense we all mean it.

sudneo ,

They don't, but a company built on that premise (private search) that does otherwise would be playing with fire. It caters to users that specifically look for that. I would quit in an instant if that would be the case, for example.

Seriously though even if they don’t track you an adversary could compromise them

This is true about pretty much anything. Unless you host and write the code yourself, this is a risk. It is a risk with searXNG (malicious instance, malicious PR/code change that gets approved etc.), with email providers, with DNS providers, etc.

What solution you propose to this, that can actually scale?

sudneo ,

It’s still data given to them, no scare quotes needed.

It is if you decide to give it to them. If it's a voluntary feature and not pure data collection, that's the difference. Which means if you don't want to take the risk, you don't provide that data. I am sure you understand the difference between this and the data collection as a necessary condition to provide the service.

And if that data includes your political alignment, like they say in their manifesto, a data breach would be catastrophic.

Which means you will simply decide not to use that feature and not give them that data?

And even if there isn’t one, using their manifesto to promise a dystopia where you are nestled in a political echo chamber sounds like a nightmare

It depends, really. When you choose which articles and newspapers you consider reputable, you consider that an echo chamber? I don't. This is different from using profiling and data collection to provide you, without your knowledge or input, with content that matches your preference. Curating the content that I want to find online is different from Meta pushing only posts that statistically resonate with me based on the behavioral analysis they have done on top of the data collected, all behind the scenes.
I don't see where the dystopia is if I can curate my own content through tools. This is very different from megacorps curating our content for their own profit.

sudneo ,

… Because based on their manifesto, that’s exactly what Kagi wants to do with you as a search engine; show you the things it thinks you want to see.

no, based on your interpretation of the manifesto. I already mentioned that the direction that kagi has taken so far is to give the user the option to customize the tools they use. So it's not kagi that shows you the thing you want to see, but you, who tell kagi the things who want to see. I imagine a future where you can tune the AI to be your personal assistance, not the company.

Every giant corporation has a privacy policy

It is not having a policy that matters, obviously, it's what inside it that does. Facebook privacy policy is exactly what you would expect, in fact.

sudneo ,

I’ve been quoting the Kagi Corp manifesto.

Yes, but you have drawn conclusions that are not in the quotes.

Let me quote:

But there will also be search companions with different abilities offered at different price points. Depending on your budget and tolerance, you will be able to buy beginner, intermediate, or expert AIs. They’ll come with character traits like tact and wit or certain pedigrees, interests, and even adjustable bias. You could customize an AI to be conservative or liberal, sweet or sassy!

In the future, instead of everyone sharing the same search engine, you’ll have your completely individual, personalized Mike or Julia or Jarvis - the AI. Instead of being scared to share information with it, you will volunteer your data, knowing its incentives align with yours. The more you tell your assistant, the better it can help you, so when you ask it to recommend a good restaurant nearby, it’ll provide options based on what you like to eat and how far you want to drive. Ask it for a good coffee maker, and it’ll recommend choices within your budget from your favorite brands with only your best interests in mind. The search will be personal and contextual and excitingly so!

There is nothing here that says "we will collect information and build the thing for you". The message seems pretty clearly what I am claiming instead: "You tell the AI what it wants". Even if we take this as "something that is going to happen" (which is not necessarily), it clearly talks about tools to which we can input data, not tools that collect data. The difference is substantial, because data collection (a-la facebook) is a passive activity that is built-in into the functionality of the tool (which I can't use it without). Providing data to have functionalities that you want is a voluntary act that you as a user can do when you want and only for the category of data that you want, and does not preclude your use of the service (in fact, if you pay for a service and don't even use the features, it's a net positive for the company if that's how they make money!).

even accusing eyewitnesses of the CEO’s bad behavior of being liars.

What I witnessed is the ranting of a person in bad faith. You are giving credit to it simply because it fits your preconception. I criticized it based on elements within their own arguments, and concluded that for me that's not believable. If that's your only proof of "bad behavior" and that's enough for you, good for you.

What you say is bad for Facebook, is what Kagi Corp wants to do.

Let me reiterate on the above:

you will volunteer your data, knowing its incentives align with yours

Now, let's be clear because I have absolutely no intention to spending my evening repeating the same argument.
Do you see the difference between the following:

  • I use a service to connect with people, share thoughts, read thoughts from others, and the service passively collects data about me so that it can serve me content that helps the company behind it maximizing their profits, and
  • I use a service that I can customize and provide data to in order to customize what I see and what is displayed to me, which has no financial incentive to do anything else with that data because I - the user - am the paying customer.

?

If you don't, and you don't see the difference between the two scenarios above, there is no point for me to continue this conversation, we fundamentally disagree. If you do see the difference, then you have to appreciate that the nature of the data collection moves the agency from the company to the user, and a different system of incentive in place creates an environment in which the company doesn't have to screw you over in order to earn money.

sudneo ,

It’s pretty clear that you only draw your conclusions from a predetermined trust in Kagi, a brand loyalty.

As I said before, I also draw this conclusion based on the direction that they have currently taken. Like the features that actually exist right now, you know.
You started this whole thing about dystopian future when talking about lenses, a feature in which the user chooses to uprank/downrank websites based on their voluntary decision. I am specifically telling that this has been the general attitude, providing tools so that users can customize stuff, and therefore I am looking at that vision with this additional element in mind.
You instead use only your own interpretation of that manifesto.

Kagi Corp is good, so feeding data to it is done in a good way, but Facebook Corp is bad so feeding data to it is done in a bad way.

You are just throwing the cards up.
If you can't see the difference between me having the ability to submit data, when I want, what I want and Facebook collecting data, there are only two options: you don't understand how this works, or you are in bad faith. Which one it is?

sudneo ,

The “lens” feature isn’t mentioned in either Kagi manifesto.

So? It exists, unlike the vision in the manifesto. Since the manifesto can be interpreted in many ways (despite what you might claim), I think this feature can be helpful to show the Kagi intentions, since they invested work into it no? They could have build data collection and automated ranking based on your clicks, they didn't.

People just submitted it. I don’t know why. They “trust me”. Dumb fucks.

Not sure what the argument is. The fact that people voluntary give data (for completely different reasons that do not benefit those users directly, but under the implicit blackmail to use the service)? I have no objections anyway against Facebook collecting the data that users submit voluntarily and that is disclosed by the policy. The problem is in the data inferred, in the behavioral data collected, which are much more sneaky, and in those collected about non users (shadow profiles through the pixel etc.). You putting Facebook and an imaginary future Kagi in the same pot, in my opinion, is completely out of place.

sudneo ,

The manifesto is actually a future vision. And again, you are interpreting it in your own way.

At the same time, you are completely ignoring:

  • what the product already does
  • the features they actually invested to build
  • their documentation in which they stress and emphasize on privacy as a core value
  • their privacy policy in which they legally bind themselves to such commitment.

Because obviously who cares of facts, right? You have your own interpretation of a sentence which starts with "in the future we will have" and that counts more than anything.

Also, can you please share to me the quote where I say that I need to blindly trust the privacy policy? Thanks.

Because I remember to have said in various comments that the privacy policy is a legally binding document, and that I can make a report to a data protection authority if I suspect they are violating them, so that they will be audited.
Also, guess what! The manifesto is not a legally binding document that they need to respond of, the privacy policy is. Nobody can hold them accountable if "in the future there will not be" all that stuff that are mentioned in the manifesto, but they are accountable already today for what they put in the privacy policy.

Do you see the difference?

sudneo ,

You are really moving the goal post eh

Developing AI feature does not mean anything in itself. None of the AI features they built do anything at all in a personalized way. For sure they seem very invested into integrating AI in their product, but so far no data is used, and all the AI features are simply summarizers and research assistants. What is this supposed to prove?

I will make it simpler anyway:

What they wrote in a manifesto is a vague expression of what will happen in a non-specified future. If the whole AI fad will fade in a year, it won't happen. In addition, we have no idea of what specifically they are going to build, we have no idea of what the impact on privacy is, what are the specific implementation choices they will take and many other things. Without all of this, your dystopian interpretation is purely arbitrary.

And this is rather ironic too:

Ironic how? Saying that a document is binding doesn't mean blindly trusting it, it means that I know the power it holds, and it means it gives the power to get their ass audited and potentially fined on that basis if anybody doesn't trust them.

Your attempt to mess with the meaning of my sentences is honestly gross. Being aware of the fact that a company is accountable has nothing do to with blind trust.


Just to sum it up, your arguments so far are that:

  • they mention a "future" in which AI will be personalized and can act as our personal assistant, using data, in the manifesto.
  • they integrated AI features in the current offering

This somehow leads you to the conclusion that they are building some dystopian nightmare in which they get your data and build a bubble around you.

My arguments are that:

  • the current AI features are completely stateless and don't depend on user data in any way (this capability is not developed in general and they use external models).
  • the current features are very user-centric and the users have complete agency in what they can customize, hence we can only assume that similar agency will be implemented in AI features (in opposition to data being collected passively).
  • to strengthen the point above, their privacy policy is not only great, but it's also extremely clear in terms of implications of data collected. We can expect that if AI features "personalized" will come up, they will maintain the same standard in terms of clarity, so that users are informed exactly on the implication of disclosing their data. This differentiate the situation from Facebook, where the privacy policy is a book.
  • the company business model also gives hope. With no other customer to serve than the users, there are no substantial incentive for kagi to somehow get data for anything else. If they can be profitable just by having users paying, then there is no economical advantage in screwing the users (in fact, the opposite). This is also clearly written in their doc, and the emphasis on the business model and incentive is also present in the manifesto.

The reality is: we don't know. It might be that they will build something like you say, but the current track record doesn't give me any reason to think they will. I, and I am sure a substantial percentage of their user base, use their product specifically because they are good and because they are user-centric and privacy focused. If they change posture, I would dump them in a second, and a search engine is not inherently something that locks you in (like an email).
At the moment they deliver, and I am all-in for supporting businesses that use revenue models that are in opposition to ad-driven models and don't rely on free labor.
I do believe that economic and systemic incentive are the major reasons why companies are destroying user-privacy, I don't thing there is any inherent evil. That's why I can't really understand how a business which depends on users paying (kagi) can be compared to one that depends on advertisers paying (meta), where users (their data) are just a part of a product.

Like, even if we assume that what's written in the manifesto comes to life, if the data is collected by the company and only, exclusively, used to customize the AI in the way I want (not to tune it to sell me shit I don't need), within the scope I need, with the data I choose to give, with full awareness of the implication, where is the problem? This is not a dystopia. The dystopia is if google builds the same tool and tunes it automatically so that it benefits whoever pays google (not users, but the ones who want to sell you shit). If a tool is truly making my own interests and the company interest is simply that I find the tool useful, without additional goals (ad impressions, visits to pages, product sold), then that's completely acceptable in my view.

And now I will conclude this conversation, because I said what I had to, and I don't see progress.

sudneo ,

To be clear, you want a venture capital corporation to keep you in your filter bubble regarding your political beliefs, your corporate brand choices, your political beliefs, your philosophical beliefs, etc?

Thankfully, I kagi is not a VC-funded corp. The latest investment round was for 670k, pennies, from 42 investors, which means an average of less than 20k/investor (they also mention that most are kagi users too but who knows).

Also, it depends on what it means "being kept in a filter bubble". If I build my own bubble according to my own criteria (I don't want to see blogs filled with trackers, I want articles from reputable sources - I.e. what I consider reputable, if I am searching for code I only want rust because that's what I am using right now, etc.) and I have the option to choose when to look outside, then yes, I think it's OK. We all already do that anyway, if I see an article from fox news I won't even open it, if on the same topic I see something from somewhere else. That said, there are times where I can choose to read fox news specifically to see what conservatives think.

The crux of it all is: who is in charge? And what happens with that data? If the answers are "me" and "nothing", then it's something I consider acceptable. It doesn't mean I would use it or that I would use it for everything.

evangelize that kind of product on a privacy forum?

First, I am not evangelizing anything. That product doesn't even exist, I am simply speculating on its existence and the potential scenarios.

Second: privacy means that the data is not accessed or used by unintended parties and is not misused by the intended ones. Focus on unintended.
Privacy does not mean that no data is gathered in any case, even though this is often the best way to ensure there is no misuse. This is also completely compatible with the idea that if I can choose which data to give, and whether I want to give it at all (and of course deleting it), and that data is not used for anything else than what I want it to be used for, then my privacy is completely protected.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • incremental_games
  • meta
  • All magazines