Our Protonmail Adventure - A Five Act Drama
After many years of running our mail/groupware setup on our own infrastructure we decided in early 2021 to switch to a hosted solution. The main reason was time: Everyone in the team is busy with writing or auditing code and the time left for petting mail servers went close to zero.
From the very beginning ProtonMail was on the shortlist because we liked the idea of having all our mails encrypted on the ProtonMail servers. Side note: Yes, we knew that ProtonMail can in theory capture our passwords and decrypt everything. Some team members have been using it privately too and have been quite happy. Since ProtonMail now also supports groupware features such as calendar and contacts we finally decided to give it a try. At this time we knew already about some limitations, for example, no ProtonMail Calendar App for iOS or no support to have shared calendars. ProtonMail pre-sales assured us that they are working on these features and that they will be available soon. Since initial tests worked well we purchased a ProtonMail Visionary subscription and hoped that those missing features will materialize sooner or later.
2. Rising Action
Aliases with multiple recipients
Right after purchasing a subscription we experienced the first set-back. Since the very beginning of our company history we have mail aliases such as office@ or accounting@ which will target multiple recipients. While ProtonMail supports aliases, it has no support for simple distribution lists. So having an office@ pointing to multiple mailboxes is not possible. ProtonMail support confirmed that this feature is not available but might be implemented eventually in the future. Teeth-gnashingly we worked around the issue by assigning aliases to only one recipient each and started forwarding mails manually.
Some of our mailboxes are rather huge, they contain up to 100k mails. While importing them via IMAP worked well, accessing them for the first time did not. Especially the ProtonMail App for Android does like lots of new mail or mails with a new state. After the initial import the Android App stopped working for more than two days after finally being able to display mails. The same problem was also observable after marking a few thousand mails as read. Another obstacle with the Android App was that frequent changes between WiFi and cellular network caused an invalid state and the app was unable to fetch new mail. Erasing all app data resolved the problem in most cases, sometimes only re-installing the whole app helped.
On our workstations we deployed ProtonMail Bridge so that everyone was able to use their least hated mail client. The bridge connects to ProtonMail services and acts as an IMAP service on localhost. Soon we experienced high CPU usage and stalls of the bridge process. The CPU usage problem has highly correlated with the used mail client. Most problems have been observed when KDE Kmail was used. Our best guess is that Kmail runs many IMAP requests in parallel and triggers various scaling issues within the bridge. ProtonMail support was unable to help us. As a workaround some of us started using getmail to fetch mail via IMAP into a local Maildir and gave Kmail access to it. With getmail being relatively stupid and strictly single threaded the bridge consumed much less CPU and showed no stalls. It turned out that has been the least problem with ProtonMail Bridge.
We’re heavy users of PGP mail and believe in end-to-end encryption. While ProtonMail can do PGP for us on the server side, nobody of us would ever upload their kernel.org PGP private key into ProtonMail. The server side PGP feature of ProtonMail is nice, as all mails stored on Proton’s servers are encrypted, but we’re still utilizing PGP with our own local keys by encrypting mails with GnuPG or similar. As a consequence mails are double PGP encrypted:
- First by our local PGP client
- Later by ProtonMail on the server side
A side effect of this is that the mime content type in mails is changed
Content-Type: multipart/encrypted to
which caused problems with various PGP capable mail clients.
To our astonishment we have been told by ProtonMail support that using PGP locally
with ProtonMail is not supported and currently works just by chance.
The Changing Point
At this point most of the initial excitement for ProtonMail rendered into depression. The only hope was that at least some of the problems or missing features we’ve encountered will get addressed soon. But our expectations were low, especially since most answers from ProtonMail support fell into two categories:
- We don’t support this use-case.
- Did you try turning it off and on again?
With all our workarounds installed we learned to somehow deal with ProtonMail and it seemed to work rather well. Features were still missing but we hoped to see some of them implemented soon with some luck. From time to time one or the other of us noticed something strange. An IMAP capable mail client decided to re-download the whole mailbox or started to re-index everything. Most of the time we saw that on mbsync and Kmail.
On 29.09.2021 the situation changed significantly. Around 15:00 that day various team members at the company saw strangeness happening with the ProtonMail Bridge. Kmail all of a sudden re-downloaded a mailbox, mbsync did a mis-sync and removed local mail, getmail did not download new mail despite new mail being present, etc. At this point our fear grew that something wonky was happening at the bridge.
The UID problem of ProtonMail bridge
Around the same time Richard was debugging a problem with getmail on IMAP level and later found that ProtonMail bridge violates the IMAP specification.1
Due to this bug hunt Richard had many IMAP protocol logs of his mailbox and started to investigate why getmail on his workstation no longer found new mail after 29.09.2021.
At IMAP level every mail has a unique and immutable identifier: the message UID. getmail keeps a list of downloaded UIDs to know which messages qualify for downloading. So getmail basically asks the IMAP server for a list of UIDs of a specific mailbox, compares it to its local list and fetches all mail who’s UID is new. Richard found that getmail did not see new mail because the new mail on the ProtonMail server side had UIDs which were already in getmails list! By comparing with the protocol logs before 29.09.2021 we had proof that the very same message all of a sudden had different UIDs. This explained all the problems we had with various mail clients regarding mail sync. We immediately reported the problem to the ProtonMail bridge folks and support.2
To be very sure Richard added a hack to his local getmail.3 In addition to the message UID getmail now also stored the message size in its local state. This allowed detecting UID changes immediately.
A few weeks later the problem happened again, getmail detected a UID change.4 At this point it was crystal clear to us that ProtonMail bridge is not trustworthy and can cause major trouble leading to data loss.
4. Falling Action
One would expect that when such a fatal error is reported, developers would react in a reasonable way, especially when the affected product is something you pay for. But nothing happened for a long time. In January 2022 ProtonMail Bridge developers kind of agreed that this bug is real and they’re basically rewriting it.5 In the meanwhile other users realized that they’re also affected and that it’s not a bug in their mail clients.
Around March 2022 there was still no progress with the UID bug nor with any of the missing features.
We had no iOS calendar app, no shared calendar, no distribution lists, nothing.
All we had was a huge pile of
wasted time with debugging and reading the source new knowledge and deeper
understanding of IMAP and various mail clients which didn’t make use more confident
to use ProtonMail Bridge.
That said, not all problems rooted in ProtonMail Bridge, we fixed also one in getmail.6
The original plan was to save time by using a hosted service. The move to ProtonMail caused the opposite.
Finally we decided to bite the bullet and migrate away from ProtonMail. We had to accept that it is not a perfect fit for our use-cases. For others it might work well, but not for us. Maybe it would have been a less painful experience if we didn’t depend on the bridge. But finally we had to give up and move on.