Encrypting Facebook Messenger with Jon Millican and Timothy Buck

Facebook Messenger has finally been end-to-end encrypted, a couple of years after Mark Zuckerberg announced it! Plus Instagram DMs are trialing ephemeral E2EE DMs too! We invited on Jon Millican and Timothy Buck from Meta to discuss this major cross-platform endeavor, and how David Bowie fits into their personal Labyrinth.

Links:

This transcript has been edited for length and clarity.

Deirdre: Hello. Welcome to Security Cryptography Whatever. I’m Deirdre.

David: I’m David.

Thomas: Chicago is further away from New Zealand than you are from Guam.

Deirdre: Who are you?

Jon: Good to know.

Thomas: This is not the longest, most geographically diverse one we’ve done so far.

Jon: Disappointing.

Deirdre: Thanks, Thomas. By the way, we have two special guests today. We have Jon Millican. Hi, Jon.

Jon: Hello.

Deirdre: Hi. And we have Timothy Buck. Hi, Tim.

Timothy: Hi.

Deirdre: Our special guests are from Meta, and we invited them on today because Facebook Messenger and Instagram DMs have just released end-to-end encrypted messaging by default, especially for Facebook Messenger, on all of your platforms, all of your services that Facebook Messenger is generally supported on. Is that correct?

Jon: Kind of. So we’ve just announced one to one default end-to-end encryption is rolling out on Facebook Messenger at this point, and that is across all platforms. Although the rollout will take a little while to reach completion on Instagram. It’s just ephemeral messages so far, but that’s coming later.

Deirdre: That’s pretty cool. And I want to get to that in a little bit. But let’s go to the big one first, which is, there was news like a couple of years ago about trying to make all of Meta’s messaging channels, which is Facebook Messenger, Instagram DMs, and of course WhatsApp messaging, and I think, is there another one? You have Threads now, but let’s not talk about Threads. They don’t have DMs. They just are Instagram, but whatever.

Timothy: Correct.

Deirdre: And try to make all of that end-to-end encrypted. And it seems like that was a big lift. And you’ve kind of gotten over the biggest hump of that lift, which is Facebook Messenger, mostly because Facebook Messenger is supported everywhere where Facebook is supported, which is on mobile and on web, and it’s been on web the longest, I think. And so the fact that you’ve gotten it rolled out, period, is a big accomplishment. Can you kind of give us a high-level of what you had to address to actually get this rolled out?

Timothy: Well, I think we can maybe do a two-part answer. I can give some context from the product perspective, and then John can get into a little bit more details of the technical hurdles that we had. So, as you said, Facebook Messenger is all over the place, right? We have several different web places: facebook.com, messenger.com. We have the mobile apps and not know. There’s the Messenger mobile apps on iOS and Android, and there’s also the Facebook mobile apps on iOS and Android. And then there are desktop apps as well, Windows and Mac. It’s available in like, Oculus and other. It’s all over the place, right.

A whole bunch of different apps that exist. And a very large number of people use these, even things that, I personally am not a huge facebook.com user. There’s a very large number of people who use facebook.com and message on it all the time. And this is not just messaging, but it’s also calling, that is end-to-end encrypted. And there’s a large number of features that all have to be rebuilt within that. So people have come to expect a large number of features in Messenger, across messaging and calling, well over 100. And each of those had to be redesigned and rebuilt. Obviously, originally they were very server-centric. A lot of the logic was on the server and that doesn’t work anymore.

Jon: Right.

Timothy: Because we don’t have access to that information. And so those needed to be rebuilt on every single surface and done in a way that had the quality that users expect. So it’s a very, very large effort and that’s just rebuilding the things that people expect. But encryption obviously also comes with things like backups and keys and other encryption, specific things that we needed to build and make easy to use and educate people about across all of those services, across all of those languages and across all of those cultures. So, yes, a very large project, but we’re really excited that this is now going out as the default for those billions of people.

Deirdre: Hell yeah. So this is definitely in the billions. I think WhatsApp is in 2 billion, 3 billion user accounts and many more-

Timothy: Yeah, I don’t know what numbers we can say, but they’re very large.

Jon: Yeah, billions.

Deirdre: Meta once again, with either the biggest or second biggest end-to-end encryption rollout all in one fell swoop, whichever it ranks with whenever I’m pretty sure WhatsApp, when it turned on end-to-end encryption by default, was smaller than whatever the numbers are turning on.

Jon: That’s very possible.

Deirdre: So that’s huge.

Timothy: It would also have been, less complicated, right, because it was mostly just-

Jon: Yeah, yeah. And actually beyond that, because WhatsApp was always designed sort of with its features working primarily in a device-to-device function, device-device mechanism. Whereas with Messenger, so much server side functionality like the bytes you upload are not the bytes which are received in WhatsApp, it largely was. And it makes a huge difference.

Deirdre: Absolutely. One thing that strikes me as something that you’ll have to adapt this is using Signal protocol under the hood for device-to-device or at least one to one handling, I know that there’s descriptions of groups in your white paper, but the standard Signal protocol kind of assumes that your identity is bound to a device-bound identity key, whereas Facebook Messenger definitely was not designed that way. You still log in with Facebook or just Facebook Messenger username and password, or however you authenticate with accounts.facebook.com or whatever. And then once you’ve done that, you can do whatever you want. Can you talk a little bit about how you adapted for that sort of identity management system?

Jon: Totally, yeah. So with the sort of constraint as you described it, and this desire that comes with most encrypted systems, to be able to identify the endpoints in question at any given moment, we basically had to treat each endpoint as its own almost identity. So you have an identity key per device and they’re added and removed, and users can see as they’re being added and removed and check the whole list. And, yeah, as you say, you can log in anywhere. You can do pretty much anything apart from read any messages that were sent to you before that moment where you logged in.

Deirdre: Interesting.

Jon: Right. And that’s where our new model of message storage, some call it a backup, some call it secure storage, but our encrypted storage system essentially comes in.

Deirdre: And this is Labyrinth.

Jon: This is Labyrinth. Exactly.

Deirdre: Cool.

Jon: Yeah. And so the context is that historically, Messenger has always allowed you to log in anywhere and just access everything. You can scroll back through all of your history, you can send and receive new messages, etc. But being able to guarantee that someone can do that wherever their Facebook account is available essentially means that Facebook has to have access to that message. And that’s obviously counter to the goal of Facebook not having access to your messages, which is sort of our approximate high level statement around end-to-end encryption, not quite sure the exact wording we generally use. So that essentially means we now need to store the messages for the user in an encrypted manner if they are to access them on another device or store them off of our own properties. And I’ll address that part in a sec.

But, yeah. So if we’re storing them in an encrypted manner, then much like with a lot of cryptographic solutions, what we’ve done is we’ve transformed one problem into a key management problem. And so it’s now the user has to have a way to transfer those keys when they access a new device. And so that’s really what Labyrinth does. It gives us a way to store it all for the user, but then we have our recovery mechanisms, which we need in place to briefly address that sort of point I mentioned about, well, why not store it off of our own platforms, which is what some other systems do, most notably WhatsApp, or at least most notably to me. Maybe that’s just as a Meta employee. But the difference here is really the constraints that we’re working in where with WhatsApp, you’re storing all of your messages locally on your device and just backing them up periodically, and you are primarily located on a single device, and then their multi-device model sort of extends out from there.

With us, we don’t have a history of requiring users to store a large amount of data on their messaging clients. This is, I guess, particularly notable in the case of web ones where it may not even be possible. But even on mobile devices, if we suddenly ask someone to store an additional few gigabytes, that’s not always going to go down well. So that was really where this came from. And this is why we don’t really call it a backup, we call it a secure storage system, because the idea is that you only need to keep a very limited amount of data on the device and you can page it in as and when it’s necessary.

Deirdre: So this is interesting because when I think of, oh, I’m logging in on a new device, or say I have an identity system, I’m like, oh, I have a new phone, I’m going to log in to Facebook with my new phone and I want to get, I don’t know, whatever the most recent cache of encrypted, unreadable-by-Facebook-servers, Facebook Messenger messages that are addressed to me or addressed to my account id, not necessarily my device id, but I could see a system that basically lets me get the last day’s know, kind of cached-in-the-pipeline messages to be delivered, and then I get the last day’s worth of messages up to a certain number or a certain size, blah blah blah blah. And I have enough context to just sort of get up to speed on my new device, and then new messages that are addressed to my id, including my newly enrolled device to my id, will also be sent to my device. I can understand that at Facebook scale this might not work, because even if you cache us like a max cap of encrypted messages for every device and for every user id, that’s literally billions and billions and billions and billions of just blobs sitting there waiting to be downloaded and it just doesn’t work. Can you tell us a little bit more about how Labyrinth works and how that both helps with main message history across devices and also helps do it at Facebook Messenger scale.

Thomas: I’m going to butt in. Before you do that, I want to make sure I follow where we’re at, right. I get roughly what Labyrinth is going to do, is aiming to do here. You’re still running Signal protocol for device-to-device stuff. So for direct messaging, you’re still running like a direct secure messaging protocol. And Labyrinth is a high fidelity record of all the messages that have been sent. So you can bring up new devices and have a complete message history. And you’re doing that because it’s the UX expectation for everybody that uses Facebook Messenger. And I am one of those people. I have two questions, right. So first of all, this is the simpler question: are you going to Labyrinth all the previous messages that were sent?

Jon: That is an interesting question. I don’t think we have a clear plan on that one yet, but it would be a lot of work to do. Yeah.

Thomas: Okay. A broader question I have for you, right, is you have a high fidelity, quick to look up, sort of kind of forward ratcheting message store; why have the Signal protocol part?

Jon: Yeah, that’s a really good question, actually. And this is perhaps partly due to precedent and partly due to the complexity of just getting it right. So, I mean, the precedent is we were already using Signal protocol in Messenger in Secret Conversations, albeit a small scale, and in WhatsApp, obviously at very large scale. On the other hand, developing a new protocol for end-to-end encryption and storage is actually a lot more complex. I mean, we’ve seen this exercise of developing a new protocol just for end-to-end encryption play out in IETF recently. We were involved in that with MLS, and it’s a lot of work and it takes a lot to make sure you’ve got it right, whereas Signal’s out there, we’ve already been using it, and we have proofs from academia about a bunch of properties that it achieves.

Timothy: I think there’s an interesting user control question in there as well, which is we don’t require you to have the secure storage turned on. If you don’t want to have this turned on, you can just not set it up, basically. Awesome. So users have more control and then they can just use the Signal protocol experience.

Thomas: That’s a great answer. Is that surfaced in the system or are you guys planning to surface it so that both counterparties have to agree to have their messages stored?

Jon: It’s stored per-user.

Timothy: Yeah. So the way that this works is when you are basically going to be transitioned to encryption, you’ll have an introduction and as part of this introduction, you’re asked to set up secure storage. And so what you need to do in that process is to choose a key that Meta doesn’t have access to. And there’s several different options that we give people. One is a pin. When you set that pin, John can get into more details of how we do pins with hsms and a bunch of really interesting stuff behind the scenes there to make sure we don’t actually have access to that pin. But you can create a pin that you can then reuse across devices to get back in. You can also do things like store the key with a third party.

If you want to store your key in your Apple backups or in your Google Drive or some other place, you can choose to do that. Or if you are really hardcore about it, you can just literally have a 40 digit code and you write it down somewhere or memorize it and there’s no pin, no anything else. You’re using a 40 digit code to get back into that message history. Or you can say, no, I don’t want this to be on. And for your messages, they will not be stored in secure storage.

Thomas: So you’re already describing what’s probably the most cryptographically kind of complex user experience of any major thing that users use. It’s not just the largest deployment. You’re also doing more with kind of the cryptographic user experience than I’ve ever heard of before.

Jon: Yeah, we’re trying to simplify it so it doesn’t feel like you’re doing key management, but yeah, under the hood you basically are.

Thomas: I mean, we’re happy about it.

Deirdre: Right.

Thomas: That’s not a criticism. You should go further in the other direction.

Jon: Yeah, bring back PGP, right.

Deirdre: Related to new complicated cryptography, part of Labyrinth involves a new kind of primitive, under the hood, oblivious revocable functions, which was very cool because I think I saw the eprint just float across my radar a couple of weeks ago and I was like, oh, what are they doing with that? I’m not really sure. And it’s basically a new way to provide better unlinkability between the encrypted attachments in your message bodies from Facebook’s servers. Can you tell us a little bit about that?

Jon: Sure yeah. So when we came up with this and when we were designing Labyrinth, we essentially realized that there were certain properties which were just going to be a bit stronger in terms of privacy if Meta was not able to make the link between certain things in storage, but we still need the ability to index into them or authenticate access to them so that when someone actually does try to read them, we know what’s happening and can verify at that moment in time. And so we were sort of spitballing this, and like, well, we could have a service that performs this mapping or whatever. And then someone was just like, could we do this with fun cryptography, Jon? And a couple of days later, I’m like, I feel like you might be able to. Then some of other far better cryptographers came and actually helped make it secure. So it was really just, we had a need. And specifically with attachments, because these are, when these are transmitted, they’re uploaded once and then shared between the members of that thread. So that provides very direct linkage between the content and multiple people’s mailboxes.

Deirdre: Right.

Jon: And if we were able to not have that, that just felt slightly better.

Deirdre: Yeah.

Jon: And this was essentially where that came from.

Deirdre: This is very cool. I need to read a little bit more about this, but it’s a way of setting up sort of like PRF, but with the ability for two different parties to be able to come up with the same outputs, but in a non-obvious way, unless you know the secret or something like that, from the outside.

Jon: Yeah. I mean, at its heart what you have is, see, the input goes into a PRF, it’s then used in an exponentiation, elliptic curve exponentiation, and then goes through a PRF again. And that exponent is consistent between all pairs running an ORF, but each party and each pair can split up that exponent in different ways, essentially. So it’s like there’s some weird Diffie-Hellman going on in the middle and taking advantage of the properties of the math here that allow us to just tweak it each time.

Deirdre: Cute. I saw a note somewhere deep in Labyrinth that was the client oblivious revocable functions are currently accessible to the server. They’re only used today for strong attachment unlinkability and so on and so on. Any plans of splitting these up somehow? Is there sort of like a migration path forward to get even the even betterer properties there?

Jon: Yeah. Great questions. So, essentially, the history of this. And I gave a talk on Labyrinth, actually, at a workshop in crypto in August last year, and that was a much more ambitious version, actually, which was using the ORF in a bunch more places for greater unlinkability. And what we realized was that this thing was so difficult to debug, and we didn’t yet know the properties of the system, that if we were going to try and go down that path, we would really struggle to build a good experience that would actually be appropriate to replace the existing one. So there was a moment where we said, right, I think we’re going to have to basically break the secrecy of this ORF at the moment, check that all of our use cases that we’re using it for are okay just for that particular primitive. And we weren’t using it for any protecting any content or anything, and then make a plan to hopefully reintroduce it in certain places in future. So particularly for attachments, we are hoping to bring back that secrecy, but that’s going to involve a complex migration.

Some client is going to have to generate a new set of secrets, distribute it across all the other clients in Labyrinth, and then somehow make that sufficiently reliable. I’m sure we’ll get there.

Thomas: Can I take a stab at trying to understand maybe one more iota of ORF than I do right now? So I read the preprint or whatever. I got to the point where it’s- I skimmed and I got to the point where there was like a game and an adversary. It’s like, this is the part of the paper that I normally skip until I can find the actual construction. And I didn’t find it and I fell off the end of the paper. So my understanding of what you’re going for here is if you chose a simpler setting for this construction to be in, you could think of like a file server, right? And you’ve got clients of the file server. They’re uploading files. The server is just there to give the bytes back, right?

Jon: Yeah.

Thomas: Clients push a file up and they want the file back. The server has got no business knowing what the file name is, right? So the idea here is that when we’re talking about linkability, we’re talking about essentially the file name. And what we’ve got is a client and a server who are somehow agreeing on a label for a file that is cryptographically random. Right. But there isn’t a single static mapping between the file name and the label that we’re using on the server. Both because- you can stop me when I become wrong here- but the two complicating things you have here are, number one, you’ve got multiple devices, and they don’t all share a key, this would be trivial if you just like, the file name is the HMAC of a client secret. Right? can’t do that here.

Timothy: Right.

Thomas: And the other problem is you want to revoke or roll or ratchet that secret forward in some way.

Jon: Yes.

Thomas: And the part that breaks my brain, like, I can sort of work out. I have a sort of intuition for how Diffie-Hellman magic happens here, of how we would do the former part, where we just have a bunch of people agreeing on a secret label. The part that breaks my brain is you can somehow ratchet this forward and land at the same label?

Jon: Yeah, yeah. Okay, this is a really good point, and I should have mentioned before one of the literal three words in the name of this thing, revocable. I really only spoke to the oblivious function part, but yes. So the intuition here is that, I guess, p to the xy equals p to the xz y over z. And that’s all we do. So when we add a new client, what we have is the existing client, which knows x, and the server knows that y corresponds to that client. This client computes a new z, reports that to the server, the server computes y over z, stores that the client then transmits over to the new one, xz. And then that new client does the same process again with.

I shouldn’t have started with x, y, and z. It generates a q. So that new client then ends up with X-Z-Q and the server ends up with y over zq. And at that point, the multiplier and divisor cancel each other out within the exponent. And so overall, they’re all just computing p to the xy. Does that make sense?

Thomas: Sort of like in my head right now. I’ve got a kind of similar vibe to blinding, right?

Jon: I think so, yeah.

Thomas: Okay, we’ll edit this, and before you started talking there, we’ll just have a thing where everyone get out your piece of paper and pencil and just start writing.

Jon: Sounds great. Yeah. And I guess the other thing I should mention here is, because we’re doing all of this math, what’s the point? The idea is that at the moment, one of these devices is removed, and because the setting that Messenger works in here is very highly multi device, we’re adding different devices all the time. You can remove a device and revoke one at any time at that moment, assuming the server is instantaneously, I guess, honest but curious, the server will delete its y corresponding to that client, whether that be y or y over z or y over z q or whatever we land on. And at that point, whatever secret the client stores is completely meaningless. And it can’t be combined with any of the other ones, because if q’s disappeared from the entire universe, then it’s not useful anymore.

Deirdre: And this is like logging in on a new browser on facebook.com is considered a new device enrollment. So you need to be able to handle this case. Like you could basically scale it up to every single time someone uses Facebook Messenger, they’re enrolling a new device and they need to be able to revoke and enroll and set this all up and be able to pull down these more unlinkable message attachments and all that sort of stuff.

Jon: Right, yeah. And this is just in the case.

Timothy: Where they use quite every single time.

Deirdre: Yeah, no, but I could conceivably be that user that only does facebook.com logins. And I have a fresh you know incognito mode every time.

David: Phone only used once. Deirdre over here, only opening up facebook.com on an incognito browser.

Timothy: No, but you’re right, that is a use case that is handled by this for sure.

Deirdre: Okay, cool.

David: And then if you do that, though, you still do get the history in the web browser, assuming that you’ve opted into Labyrinth by creating a pin or memorizing a four digit code.

Timothy: Yeah. So you’d have to do two things. You need to log in, and then after you log in, you need to provide your key, whether that be the pin or that be some other key that you have set up. And once those two things happen, then you also get this.

Thomas: Which is like a super interesting product decision. Right. This is what I was thinking about before when I was saying how cryptographically sophisticated this is. You’d think like a normal product manager input to someone like this would be like the user logs in, they get everything. Like the login is the thing that authenticates it here. And you’ve deliberately separated those things. And, yeah, I imagine that of the x billion people that use Facebook, y billion of them are going to be mad at you about this.

Jon: Hopefully, y will be smaller than one. In that instance.

Timothy: There will likely be a subset.

Jon: I’m sure there’ll be a subset.

Timothy: Yeah. Jon, you should talk about your triangle here.

Jon: Yeah, exactly. You’ve got three desirable properties, and I keep on writing this down and then forgetting exactly what they all are, but it’s essentially one of them is you can log into the network wherever and use functionality. Two is use most of the functionality of the network. Two is that wherever you can use all of the functionality of the network, you can also send and receive messages. And three is if you are able to send and receive messages, you also always have your history available to you. And it’s just one of these classic ‘pick two’ situations. Once the network does not have the messages for you and for us, the decision we took was that it was non negotiable that you can continue logging into Facebook whenever you want to with your username and password, and that if you do so, messaging will function, at least for new messages. And then that just naturally meant we have to sacrifice that guarantee that your message history is always available.

Timothy: We have done quite a bit to make the message history experience less painful and easier to use for users. That’s why there are a series of different options, like storing the key with icloud or Google, or having a pin that’s easily remembered and things like that.

Thomas: I think it sounds like I’m dunking on you. And it’s the opposite thing, right? The cynical view of all this stuff, is, like, you’re a big tech company, and whatever encryption, whatever end-to-end stuff you’re doing is all performative, right? And I think people should pay more attention to this. But one sign that you might be looking at something performative is if it’s extraordinarily usable under all possible circumstances. And here you’ve surfaced a trade off, right? Like, there’s an explicit concession to this, and you wouldn’t have done it unless, I don’t know, the counter cynic in me would say you wouldn’t have exposed that, that you wouldn’t have that somewhat rough edge there if there wasn’t an actual privacy reason for it, which I think is pretty neat to see that happening.

Timothy: Yeah.

Jon: Thank you. Certainly lots of things we’ve done for privacy reasons here. Definitely.

Thomas: Can we talk about epochs? Explain epochs.

Jon:Sure. Okay, so this is all going back to that notion of device revocation in Labyrinth. Also, it kind of relates to your earlier question around, well, are we going to retroactively apply Labyrinth to all historical content? So the ideal situation is that any of your message history is only available to the devices which you currently have authorized on your account. But the reality is, I mean, we can prevent it being available, but decryptable, shall we say. But that trade off is that whenever a device is removed from your account, you’re then going to have to re-encrypt everything, and that could be vast amounts of data. And so the trade off we went for is saying what we want to make sure is that messages are only decryptable by devices which at one point had access to them. And of course, there’s other engineering going into making sure that devices shouldn’t be keeping around the keys and shouldn’t have access, like access control doesn’t go out of the window just because you’re encrypting things. But yes.

So what’s happening in Labyrinth is that means, and a lot of the complexity in the protocol comes from this, that when you remove a device, we will be generating new entropy that gets mixed into the secrecy. Essentially, it’s almost like a hash ratchet within the center of the protocol that then gets distributed to all other clients and then they can use that moving forward to encrypt future messages stored into Labyrinth. We chose to use essentially an HPKE for distributing that epoch entropy. And in a world where everything is ratcheted, that could seem like an odd choice. But essentially the justification here was really around reliability in that we need this to work for you may have generated a recovery code which within the protocol act exactly like a device that might not be used for a year or more, then you use it to recover. And if you have a hash ratchet which is broken at some point you’re kind of stuck. So we needed to use something that we knew was going to work in one shot. And so we went with HPKE there.

Deirdre: What’s very interesting about this is basically all within Labyrinth right now. This is like after basically you have things getting delivered to enrolled devices via the Signal protocol and the fact that you have epochs of enrolled device keys and root entropy and HPKE, I’m just, hmm, this is looking a lot more like MLS and TreeKEM and a lot of the things that you might see in that collection of documents. Is this sort of like a happy accident or sort of mutual pollination of ideas? Or sort of, what do you think?

Jon: I’d say mutual pollination of ideas here. I mean, I was closely involved in particularly the early stages of MLS. And so the notion of, hey, something’s changed in the devices, we’re in a new epoch, that terminology maps around directly into Labyrinth. Hpke less so. That was more just, it was the right tool for the job. But there’s a hell of a lot of cross pollination within end-to-end encryption.

Deirdre: Like explicitly MLS is trying to be scalable for very large groups. It is very large in that you have many, many accounts, you may have many devices per account, but not the same way that MLS is trying to scale to tens of thousands of members in a single group and ratcheting forward every add, every remove, every rekey you know, blah, blah, blah. That’s how they do epochs and stuff like that.

Jon: Right.

Deirdre: But do you kind of see it seems like there’s a convergent evolution of multiple devices, incoming, outgoing, ratcheting, and that sort of thing. So even if it’s like slightly different domain applications, there seems to be a convergence of how these things are getting designed, even if it’s a sort of storage versus full on management of the actual message delivery which MLS is trying to.

Jon: I mean, I think a lot of the problems that they’re solving are quite similar. I think some of it is honestly coincidental, but similar solutions were needed.

Deirdre: Yeah.

I wanted to ask, I saw a single line that said, ‘we found a thing during formal verification of Labyrinth.’ And I grepped and I grepped and I grepped and I couldn’t find any more information about formal verification of Labyrinth. What did you do? What did you find? Tell me more, please.

Jon: Sure. Yeah, so this was actually more on the earlier iteration of Labyrinth than the later one. But we were careful when we removed certain aspects of it. That stuff should still map the there. We worked with Karthik Bhargavan to basically get some, I think it was symbolic model proofs around our claims. And I think we originally had five or six claims we were aiming for in Labyrinth. I think we’ve reduced that down to two or three security claims, but yeah, so we can’t remember the exact results, but we tried to highlight them in the white paper of certain things.

For example, the equivalence with protecting everything just under a single symmetric key. That was like our top line. If we can’t say that, then we’ve got a big problem. And then we got proofs around what was happening in epochs. Epoch rotation subject to certain conditions, such as if a collusion occurs at a certain point and you restore from an earlier epoch and roll forward, that creates more attack risks than if you’re restoring from the absolute latest epoch. But yeah, no, it was great to have that formal verification. And again, if you’re looking at sort of cross pollination, having worked on MLS and all the formal verification going on there, it was certainly an obvious thing to look at.

Deirdre: I love that. Did you do some of that work before you had designed the oblivious revocable function stuff or did you have to model that? The symbolic thing?

Jon: We did this after the oblivious revocable function. Man, I can’t even say it. If I remember correctly, Karthik had to make certain assumptions about that because our proofs, I believe, were in the computational model when we did the ORF. So it didn’t directly map. But yeah, under those assumptions that he made, it seemed to work okay.

Deirdre: I guess you can kind of hand wave over the specific niceties of their- you will compute the same thing but they’re revocable. Yeah. You can just kind of hand wave that they will always agree and things like that, and then you can get all these other things to be proven or something like that. Cool. All right.

So one innovation that Facebook Messenger Secret Conversation specifically brought to encrypted messaging was message franking. It looks like that, that’s fully present. Any changes there? I didn’t see anything in the white paper that indicated that anything changed there. But do you want to talk about that a little bit?

Jon: Sure. I mean, that one will be fairly quick. I’m pretty sure we didn’t really change it much. We re-implemented it for the new stack. But aside from that, it’s essentially the same idea, exactly the same motivations. We want to make sure that people are able to report messages and that it’s very hard to actually spoof a report and particularly with certain types of content, the impact of just being accused of having shared it can be very great. So we wanted to be relatively confident when reports come in that they’re actually authentic to some degree.

Deirdre: Is message franking all on the delivery layer or does anything with hooking in Labyrinth impact- I’m thinking mostly from like a security analysis perspective of the whole systems. Like, “oh, we’ve designed franking without Labyrinth in mind.” Does bringing in Labyrinth have an impact about how you think about either franking or reporting or anything like that for kind of Facebook messaging as a whole? This is kind of generic, but, yeah.

Jon: That’s a really good question. I mean, one of the impacts was we had to make sure that we were storing our payloads in the same format that we were transmitting them, which isn’t necessarily the obvious choice, and that comes with trade offs itself. But yeah, if you’re franking it, you don’t want to be deserializing it and then having to reserialize because that’s a recipe for lossiness or stuff going slightly wrong. So that was definitely a big impact. And then the. Let me think. There was another one we had to think about and I’m having to page it back into my cache. You know what? I’ve lost it. But there was another interesting one.

Deirdre: No, but that’s a really good point because things that were just sort of easy decisions to make, which is like you’re going to MAC over the entire ciphertext as it was on the wire because that’s all you had at the time. And now you’re like, oh, wait, we can’t change this now because if you had something franked as the Signal ciphertext and then someone tried to bring it up from Labyrinth, and it was completely different, it was like decrypted, re-encrypted, or re-encoded, and then re-encrypted for Labyrinth, and then you’re like, oh, wait, I want to show you something that only exists in Labyrinth now. And you’re just sort of like. You basically lose franking if you didn’t keep all of your records about the message you had franked in the delivery mechanism. You’re sort of, kind of screwed. You’re just sort of like, big shrug emoji.

I’m hoping that there weren’t any decisions about franking over the ciphertext that you regretted when you were deploying Labyrinth or designing Labyrinth.

Jon: So there’s an interesting point there, which is that this is actually a really good reason to frank the plain text rather than the ciphertext.

Deirdre: Oh, okay. Yeah.

Jon: So you have your franking key, which is just random or pseudorandom, but is hidden from Meta, which means that when you frank the plain text, you’re not really revealing anything. You’re just revealing 32 bytes of randomness, or however long our franking tag is, because we don’t want to keep Signal ciphertexts around. Because actually Signal ciphertexts are not useful to persist because the keys are constantly changing. It would just add an entire new layer of complexity.

Deirdre: Yeah, good.

Jon: And I also remembered the other point, actually, with bringing in Labyrinth into the system, which is now that we have online storage for some messages, if we’re saying we need to report the last 30 messages, or whatever it is, whatever that reporting window is.

Deirdre: That’s right.

Jon: You want to make sure that the client already knows which messages it’s reporting before it does that. And so that there’s a bit of an interaction there with the cache, the local cache, to make sure that the server can’t just send down a different set of messages from the ones you may have thought you were revealing at the point that it pages them in. So we did have to think about that, but we figured it out.

Deirdre: I keep forgetting that franking includes, is thinking about the context, not just of a single message, it’s thinking about messages, and possibly after, when you’re actually making a report, I’m always like you, MAC a message. And it’s like, no, context is also very valid because you can make a joke, and in context it’s fine, but in, out of context, it seems very bad or something like that. And then you report it to Meta and you get your account deactivated for a month or something like that. Things that you forget about.

Okay, one last thing before we pivot. Deploying end-to-end encrypted anything on the web is just a harder thing than end-to-end encrypted mobile apps or even desktop apps to a degree. And I know that you kind of did what WhatsApp has kind of done here is leveraging the Code Verify- I think it’s a browser extension to help-

Jon: It is, yeah.

Deirdre: Out-of-band, check that, the version of the web app software that’s getting served to you to be one of the ends of your end-to-end encrypted messenger, so the web app version of Facebook Messenger that’s now supporting end-to-end encryption is what you are expecting it to be. Can you talk a little bit about that and especially how you got that working in the big blue facebook.com app?

Jon: Sure, I can talk a little bit about it, but as I’m sure you can imagine, we had an entire web team looking at getting this working, who I’m sure could wax lyrical about this for days on end if we asked. But yeah, it’s certainly this constraint of we need to sort of know in advance what code either will or might run in the browser so that you can attest to it and have it covered by the Code Verify extension. That was just very difficult, particularly in the context of facebook.com. It was already hard for messenger.com, but yeah, we had to essentially look at all of the frameworks that we’re using, all of the essentially developer efficiency tooling that’s, that’s been built and some of it established for years. And while I think certain aspects of the site’s architecture, I think have actually moved closer to what’s useful for this in more recent years. There was a bit of that push and pull between where are we tweaking the architecture of the site and where are we just saying, you know what, the Code Verify extension is a lot more complicated than one naively may think: it attests to manifests of JavaScript code which might run, and then it has multiple manifests because we don’t want to be paging in megabytes at a time, and most people don’t need that. So there’s the most common manifest, which is quick to load, and the long tail manifest, which covers everything else. I have a lot of respect for the perseverance of that team who did that. They did amazing work on that.

Deirdre: I can imagine. Very cool. From the web perspective, I know that you’ve done a ton of work to make this workable and having an identity management system that is like, you can logbook login and then treating all these clients as new devices and managing them partially with Labyrinth makes deploying to web like this plausible. Because if you just treated your web client as like, here’s an identity key, I hope you don’t lose it, that’s your whole identity, that doesn’t quite work. So having all of this infrastructure in place and including Code Verify in it makes this possible. The fact that it’s possible means that other people can look to you and say that it’s possible. Cool. All right.

Timothy: Exactly.

Jon: Yeah.

Timothy: One cool thing about Code Verify before we move on is that it works across all of Meta’s apps, right? So it helps verify WhatsApp, facebook.com, messenger.com, Instagram. It’s sort of built in a way where there’s one plugin that helps you verify the encrypted messaging on the web across all of those different services.

Deirdre: That’s so useful. And I know that other people have talked about either trying to fork it or emulate it or something like that, and try to get like Code Verify, but completely different other set of web apps or something like that.

Timothy: It is open source, right?

Jon: We’d love to see it looking at. Yeah, it’s open source.

Timothy: Yeah. Okay.

Jon: Yeah. I think in a certain way it would be great sort of in the long run to see something along these lines kind of being standardized, some sort of like web binary transparency or et cetera. But yeah, we had to start somewhere.

Deirdre: And really quick; we mentioned Instagram DMs in the beginning, and this kind of dovetails with all the work that had to be done for persistent history across newly logged-in devices and all sort of stuff. Instagram DMs are having ephemeral, one on one encrypted DMs in preview. Now, I completely understand why you just sort of like cool, ephemeral, non-persistent history, they only live for a little bit of a time, like, easy mode. This is like the easy mode of deploying end-to-end encryption. Is that basically it?

Timothy: For now, yeah. So I guess the story there, is, that’s one step forward, right? The end goal here is to bring default end-to-end encryption, persistent storage, all that stuff to Instagram as well. But it’s another very large system with its own complexities, its own set of features that have to be rebuilt, its own set of apps that have to be supported. And so we’re kind of taking one step at a time, and it does already have optional end-to-end encryption similar to what we had on Messenger for many years. So if you want persistent thread that’s not disappearing, you can create that. Or as you mentioned, the disappearing messages feature, is sort of a way to get that into people’s normal threads and allow us to make sure a lot of the pieces of the puzzle work without having to plug in the message history and the keys set up and pins and all of that stuff all at the same time. We will get there, but one step at a time.

Deirdre: Yeah, that’s awesome. All right: David.

David: Yeah, so you mentioned the triangle earlier and you had to introduce this additional pin or backup system, and I’m curious what that conversation was like with the rest of the organization. How did all of this come about? And how receptive were the teams that are, usually, their entire goal is to reduce the number of clicks it takes to log in? Receptive to being like, what if we added more clicks to log in? If you’re willing to talk about it.

Timothy: To clarify a little bit, right, if you’re on, let’s say, facebook.com, the primary reason you’re there is probably not messaging, right? Like, messaging is one feature among many in which people are using Facebook. And so you can continue to access the vast majority of public Facebook without this additional step, right? You can use Facebook groups and all this stuff just logging in. And then you can also do messaging again without having to re-verify anything, add an additional pin or anything.

You can send new messages, and so that sometimes that’s all you need. You log in on a web browser on, I don’t know, let’s say some computer at a library or a friend’s computer or whatever it may be. You don’t need the message history. In that case, you have full access to Facebook. You have full access to messaging the person you need to message. But if you do want that additional message history, then we’re continuing to reduce the friction for you to make that easy for you to get back access there. It was obviously a difficult conversation for a lot of different people with competing goals, but it is helpful when the founder of the company has publicly stated that this is going to happen as a forcing function for those conversations.

There are a lot of trade-off conversations and difficult conversations, but at the end of the day, we want to be able to make this privacy claim and legitimately improve people’s privacy in this dramatic way. And so at some points there are lines in the sand where you say, either we’re doing this or we’re not, right? Like if we’re doing it, then this is what we are doing.

David: I think the highest ranked person that-

Timothy: Yeah, Mark wants encryption, right. And so that really comes down to it is, we all do, right? We want to improve the privacy of your messaging and your calling. We haven’t really talked about calling, but we also have encrypted calling for all of this as well. And when there’s sort of been that top level buy-in, we’ve got to figure out how to make this major technical hurdle happen so that we can improve the privacy of billions of people. There are a lot of trade-off conversations and difficult conversations, but at the end of the day, we want to be able to make this privacy claim and legitimately improve people’s privacy in this dramatic way.

And so at some points there are lines in the sand where you say, either we’re doing this or we’re not, right? Like if we’re doing it, then this is what we are doing. And if not, then why is this whole project even happening, right? And so coming back to that conversation and having very, very senior people who were very committed to making this happen was critical, right? It’s crucial in a company of the size of Meta.

Deirdre: It definitely, definitely helps when the tippy top of your very large org has said on the record, hey, we’re going to do it. And you could just be like, see previous statement, you’re going to help achieve Mark’s goal?

Timothy: The number of decks that started with, “Mark said.”

Jon: So many sweet baby rays.

Deirdre: Oh my gosh, I feel like I want to go like earworm Mark Zuckerberg now and be like, okay, now we’re going to do post-quantum end-to-end encryption!

Thomas: I think something like that has happened. There’s been like one monumental aligning security statement at some point at every single one of the major techs. Like, it happened at Microsoft after the summer of worms. It happened at Google after they got owned up by the whatever thing. It happened with Mark with a particularly potent hit of barbecue sauce. I don’t know when Apple did it, but it happened, right.

Thomas: It’s just interesting that that works.

Deirdre: It’s not me. I was too young for some of those to pivot back.

Poor Jon. I say post-quantum and he just looks exhausted. But Signal foundation just released a couple of weeks ago their kind of first stab at making at least Signal protocol, and maybe they’ll eventually get to the rest of the other features that Signal the service supports cryptographically, which is they made the first handshake for a pairwise conversation, which was the original was called Triple Diffie-Hellman or Extended Triple Diffie-Hellman. And their post-quantum variant is a hybrid called PQXDH and it includes- you add some Kyber keys in there and you shove them in the KDF and yada yada, yada and you get a little bit of formal analysis from Karthik et al, and then you make some fixes and it makes it even better.

Is there any discussion of piloting that in end-to-end encrypted Meta products?

Jon: So, yes, obviously we saw that have been looking at it with interest. Interestingly, I think in our case, something like Labyrinth is actually a much more natural starting point in terms of the post-quantum threat. Yeah, because with what Signal built, it protects against the harvest now, decrypt later attack, and the dependency of that is the harvesting. In the case of Labyrinth, that is done by designing the product, whereas obviously Signal ciphertext are designed to be ephemeral. And actually going back to that question of, oh, why didn’t we just design our own protocol which handles delivery and storage in one? What wasn’t the primary reason, but one actual benefit we get from our design is that Labyrinth, if you sort of ignore all of the epoch rotation and you sort of say for a protocol that allows you to add but not remove devices, that’s, we think, post-quantum secure in that threat model because it’s got that basis of symmetric cryptography. So you’ve got this one key, you needed to use it to go forward or back. Well, it’s not one whole, one key, it’s rotating. But if you get one of them, you can go forward and back, but you can crack all the asymmetric cryptography that you like, but you’re still not going to get into the core of the Labyrinth protocol there.

So that that was sort of where we were thinking. But I mean, I say you can crack all the asymmetric cryptography you like. We’ve got a few pieces of Labyrinth adjacent or internal asymmetric crypto. We’ve got the HPKE, which is potentially a natural target for thinking about that. You then also have two of the ways in which you can add a device in Labyrinth, one of them being using these HSMs. And so we’d need the negotiation with the HSM itself to be post-quantum secure. Currently we’re using OPAQUE, so that wouldn’t be. And then also if you’re doing a direct device ad, then we use cpace key exchange, which again is based on classical primitives.

So I think experimenting with Signal post-quantum is definitely interesting. I’m thinking more about Labyrinth at the moment for that reason. But I’m sure we will be looking across the board, as all big tech companies are over the next however many years.

Deirdre: I do think that we’re seeing even more deployments of PAKEs like OPAQUE, and I think you said CPACE, than we ever have. And as far as I know, the post-quantum variants of those, there aren’t really any. Maybe there’s some fully-homomorphic encryption based ones that are very expensive and very heavy and maybe you can do something like that with lattices, but no one wants to use them. So research in that area of making something that’s hybrid secure or post-quantum secure is probably well motivated because shifting the attack, the deliciousness from Diffie-Hellman and something that’s like extended triple Diffie-Hellman to, if we crack the OPAQUE, sign up enrollment for a new device, we get everything. Or we get a lot. You stored all the goodies for us, so we’ll just take that, thank you. I know some cryptographers listen to our podcast if you want a good research topic, post-quantum resilient PAKEs I think is- efficient, useful, post-quantum or hybrid PAKEs, I think will be very useful to some people.

Jon: Absolutely.

Deirdre: Jon, Tim, thank you so much. We’ll be linking both of your white papers, and there’s just a lot of cool stuff in here. There’s the Labyrinth stuff, there’s the ORF primitive, which is really cool new stuff, and all the other little details of getting this deployed. Thank you so much for talking with us.

Jon: Thank you very much. It’s been really great joining you.

Timothy: Yeah, thanks for having me. Thanks.

Security Cryptography Whatever is a side project from Deirdre Connolly, Thomas Ptacek and David Adrian. Our editor is Nettie Smith. You can find the podcast on BlueSky @scwpod and the host on BlueSky @durumcrustulum, @tqbf, and @davidcadrian. You can buy merchandise at https://merch.securitycryptographywhatever.com. Thank you for listening.

Encrypting Facebook Messenger with Jon Millican and Timothy Buck

Latest Posts

A Little Bit of Rust Goes a Long Way with Android's Jeff Vander Stoep

Campaign Security with [REDACTED]