Security threats from the Chrome Web Store with Sheryl Hsu

August 13, 2024
  • copy-link-icon
  • facebook-icon
  • linkedin-icon
  • copy-link-icon

    Copy URL

  • facebook-icon
  • linkedin-icon

Have you ever wondered how secure your browser extensions really are?  On our latest Secure Talk episode, join us for a discussion with Sheryl Hsu, a brilliant researcher from Stanford Empirical Security Research Group, to uncover fascinating insights from her recent study, "What is in the Chrome Web Store? Investigating Security-Noteworthy Browser Extensions.".

Discover how these small add-ons pose a significant security vulnerability to a very large population of internet users. We also dev into the perverse incentives that confront organizations like Google in managing their “app” ecosystems. Don't miss out on this eye-opening discussion!

 

 

 

View full transcript

Secure Talk - Sheryl Hsu

Justin Beals:  Hi, everyone, and welcome back to SecureTalk. This is your host, Justin Beals. I'm certainly happy to have you joining us today. We have an excellent guest on the podcast today. I'm very excited to talk with her. Sheryl Hsu is currently a student at Stanford University. She spent time as a researcher for the Stanford Empirical Security Research Group, where she recently published a paper titled “What is in the Chrome Web Store? Investigating Security Noteworthy Browser Extensions”, along with her co-authors, Manda Tran and Aurore Fass. Sheryl, thanks for joining the podcast today. We're really glad to have you on it. 

Sheryl Hsu: Thank you. It's an honor.

Justin Beals: All right. Excellent. Sheryl, am I right that you recently gave a presentation on this paper at in Singapore? I think at a conference. Is that correct? 

Sheryl Hsu: Yeah, literally just last week. 

Justin Beals: Wow. That's amazing. Asia 

Sheryl Hsu: CCS conference. Yeah. 

Justin Beals: How was it received? Did you enjoy the conference? 

Sheryl Hsu: Yeah, I think the conference is really good. You know, a lot of really interesting work. And I think one of the nice things about our paper and that it's kind of like a broad overview is that it can impact a lot of people, and it's something that, like, a lot of people can use. It's quite a broad study. 

Justin Beals: Well, I certainly was, after reading your paper, looking through all the Chrome extensions that I've installed, thinking about, you know, largely my team and how we're kind of managing this. It is certainly an interesting way to inject malware or other types of issues, and I think as you describe it, I love the definition security noteworthy extensions was like a great description before we get too deep into some of your research work that you're doing recently, I'd love to just learn a little bit about your background and what kind of inspired you to enter the field of computer science.

Sheryl Hsu: Yeah, for sure. So, I'm originally from the Bay Area. So obviously, like, kind of growing up in the heart of Silicon Valley, a lot of tech. But I think from a young age, I just really loved building things. Like, I have photos of me building a fishing rod out of, like, markers and, like, tape. And so I think engineering and just, like, kind of being able to create things, was just a really natural fit. I just love the process of kind of dreaming of something, imagining something and then slowly seeing it come to life through all of the like work you put in and like assembly each piece and then computer science is just really cool because you can build and develop so quickly compared to like other forms of engineering or in one day you can literally have something that's really useful to yourself or useful to others.  That's just really exciting. 

Justin Beals: Yeah, I think that's why I got interested in it as well. Both from two aspects. I like taking things apart and putting them together. So I wanted to know how software worked and computers work. But then I, I love, like you mentioned, this idea that I could kind of rapidly create an interesting idea and put it out in the market and see how it does.

How long have you been studying at Stanford now? 

Sheryl Hsu: Two years.  Halfway through. 

Justin Beals: Yeah, that's awesome. Well, halfway through is downhill, right? Or it gets harder at the end, one of the two. 

Sheryl Hsu: You want to dream bigger. 

Justin Beals: Yes, that's right. And it seems like you've got a deep focus on the machine learning side in, kind of the education work that you're doing.

Um, yeah. Is that do you just think that's the momentum for a lot of students today? A lot of your peers? Are they that's just where computer science is going where the innovation is at? 

Sheryl Hsu: Yeah, definitely. It feels like a lot of people are very interested in machine learning. I think there's definitely just, like, so much hype and excitement in the air.

You can really go like one social event without like talking about AGI. So, it's just, it's an exciting time to be alive. 

Justin Beals: Yeah, that's awesome. I certainly think, uh, I've had this for a while now. You know, I used, I was really interested in natural language processing about 15 years ago. You know, started reading a lot out of it and including machine learning or data science is kind of a software layer and our in our applications that we were building.

And I feel like broadly now the rest of. The computer science industry is catching up with you know, what was possible. I guess the hype engine is hit. Yeah. 

Sheryl Hsu: Yeah. I don't know. People are all the way caught up even. 

Justin Beals: Well, that's that means we're really excited to see what you build in the future. Sheryl.

Absolutely. Yeah. So I'm curious how the topic area for your paper on the Chrome web store. You know, what, what attracted you and your team to researching this particular um, space, and producing the paper that you did on it? 

Sheryl Hsu: Yeah, for sure. So I think, like, cybersecurity is just always really interesting because it's so needed, like, no matter how good AI gets, no matter if, like, half of the jobs in the world are replaced, you know, like, you're still going to need to be able to defend your systems and keep everything secure. So, in terms of the analogy of, like, build the mining supplies, don't be a gold miner. I think it fits well there. And specifically, like, browser extensions, it's something that I think, as you mentioned, like, everyone, like,

I've used them before. And it happened to be an area of work that Aurora had done a lot of work in. And specifically in her previous work, she focused on, like, analyzing code to find  vulnerabilities. So, But at the same time, you know, we kind of notice, we're also working on some projects kind of like down that scope in terms of, like, building an extension that could monitor other extensions.

But I think at the end, we kind of realized that there was a big need in terms of no one really had clear statistics or clear data on kind of the contents of the Chrome Web Store, you know, like, even basic things, like, how many, like, vulnerable extensions are there, you know, like, Basic things like how  are the reviews?

Yeah. What's the average review? What's the average number of users? Like literally no one really knew. Right. So we kind of felt like there was a big need for this that could go far beyond just like developing one. 

Justin Beals: Yeah, I certainly have a couple of extensions in my system, and sometimes I go searching and and the Web store, like when I got into Chrome Web store, it is a little iffy sometimes I'm kind of like, how can I trust this extension? Did y'all feel the same way in your own utilization? 

Sheryl Hsu: Yeah, like, I don't use Chrome extensions anymore.

Justin Beals:   No, you, you've sworn off them. Yeah. 

Sheryl Hsu: Yeah. 

Justin Beals: Well, now you have the skills, Sheryl, to write your own extensions, I think. So, maybe you could just only do Sheryl trusted extensions.

Sheryl Hsu: That's true. 

Justin Beals: Yeah. So, one of the things that I thought, philosophically about your paper that was really interesting is that you have this idea about security noteworthy, you know, or is an extension security noteworthy? You know, what, what is the definition to you of something like security noteworthy? How should how should we should think about that? 

Sheryl Hsu: Yeah, so I think the definition like in the paper was quite technical or clear cut in that they fall into three categories. They're either malicious, policy-violating, or they're vulnerable. So as far as malicious and policy violating, the way we got this is the Chrome Web Store actually releases the reason that extension was removed in their API.

So, like, if you query an extension, it will say it was removed from malware or was removed for policy violations. So those are like, very clear. It's literally just like what the Chrome Web Store said. Yeah, but as far as, like, their definitions, you know, like, I think we're pretty familiar with what malware is.

As far as policy violating, that's things like, oh, they didn't like, write a privacy policy, or their data policy is like, not consistent with what the Chrome Web Store wants, or they're requesting too many permissions that they don't need. They're like, failing to include some disclaimer that they're like, using the data for advertising, that sort of thing.

And then the last category was vulnerable extensions. And, you know, it's overall hard to find vulnerable extensions. You know, no one like, so this was mainly like, a set of extensions that had been collected by Aurore on her previous work on like static analysis of code for vulnerabilities. 

Justin Beals: Yeah, so we have containing malware. So, and I think what I read in some of the data is that malware is going to steal user-sensitive data, track users, spy on them, or propagate malware. This seems like. The most critical of the security risks, right? Like, those would be the extensions we're most concerned with. Correct? 

Sheryl Hsu: Yeah, definitely.

Justin Beals: Yeah. The next one that you mentioned is the policies. So this is more about didn't meet Google's requirements to from a policy perspective to be in the Chrome Web Store, but it could be like, just lack of maturity, like, They put together a quick extension, but didn't include all the information they need, or they got, you know, real loose with the permission set that they might be asking for.

Sheryl Hsu: Yeah, although, like, Google does vet extensions before they're added the first time. So this is like, extensions that are removed. So, like, hopefully it isn't like, completely immature extensions. Like, those should never have been removed in the first place. 

Justin Beals: Yeah, I mean, then finally containing vulnerabilities and you mentioned that's the hardest one to kind of, to identify mostly because maybe the extension is trying to do what it's supposed to do. But the developer didn't write very secure code for that particular extension. So, it's representing a new vulnerability. 

Sheryl Hsu: Yeah, like, for example, a lot of extensions will, like, they'll need to pass messages between, like, different pieces of code.

So, ideally, you're going to like, check where the message came from before you do anything with it. Right? Yeah. Extensions will, like, not check. And the next thing, you know, like, literally anyone could be saying them a message, and they're just like, executing whatever set. 

Justin Beals: they 

Sheryl Hsu: are like, executing arbitrary code.

Justin Beals: Yeah. That's the scariest. Speaking of which. I'm curious a little bit about the extensions themselves and just the technology feature set that they provide. Is there a typical language? Are these written in JavaScript? Or are there different choices that people can make in coding up an extension? 

Sheryl Hsu: Yeah, so it's almost entirely written in JavaScript.

There'll be like some HTML, CSS for like the UI part. But it's basically all in JavaScript, and then people can just, like, query their back end or whatever they want as with like a normal page. 

Justin Beals: Okay. So restful API, typically, or whatever API integration to the back end database. Yeah, and then just like standard from a JavaScript perspective, you know, JavaScript is very package dependent.

So I assume that you could embed packages in that extension that you want to in developing that extension from 3rd party developers as well. 

Sheryl Hsu: Yeah, so, like, stuff like jQuery, Angular, that sort of thing. It's quite interesting because we actually found, like, a lot of use of outdated libraries in terms of, like, versions of jQuery or Angular that have, like, known vulnerabilities that are patched, but, like, developers just don't update. So, everyone's kind of using this, like, old, vulnerable version of common libraries. 

Justin Beals: Yeah. So, I had a couple of questions I wanted to ask you, like, If I wrote a browser extension, could I, and  I'm curious if you could help me say yes or no to, is it possible to do this? For example, let's say that I created a browser extension, and I wanted to look for someone logging into a bank, for example, in their Chrome browser. Could I overlay the login interface of the actual bank login and, you know, on detecting that URL and collect information? 

Sheryl Hsu: You know, so, like, Chrome extensions can, like, insert code. Into the dom. So like this like allows you to do so many things, right? You could add your own HTML, you could add JavaScript event listeners, and all of that. So that just gives you like a lot of flexibility. 

Justin Beals: Yeah, that's, I mean, both powerful depending on the extension I want to build and a little scary about what's possible because we're so thin client-oriented from a comp computing perspective these days that we're going into third party systems for a lot of our data.

Sheryl Hsu: Yeah. Like the extension has to list like what sites they're going to work on, but like there is an option to just hit all sites and the user accepts that and then the extension is on like your ad blocker probably works on all sites, right? 

Justin Beals: Yep, I think mine does. Yeah. Could I, can I log like key presses?  So, if someone were in the browser and typing into the interface, can I log keys that are hit? 

Sheryl Hsu: Yes, you can, like, add an event listener and a button pressed is an event. 

Justin Beals: Yeah. 

Sheryl Hsu: So there you are. 

Justin Beals: And I'm imagining then that I could track and send browsing data to a remote server, right? Like, if I have a centralized server and the extension is running, I know what the person is doing in the website and I can send that data back to my remote central server.

Sheryl Hsu: Yeah. I mean, like, you have all the keyboard presses. You can also intercept requests. Yes. So there's a way to kind of, like, hook into every request that's made, and so you can, like, modify it. You could, like, discard it. You could log it, track it, whatever you want. 

Justin Beals: Yeah, and that was my next question.

Actually, can I control the extension from a remote browser in a way where if I saw a certain interaction happening from the extension went to my central server, I could push back to that extension activity that I wanted it to do in a very personalized way. Yeah,  So pretty. I mean, I think that there would be a side of the coin where we'd say this is a powerful set of features for developing really interesting solutions, but also a little scary with the kind of the ease at which we adopt and install these things in our browsers.

How central the browser is to, deeply  private and and and work that needs security, but they're just living in there and we get to make decisions about them. Yeah. Okay. That's awesome. So my next question for you and you cover this a little bit in the paper is how pervasive  is extensions in chrome. So, for example, for how many users, how many extensions are we seeing utilized?

Sheryl Hsu: Yeah. So in regards to like the number of users, the vast majority of them have like, very few users, 64% of them are like, have less than 100 users and then around like, an additional 18% are in the 100 to 1000 range. So, then that leaves like, maybe like, 14% of extensions that are, have like, over 1000 users, but overall, like, a small amount of extensions that have like, the majority of the users, like, 1% of extensions have between 100, 000 and a million users.

And then, just like 0. 27% of extensions have over a million users. And then Chrome doesn't give information past like a million, past half a million. 

Justin Beals: Gotcha. There's a huge number of extensions that are used by very few users, where they're coming in the bulk of them, but there's, there's still a lot of them out there and they do very redundant things, right?

Like we, we see a whole lot of, uh, like it's easy to pick the wrong extension, essentially, for what you're looking for, the best one, right? 

Sheryl Hsu: Yeah. And I mean, you would hope that like ratings and stuff helps take care of some of that. I think the barrier to like submitting a Chrome extension is very small, like I've messed around and submitted one before and it's like $5, like compared to Apple, for example, where they want to charge you like a hundred a year, which I think might help. And as a result, I can definitely see like people just like mess around, like build one and a half an hour. It'd be cool if it was like a web store in just a minute.

Justin Beals: And Google does have a group, you mentioned it, that's supposed to review the extensions. How do you, how does that process work? It almost seems more approval-focused, so they can get as many extensions in than necessarily security-minded. Would you agree? 

Sheryl Hsu: Well, I don't know, because they actually, I don't know that they get money from, like, having more extensions in the store necessarily versus, like, I think people are kind of concerned about the security as we've kind of just seen from, like, media.

But once again, I'm not Google. So my, I mean, my hypothesis is probably that, like. They do some amount of, like, static code analysis, where they, like, just look through the code with some tools. And then there's also, like, they can flag extensions based on the permissions, right? Because extensions have to request permissions, for example, being able to operate on all sites, being able to, like, see all your browsing data.

So, they can, like, flag extensions that have that, like, higher,  that requests more permissions and power and like that those more thoroughly, I wouldn't be surprised if they were also like running extensions in a sandbox of some sort. I know there's been papers where they use like a honeypot or a honey page, which is like a fake page where they're gonna try to see if what the extension does to the page and kind of be able to determine if there's anything malicious through that.

But I think it's also, like, the cost benefit thing, right? Like, all these things take time to run. They take money to, in terms of, like, compute and everything to be, like, running these simulations, these tests. So I don't really know, like, to what level they're doing this and for what level of permissions.

Justin Beals: Yeah. And I think I read in your paper that the total number of extensions in the chrome store is fluctuating a little bit between about 150,000 and 125,000 extensions that people can pick from. And that's a huge amount, of course. Yeah. 

Sheryl Hsu: Yeah. 

Justin Beals: I think that there's a catch-22 for Google here, and neither of us are Google,that's a given, but it's a little hard, right? I think. I can go through the thought process here, where Google is like to make, we cannot build every feature that chrome can possibly do with our own development team. So we're going to create this marketplace so that other people can contribute kind of like we see in the open source marketplace or packages, so that you know, I wrote this code, so you don't have to write it again, or I developed this, they charge for some of them. So just like app stores, maybe Google does make a little money off of each app that is paid for, like, there's a per cent they get, but then to have to go in and check each one for security over revision after revision, I just don't know that there is enough staff, right?

Like, it's in the world. It's even at 150, 000 extensions, the number of things you'd have to test. It's going to get really difficult really fast. 

Sheryl Hsu: Yeah, and it also seems to me like extensions very much break a lot of the web security model. Like, you know, things aren't supposed to be able to be handled. By like, CSRF tokens are like, could be stealing, like stuff that is supposed to be solved by all of these things that we've already implemented.

I think like, extensions break a lot of that, so.

Justin Beals: yeah, I'm curious. In your paper, were you able to calculate the intersection between the security noteworthy extensions and how many users have adopted them? And I guess what I might be driving at it in the end of the day, is there a way to get a probability that any one user has installed an extension that is a security noteworthy? 

Sheryl Hsu: Yeah. I mean, I think when we, like, when we actually like went and calculate it, I think there's like close to like 350 million people who have like used the security noteworthy extension over the last three years, and that's like very high, right?

There's only like a couple billion people who are using Chrome, so that's like quite a high probability. 

Justin Beals: Yeah. Okay. I'm going to do a little math here. First off, 350 million people haven't installed a security noteworthy extension at some point out of a billion total users. That means 30, you know, you have a 1 in 3 chance of likely installing a security noteworthy extension if 30% of a billion users have already had that experience.

That's a huge number. That's not small, right? 

Sheryl Hsu: There's been like malware containing extensions that have like contained that have Had over 100,000 downloads, right? Like, some of these really popular extensions are actually not that safe. 

Justin Beals: Yeah, I, this thing with the Chrome extension extensions reminded me of things that we got a lot with, um, malware tools themselves, right?

Like, I have long had this complaint that antivirus and malware tools. You know, the ones that are supposed to detect and protect us become viruses and malware themselves over time. It's really frustrating. You can't trust that. It's, you know, it becomes  its own enemy in a way. Yeah. 

Sheryl Hsu: Yeah, very ironic. I guess you have to give all of those tools so much permission and so much access so they can monitor. That's just kind of the danger in and of itself. 

Justin Beals: Yeah. In your research, have you learned anything about the developers that are building these extensions security noteworthy or otherwise? You know, what are some of the trends in the developer community? 

Sheryl Hsu: Yeah, I think the. I don't know, there's like a super good sense of, like, who are these developers?

Like, where are they from? All that. There are some interesting things, which are that, in the past, like, people would actually sell and buy extensions. For example, you build an extension, it gets kind of popular, you can sell it to someone else for some amount of money. And then that person might start introducing ads, or they'll start introducing, like, malware, just to, like, try to make more money off of it.

I think another interesting thing is that the Chrome Web Store doesn't actually, like, kind of track developers and that, you know, how for Apple, you can click on it and it'll show you all the other apps, but the Chrome Web Store, all you have is, like, the developer's name or their website. So it's possible that, like, you have developers who are just all using the same name, and they're not actually the same developer.

Which I think is kind of makes it hard for people to kind of be able to see how good the developer is, and I don't really know why they wouldn't just change it. 

Justin Beals: Yeah, I have a quote from your paper here that I thought was really interesting. And you state in the paper, for example, a developer having published one malicious extension, like a malware extension, publishes on average 3.6 benign and 4. 9 malware containing and 1. 4 policy violating and 0. 00093 vulnerable extensions. So you've found instances of developers having many malicious extensions, And I think that speaks to what you're saying there, which is, at the end of the day, someone is writing this extension and being willing to make it security noteworthy to use the definition, and they're doing a lot of that, but they're kind of, you know, Google is trying to track the extension, not the developer, not requiring identity up front to track who's writing the thing.

Sheryl Hsu: Yeah, I think, like, this kind of goes back to something else that we found in the paper, which is that there's, like, a lot of very similar code. So a lot of these developers who have, like, that statistic that you just pointed out, right? Like, there were developers who had, like, many hundreds of, like, extensions, right?

And there were, you know, There were clusters of like 800 identical malicious extensions. So it's like, I mean, it kind of makes sense. Like if you have an extension that's doing well, just like copy and paste it, change it a little. Yeah. And you have another extension that's doing well and making you more money or whatever your ultimate goal is.

Justin Beals: Yeah. The other thing I thought about this data, and you can tell misassuming. Was that if I, let's say that I'm, I'm creating a malware embedded extension, or there's data I want to exfiltrate from people using a browser, it would behove me to write a bunch of very benign extensions so that I'm kind of hiding behind all the benign ones with the one that is actually injecting malware, exfiltrating the data that I want. 

Sheryl Hsu: Oh yeah, definitely.

Justin Beals: Yeah. Tell me Sheryl, what other changes would you want to see Google make and how they manage the Chrome store? We mentioned one, you know, we'd like. clear identity on the developers that are writing it. What are the opportunities for improvement and not just Google? 

There's a lot of these marketplaces being adopted, right? Like we could look at app stores as well. We could include a lot of these scenarios. We could look at the package management solutions for javascript alone and say, what are the decisions that we could make about being more secure?

You know, would you have any recommendations having done this research? 

Sheryl Hsu: Yeah, I think just from being in the space, like, I think one concern for Google is, like, definitely fake reviews, especially because Google accounts aren't tied to, like, phone numbers at all, right? So it's very easy to, like, for bots to be creating accounts and commenting.

So there is, like, quite, there's actually, like, quite a few concerns about, like, fake reviews and that seems something that's, like, easy enough to solve requiring additional verification before commenting on anything or reviewing. So that seems, like, quite easy to solve. In addition, all Chrome extension code is actually open source right now.

So when you like, download an extension, it just downloads a zip file and you unpack that, you have the code. Which firstly seems like, okay, now it's really easy for people to exploit vulnerabilities if they find them. 

But secondly, it also makes it really easy for someone to, like, copy an extension because you literally have all the code right now, just, now just put your ads or your malware in there. So I think like, those are two things. And also, it seems quite obvious, like, I don't, to be able to just, like, identify duplicate extensions, like, it doesn't seem like it should be that hard for Google to do.

Justin Beals: That's interesting. I did not know that the actual code in the extension is open in a way. Like, so if you, if you download a Chrome extension, I could find where on my computer that extension is stored in my directory and pull up the JavaScript and take a look at it. 

Sheryl Hsu: Yeah, that's how I did a lot of this research.

Justin Beals: Wow. 

Sheryl Hsu: I mean, it's great for researchers. Yeah, that's amazing. For the general public. 

Justin Beals: Well, I, I don't know. You know, it's a catch-22, right? Sheryl, we all look; I've used a lot of open-source tools in my career to build the software that I wouldn't have if I had had to start from zero. I wouldn't have had the time or the opportunity to build and, and there's also, I think a fair argument that When we share code in plain text, we write better code because we get an opportunity to review each other's code and make recommendations.

And also that kind of transparency can help us build better security because we may get recommendation or someone might be looking for vulnerabilities. But in this case. It's got another outcome, which is,  I can download code from an extension that a lot of people are using, copy that code, inject my malware or ads or whatever violation I want, and upload something that seems to be doing the same exact thing as someone else's extension.

But with my, my extra flavor on top of it and not really knowing the difference. 

Sheryl Hsu: Yeah, I think one distinction is, unlike in the open source case, no one is actually gonna like, we're not building off of each other's extension. No one's supposed to be looking through the code. Unless, like, there's no reason anyone should be, unless you're, like, kind of have ill intent.

Justin Beals: Yeah, so we could package it like Google could elect to have these things packaged in some way where they're not readable. Yeah, absolutely. Yeah, and that would be an easy change that wouldn't require a lot of people. It would require a Technology change, you know, to implement that. 

Sheryl Hsu: Yeah. 

Justin Beals: Yeah Um, is the, is the extension space continuing to grow?

Like, are we seeing more and more extensions being built as people use browsers more? Um, or do you think we've kind of reached full saturation on what types of extensions people want to implement? 

Sheryl Hsu: Yeah, so since 2020, I mean, there's kind of, like, a pretty big decline. A lot of that might have been triggered by Google kind of announcing changes, like, oh, you need to manifest version 3 for manifest version 2, or just like new requirements, which might have prompted people who just kind of forgot about their extensions or like didn't really care that much to just take them out of the store. 

As far as users, the number is kind of all over the place. Google has a very strange definition of the number of users. When we contacted them to ask them about this, they basically said, the number of users displayed on the Chrome Web Store for a given extension corresponds to the number of Chromes, or I guess just like Chrome softwares, with the extension installed that are pinging the update servers over the previous seven days.

So, if you don't like Chrome on your phone, computer, iPad, whatever, that's going to count as like three separate versions. And if you don't use your computer for seven days because you're on vacation or something, it wouldn't count. So it's just like a very strange definition. So the trend is like all over the place and I think it's pretty hard to tell If, like, overall, there are more people using Chrome extensions. 

Justin Beals: What are the manifests? Actually, I did read about that in your paper, and Google has this version 2, version 3. Can you describe what the manifest means to an extension developer? 

 Sheryl Hsu: Yeah, I mean, I think I kind of think of it a little bit as, like, a package .json. 

Other forms of software, but essentially it's just like a file that where you're going to write, like, this is where my code is. This is where, like, my other code is. This is like, where the HTML and CSS that needs to be bundled is. And these are all the extensions that I'm requesting or all the permissions I'm requesting. For example, I want the permissions to be able to access the path, the permissions to access the cookie, the permissions to be able to access all the API calls that are being made. And so on. And then also, like, these are all the sites I want to be able to work on. 

Justin Beals: Yeah, and so Google has, they had a manifest version 2 and then most recently a manifest version 3.

Can you help, you know, what is the difference in your mind? What was Google driving at in the manifest v3 work?.

Sheryl Hsu: yes, I think they rearranged a couple of things in terms of how the code is structured. None of that seems super significant to me from a security standpoint. They're claiming that, like, improve speed, which is probably true.

But from a security standpoint, I think the most interesting thing is that they're disallowing remote code execution. Which is pretty big because previously they were allowing that, which obviously, like, I don't even know how you're going to be able to vet extensions if you're going to allow them to, like, grab code from anywhere, execute them anyways.

So I think, like, that's probably the biggest change. I think they're also introducing rules that an extension is kind of limited in terms of the websites that it can intercept requests from. Instead of like being able to intercept requests from every single site, they need to like, they're limited to like a list of like 50 or something, but I don't know.

I don't think that's been like the key point. I think the key point has probably been the remote code execution. As far as timeline, this has been like in the works for quite a few years. You know, they keep talking about it and they keep pushing it back and  they were supposed to have started to disable all extensions that use manifest version two last month.

Whether or not it happened is like a very good question and they're like, we're going to slowly roll out the rest of it, but, you know. 

Justin Beals: yeah. All right. I'm imagining a conference room in Google headquarters and they're all sitting around the table and one person is like. V2 is really insecure, and it's making our Chrome extensions look bad.

And we need to force people to go to V3. And the other person on the other side of the table is like, yeah, if we do, we'll lose 90 percent of the extensions.

And they,  Oh, it is so frustrating for them because now they're like, Oh, can we, man, my head would be spinning. I'd be like, how can I get these developers to update? All these old extensions that is this long tail of users, and it's just impossible, right? Because they weren't tracking to your point their identities very discreetly.

Sheryl Hsu: Yeah.  Like, it's also like, it's a Google account. Like, people have like millions of loads. 

Justin Beals: Yeah. Well, I'm just curious also, Cheryl, as we're wrapping up a little bit. First, I'm, I'm deeply intrigued by the paper. I really appreciate you and your co-authors. Digging into this area. I think it's very helpful, you know, as you look forward in your work, you know, certainly your degree and your interests where, where do you think you're going?

You know, what's, what's next for you, Sheryl, and your computer science career? Yeah, 

Sheryl Hsu: yeah, I think, you know, I'm definitely like, you're very interested in machine learning. I've been doing a lot of, like, NLP work the last couple of months, you know, I'm also really excited about robotics. I'm hopeful that robotics will take the next step within the next, within our lifetime for sure.

And I think, like, that's something that the people of my generation can play a big role in, you know, it feels like the generation above us. Got to see and be a part of NLP really transforming the world and I'm hopeful that my generation will get to. You know, build their eyes and robots, um, and hopefully, like, safe.

Way that benefits us all. But beyond that, I think, like, there's always so much room for security, especially like, you know, trying to figure out how to keep models safe, how to make models. So, I think there's there's a lot of really interesting stuff. 

Justin Beals: Yeah, I really like that area. You know, I've I've in in my work in machine learning, one of the things that I complained about a lot is like security around the models. And for me, a lot of the challenge, like the first challenge that people seem to forget about is running true accuracy measures on their model predictions, right? Like, we're happy to get some qualitative answer that seems aligned, but we don't do the hard work of, like, really testing and publishing, you know, how many false positives are true negatives that I actually get out of a model.

Sheryl Hsu: Yeah. And like, even beyond that, in terms of like stealing models, right? Like if we build these models and they're worth billions of dollars. You make sure that people aren't stealing well. 

Justin Beals: Yeah. Okay. I'm going to get a little science fiction here, but I find this topic absolutely intriguing if you had a model out in the wild and you could ping it with enough data and see its responses, you could almost brute force the models learning on the other side, right? Like, I could steal a model by essentially asking it enough questions to understand. Reductively how it responds and recreate it. 

Sheryl Hsu: Yeah, likely. 

Justin Beals: Yeah. 

Sheryl Hsu: But you know, are you going to have the money to be able to train that model? 

Justin Beals: Yeah, that's a good question. Well, I, you know, what's interesting is I worked on some AI tooling a while ago, and we found this really interesting thing: we were, we needed a fair bit of server resources to train our models for the data. What we, and we were working for a high accuracy level. I think we wanted around it. 80 to 85% accuracy on the model project predictions. We went, you know, we, and we were using, uh, a large model method. And we decided to just test out a simple Bayesian model, you know, something that was much, much easier to build a lot less expensive to run and operate.

And I think we were off by like two points on our accuracy. And we were like, man, we spent all this money. We used all these GPUs to build this massive model and we got two points, extra accuracy out of it. Does anybody really even care? So I do think that there's a balancing act in it sometimes. Well, Sheryl, we really appreciate you joining the podcast today.

Grateful for your paper and your continued work, you know, keep us surprised as you move forward. And thanks so much for joining SecureTalk. 

Sheryl Hsu: Yeah. Thanks so much, Justin. 

Justin Beals: It's been great.

 

About our guest

Sheryl HsuComputer Science Student Stanford University

She is currently an undergrad at Stanford University. She is broadly interested in machine learning and machine learning systems. Currently, she is working on reinforcement learning and LLMs at IRIS lab and ML systems at Foundry.

Keep up to date with Strike Graph.

The security landscape is ever changing. Sign up for our newsletter to make sure you stay abreast of the latest regulations and requirements.