The Hacker Mind Podcast: The Rise Of Bots (Why You Didn't Get Taylor Swift Tickets)

Robert Vamosi

November 29, 2023

Bots are actionable scripts that can slow your day to day business, be enlisted in denial of service attacks, or even keep you from getting those tickets Taylor Swift you desperately want. Antoine Vastel from DataDome explains how its an arms race: the better we get at detecting them, the more the bots evolve to evade detection.

‍

The Hacker Mind Podcast is available on all the major podcast platforms. Subscribe today.

‍

[Heads Up: This transcription was autogenerated, so there may be errors.]

‍

VAMOSI: Securing your home is important. For many, that means have NVR systems. NVR stands for Network Video Recorder. Basically it’s a harddrive that captures terabytes of data from cameras the homeowner installs around their property. One NVR vendor had a zero day. Naturally, criminal hackers found it and didn’t report it. Instead, they built a botnet.

‍

A botnet is a web of infected devices. This includes internet of things devices such as security cameras. The botnet can be trained on a single site to create a distributed denial or service attack. Or, recently, it can used to exfiltrate data.

‍

Researchers at Akamai found and nicknamed the botnet “InfectedSlurs”. They did so because researchers discovered racial epithets and offensive language within the naming conventions used for the command-and-control domains associated with the botnet. Nice.

‍

Botnets are the only way that we often hear about bots. But there are actually good uses of bots. Either way -- good or bad -- bots, which are really just scripts, are here to stay. So that means we need a way to handle them. And in a moment I’ll talk to someone who is doing just that. I hope you’ll stick around.

‍

[music]

‍

VAMOSI: Welcome to The Hacker Mind, an original podcast from the makers of Mayhem Security. It’s about challenging our expectations about the people who hack for a living. I’m Robert Vamosi and in this episode I’m discussing the rise of bots, actionable scripts that can slow your day to day business, be enlisted in denial of service attacks, or even keep you from getting those tickets Taylor Swift you desperately want.

‍

[music

First, I want to thank everyone for listening to The Hacker Mind. This episode coincides with The Hacker Mind achieving 500,000 unique downloads. That’s half a million total downloads. That is quite a milestone. Whether this is your first episode (Hi!) or your 85th (how’s it going?), or perhaps you go back and listen again to one you’ve heard before (thank you, thank you!), this accomplishment wouldn’t be possible without you, my loyal listeners. So if you like this podcast, please subscribe on your favorite podcast platform. And also continue to share episodes with your friends. And if you could also write a review on your favorite podcast platform, I’d be humbled. I truly want to thank you, my listeners in over 100 countries worldwide. I want to pledge future improvements to the show as we together climb toward one million downloads-- hopefully in 2024.

‍

Alright, so let’s get started with Episode 85. Around the time of Black Hat USA I was pitched a story about how bots could be both good and evil. Intrigued, I reached out to an expert.

‍

VASTEL: Hi, I'm Antoine Vastel from DataDome. DataDome is a security company that protects online businesses against bad bots attacks. Like credential stuffing, DDoS attacks, payment for further to dry and so I'm, I'm leading the sprint research team. So the goal of my team is really to improve the quality of our bug detection engine.

‍

VAMOSI: Let’s start with a definition of what we are calling a bot?

‍

VASTEL: Good question. What are bots? Messenger bots is a program that you can use to automate tasks on the web. And yeah, we are not talking about you know, like chatbots we are talking about programs that we try, you know, to interact with websites or mobile applications. Most of the time I would just say websites but basically can be applied to mobile applications as well. And basically a birdie is a program that we try you know the tool. I don't like to get the product page or the product to the cart, make a login attempt, etc. And so it can be used for good purposes. For example, companies like using birds to index the web as a web animation or the content of websites searchable, which is good for others. But you can also use birds to scale attacks to you know, like make 1000s of login attempts and you're still user events.

‍

VAMOSI: So I just want to get into the nuances of that. There are botnets and so forth. We'll get into that in a moment. There are good uses of bots. For example, Google's using it to index. I’m wondering how that is different from the spiders that used to crawl the web?

‍

VASTEL: Yeah, so what we call spiders crawling the web scraping basically like scraping is a is a type of not necessarily going to say attack you know, it's it's a bit blurry and depends on the perspective and basically, what it means to do scraping it means that you made about that's going to like running a lot of pages on the website and get the contents and index it or make it available through an API or store it in the database. Then, if you're a company, and you're providing value to the website that you're scraping, no one is going to complain which is the case with Google because there are companies, you know, driving real human users in some websites, by products. So in this case, most people would agree that you know, Google is doing buts. But if you're scraping your competitors, you know doing 1000s of requests every minute to get all the prices in real time and adjust your pricing strategy. Most people would agree that it's best for them and so they would like to block numbers.

‍

VAMOSI: Okay, so in this definition, a bot is actually doing something. It’s not just scraping information, it’s acting.

‍

VASTEL: Yeah, so the botnet is doing something but you know, bots are developed by human users and they create these birds for a reason. So fundamentally discuss the, you know, scraping but it can be anything that you want to automate.

‍

VAMOSI: Okay. So with, with a botnet that would be using a kind of compromised computer to launch a distributed denial of service attack. One example.

‍

VASTEL: Yeah. So for example, like let's say you want to make a DDoS attack, you need to have access to a lot of IP addresses and devices because you want to distribute it to make it look like it's coming from a lot of different devices to avoid the traditional, you know, blocking techniques like rate limiting or geo blocking. So, what you can do is use the, you know, a botnet disposal or if you are, if you can pay for it, then you might be able, you know, to use this botnet to treat a lot of requests to a single website. Probably you want to target an API, you know that that is really costly for the website, so maybe something like an Add to cart, or something that generates a lot of reads or a lot of writes or updates in the backend. Yeah.

‍

[MUSIC]

‍

VAMOSI: Hmm. This sounds familiar. I was one of those people who was sitting there on the Ticketmaster site for Taylor Swift, waiting for my chance to purchase, then seeing all the tickets go away in seconds. I had no chance because I'm a human being and I was up against these bots. I’m wondering if we can dissect that a little bit, understand what was actually going on behind the scenes that I couldn't see.

‍

VASTEL: Yeah, so when we talk about ticket masters and all kinds of, you know, ticketing websites, we are talking about scalping robots. So Scalping is basically you know, it's about aiming to buy limited edition products or items. And actually, you know, the oldest schedulers were already targeting, you know, ticketing websites. Nowadays, they target a lot of products like GPUs, during you know, the craziness. of cryptocurrencies about forgetting NF T's but, so, skeptical of words, what they will do is that, you develop it but and it will monitor you know, every time if a product is available, so, here we are talking about, you know, concert tickets, but that's not all before that you may want to create a lot of fake accounts because I think in case of Taylor Swift and Ticketmaster, they probably implemented some sort of purchasing meet Paragon something like maybe you can purchase more than two T shirts or four together at once. So what you want to do as a broker if you want to buy a lot of tickets, first thing you would do is you would trade a lot of fake accounts on the platform maybe a few days, a few weeks or months ago. Before you know to anticipate to stay under the radar and avoid your considering adding your accounts being banned. That's not that easy, because if you want to create lots of fake accounts, you need to be, you know, subtle in the way you do it. You want to use good quality Gmail accounts, maybe use different kinds of patterns. Maybe you will need different kinds of phone numbers that you need to validate. So it's not that easy. And then on the day, you know, like okay, you know that today you know, the tickets, the tickets are going to be sold. You will start to monitor every second you need to be sure that you know the tickets on sale are being sold. And then whenever it's ready, you're going to throw all kinds of sophisticated birds that you know studied your website, your APIs for weeks before and that are going to make you know the perfect sequence of API calls your affinity segments and they will be able to buy you know 1000s of ticket they will automatically you know validate and create chat transaction or you know, whatever you know, things are using to pay like paper. And as a human user, you know, maybe you're like okay, I'm using my computer and using my smartphone or my tablet, okay, I have a you know, concert in any way. You want it to increase your chance, but you're not even playing the same game. You're competing with a developer that's creating 1000s of fake accounts. And you know, that can operate in a few seconds.

‍

VAMOSI: One area is cryptocurrency, how would this play out in the crypto world?

‍

VASTEL: So you know, at the time, he's going to put it out which is not really the case anymore, but at the time the value people are trying to buy with the limited edition NF t's on marketplaces. And so you know, they were using your scraper bots to monitor when NFT would go on sale and then by a bot using your using derivatives examples.

‍

VAMOSI: So these bots can use social media profiles, creating fake accounts or using dormant accounts. How are bots being used in social media accounts?

‍

VASTEL: So, in the case of social media, it's more difficult for us to say it's more like supposition or what we observe from other fields because we are not protecting your social media. But basically what you can do as an attacker is you can create lots of fake accounts, and you know, use them to provide services that offer following followers like views. Basically, if you search for, you know, fake views, fake likes, Instagrams or Faker use tick tock or fake for the world. whatever social media you want, you will see a lot of services where you can buy I don't know like for maybe I don't remember the the price exactly, but you can buy likes for I don't know, like dozens of dollars and if you want to do these kind of things. You need to use bots to automate these kinds of services at scale.

‍

VAMOSI: So what I'm hearing is they can like and then promote something higher, but can they respond? Can they actually post?

‍

VASTEL: Yeah, basically we can do everything you want. You should look at the main services provided most of the time, you will find services for likes follows this kind of things, but you have a lot of people that are using bloods to conduct scans automatically send messages to people you know, click on this link, or promote, you know, drop shipping, like products on the you know, a lot of groups at once knows are basically exploiting the API's of these social media to scale their operations.

‍

[MUSIC]

‍

VAMOSI: So, we briefly touched at the beginning on some positive uses. There are some positive uses for bots.

‍

VASTEL: Yeah, so because this is a good question, like bots are becoming more and more sophisticated, and not necessarily because developers are smarter and other things are under but like, what happened is that there has been a lot of development around bots. Like in the past, you know, it was really difficult to make it perfect, but you know, today you can use headless browsers, you can use automation frameworks that provide you with, you know, a browser that is almost the same as what a real human user would use. And this browser is the most used one is a headless Chrome, and it's being developed by Google.

‍

VAMOSI: If you’re a developer and you don’t yet know about headless chrome, it’s a generic browser to see how your website and applications work. Check it out.

‍

VASTEL: So a few years ago, Google developed the zero of headless Chrome. And it was like a revolution suddenly, you know, you could create a bot using a real browser that you know, it was headless, so it's less costly to run. You can run it on any operating system and at the same time, deploy the framework to a, you know, instrumental, but using high level API was a dream. Like, you know, okay, I want to build this page, I want to intercept the request, well, all of a sudden, it became really easy. The beta, add all the features that were, you know, normal promo data. So you know, you could browse any website you're on but couldn't break because of compatibility issues. And like, a few months ago, that released a new headless Chrome, so like the former version was slightly different. It was slightly different than the real Chrome but it was almost the same but like the latest version is the same. It's the same codebase on the same branch on your on zero code repository. And people who are dating on Hacker News like that, why is Google providing this kind of weapon? And you know, it's like, with knives, you know, you can use it for bad purposes, but you also need it to catch up and so it's really useful for a lot of people to test their website. That's the main use case. The reason Google already offers these kinds of features for free is because they want people to give up developing websites that work on any kind of device. And if you want to develop a lot of new features, you have a lot of developers then you need to be able to test your website with the same browser, then what your users use, and you know, that that's the main reason they developed headless Chrome, because they wanted people you know, to make bad bots.

‍

VAMOSI: Automation. I want to be clear with the audience like when I use automation I'm using like Zapier, I'm doing if/then statements and so forth, would that be what a bot would be doing?

‍

VASTEL: That's a good question. Like there is a blurry line you know, like Autobots bad on it like like, you're using Zapier and probably the website, you know, they have an API that integrates with Zapier so that you can you know, chain, you know, okay, if I receive this message contained in this world in Gmail, then automatically create a I don't know, like, a mutation in my calendar and you can do it easily with Zapier because there isn't any API that makes it possible. So some people will use both for good purposes, you know, personal purposes, but that are not humans that are trying to break your website or teach something, you know, maybe, I mean, let's say tomorrow, I want to automate something on my bank accounts. But at least in Europe, it's quite difficult to obtain access to an API provided only you know, to a huge financial company. So I could be tempted to make about two I don't know make a small transfer, using a feature that's not available. It's not bad. But probably, you know, the bank would try to block these kinds of things because the company knows the intent of the person until it's too late. So not all bots that are doing automation are bad, I would say depends more on the intent. What are we trying to do?

‍

VAMOSI: API's and that's another piece here. It's like, if a company has an API, a bot could do certain things. If a company didn't have an API, a bot couldn't do something.

‍

VASTEL: Okay, so talking about API's, like, let's take the example of Twitter or x we still refer to them as Twitter probably best. Basically, what happened is that they were like, oh, there are too many bots will make the API private, or first I think they wanted to make it paid. So you had to pay. People don't want to pay for something that used to be free. So what happened is that people started to make scrapers you know, to get, you know, tweets or interact with the website without paying for the API. So even if there is an API, if you have to pay for it, some people will still prefer you know, to use the free solution which is to use your own bot. And then when we talk about API's, we need to keep in mind that there are API's that are, you know, expected to use by HTTP clients. So tomorrow, I'm creating an API for anyone you know, let's say you have a server and from your own server, you want to retrieve the latest news. You might create an API and a generator, private token so that you can control or regulate what these people are doing with your API. But then on your website, or mobile application. Most websites and mobile application domains there are developers that know you're the front end, and the way the front end is displaying dynamic content says that it can make lots of small API requests to your server and you know, dynamically update subsets of your page. But this API, you may not want everyone to access that, you know, you probably just want real human users going on your website interacting with it, you know, the normal way to use them. But the thing is that because it's a public API, you know, because it's, I mean, that's the way it works. If you want your website to be able to make a request to your API, anyone could look in the dev tools or using a proxy, intercept the traffic and, you know, replay the request or modify them. And so in this case, it's more blurry because like, you didn't intend this API to be used by bots or by HTTP clients. We only want human users to use it from their browsers.

‍

[MUSIC]

‍

VAMOSI: So we've defined some good uses and some bad uses of bots. How can we manage bots, how can we prevent some of this bad stuff? What solutions are out there?

‍

VASTEL: I will tell you how not to do it first, because it is mostly people who are doing it. Like when you read a discussion on Hacker News on Reddit. Most people are like, Ah, this is so easy. This is a data center IP that just blocks it. Ah, this is a foreign IP. Just block it, just use this CAPTCHA or just use a hidden link. You know, bots are not stupid. They will click on your hidden No, okay, like the way most websites or mobile applications or Hindenburg is using whatever. Say like traditional techniques that are using websites are trying to block birds tactically, using signatures that are trying to block but using IP based recognition like you're making more than 50 requests per minute and we blocked him.

‍

VAMOSI: CAPTCHA is that annoying popup with multiple boxes that says select all the cats, only you can’t see clearly the image presented, so it gives you another. Yeah, CAPTCHA, even RECAPTCHA doesn’t really work. There, I said it. It doesn’t work and Antoine explains why.

‍

VASTEL: They are using CAPTCHA traditional CAPTCHA like recapture for example, that everyone knows where you have to select to fire hydrants and this does not hold on even capture doesn't for a lot of studies have been published. lately about the fact that bots are better at passing CAPTCHA than human because Acer's are using your image or audio recognition techniques, sometimes not even using the API's provided by Google, right, google recaptcha you know, every two or three years you have a presentation like this at BlackHat. All they are using Capture sounds. So basically you have an API that enable you to ask someone somewhere on Thurs to solve the CAPTCHA on the behalf of your bot and it will respond with a token that you can use to make it look like as the CAPTCHA so this technique they look you know, effective but it's not the case that must be painful for human users because if you're using a VPN, you know you will get blocked if you're stuck here to you know, refreshing your page or to merge whether you will get blocked. If you're coming from another country. Maybe you will get blocked but don't care because birds can use proxies for example, and nowadays it's very easy to use residential proxies. So proxies that with IP belong to an ISP like Comcast AT and T Verizon. So, you know if you want to avoid rate limiting, and you just use several proxies by Okay, so facial is 50 requests per minute, okay, when I reach 4901 in blood, okay, I use another IP by using a proxy. So it's not even painful for them. You know, it's one line of code. Same for CAPTCHA. I mean, it has to integrate with CAPTCHA, but services are surging themselves, but if you use a capture time service, it can be something like $3 to sort of 1000 recapture. So it's not that much if you're trying to still use our account for example, or computer scan if you have a way to monetize your but okay, so, how can we actually protect against but I said all the way the other techniques do not work better. And so the first thing is to acknowledge that you know, attackers are way more skilled than you know, the techniques are trained to use in blogs and so there is no silver bullet.

‍

VAMOSI: So one of my favorite topics is digital forensics. Did you know that when you visit a website, that site can see a lot of information, such as which browser you are using, the dimensions of the screen, the operating system, and so forth. As I discuss in The Art of Invisibility, a book I co-wrote with Kevin Mitnick, there are ways to spoof these settings, but there are also ways to detect that, too.

‍

VASTEL: You know, when it comes to but detection a Bert is basically an HTTP client. You know, it could be your browser or your browser or Python scripts or Golang script making requests to your server. And it will try to assure you when they can try to solve any kind of you know, signature fingerprints blows up in your face, citizen fins, they will try to generate fake mouse movements and we try to use clean IPs. They will try to use everything you know to appear more human. So, on the back detection side, you need to liberate all the signals possible. So in general when it comes to the detection, you are the fingerprints. So you want to collect sophisticated fingerprints in chemistry, Toby's SDK to detect side effects of automated browsers. So if you're using headless Chrome or if you're using puppeteer as testers, you will have side effects that can be observed and that are not present. If you're browsing the web as a human and so you want to develop special just quick changes that can detect that you know, someone is, you know, automating his browser or using your headless browser. You may want to detect virtual machines, for example, like some humans can use, it may be interesting signal, like if somebody's pretending to be a Windows, but you detect that you're using some code, like you're asking him to draw some shapes, for example, geography, geometrical shapes, and you'll notice that you know, this kind of signature is neat to Linux. It's kind of inconsistent consistency can be helpful. Then, of course, you want to study the user behavior. So you want to leverage the behavior of the user to know if it's our real human users with the DEA on your website. When we talk about behavior, most of the time people are thinking that most movements are coding. It is typing on the keyboard. Yes, that is really helpful. But then you also want to study, you know, the sequence of HTTP requests, the browsing patterns, the you know, the grasp of request, is it consistent with what a human user would do? So for example, are you doing an Add to Cart before even during your product, this kind of thing, but more complex with more dimension, of course. And then, I would say contextual signals you want to use a lot of context. These are a lot of weak signals. Like the time the request was the country, the request, the age of the session. The type of aprs was used as a proxy recently and it's easier in our case, you know, we are putting in a lot of websites and mobile applications. So we have a global view of the web, so we can better understand, you know, what an APM is doing on different kinds of services, and help us find some fun to fine tune the aggressiveness of the detection. So I would say then, you need to combine all of this because attackers will try to lie about all of the signals, you know, we know that's the case. And so the idea is to have different layers of detection, that leverage different kinds of signals, using different techniques, different timeframes, different aggregations, so for example, you may want to study the behavior, per IP per session on the wall website to ensure that you know, you are looking at the data in every dimension, because attackers are trying to lie, you know, everything so it's important to do it.

‍

VAMOSI: That sounds like a lot of data. I’m wondering if this is done in real time.

‍

VASTEL: So in our case, here, we do it in real time. That's the challenging part. Because in real time, you need to be more like okay, do I want to block, challenge some capture or request and you have something like two seconds to make a decision? Yeah.

‍

VAMOSI: How does this play with enterprises? I’m curious how one might convince them that they should be concerned about this? What sort of risks do like a large enterprise face

‍

VASTEL: So sometimes it's easy, you know, they feel the pain, if they can say and you know, they need to do something because otherwise they can't even operate. So, for example, if we talked about scalpers for ticketing, what happened is that the details are filthy selling tickets, there are so many human users already that you know, the company has to add more servers, so it's costing them money, okay. But the thing is that once you start adding your own robots on top of that, it's like a sort of DDoS attack. Suddenly, you have spikes of traffic made by really sophisticated birds from all around the world. You have no simple way to blog then. And if you don't do anything, your website can't even operate. We're not talking about you know, getting your chins to human users. We are talking about adding your website running and in some situations, like the if you don't do anything about the birds, they will just, you know, completely slow down. Your website or making it available. So even people that are not interested in Taylor Swift digits won't even be able to buy anything. Or maybe you're going to a concert, you need to load the ticket on your smartphone and you can't even load it, because the website is unavailable. So in this kind of situation, customers are well prospects at the time they were aware of the program and the idea is to keep it, but sometimes companies are not fully aware, you know, like, like, you may not know that you have bought on your website. Because you know, they may not be too aggressive or maybe you don't have the right metrics. And so what we propose to do when it comes to the detection of security in general, all vendors will tell you, we have the best detection we have the best and you know, as a buyer, okay, anyway, let's try the product. And I want to see it by myself so that's what we propose. They can really easily integrate our San Jose module into our CDN, our load balancer server, and then during one month for free, they can monitor what's going on that website so they can see all the traffic and we labeled it as bot or human. We don't interact with it, we aren't blocking anything, but they can get an idea of you know, is it White House scrapers are they butts, you know, testing stolen credit cards, these kinds of things.

‍

VAMOSI: So, the question here is, have they slowed down the bots you mentioned? Like if they're doing things out of sequence, or they're doing things too perfectly, that gets flagged?

‍

VASTEL: Yeah, yeah. This is a really good question. So as you can imagine, you know, but developers are stealing data we are selling but the lookers, it's not a secret I can say publicly, and we subscribe to different kinds of services. And what we noticed on some, but as a service, so what we call a better service, basically, it's a service that you can, you can use to make but using an API. So let's say I want to scrape the content of a page, I will contact the API of the buttons assignees. And then it would make you know, requests on ideas. So it will use proxies. It will fortune your friends. It could try to force capture, but as the end user you know, I just need to make an API call and I only paid if it's successful. So it's really convenient. So we like to study these kinds of services because you know, these are professional, you know, attackers, let's say and so it knows when to stop easily because their business is to bypass protection.

‍

VAMOSI: Is there a conscious effort by some of the bot creators to make it look more human and to maybe do a few random things here and there to trick you guys?

‍

VASTEL: So we studied some of them and what we observed is that on some of them, for example, let's say you were trying to okay, I want to scrape this page, like on the real estate website. I want to get information about the price of this house. They wouldn't go directly to the URL you want. Instead, they will slow down. They would, you know, go on the homepage. And then they will browse random things, but then he was going on the tab of services pages. And you know, it was very suspicious because no one was reading the tab of services. And so we did several tests, and we noticed that there were, you know, suspicious patterns of people, you're going from page to 10 of services to real estate pages. And so we know that they try to destroy ourselves because it may be the smartest thing to do instead if you're trying to go fast to what you really want. So yeah, they are trying things like this

‍

{MUSIC]

‍

VAMOSI: One thing we haven’t talked about is the elephant in the room. AI, or what people think is AI today. How might bots be used with AI?

‍

VASTEL: Maybe we could talk about I mean, it's interesting to you. It's about everyone talking about AI chat GPT. These kind of things and outbursts come into the equation. Sure. So everyone is looking for human generated content, basically, to train, you know, these large language models. So it started with chargeability version three, you know, the one where everyone was amazed, like, wow, it's going to be a revolution. And at the time, you know, they explained how they were collecting data. They were using mostly open source data set. 60% of the training data came from the common crawl, but this is about scraping the web and making, you know, a huge archive available as s3 buckets or you know, maybe the comments below it. And then came charging 3.5 charging at four in parallel, you have a Google developing bother and you know, the more competition there is between you know, these providers Natrum wage model, the less transparent there are adults, you know, how they obtains data, because at some point, everyone wants high quality human generated data. So what we started to do was to play with plugins, change bits plugins, because one of the main benefits of changing GPT is that you know, it can't get live data you know, I think that was the recent update, where you know, the less information it has is 2022. But before that it was 2021. So let's say you want information about the latest movie or latest basketball game. It's not possible so people started to develop plugins to uh, you know, get live information on the web and that directly in just demeanor, you know, intelligibility and so we study some of the most popular plugins that you know, provide these kind of features and some of them are already trying, you know, to be less obvious about the fact that they are but some of them that just came about that was else you know, as they start changing their useragent start adding, you know, proper arrows to detail. And I think it would be even worse for training data because we see a lot of companies raising a crazy amount of money they need to get the end of high value data. Some websites like Twitter like fleet start to understand that there is money to be made with their you know, user data, so they want to provide API to monetize it, but as I told you before, people are not necessarily going to pay you there is a simple way to get this data and a lot of them would prefer to use bullets to get this data, so it's free. If you want to monetize your content for an API. It's really important that you protect it. Otherwise people are not going to pay for it. They're going to make lots of random API calls on official API's. And it makes it worse for you.

‍

VAMOSI: But with an API, how would you protect it necessarily, you've basically opened up certain parts of the data and said I'm sharing this out.

‍

VASTEL: Oh, you so basically, what we'll do is that you will have a civil paid plan like you know, you pay, I don't know 100. That means you can make 1 million requests and then you know, the price increases depending on the number of requests you want to make. But because you can authenticate the user with a token, then you can control what they can access, but people must have the timeline. That's what we saw on Twitter. They didn't want to pay. So they started making lots of births. And it was even worse because they didn't care. They were not even raised by the API. They were using lots of birds making a lot of requests from different IPs and slowed down their website so much that at some point, you know, LMS said that, okay, we're going to rate with everyone and I think the threshold was like 500 tweets a day. I've noticed a Twitter ID or maybe I don't know, but that's uh, you know, I but quickly regulated as a human. You know, it significantly impacts the user experience, and it didn't really solve the problem on Twitter. So

‍

VAMOSI: right so rate limiting would be one way of restricting that access.

‍

VASTEL: Yeah, that's a way but what I said before is that if you want to rate leads, you can rate what you need per IP address, then then birds are going to use another IP address. If you use a user account then boats are going to create a lot of fake accounts. So then you have another problem. So finding the right way to, you know, to rate lead someone is really challenging. And on top of that, if you're too aggressive, you will also, you know, reach into your middle human users. And as a company like Twitter, you know, your business is to show ads. And, you know, people can see multiple ads that will leave your platform and come back later or not, but then you can show more ads and see, so you're losing revenue. So you need to find the right balance between, you know, controlling budgets, and you know, making money. By showing your advice, but in case of Twitter,

‍

VAMOSI: so, I come from an experience of the virus community and there was a lot of back and forth. You'd have the anti virus companies developing something, but then the virus community would answer that and back and forth. It sounds like the bot community is doing much the same protections you put in place, they come back with something else that changes it slightly. Do you feel like the tide might be turning in your favor that we're getting on top of it, or is this just going to be a persistent problem like, you know, viruses and worms?

‍

VASTEL: I think it's going to be a persistent problem because as long as there is an incentive, you know, for people, you know, attacking the websites, you know, like let's say you're a company and you know, you rely on data that are protected that are public. I'm not talking about stealing private use of the internet, but maybe you want to scrape a website every day to adjust your price in real time. As long as it costs you less to scrape the website by paying proxy developers then it enables you to win money by adding a better pricing strategy, you're probably going to continue as an attacker. Let's say you want to, you know, steal a lot of user accounts and conduct scans to still try to sell them. As long as you know, your roundtable would probably need to continue to do so Mike will continue to adapt on our side because I think people in front of us are not willing to stop.

‍

Share this post

The Hacker Mind Podcast: The Rise Of Bots (Why You Didn't Get Taylor Swift Tickets)

Get a Demo

Or let us know if you have any questions

Complete API Security in 5 Minutes

Maximize Code Coverage in Minutes