WEBVTT

00:00:00.000 --> 00:00:09.570
JACK: [MUSIC] There’s a big list of all known security vulnerabilities for computers. You

00:00:09.570 --> 00:00:13.800
want to know what the oldest known computer vulnerability is? The oldest I could find

00:00:13.800 --> 00:00:20.910
is weak default passwords. This has been a known vulnerability since 1969. Specifically,

00:00:20.910 --> 00:00:27.120
computers sometimes have the username admin with the password also admin. Then the computer

00:00:27.120 --> 00:00:31.560
doesn’t ask you to change it when you buy it, so it can stay that way for a long time,

00:00:31.560 --> 00:00:37.980
years. Many computers after that also use admin/admin as the default username and

00:00:37.980 --> 00:00:43.800
password. Over the years many hackers have been able to get into many systems that they didn’t own

00:00:43.800 --> 00:00:50.340
using this basic username and password. Now it’s been forty years since we became aware of this

00:00:50.340 --> 00:00:56.610
security weakness. Surely by now this weakness has been resolved, right? There aren’t any

00:00:56.610 --> 00:01:03.960
computers in the world that have this username and password anymore, right? Right? I sure hope so.

00:01:03.960 --> 00:01:11.850
JACK (INTRO): [INTRO MUSIC] This is Darknet Diaries, true stories

00:01:11.850 --> 00:01:17.870
from the dark side of the internet. I’m Jack Rhysider. [INTRO MUSIC ENDS]

00:01:17.870 --> 00:01:25.790
JACK: In 2012 a security researcher began scanning the internet to see what computers are still

00:01:25.790 --> 00:01:32.360
running Telnet. Telnet is a way to log into a computer remotely but it doesn’t have encryption,

00:01:32.360 --> 00:01:38.540
so when you log into a computer using Telnet you send your username and password in clear text for

00:01:38.540 --> 00:01:44.180
anyone to see on the internet. The alternative is to use SSH which does the same job but it’s

00:01:44.180 --> 00:01:50.030
encrypted. SSH has been around since the 90s so there’s really no excuse to run Telnet anymore,

00:01:50.030 --> 00:01:54.200
but while this security researcher was scanning the internet trying to see how

00:01:54.200 --> 00:01:58.340
many systems were running Telnet, they also wanted to see how many systems are

00:01:58.340 --> 00:02:03.380
using those default passwords. They used the following four username/password combinations:

00:02:03.380 --> 00:02:10.820
admin/admin, admin with no password, root/root, and root with no password. They took these four

00:02:10.820 --> 00:02:14.870
username/password combinations and started scanning the internet to see if any systems

00:02:14.870 --> 00:02:19.190
would let them log in using Telnet. They were finding unsecured systems

00:02:19.190 --> 00:02:25.580
pretty quickly. [BLIPPING] But it took them over sixteen hours just to scan 100,000 IPs.

00:02:25.580 --> 00:02:30.230
The internet had almost four billion IPs so scanning the whole thing poses a big challenge.

00:02:30.230 --> 00:02:36.260
If they were to scan ten IPs a second it would take them ten years to complete the scan. The

00:02:36.260 --> 00:02:41.330
researcher thought if they had two scanners it would go twice as fast, and one hundred scanners

00:02:41.330 --> 00:02:45.830
would go one hundred times faster. Since the researcher was finding all these systems on the

00:02:45.830 --> 00:02:51.860
internet that they could log into as admin, then why not put those systems to work to help scan

00:02:51.860 --> 00:02:57.410
the internet? The researcher created a program that would scan and find unprotected systems and

00:02:57.410 --> 00:03:03.140
then upload that same program to the systems it found, and then put that system to work scanning

00:03:03.140 --> 00:03:09.830
for more systems. They were creating a botnet. A botnet is a program running on many computers that

00:03:09.830 --> 00:03:14.840
are all working together to do the same task, but the botnet creator doesn’t have permission

00:03:14.840 --> 00:03:21.020
to use any of those computers. Actually just logging into one computer as admin that they

00:03:21.020 --> 00:03:27.260
didn’t own was illegal. Of course, it was very illegal to do it to thousands of computers.

00:03:27.260 --> 00:03:32.090
The researcher knew this was illegal and had to stay anonymous and not get caught.

00:03:32.090 --> 00:03:38.850
They let this program run and propagate all over the internet all night long.

00:03:38.850 --> 00:03:46.800
The next day the botnet had spread to 30,000 computers and wasn’t even close to finishing

00:03:46.800 --> 00:03:52.590
the full scan. After some tweaks and more testing and more scans, the botnet finished

00:03:52.590 --> 00:03:58.110
the scan of the internet looking for all devices running Telnet that had those default passwords.

00:03:58.110 --> 00:04:05.970
[MUSIC] The botnet discovered 1.2 million of these kind of devices. Many of these vulnerable devices

00:04:05.970 --> 00:04:10.470
shouldn’t even be on the internet. They were TVs and [00:05:00] industrial control systems,

00:04:10.470 --> 00:04:16.380
cameras, water sprinklers, none of which should be accessible from the internet. Out of those

00:04:16.380 --> 00:04:24.780
1.2 million vulnerable devices, the botnet got installed in 420,000 hosts. Not all the systems

00:04:24.780 --> 00:04:28.620
could run the program and they didn’t want to install it on any industrial control systems.

00:04:28.620 --> 00:04:35.670
Controlling 420,000 machines all at once was a complicated task. The researcher had to set up

00:04:35.670 --> 00:04:40.980
an elaborate system which included middle nodes and N nodes and each system had to be controlled

00:04:40.980 --> 00:04:46.230
individually to perform a different task. Some systems would get rebooted and their IPs would

00:04:46.230 --> 00:04:54.090
change. It was a constant changing environment. What would you do if you had control of 420,000

00:04:54.090 --> 00:04:58.830
computers? With that many computers you could do a massive denial-of-service attack against

00:04:58.830 --> 00:05:04.200
your enemies or try to infect the world with a terrible virus. But this person had no evil

00:05:04.200 --> 00:05:08.070
intentions as far as we can tell. They were just a security researcher that was willing

00:05:08.070 --> 00:05:12.330
to break a few laws to try to understand the internet further. They now had a new

00:05:12.330 --> 00:05:18.240
mission which was to get a detailed scan of the entire internet. The first mission

00:05:18.240 --> 00:05:21.810
was just to see how many devices were running Telnet with default passwords.

00:05:21.810 --> 00:05:27.000
This new mission was to use those vulnerable devices to do a full scan of the internet;

00:05:27.000 --> 00:05:33.330
not just checking for Telnet but pinging every IP and checking the top 100 ports.

00:05:33.330 --> 00:05:38.010
In 2012 there really wasn’t that much data of people scanning the entire internet,

00:05:38.010 --> 00:05:42.990
partially because it just takes so long. If you were to scan ten IPs at a time it would

00:05:42.990 --> 00:05:48.090
take you over ten years to complete it. There were over 3.6 billion IPs allocated at the time,

00:05:48.090 --> 00:05:53.640
so scanning the whole internet required a lot of storage for the results. It also required a

00:05:53.640 --> 00:05:57.990
lot of time to complete the scan. They wanted to use this botnet to try to quickly scan the

00:05:57.990 --> 00:06:02.910
internet and see what’s out there. The scan they decided to do did numerous checks to see

00:06:02.910 --> 00:06:07.590
if the IP is alive, they would ping it and map it, and test to see if the top

00:06:07.590 --> 00:06:12.210
100 ports were open on it. Even though the internet had almost four billion IPs on it,

00:06:12.210 --> 00:06:17.250
the scan would really need to make over 60 billion probes to test all these different things. But

00:06:17.250 --> 00:06:23.310
with the help of 420,000 systems they calculated they could scan the whole internet in an hour.

00:06:23.310 --> 00:06:29.610
But this creates a new problem; storing that many scan results creates a major logistics

00:06:29.610 --> 00:06:34.500
issue. We’re talking about having the ability to receive over one million events per second

00:06:34.500 --> 00:06:40.950
of data coming back from the scan. The researcher built a web application using Python and PHP and

00:06:40.950 --> 00:06:48.240
used a dupe as a database. At this point the botnet was now fully built and ready to conduct

00:06:48.240 --> 00:06:53.160
a full scan of the internet. [PIANO] Researcher looked at this creation and decided to call it

00:06:53.160 --> 00:07:00.030
the Carna Botnet. Botnets are sometimes named after Roman or Greek gods and Carna was the

00:07:00.030 --> 00:07:05.070
Roman goddess known to protect the vital organs of the physical body. [ELECTRONIC] While the

00:07:05.070 --> 00:07:09.660
researcher was setting up the botnet, they noticed something strange. They were finding

00:07:09.660 --> 00:07:15.210
someone else was also building a botnet and using the exact same vulnerabilities. They

00:07:15.210 --> 00:07:19.470
were finding this other botnet on the same computers that the Carna Botnet was installed

00:07:19.470 --> 00:07:25.260
on. It was known as the Aidra Botnet. But the Aidra Botnet had malicious intent.

00:07:25.260 --> 00:07:30.720
It was being used to take down computers and did bad things. The researcher was able to

00:07:30.720 --> 00:07:36.750
detect that Aidra had infected over 30,000 of the same computers as the Carna Botnet. Being

00:07:36.750 --> 00:07:41.130
in this unique position, the researcher decided to block the Aidra Botnet from

00:07:41.130 --> 00:07:45.180
accessing devices. They were able to remove Aidra from the system and block

00:07:45.180 --> 00:07:51.740
that IP so Aidra wouldn’t come back. So Aidra started losing numerous nodes because of this.

00:07:51.740 --> 00:07:54.890
It fascinates me to think about these two botnets out there in the world,

00:07:54.890 --> 00:08:01.910
battling each other. After the Carna Botnet was built and more tests were done, it was time to

00:08:01.910 --> 00:08:09.020
conduct the full scan. The researcher gave the command for all 420,000 systems to scan

00:08:09.020 --> 00:08:15.770
the entire internet and it worked. All public IPs in the world were scanned and the data was

00:08:15.770 --> 00:08:21.620
collected on the results but to the researcher, that wasn’t enough. After building this massive

00:08:21.620 --> 00:08:28.550
botnet and an incredible infrastructure to support it, a single scan just wasn’t satisfying enough.

00:08:28.550 --> 00:08:35.390
They decided to scan a second time, and a third, and a fourth. In fact, they continued to scan the

00:08:35.390 --> 00:08:40.010
entire internet over and over, repeating it again and again weeks after weeks,

00:08:40.010 --> 00:08:46.130
month after month. Because hour by hour and day by day, the internet changes. So by conducting

00:08:46.130 --> 00:08:50.510
numerous scans of the entire internet would be the only way to understand exactly what’s out

00:08:50.510 --> 00:08:56.030
there. After six weeks of continually scanning the internet and collecting all the data, the

00:08:56.030 --> 00:09:01.820
researcher shut down the botnet. All the programs that were on the infected hosts quietly deleted

00:09:01.820 --> 00:09:06.200
themselves and all systems were returned just to how they were before the botnet was installed.

00:09:06.200 --> 00:09:11.300
That’s the end of the story for the Carna Botnet. [00:10:00] Now begins the story of the internet

00:09:11.300 --> 00:09:17.150
census. [MUSIC] With all these billions of probes and data points collected from the Carna Botnet,

00:09:17.150 --> 00:09:20.600
it was now time for the researcher to pour through all this data and try to

00:09:20.600 --> 00:09:25.370
make sense of it. The researcher called this project the Internet Census of 2012.

00:09:25.370 --> 00:09:30.170
Because there was so much data it was not easy to figure out what to do with it. The researcher

00:09:30.170 --> 00:09:35.150
analyzed and calculated and reviewed the data in numerous ways. Now, I think what this researcher

00:09:35.150 --> 00:09:39.980
did next was absolutely brilliant. Yes, the work they did up to this point was brilliant as well,

00:09:39.980 --> 00:09:45.890
but if they just published this data in a big spreadsheet and 40 page report, it probably

00:09:45.890 --> 00:09:50.450
would have gone unnoticed. All this data that’s in the database is interesting but it’s boring

00:09:50.450 --> 00:09:55.160
to read. It’s like reading a really dry technical book that’s just too long. Regions of the world

00:09:55.160 --> 00:10:00.890
were assigned a range of IP addresses. Africa gets one block, US gets another, and so forth.

00:10:00.890 --> 00:10:07.370
But even more specific states and cities are also given IP address ranges. The researcher started

00:10:07.370 --> 00:10:13.430
adding geographic locations to all the data they collected. GeoIP lookups were done on every IP

00:10:13.430 --> 00:10:18.470
address to determine where that computer was in the world. Eventually the data started to tell a

00:10:18.470 --> 00:10:24.500
story. The data was showing which IPs were online and where they were in the world. The researcher

00:10:24.500 --> 00:10:31.790
compiled all of this location data and placed it over a map of the world. This had amazing results.

00:10:31.790 --> 00:10:39.560
[MUSIC] The security researcher compiled all the data and published it anonymously for the world

00:10:39.560 --> 00:10:45.110
to see. This included a lot of details on how the Carna Botnet was created as well as how all the

00:10:45.110 --> 00:10:50.540
data was collected, and of course the map of all the computers in the world. You know what,

00:10:50.540 --> 00:10:54.500
you’ve gotta see this map for yourself. If you can, right now, stop what you’re doing,

00:10:54.500 --> 00:10:58.400
go to darknetdiaries.com, find the Carna episode, and let’s take a look at this

00:10:58.400 --> 00:11:14.360
map together. I’ll pause for a minute for you to load it. [HUMMING] Okay,

00:11:14.360 --> 00:11:20.570
on the map you’ll see lots of dots. There’s a dot on the map for every computer in location

00:11:20.570 --> 00:11:27.860
that was in the database. There are billions of dots. It’s hard for me to describe it. It’s truly

00:11:27.860 --> 00:11:33.200
a case where the data is beautiful and brilliant and magical, but that probably doesn’t describe

00:11:33.200 --> 00:11:38.120
anything so I took a trip to my local hacker space and asked some friends to describe it.

00:11:38.120 --> 00:11:39.500
TED: It’s pretty.

00:11:39.500 --> 00:11:41.120
GREGORY: It’s true, it is pretty.

00:11:41.120 --> 00:11:42.260
BARRY: That’s insane.

00:11:42.260 --> 00:11:44.210
KURT: Pretty cool, pretty cool.

00:11:44.210 --> 00:11:52.970
CURTIS: Wow. There’s a lot of dark areas. No real surprise. This is impressive.

00:11:52.970 --> 00:11:54.050
GREGORY: Pretty colors.

00:11:54.050 --> 00:11:58.730
JEN: It looks remarkably, densely internetty.

00:11:58.730 --> 00:12:04.460
ALLEN: The amount of technology that is on this planet, just at a glimpse is insane.

00:12:04.460 --> 00:12:09.586
ZACK: I didn’t expect Brazil to have that much. Brazil is a lot denser than I expected it to be.

00:12:09.586 --> 00:12:09.612
MICHAEL: Yeah, absolutely.

00:12:09.612 --> 00:12:16.520
CARLOS: Seems kind of surprising that Europe seems to have a greater concentration than the

00:12:16.520 --> 00:12:20.630
United States. Yeah, I would have expected the United States to be a lot more red than that.

00:12:20.630 --> 00:12:26.330
JACK: The map they’re looking at has billions of dots all over the globe. In regions that have a

00:12:26.330 --> 00:12:31.550
high concentration of computers online will show up red and very bright, and in regions that have

00:12:31.550 --> 00:12:36.980
a low number of computers, they show up blue. In areas that have no computers are completely dark.

00:12:36.980 --> 00:12:41.480
ZACK: The brightness of America doesn’t surprise me at all.

00:12:41.480 --> 00:12:48.380
BARRY: Australia is like – the coast lit up and I know that Australia is really barren in the

00:12:48.380 --> 00:12:54.710
middle. It’s mostly desert but it’s still nice to see how tightly packed it is towards the water.

00:12:54.710 --> 00:12:59.180
KURT: New Zealand is amazing. It’s like, the whole thing.

00:12:59.180 --> 00:13:05.930
CARLOS: Look at the islands in the Caribbean. They look almost like

00:13:05.930 --> 00:13:11.360
they’re forming a continuous line of lands all the way up to Florida.

00:13:11.360 --> 00:13:17.480
JEN: I’m looking at bright spots in the middle of the water and thinking about what that means.

00:13:17.480 --> 00:13:23.660
JACK: But this map is even more amazing than just dots on the world. The researcher had so

00:13:23.660 --> 00:13:28.070
much data from scanning the internet over and over and over that they were able to create an

00:13:28.070 --> 00:13:34.130
animated map showing the daytime/nighttime cycle. [00:15:00] Along with this animation we can see

00:13:34.130 --> 00:13:39.980
what hour of the day different regions of the world come online and go offline.

00:13:39.980 --> 00:13:46.310
JEN: I’m watching the sun shadow pass over the lights and matching that up.

00:13:46.310 --> 00:13:51.050
TED: What I’m seeing is – when you look at it you’re seeing it how it lights up.

00:13:51.050 --> 00:13:57.140
It lights up in almost a cascade. It goes from bottom to top in a wave, basically,

00:13:57.140 --> 00:14:00.200
in how it lights up. Really interesting.

00:14:00.200 --> 00:14:09.790
GREGORY: Italy goes full load earlier than the rest of Europe. It’s almost like Italy is a

00:14:09.790 --> 00:14:17.950
couple hours ahead of the rest of Europe ‘cause it also – it surges earlier and it drops off earlier.

00:14:17.950 --> 00:14:22.420
ALLEN: Middle of Australia, there’s a huge area where there’s no computers turning on and off.

00:14:22.420 --> 00:14:25.840
ZACK: Notice how also bright that India is.

00:14:25.840 --> 00:14:29.530
KURT: I love that when you go way up north, like North Pole,

00:14:29.530 --> 00:14:33.624
Greenland and stuff, you still see activity like way, way out.

00:14:33.624 --> 00:14:33.664
SHAUNTY: Can we zoom in? Can we zoom in?

00:14:33.664 --> 00:14:40.480
JACK: Everybody who I showed this map to marveled at the magnificence of the data

00:14:40.480 --> 00:14:44.980
they were looking at. Some people noticed Los Angeles comes online about the same time as

00:14:44.980 --> 00:14:50.110
New York. Some people notice it’s completely dark in North Korea, and other people saw that Canada,

00:14:50.110 --> 00:14:56.140
Russia, and the northern parts were all dark except Scandinavia. Even at extreme

00:14:56.140 --> 00:15:01.390
northern latitudes, it’s lit up. Because the security researcher created such a beautiful

00:15:01.390 --> 00:15:06.430
map to display the data collected, this map went viral and spread across the world.

00:15:06.430 --> 00:15:11.350
Everyone got to marvel at how big the internet was. This is the first map of the internet and

00:15:11.350 --> 00:15:16.990
it amazed us all. Now, a half decade later, I still see this map pop up in my social feeds

00:15:16.990 --> 00:15:22.330
from time to time with someone new discovering it and swooning over its beauty. Most people see this

00:15:22.330 --> 00:15:26.830
map and have no idea what it took to create it. But because of how beautiful the map is,

00:15:26.830 --> 00:15:33.070
to them it doesn’t matter how it was created. It’s still marvelous and worth spending a minute

00:15:33.070 --> 00:15:37.870
to look at. The creator of this botnet remained anonymous and nobody ever openly

00:15:37.870 --> 00:15:42.280
took credit for this. This is because even though the Carna Botnet had good intentions,

00:15:42.280 --> 00:15:47.170
it was still illegal since it uploaded and ran programs on machines that weren’t owned by the

00:15:47.170 --> 00:15:51.820
researcher. The botnet creator had to stay hidden and anonymous after publishing the

00:15:51.820 --> 00:15:57.370
data. This story probably would have ended right here if it wasn’t for one person.

00:15:57.370 --> 00:16:02.230
PARTH: My name is Parth Shukla. I’m currently a security engineer at Google here in Switzerland.

00:16:02.230 --> 00:16:08.500
Previously before Google I used to work for AusCERT, the Australian Computer Emergency

00:16:08.500 --> 00:16:14.800
Response Team based in Brisbane in Australia. When I first read about this I had just started working

00:16:14.800 --> 00:16:22.540
for AusCERT. It was my first month. This is my first IT security job ever. I was still studying

00:16:22.540 --> 00:16:30.040
at the time, I still hadn’t graduated. I was the newbie in – I read this thing, I went well,

00:16:30.040 --> 00:16:34.150
this is interesting. I don’t know what we’re supposed to do. I’m looking for guidance from

00:16:34.150 --> 00:16:40.600
the senior people because I’m not sure what the standard response procedure is within the company.

00:16:40.600 --> 00:16:44.620
I think someone suggested just e-mail the guy. I went what? They’re like yeah,

00:16:44.620 --> 00:16:48.760
just e-mail him. Maybe he’ll give you something. I think it was actually in jest. They made a joke,

00:16:48.760 --> 00:16:52.660
it was like yeah, as if you’re gonna hear back. I’m like okay,

00:16:52.660 --> 00:16:58.150
I guess I can do that. So I found the e-mail that was, I think on the GitHub page already

00:16:58.150 --> 00:17:04.450
and I sent him an encrypted e-mail saying hey, can you give us since we’re AusCERT,

00:17:04.450 --> 00:17:09.250
we’re supposed to look after the Australian interest. Can you give us the compromised IPs that

00:17:09.250 --> 00:17:19.150
you used for the botnet scan for Australia only? I got back a response that said actually, you’re the

00:17:19.150 --> 00:17:29.570
first person to contact me and here is everything. I was pretty shocked. That’s how that started.

00:17:29.570 --> 00:17:34.070
JACK: When he read about the Carna Botnet, there was one thing that stood out to him. Those

00:17:34.070 --> 00:17:40.280
1.2 million systems that were on the internet running Telnet and using default passwords. He

00:17:40.280 --> 00:17:45.170
thought there should be no reason for this many unsecured devices to be out there. He wanted to

00:17:45.170 --> 00:17:50.600
understand that problem further. When he asked the botnet creator for just the vulnerable devices in

00:17:50.600 --> 00:17:56.690
Australia, the researcher gave Parth the full list of all 1.2 million vulnerable devices.

00:17:56.690 --> 00:18:06.860
PARTH: The data itself was about 882 MB. It was a big text file that was formatted with

00:18:06.860 --> 00:18:11.810
tabs and it just basically contained MAC addresses, manufacturers, RAM, hostname,

00:18:11.810 --> 00:18:18.050
CPU info, IPs, country codes of all the devices. Approximately 1.2 - 1.3 million.

00:18:18.050 --> 00:18:22.670
JACK: Parth got busy trying to make sense of the data. First he did everything he

00:18:22.670 --> 00:18:26.630
legally could do to verify the data. He organized the data in different ways,

00:18:26.630 --> 00:18:31.070
figuring out which countries had the most vulnerable systems and which manufacturers

00:18:31.070 --> 00:18:33.770
were responsible for [00:20:00] creating the most vulnerable devices.

00:18:33.770 --> 00:18:38.960
PARTH: To me, these kind of – this indicated, for example, the manufacturer indicated this

00:18:38.960 --> 00:18:44.600
was a systemic issue. They were building and shipping devices that were vulnerable

00:18:44.600 --> 00:18:49.430
from the factory and they were shipping them en masse and that’s why they were – this one

00:18:49.430 --> 00:18:52.700
or two manufacturers – I think there were like, three really big manufacturers that

00:18:52.700 --> 00:18:59.540
were over-represented in the data. For the IPs it was a little harder because certain

00:18:59.540 --> 00:19:05.900
countries were over-represented but they also had more devices allocated to them globally anyway.

00:19:05.900 --> 00:19:12.140
Percentage-wise, they were not that bad. Actually, one of the things that I did in my research paper

00:19:12.140 --> 00:19:22.250
is, I tried to figure out how easy it is – it would have been to find a vulnerable device.

00:19:22.250 --> 00:19:28.250
If you started scanning a random IP range in a particular country of interest, how long would

00:19:28.250 --> 00:19:35.510
it take you? I published a table as part of my paper of the number of seconds it would take you

00:19:35.510 --> 00:19:41.660
to find a vulnerable device given the statistics we have. We know from all the internet registries,

00:19:41.660 --> 00:19:47.170
all the allocated devices – sorry, all the allocated IP ranges for each of the countries. We

00:19:47.170 --> 00:19:54.550
know from Carna Botnet all the number of devices in each country. We can do some simple maths to

00:19:54.550 --> 00:20:02.650
figure out percentages and likelihoods. I think, for example, the device – I think for Australia,

00:20:02.650 --> 00:20:07.120
for example, what I was interested in, if you started scanning randomly within

00:20:07.120 --> 00:20:12.640
just the Australian IP address range it would take you about an hour on average to find one

00:20:12.640 --> 00:20:17.950
vulnerable device. Whereas in China it would take you an average of about twenty seconds.

00:20:17.950 --> 00:20:21.160
JACK: When Parth started realizing how vulnerable the internet was,

00:20:21.160 --> 00:20:23.350
he decided to do something about it.

00:20:23.350 --> 00:20:30.310
PARTH: The end result was that I talked to over twenty CERTs from different countries. I notified

00:20:30.310 --> 00:20:35.620
all the CERTs that had more than ten thousand devices in their countries. I actually e-mailed

00:20:35.620 --> 00:20:40.450
them a copy of the relevant data. For example, for the US I would have sent them a copy of all

00:20:40.450 --> 00:20:45.760
the US compromised devices. For China, I sent them all the Chinese compromised devices. The intention

00:20:45.760 --> 00:20:50.320
there is, this is kind of the job the CERTs are trying to coordinate with other national agencies

00:20:50.320 --> 00:20:55.420
who would know better how to handle the situation in their local country. The Chinese would know

00:20:55.420 --> 00:20:59.470
okay, which manufacturers or which carriers they should go talk to and they have their own

00:20:59.470 --> 00:21:06.010
national contacts. To me, the responsibility here lies more-so with the manufacturer because they

00:21:06.010 --> 00:21:12.730
sold you a device with certain promises. From the manufacturer sides, I actually contacted the IEEE.

00:21:12.730 --> 00:21:18.310
JACK: The IEEE is an organization that creates standards for electronic components. They are

00:21:18.310 --> 00:21:22.660
the authority figure for which manufacturer can use which MAC address. The MAC address

00:21:22.660 --> 00:21:27.670
is a local designator assigned to every network interface on every device in the world. Parth had

00:21:27.670 --> 00:21:32.320
a list of 1.2 million MAC addresses as part of the data he got from the botnet creator.

00:21:32.320 --> 00:21:37.120
PARTH: I went to the IEEE and said these are the manufacturers we have derived. These are the

00:21:37.120 --> 00:21:43.120
top ten or twenty. I can’t remember the exact number. Can you give us their contact details?

00:21:43.120 --> 00:21:49.630
Because I want to contact them. You should have the authoritative information on this;

00:21:49.630 --> 00:21:53.680
I don’t want to just go on ‘cause a lot of corporations can share the same name or

00:21:53.680 --> 00:21:59.560
have similar names. I want the authoritative info from you and if I remember correctly,

00:21:59.560 --> 00:22:05.830
they denied the request. They said they can’t share it for privacy reasons. But they said

00:22:05.830 --> 00:22:11.050
if I had something to pass on, they would pass it on. I remember writing quite a terse letter

00:22:11.050 --> 00:22:17.950
with my contact details and saying please reach out to me; I have something to share with you.

00:22:17.950 --> 00:22:25.870
I know that ten or fifteen manufacturers I reached out to via the IEEE, only one replied

00:22:25.870 --> 00:22:31.660
and that was one of the Turkish manufacturers that was quite well represented for Turkey.

00:22:31.660 --> 00:22:36.460
They contacted me asking for more details, then I contacted them back. I think we did some phone

00:22:36.460 --> 00:22:43.540
calls to make sure authenticity was good. Then I sent them an anonymized version of the data,

00:22:43.540 --> 00:22:49.270
so I removed basically the IP addresses but I sent them just the devices that had them

00:22:49.270 --> 00:22:54.040
as the manufacturer to help them figure out which of their particular devices are

00:22:54.040 --> 00:22:59.350
actually vulnerable. I’m hoping the Turkish one ended somewhere. I haven’t heard from

00:22:59.350 --> 00:23:07.050
them since. I gave the data and fingers crossed they did something good with it.

00:23:07.050 --> 00:23:11.070
JACK: With the data Parth collected from the Carna Botnet, he made it his mission to try

00:23:11.070 --> 00:23:16.260
to resolve this problem of so many vulnerable systems being online. He thought by contacting

00:23:16.260 --> 00:23:20.850
CERTs in other countries he could help clean up the vulnerable devices out there. By contacting

00:23:20.850 --> 00:23:24.390
the device manufacturers he could stop them from creating vulnerable devices,

00:23:24.390 --> 00:23:28.890
but it didn’t seem like very many CERTs or manufacturers were interested in helping

00:23:28.890 --> 00:23:33.180
solve the problem. Parth was having a hard time getting [00:25:00] organizations to

00:23:33.180 --> 00:23:38.340
pay attention to this problem. But there were some people who were paying attention

00:23:38.340 --> 00:23:43.710
to this data. [MUSIC] Hackers with malicious intent were seeing how the Carna Botnet was

00:23:43.710 --> 00:23:47.910
created and started making their own botnets using the exact same methods.

00:23:47.910 --> 00:23:55.080
PARTH: There’s been multiple – and I’m sure there’s hundreds of them running right now.

00:23:55.080 --> 00:23:59.910
The tool called Lightaidra exploited the exact same vulnerability. It was released

00:23:59.910 --> 00:24:04.140
in parallel, I think, just a little earlier before the Carna Botnet data was released. I

00:24:04.140 --> 00:24:08.220
think it was independently discovered. It’s not a complicated issue to be discovered,

00:24:08.220 --> 00:24:15.420
right. That led other people in the community to go hey, this is so simple. I just click and like

00:24:15.420 --> 00:24:19.800
I said, on average about, depending on where you point, on average anywhere from ten seconds to

00:24:19.800 --> 00:24:26.070
180 seconds you’ll definitely find an IP address that’s vulnerable. That’s a really good hit rate.

00:24:26.070 --> 00:24:36.510
[MUSIC] What I really like about the Carna Botnet data, in hindsight, is it came before this became

00:24:36.510 --> 00:24:42.570
a big thing, before many of these botnets started forming, exploiting the same vulnerability over

00:24:42.570 --> 00:24:53.910
and over again. I feel like we have the largest, most accurate data before other botnets took over

00:24:53.910 --> 00:24:59.850
and started shutting down the port, Telnet port, which would stop further investigation. This,

00:24:59.850 --> 00:25:05.340
to me, seemed like a really nice imprint, the 1.3 million devices vulnerable worldwide,

00:25:05.340 --> 00:25:11.220
quite accurate at the time because he did it multiple times over a course of months.

00:25:11.220 --> 00:25:15.780
I’m referring to the anonymous researcher as he. I don’t actually know if that’s true. For

00:25:15.780 --> 00:25:19.470
the whole year – I worked for AusCERT for about a year and a half and out of that,

00:25:19.470 --> 00:25:25.380
for a whole year, I was working on just this. I was very lucky, very lucky that AusCERT allowed

00:25:25.380 --> 00:25:28.830
me to spend that kind of time on something that wasn’t actually related to Australia.

00:25:28.830 --> 00:25:31.980
JACK: There were a few people in the security community that condemned the

00:25:31.980 --> 00:25:34.950
data that came out of the Carna Botnet, saying that because the

00:25:34.950 --> 00:25:38.280
data was illegally obtained we should not use it for any legitimate research.

00:25:38.280 --> 00:25:44.910
PARTH: I agree it’s an illegal botnet. There’s no way I can disagree with that statement.

00:25:44.910 --> 00:25:51.420
The use of the data, I guess obviously my position’s been clear since you can see how

00:25:51.420 --> 00:25:58.080
I’ve used it. I haven’t really had any big, ethical qualms about it but in my opinion,

00:25:58.080 --> 00:26:03.030
the reason I think the researcher even bothered to give us this data is that he also wanted this

00:26:03.030 --> 00:26:08.130
problem fixed. That’s very clear by the multiple e-mails. I sent them quite a lot of questions

00:26:08.130 --> 00:26:14.370
and he continued answering them. When I did my first presentation at the AusCERT conference,

00:26:14.370 --> 00:26:20.400
I sent him the slides and he replied he was happy with that outcome.

00:26:20.400 --> 00:26:25.650
Since then he’s stopped communicating. My conclusion from those events is he

00:26:25.650 --> 00:26:30.180
got what he wanted. He wanted publicity. He wanted a proper analysis done from someone

00:26:30.180 --> 00:26:36.360
that has good reputation as AusCERT does in Australia. Once he got all of those,

00:26:36.360 --> 00:26:40.410
he was happy. I see that the reason he went into this effort to provide this data,

00:26:40.410 --> 00:26:46.440
to answer this questions, is that he didn’t create this botnet because he wanted to own

00:26:46.440 --> 00:26:51.630
the world and destroy things and make a profit. He realized it’s a problem and he wanted it fixed.

00:26:51.630 --> 00:26:55.050
JACK: Parth, do you think you were the only person to contact the botnet creator?

00:26:55.050 --> 00:27:03.090
PARTH: Yes. The creator said so. The last communication I had with the creator – we

00:27:03.090 --> 00:27:07.320
exchanged two or three e-mails and the last one I just checked. I think it was a few

00:27:07.320 --> 00:27:12.630
months after our initial contact. I said hey, has anyone contacted you yet? The response was no,

00:27:12.630 --> 00:27:16.350
you’re still the only one. Then I haven’t had any contact with the researcher since.

00:27:16.350 --> 00:27:18.780
JACK: Still to this day, the creator remains

00:27:18.780 --> 00:27:21.180
anonymous but do you have any thoughts on who it might be?

00:27:21.180 --> 00:27:27.420
PARTH: At the time in 2012, storage of nine terabytes of data was not cheap. He had to

00:27:27.420 --> 00:27:36.630
store it and he had to compress it using ZPAQ which is incredibly CPU expensive. Actually,

00:27:36.630 --> 00:27:43.020
related to this, for the internet census part, the public data, I did my undergraduate thesis

00:27:43.020 --> 00:27:49.200
on it. I had to decompress that data to be able to access the raw data so I could index the raw

00:27:49.200 --> 00:27:56.760
data and then do some analysis on it. That took me – the university had a high performance computing

00:27:56.760 --> 00:28:03.960
cluster of I think, four hundred machines, I think six hundred CPUs and even on that it took a day to

00:28:03.960 --> 00:28:10.860
decompress all this data. From 500 GBs, and once you decompressed it, it became nine terabytes.

00:28:10.860 --> 00:28:15.570
That decompression took me a day on a high performance computing cluster with

00:28:15.570 --> 00:28:22.050
three hundred CPUs. From my mind I just went, whoever this is obviously has a lot of money

00:28:22.050 --> 00:28:28.320
because the claim was he did it on an Amazon cluster. This would cost ridiculous – back in

00:28:28.320 --> 00:28:37.650
2012 with Amazon prices try storing [00:30:00] nine terabytes there for more than six months,

00:28:37.650 --> 00:28:43.620
then continuous collection, and then CPU crunch to compress it so you can upload the 500 gigabytes. I

00:28:43.620 --> 00:28:49.710
just see that there’s a lot of layers here, where the only conclusion I could come up with was this

00:28:49.710 --> 00:28:56.010
was probably an already established researcher who was doing some private home research and

00:28:56.010 --> 00:29:03.030
didn’t want the data associated with his public identity. That’s the best I could come up with.

00:29:03.030 --> 00:29:04.890
JACK: Why did you stop working on this data?

00:29:04.890 --> 00:29:12.000
PARTH: This is a battle that seemed like we should be able to win but I made no

00:29:12.000 --> 00:29:19.020
progress. My focus has now personally shifted towards focusing on problems that I can fix at

00:29:19.020 --> 00:29:24.660
hand. Whenever industry-wide impacts like these are necessary, you actually have to

00:29:24.660 --> 00:29:32.880
propose a solution at a specification level. For example, MAC addresses are controlled by

00:29:32.880 --> 00:29:39.210
IEEE. If IEEE made a mandate on something then these manufacturers would be forced

00:29:39.210 --> 00:29:45.240
to follow it. Currently the IEEE is not in the business of making mandates on security. That

00:29:45.240 --> 00:29:50.520
would be an uphill battle but that’s a battle that now you actually know a specific person,

00:29:50.520 --> 00:29:56.220
a specific entity that you can get involved with. There’s [inaudible] that are open to participation

00:29:56.220 --> 00:30:02.340
to a certain level of people, then now you have some hope of how you can address this systemic

00:30:02.340 --> 00:30:08.670
issue through one – by catching yourself with the core problem. For this particular example,

00:30:08.670 --> 00:30:11.880
I don’t have a solution but I’m just giving an example. IEEE is an example.

00:30:11.880 --> 00:30:17.640
One of the reasons I dropped working on this a lot, I left AusCERT, that’s for one thing,

00:30:17.640 --> 00:30:23.520
but I also haven’t spent any significant time chasing this up because I think this

00:30:23.520 --> 00:30:32.220
is a dead-end. Trying to get manufacturer to pay attention through the public-face is a nightmare

00:30:32.220 --> 00:30:37.140
because what matters to manufacturers most is maintaining good PR. If that’s

00:30:37.140 --> 00:30:41.760
how you attack them, then they’re going to be defensive. The way to get the problem fixed,

00:30:41.760 --> 00:30:45.870
if that’s what you really care about, is to go through the back channels. Find the engineers,

00:30:45.870 --> 00:30:53.250
find the people who know what it is, who actually make these assigns. A lot of times what I find,

00:30:53.250 --> 00:30:57.690
there is a tendency and security go – look at these developers. They don’t know what they’re

00:30:57.690 --> 00:31:02.460
doing. They’re idiots, right? But what I find a lot of times, is if you talk to these engineers

00:31:02.460 --> 00:31:07.080
who actually made these products, who decided to leave Telnet open with default credentials,

00:31:07.080 --> 00:31:12.900
you realize that given the circumstances they were in, it was not a stupid decision.

00:31:12.900 --> 00:31:17.730
They had deadline crunch, they had all these other things that are happening. They had to

00:31:17.730 --> 00:31:23.010
leave the default creds open in case the device wasn’t set up correctly, so Help Desk can dial

00:31:23.010 --> 00:31:28.350
in remotely and make sure that everything works properly for the layman consumer. There’s all

00:31:28.350 --> 00:31:32.610
these requirements that are imposed on these engineers. They try their best to convey them

00:31:32.610 --> 00:31:37.350
and sometimes they don’t have security backgrounds nor are they trained to be aware of these security

00:31:37.350 --> 00:31:41.820
problems. When you actually sit down and have a chat with them or convince them, a) I think it’s

00:31:41.820 --> 00:31:46.710
a lot easier to convince them because they can see problems because they have the same mindset

00:31:46.710 --> 00:31:52.560
and then once they see them, they will start looking for a proper solution themselves. Then

00:31:52.560 --> 00:31:56.400
you can exclude yourself from this problem now because you’ve just made the correct people aware

00:31:56.400 --> 00:32:02.940
what the problem is. That’s the lessons I’ve taken away. This was a brutal entrance to the

00:32:02.940 --> 00:32:10.440
security industry for me. It’s my first thing I did in this job. There’s nothing more I could do,

00:32:10.440 --> 00:32:16.472
right; I tried my best. It’s time to move onto something that’s not as soul-crushing.

00:32:16.472 --> 00:32:27.750
JACK (OUTRO): [OUTRO MUSIC] You’ve been listening to Darknet Diaries. For show notes and links,

00:32:27.750 --> 00:32:32.940
check out darknetdiaries.com. There you’ll find the full animated map of the internet census as

00:32:32.940 --> 00:32:37.680
well as all the research that Parth has published including his full presentation. A very special

00:32:37.680 --> 00:33:17.580
thanks to everyone who had comments about the map. This show is made entirely by me, Jack Rhysider.
