Who Stole My Stuff? Finding Out Who Is Behind A Website

The last blog post I wrote about Fake Twitter profiles is by far the most widely-read post I’ve ever published since I started this blog. It’s nice when people read your work but there was a little sour note with this post after it was pointed out to me that someone had been posting it on Reddit and claiming it as their own work from a site called “Learnworthy”. It’s one thing to forget to credit someone when you quote their work but copying it verbatim and claiming you wrote it yourself is dishonest. Content-stealing for SEO manipulation is part of life on the internet of course, but it doesn’t mean it has to be tolerated. Plagiarism is stealing, and I want to know who stole my stuff. In this post I’ll look at a few different ways of digging into a website before showing how it was possible to identify who actually runs it. I was greatly assisted in this by Jeff Lomas aka @bleubloodhound.

 

The Fake Author

My article was copied by the website Learnworthy.net. The author was “Alicia Newman”. You can see that “Alicia” has made quite a few contributions to Learnworthy. She’s quite the prolific author (I’m just going to post screenshots of the site so that I don’t have to link to it):

There’s only one problem with Alicia – she isn’t a real person. Her profile picture returns no reverse image hits at all – which is a little unusual. Looking at her profile picture a little more closely and it becomes clear as to why:

Ironically the face of the person who stole my article on AI-generated faces is itself an AI-generated face. The eye and mouth position are a slight giveaway – but the biggest clue is the face of the person next to her with the weird-shaped eye in the wrong position. Alicia’s creator must have got a little suspicious (perhaps after reading the Twitter thread about this) because as I was preparing this article her profile suddenly changed:

There are some innocent parties that feature in this mini-investigation so to avoid harming their reputations I am not mentioning their names or linking to their websites. A reverse image search on this image will bring up the true identity of the person in the photo. She is a real person who is not called Alicia Newman and I’m sure she has no idea that she’s now the face of Learnworthy.net.

“Alicia” is about as real as the Tooth Fairy, but what else can we learn about her site and who might be behind it? There are a number of different OSINT techniques that can be employed for digging into websites, so let’s go through them and see what they show.

Wayback Machine

Where better to start than at the beginning? We can load the site into the Internet Archive and see that once upon a time it used to look like this:

The site was created and run by a professional educator who used the site to write about technological innovation in education. You can see that the username was “clong” – this is an important detail that I’ll come back to later on. We’ll see that this domain has changed hands over time – and this is really important in internet investigations. Just because someone was once affiliated to a website, domain, or IP address doesn’t mean that they are now. If you only ever did a limited Google search into this site you’d conclude that “clong” is the person responsible for the site because that’s what the initial search results indicate – but you’d be wrong. The most obvious answer is not always the right one.

Browsing through the archived web pages show that the Learnworthy was in this format for several years but between the capture on Jan 15th 2019 and August 17th 2019 there was a big change in the site appearance and content. It now looked like this:

At some point “clong” disappeared  and “Alicia Newman” took over. It’s very easy to find out who the real “clong” is and that he’s based in California, USA, but Alicia is a bit more of a mystery. Learning a little more about the sudden change in content might be key to finding out who “Alicia” really is. We can look at a few other web artefacts that will give us a little bit more information.

Whois

Whois lookups are rarely so generous as to contain the real name and contact details of a website owner, and Learnworthy is no exception. I’ve written about ways around this in the past, but just because the registrant details are not visible does not mean that we can’t learn anything from the Whois records.

For Learnworthy we can see it was first registered on 25th May 2013, but that there was a recent update on 29th November 2019. Using RiskIQ’s archived Whois data it’s possible to see that the registration was renewed regularly on 18th May (i.e. one week before registration expiry) every year – until there was an out of schedule change on 30th May 2019, followed by another on 7th July 2019:

In the update for 7th July 2019 there’s an interesting little tidbit in the new Whois details:

We still have no idea who the registrar is – but their location details are now the state of Kosovo in Albania. (I’m fully aware of the political sensitivities around Kosovo and Albanian integration, but for these Whois purposes, Kosovo is a state in Albania.) The change of a registrant on 7th July 2019 is significant becuase it indicates a shift from the USA to Kosovo – but why did this happend and can we find out any more about it?

IP Address & DNS

In this particular example knowing the IP address of the website is not hugely useful in terms of finding out who is behind it. It’s possible to host a website almost anywhere in the world and there is no requirement for a website to be hosted in the country the owner lives in. Using historic hosting data from RiskIQ shows that the domain has moved round quite a bit since it was created in 2013:

If you don’t have access to RiskIQ then SecurityTrails offers a lot of similar information.  Notice that there are hosting changes around the 7th July and 29th November 2019 – the same times as there were corresponding domain registration changes. The current host is the one I’m most interested in, and the IP addresses 194[.]1[.]147[.]9 and 194[.]1[.]147[.]95 both belong to WPXHosting. The DNS records also indicate that the site is now using WPXHosting’s nameservers. The IP and DNS data don’t tell us who Alicia Newman really is, but the fact that WPXHosting is a dedicated WordPress hosting company tells us that Alicia’s site is almost certainly powered by WordPress, and WordPress is often an OSINT goldmine.

Certificates

Before digging into the WordPress setup I had a look at the SSL certificate history with crt.sh:

Notice anything unusual? On 8th July 2019 the certificate changes so that it no longer covers the subdomain mail[.]learnworthy[.]net. This helps support the theory that control of the domain changed on 7th/8th July 2019, just as we’ve seen with the Whois, Internet Archive, and IP/DNS changes. This is an important lesson from a recon point of view because if we’d only relied on information found by Google and other search engines we’d have concluded that “clong” (the original owner) was still responsible for the site, because that’s what the most easily accessible information tells us. However digging into some of the underlying web infrastructure brings out a lot more detail that means we can be a lot more specific with our conclusions. It seems more likely that the domain changed hands on or around 7th July 2019 and came under the control of someone from Kosovo. There’s plenty more digging to be done first though.

WordPress

I’ve written about OSINT tips for WordPress before, and this was a good chance to put them into practice. We’ve already got a good idea that the site is powered by WordPress because it’s on a server belonging to a specialist WordPress hosting company, but we can use Builtwith to learn about what software is used to power the site. Sure enough, Learnworthy is powered by WordPress. Notice how much of the site software was only seen for the first time in July 2019:

 

There was also a website theme change in July 2019:

This all supports the hypothesis that the website underwent significant changes in July 2019, but who are the new owners?

WordPress User Enumeration

There are a few different ways to find out who the contributors to a WordPress site are, but WPScan is one of the most useful tools. It also represents a step up from passive to active recon – everything has been passive so far but this is a little more direct.

Most WordPress sites (which account for about 35% of sites on the internet) follow common URL structures, so to find an author on a WordPress site is predictable. The format goes:

www.somesite.com/author/authorname

WPScan looks for this and other common paths to find the authors who contribute to a site. Here’s what it found for Learnworthy:

If you’re not confident with WPScan, it’s possible to query the WordPress site JSON that contains the author information. It’s found at www.somesite.com/wp-json/wp/v2/users. You can view this directly in your web browser (I recommend using Firefox because of its built-in JSON prettifier.)

Sector035 helpfully did this for me and pointed out that “Alicia Newman” has the user ID number 1, which means she was the first account created on the site. Unsurprisingly all the other “contributors” are fake too. Here’s “Jessica Bingham”:

A quick reverse image search shows that Jessica is a stock photo:

The other authors are too:

 

There’s one author “Koki” who seems to have disappeared though. Where did he get to? WPScan found him, but there’s no image of him on the site. Let’s have another look at the user JSON to try and find where “Koki” might be:

“Alicia” and “Koki” are the same user. The Koki author page points directly to Alicia’s profile. Why is this? I think the most likely explanation is that “Koki” was the the name of the admin account used to create and set up the original WordPress site (hence user ID “1”), and when it went live Koki changed his or her profile name to “Alicia Newman” for posting content. The screen name has changed, but the author page URL has not. The predictability of WordPress URLs also means we could run a Google Dork like inurl:/author/koki to see if he might be linked to any other WordPress projects, but it’s beyond the scope of this article to look into that.

We can also use this technique to verify that the original Learnworthy owner “clong” no longer has account on the site. Attempting to visit .../author/clong brings back a 404 error. I’m certain beyond all doubt that he has absolutely nothing to do with the site in its post-July 2019 guise.

Facebook

So we have a load of fake user profiles who publish stolen content and pass it off as their own – but we want real people. Content doesn’t steal itself, so someone has to be running the site and keeping it updated. Next stop will be Learnworthy’s Facebook page.

Facebook’s new page transparency feature is quite useful for finding out who is behind a page. It doesn’t give the Admin names in this case, but it does help tell us where the site is run from and when it was created:

Two things stand out here. The 8th July 2019 date crops up yet again, and more importantly we learn that the people who manage the page are based in Kosovo. This ties in nicely with the changes we saw at the start around the Whois registration and hosting information. Ordinarily a good place to start with a page like this would be to look at the earliest posts and see who liked and shared them. There are a few common names that pop up, but since I cannot show they are linked to the plagiarism of my blog post, I haven’t listed them here because they’re almost certainly not involved. The only thing they had in common is that they are all Kosovan, and most of them attend the same university.

Before page transparency came along, it was much harder to establish when a page was created. As a rule of thumb it is often possible to look at when the user uploaded the very first profile picture and make a reasonable guess from there. Sure enough the first two profile pictures for Learnworthy were uploaded on July 8th 2019. The one in current use was uploaded on 17th August 2019. Even the logo has been plagiarised.

Getting Closer With a Little Social Engineering

We can be reasonably certain at this point that the site is run by someone in Kosovo and that most people who have liked the content of the Facebook page are younger people from Kosovo. The Facebook posts and memes also suggest someone with an interest in technology and programming. Can we do any better  than that though? At this point we start to reach the limits of OSINT – but not the limits of investigation. I am very grateful to Jeff Lomas aka @BleuBloodHound who helped out with a little social engineering to take things forward.

The Learnworthy website features a handy way in for social engineers:

Learnworthy want writers to contribute to their site. Technically I’ve already contributed one free article, but Jeff posed as a budding writer who was interested in making a contribution. He got an e-mail back pretty quickly from this guy, who was only too keen to help:

Now we have a name, e-mail address, and a profile picture. Thanks Jeff! A quick reverse image search on Korab’s profile picture brings back this match from a company website:

Fortunately Korab has footprints all over the internet so it was quite easy to find out more about him (at least until he reads this and then flips all the privacy switches). The aim of this post isn’t to dox him but for me and all the other authors he has plagiarised to know who it was who stole our work and passed it off as their own so I’m not going to be more intrusive than absolutely necessary.

Starting with Twitter, we can find a couple of his accounts. There’s an older one here @Korabdragidella:

Notice his nickname is “Kokinho” – I’d wager that this alias is also the source of the “Koki” username originally used by the Learnworthy WordPress admin. He has a more recent Twitter account too @korabdr:

Sure enough, his feed is used to promote content from Learnworthy:

His tweet about the article “Five Best Features On The Galaxy S20 Ultra” is another “Alicia Newman” post. A quick Google search on the first paragraph shows that it is also stolen, this time from CNET:

Korab’s other online profiles show he has a little background in “SEO specialism”, so he knows full well how to drive traffic to his site. In the case of Learnworthy it is by plagiarising content from other sites to try and piggyback on the Google search results. His Facebook bio also confirms the Kosovo connection:

 

So far we know that Korab is the person who answers e-mails when you contact Learnworthy and he promotes their content on his social media. He’s also known as “Kokinho”, and if he’s “Koki” then he’s “Alicia” too – but can we link him to “Alicia Newman” in any other ways? Possibly. Here’s his blog on Dev.to, talking about how inspired he was by an article he read on Learnworthy. The article he was pushing on Dev.to was another “Alicia Newman” special written just twelve hours before.

In fact as far as I can see Korab is pretty much the only person who promotes and shares Alicia’s posts with any kind of consistency. I’m happy at this point that he’s the brains behind the new incarnation of Learnworthy and that he created “Alicia Newman” as part of crude scheme to create content for a website. He lives in the right place, answers the e-mails, is probably “Koki”, and actively promotes the site content. Unfortunately this means that Korab is also probably the person who stole my blog content and passed it off as his own. As I was writing this post I noticed that my stolen article was actually removed from Learnworthy – but what about all the other people who have had their work stolen and reused? When is that going to be taken down? Cheats, scammers, dodgy marketers and liars are the curse of the internet and they shouldn’t be allowed to operate unchallenged. You can also apply most of the techniques in this article to go hunting for them too if you want…

 

 

1 thought on “Who Stole My Stuff? Finding Out Who Is Behind A Website”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.