OnionRunner, ElasticSearch & Maltego

Last week Justin Seitz over at automatingosint.com released OnionRunner which is basically a python wrapper (because Python is awesome) for the OnionScan tool (https://github.com/s-rah/onionscan).

At the bottom of Justin’s blog post he wrote this:

For bonus points you can also push those JSON files into Elasticsearch (or modify onionrunner.py to do so on the fly) and analyze the results using Kibana!

Always being up for a challenge I’ve done just that. The onionrunner.py script outputs each scan result as a json file, you have two options for loading this into ElasticSearch. You can either load your results after you’ve run a scan or you can load them into ElasticSearch as a scan runs. Now this might sound scary but it’s not, lets tackle each option separately.

On the fly:

To send the results to ElasticSearch (and to a file), you just need to make some small changes to the onionrunner.py script (10 lines of code). Firstly we need to import a couple of extra python libraries. At the top of onionrunner.py add the following lines of code.

from elasticsearch import Elasticsearch
import datetime

If you haven’t used the elasticsearch python library before you will need to install it which you can easily do using:

pip install elasticsearch
sudo pip install elasticsearch

The elasticsearch library is essentially what we are going to use to load the results into ElasticSearch, as the results as nicely formatted as json, ElasticSearch happily accepts the data without any formatting changes. The datetime library is used to create the correct timestamp format for ElasticSearch so we can track when the data was imported (important if you are running multiple scans).

The second (and equally easy part) is to create a python function to send the results where they need to go. Create this function towards the bottom of the script (I added it after the add_new_onions function):


def send_to_elastic(data):
es = Elasticsearch()
data['timestamp'] = datetime.datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%S.%fZ')
es.index(index='osint', doc_type='onion', body=data)

You will need to make some slight tweaks to the code to match your ElasticSearch environment. The first part of the function is where you define your ElasticSearch instance to connect to, the default is localhost:9200. If your server is called ‘bob’ then you need to change the line of code to:

es = ElasticSearch('bob')

The next line of code adds the timestamp we mentioned earlier, you don’t need to change this but if you use Kibana to visualise the data you will need to add this as the field to index. The next step is to change the index and doc_type variables to meet your requirements. So for example if you wanted to store the results in your ‘research’ index, with a doc_type of ‘darkweb’ you would simply change the code to this:

es.index(index='research', doc_type='darkweb', body=data)

The final step is to add the line of code into the script to send the data once it’s been collected. To do this you need to find the def process_results function, and then after this segment of code:

# look for additional .onion domains to add to our scan list
scan_result = ur"%s" % json_response.decode("utf8")
scan_result = json.loads(scan_result)

You just need to add this line of code:

# look for additional .onion domains to add to our scan list
scan_result = ur"%s" % json_response.decode("utf8")
scan_result = json.loads(scan_result)

And that’s it, job done. Now when you run the script the results will get saved to a json file and also added into ElasticSearch (fingers crossed).

From results files:

Adding existing scan results into ElasticSearch is also just as easy, well in fact it’s easier as I’ve written the code for you. Head over to https://github.com/catalyst256/MyJunk/blob/master/loadresults.py and you will find a pre-made script all ready for you to use. Well actually you will need to change the ElasticSearch parameters at the top of the script but otherwise it’s good to go.

To run the script all you need to do is specify the results folder as a variable when you run the script. So for example if you take the default output location for the onionrunner.py script you would just run the following.

./loadresults.py onionscan_results/ (NOTE: you need to have the trailing ‘/’ in there)

This will then load each file and send it to ElasticSearch, now how simple was that.

Maltego all the things:

I’ve been messing around with ElasticSearch a lot lately for various things and one thing I really like about it is how easy it is to plug Maltego (via transforms) into it and get some lovely visualisations out. So with that in mind I spent a bit of time writing some transforms for the data from OnionRunner.

The awesome thing (well one of them) about ElasticSearch is that it is free text searching, so essentially type some words in and it will return any record that matches, or you can be more precise and specify certain key: value searches to run. If we look at the data collected by OnionScan/OnionRunner we can start to work out what we want to search for. For example if you want to find all servers that run Apache you can search one of two ways.

1. “Apache”
2. serverVersion: Apache

Both will give you the results you need, the second option will only return matches to that specific “key” whereas the first one will find any references to apache in any field.

Using this search method it’s easy to create some Maltego transforms. The flow of the ones I (quickly) created work like this.

1. Create a phrase with your search query
2. Transform returns hiddenServer addresses
3. Search for open ports
4. Search for SSH keys

These are just some quick examples, and here are some screenshots.

Screen Shot 2016-08-01 at 09.01.01

Screen Shot 2016-08-01 at 09.03.34

Screen Shot 2016-08-01 at 09.07.57

The Maltego transforms aren’t quite ready for release as I need to tweak them and make them production ready but once they are I will release them. They are all local transforms so you will need to install them into Maltego yourself (I will provide instructions) but it’s a painless process.

Massive thanks to Justin for creating OnionRunner and if you want to learn Python I’ve heard great things about the Python courses he provides.

PS Sorry about the rubbish code blocks, WordPress hosted doesn’t like Python…😦

Beginners Guide to OSINT – Chapter 1

DISCLAIMER: I’m not an Open Source Intelligence (OSINT) professional (not even close). I’ve done some courses (got a cert), written some code and spend far too much using Maltego. OSINT is a subject I enjoy, it’s like doing a jigsaw puzzle with most of the pieces missing. This blog series is MY interpretation of how I do (and view) OSINT related things. You are more than welcome to disagree or ignore what I say.

The first chapter in the OSINT journey is going to cover the subject of “What is OSINT and what can we use it for”, sorry it’s the non technical one but I promise not to make it too long or boring.

What is OSINT??

OSINT is defined by wikipedia as:

“Open-source intelligence (OSINT) is intelligence collected from publicly available sources. In the intelligence community (IC), the term “open” refers to overt, publicly available sources (as opposed to covert or clandestine sources); it is not related to open-source software or public intelligence.” (source)

For the purpose of this blog series we are going to be talking about OSINT from online sources, so basically anything you can find on the internet. The key point for me about OSINT is that it (in my opinion) only relates to information you can find for free. Having to pay to get access to information such as an API or raw data isn’t really OSINT material. You are essentially paying someone else to collect the data (that’s the OSINT part) and then just accessing their data. I’m not saying that’s wrong or should be a reason not to use data from paid sources, it’s just (and again just my opinion) not really OSINT in its truest form.

Pitfalls of OSINT

Before we go any further I just wanted to clarify something about collecting data via OSINT. This is something that I often talk to people about and there are varying different opinions about it. When you collect some data via OSINT methods it’s important to remember that the data is only as good as the source you collect it from. The simple rule is “Don’t trust the source, don’t use it”.

You also need to consider about the way that the data is collected. Let me explain a bit more, consider this scenario (totally made up).

You spot someone (within a corporate environment) called Ronny emailing a file called “secretinformation.docx” to an external email address of ronnythespy@madeupemail.com. You decide to do some “OSINT” to work out if the two Ronnies are the same people. Using a tool or chunk of code (in a language you don’t know) you decide that you have enough information to link the two Ronnies together.

Corporate Ronny takes you to court to claim unfair dismissal, during the court procedures you are asked (as the expert witness) how the information was collected. Now you can explain the process you followed (run code or click on the tool) but can you explain how the tool or chunk of code provided you with that information or the methods it used to collect it (where they lawful for example)?

For me, that’s the biggest consideration when using OSINT material if you want to use it to provide true value to what you are trying to accomplish. Being able to collect the information is one thing, validating the methods or techniques on how it was collected is another. Again this is a conversation I have many, many times and I work on this simple principle, “if in doubt, create it yourself” which basically means I have to/get to write some code or build a tool.

This quote essentially sums up everything I just said, “In OSINT, the chief difficulty is in identifying relevant, reliable sources from the vast amount of publicly available information.” (source)

What is OSINT good for?

Absolutely everything!! Well ok nearly everything, but there are a lot of ways that OSINT can be used for fun or within your current job. Here are some examples;

  • Company Due Diligence
  • Recruitment
  • Threat Intelligence
  • Fraud & Theft
  • Marketing
  • Missing Persons

What are we going to cover??

At the moment I’ve got a few topics in mind to cover in this blog series, I am open to suggestions or ideas so if you have anything let me know and I will see what I can do. Here are the topics I’ve come up with so far (which is subject to change).

  • Image Searching
  • Social Media
  • Internet Infrastructure
  • Companies
  • Websites

Hopefully you found this blog post of use (or interesting), leave a comment if you want me to cover another subject or have any questions/queries/concerns/complaints.

Building your own Whois API Server

So it’s been a while since I’ve blogged anything not because I haven’t been busy (I’ve actually been really busy), but more because a lot of the things I work on now I can’t share (sorry). However every now and again I end up coding something useful (well I think it is) that I can share.

I’ve been looking at Domain Squatting recently and needed a way to codify whois lookups for domains. There are loads of APIs out there but you have to pay and I didn’t want to, so I wrote my own.

It’s a lightweight Flask application that accepts a domain, does a whois lookup and then returns a nice JSON response. Nothing fancy, but it will run quite happily on a low spec AWS instance or on a server in your internal environment. I built it to get around having to fudge whois on a Windows server (lets not go there).

In order to run the Flask application you need the following Python libraries (everything else is standard Python libraries).

  • Flask
  • pythonwhois (this is needed for the pwhois command that is used in the code)

To run the server just download the code (link at the bottom of page) and then run.

python whois-server.py

The server runs on port 9119 (you can change this) and you can submit a query like this:


You will get a response like the picture below:

From here you can either roll it into your own tool set or just it for fun (not sure what sort of fun you are into but..).

You can find the code in my GitHub repo MyJunk or if you just want this code it’s here.

There may well be some bugs but I haven’t found any yet, it runs best on Linux (or Mac OSX).

Any questions, queries etc. etc. you know where to find me.

Pandora – Maltego Graph Thingy

I talk to a lot of different people about Maltego, whether its financial institutions, law enforcement, security professionals or plain old stalkers (only kidding) and the question I usually end up asking them is this;

What do you want to do with the data once it’s in Maltego?

The reporting features in Maltego are good, but sometimes you want something a little bit “different”, usually because you want to add the data you collect in Maltego to another tool you may have, or you want to share the information with others (who don’t have Maltego) or just because you want to spin you own reports.

A few weeks ago on my OSINT course we talked about classifying open source intelligence against the National Intelligence Model (5x5x5), so I decided to see if I could write a tool that would take a Maltego graph and do just that. In addition (more as a by-product) you can now export Maltego graphs (including link information) to a JSON file.

I would like to thank Nadeem Douba (@ndouba) for the inspiration (and some of the code) which is part of the Canari Framework and originally allowed you to export a Maltego Graph to CSV format.

Pandora is a simple lightweight tool that has two main uses. The first is a simple command interface (pandora.py) that will allow you to specify and Maltego Graph and it just spits out a JSON file.

The second usage has the same functionality but via a simple web interface (webserver.py), you can export your Maltego graph to JSON and then get a table based view of the entities held within, you can also click on a link that shows all the outgoing and incoming linked entity types.

This is still a BETA at the moment, the JSON stuff works but the web interface has a few quirks to it. Over the next few weeks I will be adding extra stuff like reporting, the ability to send the JSON to ElasticSearch, Splunk (which has a new HTTP listener available) and some other cool stuff.

You can find Pandora HERE and some screenshots are below:

Maltego Graph (example)


Pandora – Web Interface


Pandora – Web interface with imported graph


Pandora – Graph Information


Pandora – Link Information


As always any questions, issues etc etc please let me know.

Open Source Cyber Intelligence – Course Review

DISCLAIMER: This review is based on my own experience attending the course, and is no way affiliated with the training provider or my current employer. All opinions stated in the review are my personal views and should be treated as such.

A couple of weeks ago (might be more) I spent the week in London on a training course (well actually it was two but..). The courses were run by QA and are part of their Cyber security range. Details of the courses are below (just in case you want to go on them).

Open Source Cyber Intelligence – Introduction (3 days)

Open Source Cyber Intelligence – Advanced (2 days)

Now it’s important to note at this point that the courses are focused on “Cyber” based Open Source intelligence techniques rather than the more generic Open Source Intelligence which in my mind is more about stalking people, sorry I mean tracking people.

Below is a brief outline of what the courses contained (taken from the QA website).

Open Source Cyber Intelligence – Introduction

Module 1 – History of the Internet and the World Wide Web
Module 2 – How devices communicate
Module 3 – Internet Infrastructure
Module 4 – Search Engines
Module 5 – Companies and people
Module 6 – Analysing the code
Module 7 – The Deep Web
Module 8 – Social Media
Module 9 – Protecting your digital footprint
Module 10 – Internet Communities and Culture
Module 11 – Cyber Threat
Module 12 – Tools for investigators
Module 13 – Legislation

Open Source Cyber Intelligence – Advanced

Module 1 – Advanced search and Google hacking
Module 2 – Mobile devices; threats and opportunities
Module 3 – Protecting your online footprint and spoofing
Module 4 – Advanced software
Module 5 – Hacking forums and dumping websites
Module 6 – Encryption and anonymity tools
Module 7 – Tor, Dark Web and Tor Hidden Services (THS)
Module 8 – Bitcoin and Virtual Currencies
Module 9 – Other Dark Webs and Darknets
Module 10 – Advanced evidential capture

NOTE: The courses are designed for people of any skill level which is why when you look at some of the module titles and think “Why are they teaching networking basics” it may seem a little bit random.

It’s also important to point out that the prerequisite for the advanced course is that you have completed the introduction course first.

Course Review:

The size of the class (for both courses) was smaller than I expected but that wasn’t a bad thing as it gave us the chance to ask questions and provide a steer on the direction of the conversations without feeling like we were stopping lots of people from learning (and getting their monies worth).

You get a preconfigured workstation that has all the tools you need for the course as well as a Kali virtual machine, multiple browsers, plugins etc. and the workstation has plenty of grunt (CPU & Memory) to not become bogged down when running lots of things.

The instructor for both courses was a guy called Max Vetter who has loads of experience in this area and made sure we understood the content of the course and that it was also fun (check out “If Google was a Guy” on Youtube).

Now I’m no OSINT expert, but I have worked in IT for nearly 20 years and I am a bit of a OSINT wannabe so for me the introduction course was a bit slow. Don’t get me wrong, the content and the method it was delivered was awesome, but if you know about networking and how to do whois lookups or view source code in websites, you may find the introduction course not to your liking.

If however you know how to do all of the above, but have never done it within a OSINT type scenario then the course will be really useful (and fun) as it will enable you to understand how to use the information you collect in order to track and trace “cyber bad guys”.

For example if you find a “bad” domain, you can query whois to find out who registered it, then using a bit of Google-fu (Google hacking is covered in the introduction course) see if you can use the details from the whois information to find any other domains the suspect might have registered.

Let me show you, looking at the whois information for qa.com (the training provider) you will see the following:

For more information on Whois status codes, please visit
Domain Name: qa.com
Registry Domain ID: 113160_DOMAIN_COM-VRSN
Registrar WHOIS Server: whois.1and1.com
Registrar URL: http://1and1.com
Updated Date: 2013-05-24T22:04:57Z
Creation Date: 1994-10-25T04:00:00Z
Registrar Registration Expiration Date: 2015-10-24T04:00:00Z
Registrar: 1&1 Internet AG
Registrar IANA ID: 83
Registrar Abuse Contact Email: abuse@1and1.com
Registrar Abuse Contact Phone: +1.8774612631
Domain Status: clientTransferProhibited https://www.icann.org/epp#clientTransferProhibited
Registry Registrant ID:
Registrant Name: Alexandra Kubicka
Registrant Organization: QA-IQ Ltd
Registrant Street: 80 Cannon Street
Registrant Street: 4th Floor
Registrant City: London
Registrant State/Province: ABE
Registrant Postal Code: EC4N 6HL
Registrant Country: GB
Registrant Phone: +44.8450559501
Registrant Phone Ext:
Registrant Fax: +44.8450559502
Registrant Fax Ext:
Registrant Email: IT@QA.com
Registry Admin ID:
Admin Name: Alexandra Kubicka
Admin Organization: QA-IQ Ltd
Admin Street: 80 Cannon Street
Admin Street: 4th Floor
Admin City: London
Admin State/Province: ABE
Admin Postal Code: EC4N 6HL
Admin Country: GB
Admin Phone: +44.8450559501
Admin Phone Ext:
Admin Fax: +44.8450559502
Admin Fax Ext:
Admin Email: IT@QA.com
Registry Tech ID:
Tech Name: Hostmaster ONEANDONE
Tech Organization: 1&1 Internet Ltd.
Tech Street: 10-14 Bath Road
Tech Street: Aquasulis House
Tech City: Slough
Tech State/Province: BRK
Tech Postal Code: SL1 3SA
Tech Country: GB
Tech Phone: +44.8716412121
Tech Phone Ext:
Tech Fax: +49.72191374215
Tech Fax Ext:
Tech Email: hostmaster@1and1.co.uk
Nameserver: ns1.dnsexit.com
Nameserver: ns2.dnsexit.com
Nameserver: ns3.dnsexit.com
Nameserver: ns4.dnsexit.com
DNSSEC: Unsigned

Then using one of the values from the whois (telephone number, email address, contact name(s)) you can get Google to find other domains that have the same whois information.

Within the Google search bar enter the following query:

site:whois.domaintools.com +44.8450559501

You will get back all the registered domains (stored in whois.domaintools.com) that have that phone number in the whois information. If like me you are a Python addict (it’s ok to admit it) you can automate this kind of process and write the information to something (flat file, database etc etc) but that’s not covered in the course.

The last two days of training was the fun stuff, the advanced course covered all the good stuff like Tor, “Darknet” forums, bitcoin and other grey online communities. Again I had played around with most of this (and read books etc), and again the advanced course is again aimed at all skill levels but there was a good mixture of theory and practice.

Over the two days you get to mess around (within legal guidelines) with different “Darknets” such as Tor, I2P, Freenet and GNUnet which is quite fun and interesting to see how mature some of the others are in comparison to Tor. You also learn about the history of Bitcoin (which in itself is quite interesting) and look at the different ways to track bitcoin transactions and all the other various “alt coins” that have sprung up (did you know there was a Bieber coin (BRC)??).


For me I really enjoyed both courses, yes I knew a lot of the stuff already (not being big-headed) but it’s always good to have an expert validate what you know, especially when you have taught yourself most of it (and now I have certificates). It also introduced me to lots of new OSINT resources (websites, tools etc) as well as helped focus the “flow” of the data you can collect, and some better ways of processing it and reusing it for other purposes. Plus it also opened up lots of other opportunities for new Maltego transforms (I bet you thought this was a Maltego free blog post).


  • Excellent instructor
  • Good facilities (lots of free coffee and biscuits)
  • Good course content which was delivered well
  • Fun (that’s important)


  • Might not be the course for you if you already do OSINT related work or have a deep technical background around “Cyber”

If you have any questions or queries just give me a shout.


Maltego: Email/Person/Alias to Skype ID

So ages ago the guys at Paterva (the makers of Maltego) challenged me to write a public Maltego transform that would perform a lookup on an email address and returning the matching Skype user account. I can’t quite remember when they set the challenge but today after much research and a lot of trial and error I can announce that the I’ve finished my Skype transforms.

Currently there is just two transforms available (I need to tweak the others) which takes an email address or an alias (Maltego entity), in the end there will be three available. The final set of transforms will be:

1. Email to Skype (available)
2. Alias to Skype (available)
3. Person to Skype (coming soon)

All three of these transforms are available as part of the Media Monkey package which you can find more details out about HERE.

The transforms are called:

mmEmail2Skype (takes an email address entity)
mmPerson2Skype (takes a person entity)
mmAlias2Skype (takes an alias entity)

Here is a nice screenshot of what it looks like in action.


DISCLAIMER: This transform does not in any way use a modified Skype client and only makes use of legitimate API’s provide by Microsoft and Skype.

Maltego Magic comes to BSides London

I’m a big fan of BSides London, it was the first security conference I ever went to, and this will be my fourth year attending. The last couple of years I’ve been a “crew” member for the event, working in the background to help make the event what we all know and love. Last year I stepped in last-minute to run a Scapy workshop, this year I’ve decided to submit one, on my other favourite thing Maltego.

Below you will find a brief description of the workshop and the things that if you are planning on attending you will need to bring with you.

Maltego Magic – Creating transforms & other stuff

In this workshop I will teach people how to write their own Maltego transforms. Using simple to understand examples (and pictures, everyone likes pictures) I will lead the participants through the process of creating local and remote transforms using just a pen and paper (ok a laptop is needed as well).

A basic knowledge of Maltego & Python is needed but the workshop will be aimed so that anyone can benefit from the magic that is Maltego even if they haven’t coded anything before.

Requirements for the day

  • Laptop (Mac OSX, Windows or Linux)
  • Python installation (2.7 or above, not version 3 though)
  • The Python Requests library (sudo pip install requests)
  • Maltego (CE edition is ok)
  • A text editor that’s Python friendly or a Python IDE (Sublime Text, PyCharm etc)
  • Your imagination (borrow someone else’s if necessary)

The workshop information for the day is below:

Date: June 3rd 2015
Workshop: 3
Track No: 2
Duration: 2 hours
Schedule for: 14:00 – 16:00

If you are interesting in writing Maltego transforms come along to the workshop, if you can’t make it I will be wandering around the con all day so feel free to stop me and we can have a chat about Maltego Magic.