Harpoon: an OSINT / Threat Intelligence tool

osint tools


Harpoon is a tool to automate threat intelligence and open source intelligence tasks. It is written in Python 3 and organised in plugins so the idea is to have one plugin per platform or task. The code is on Github, feel free to open issues and propose Pull Requests.

Install and config:

pip install git+ssh://git@github.com/Te-k/harpoon  --process-dependency-links
npm install -g phantomjs
harpoon config -u
harpoon config

Then check how to use every module with harpoon help MODULE

Harpoon ?

For the past year and a half I have been pretty busy doing threat intelligence and open source intelligence on several malware operations. Threat Intelligence mainly relies on passive DNS/malware databases on one side, and databases on malicious activity on the other side. The objective is to map an attack’s infrastructure and if possible link it with other malicious activities. Some threat intelligence platforms are accessible to everyone (like OTX or RobTex) while others are commercial with or without free access (like VirusTotal or PassiveTotal). In the end, a large part of this activity is about looking for information in different platforms. Many people have tried to create a platform centralizing information from other platforms, but we always end up having yet another platform to consider during the research.

New standards problem totally applies to Threat Intelligence (xkcd 927)

Open Source Intelligence (aka OSINT) on the other side is way more diverse. The objective is to get as much information as possible on someone or a group of people using any data source openly accessible on Internet. Of course there are some interesting platforms (like SpyOnWeb) but you may also have to do research in social media, cache platforms and many other random tasks depending where your investigation leads you.

In all that, a large number of tasks are completely manual and it sucks. At first, I tried to create random python scripts to automate some of these tasks but it became quickly a mess : too many scripts, some in python 2 some in python 3, some using config files and some getting API key in parameters… After a while, I started to organize these scripts as modules for a tool called Harpoon and after some months of using it, I think it is time to make it open source to see if it can help other people.

Some notes on the principles behind this tool :

  • Python 3 only, python 2 is dead, I am not even trying to support it
  • Many OSINT tools try to gather as much information as possible from an indicator (domain or email) without really any interest on where it comes from. Harpoon does not follow this philosophy. It mostly allows you to realise a single task per command (with a couple of more general commmands using several tools). I think it is really important during an investigation to understand where an information comes from and how reliable it is.
  • In many cases, I also wanted to explore APIs to see what was possible with them, and I ended up rewriting some libraries (like SpyOnWeb) just because I wanted to understand exactly what it did and how. So I reinvented the wheel many times and I am totally fine with it.
  • Harpoon is organized into subcommands that are easy to implement, these commands rely on internal or external libraries. These commands also use a single configuration file that you need to complete manually when an API key is needed.
  • This tool is not perfect, it only does what I needed to automate, it is likely buggy and has a long list of things I would like to implement one day (but it may never happen). Feel free to open issues or propose Pull Requests.

This post will be boring, mainly because I try to be as exhaustive as I can to complete the limited existing documentation about the tool. Feel free to jump to the sections you are interested to read



pip install git+ssh://git@github.com/Te-k/harpoon  --process-dependency-links
npm install -g phantomjs
harpoon config -u
harpoon config

Boring yet mandatory part: how to install Harpoon. I have tried to package it as much as possible through pip (with the great help of @cybersteez) so everything should be installable just with pip install git+ssh://git@github.com/Te-k/harpoon --process-dependency-links

The main challenge with the packaging is that many libraries I have written are hosted on github and not in Pypi (yet?), so the package should install them at the same time. Alternately, you can install everything from requirements.txt after you cloned the repository (pip install -r requirements.txt).

If you want to use the screenshot module (to take a screenshot of a website), you need to install it through npm : npm install -g phantomjs

Now harpoon should be installed, you can check if it works with harpoon help ip for instance. Then you need to install files needed by Harpoon and configure it. To install files needed (for now, mainly the MaxMind GeoIP database), you just need to run harpoon config -u and wait. Finally, you need to configure the tool, mainly by providing the API keys of platforms you can/want to use. For this, just run harpoon config, it will copy the empty config file and open it with vim so that you can provide the given keys. If you don’t have them, just leave the key empty and harpoon will avoid using the platform when possible. You can see the list of configured module with harpoon config -c. On my current system, it gives :

Configuration check:
-hibp            -> OK
-twitter         -> OK
-misp            -> FAILED
-robtex          -> OK
-totalhash       -> OK
-pt              -> OK
-asn             -> OK
-otx             -> OK
-bitly           -> OK
-vt              -> OK
-screenshot      -> OK
-dns             -> OK
-safebrowsing    -> OK
-threatgrid      -> OK
-help            -> OK
-shodan          -> OK
-greynoise       -> OK
-crtsh           -> OK
-domain          -> OK
-pgp             -> OK
-github          -> OK
-malshare        -> OK
-config          -> OK
-hunter          -> OK
-hybrid          -> OK
-cache           -> OK
-spyonweb        -> OK
-telegram        -> FAILED
-fullcontact     -> OK
-ip              -> OK
-censys          -> OK
-googl           -> OK

All the files needed by Harpoon (including config file) are installed in ~/.config/harpoon.


It is hard to describe the features without just listing the modules because I almost created a new command for every task I needed to automate. Let’s try to organize in categories with examples :

$ harpoon otx -s cdnverify.net
No analysis on this file
Listed in 1 pulses
	-Sofacy targeting Romanian Embassy
		Sofacy targeting the embassy of Romania in Moscow -  Email Subject: Upcoming Defense events February 2018
		Created: 2018-02-08T11:50:07.652000
		References: https://twitter.com/ClearskySec/status/960924755355369472
		id: 5a7c396f6db26d7636273c44
URL list:
	[2018-02-06T19:12:50] https://cdnverify.net/ on IP
	[2018-02-02T18:44:09] http://cdnverify.net/ on IP
  • On top of that, I have implemented some higher level commands to gather information from all these platforms with ip and domain. These commands search for interesting information from almost all the configured plugins:
$ harpoon domain intel cdnverify.net
###################### cdnverify.net ###################
[+] Downloading OTX information....
[+] Downloading Robtex information....
[+] Downloading Passive Total information....
[+] Downloading VT information....
----------------- Intelligence Report
 -Sofacy targeting Romanian Embassy (2018-02-08 - https://otx.alienvault.com/pulse/5a7c396f6db26d7636273c44)
PT: Nothing found!
----------------- Malware
[PT (Emerging Threats (Proofpoint))] 36524c90ca1fac2102e7653dfadb31b2 2018-02-04
----------------- Urls
[VT] http://cdnverify.net/ -  2018-02-15
[VT] https://cdnverify.net/ -  2018-02-09
[OTX] https://cdnverify.net/ - 2018-02-06
[OTX] http://cdnverify.net/ - 2018-02-02
----------------- Passive DNS
[+]                            (2018-02-07 -> 2018-02-07)(PT)
[+]                            (2018-02-04 -> 2018-02-04)(VT)
[+]                            (2018-02-02 -> 2018-02-06)(Robtex)
[+]                            (2018-01-31 -> 2018-01-31)(PT)
  • Network information : the commands ip, dns and asn provide basic information about an IP, domain or ASN number (location, dns resolutions or information on the ASN). Nothing fancy but it helps all the time:
$ harpoon ip info
MaxMind: Located in Roubaix, France
MaxMind: ASN16276, OVH SAS
ASN 16276 - OVH, FR (range

Censys:		https://censys.io/ipv4/
Shodan:		https://www.shodan.io/host/
IP Info:	http://ipinfo.io/
BGP HE:		https://bgp.he.net/ip/
IP Location:	https://www.iplocation.net/?query=
  • Social media : when researching in social media platforms (not sure the API is the best way to do that), I find that saving everything from a social media account quickly is really helpful during an investigation (i.e. before Twitter deletes Russian trolls). Same idea with the screenshot command which take a screenshot of a website. For now, only Twitter and Telegram exists:
$ harpoon twitter -s realDonaldTrump > @realDonaldTrump
  • URL shorteners : I have also implemented commands for bit.ly and goo.gl url shorteners, the idea is to be able to get as much data as possible from the API:
$ harpoon bitly -H 2oh6Nrj
-------------------- Bit.ly Link infos -------------------
Link: http://bit.ly/2oh6Nrj		Metrics: http://bit.ly/2oh6Nrj+
Expanded url: https://ooni.torproject.org/post/mining-ooni-data/
Creation Date: 2018-02-19 00:15:03
Aggregate link: http://bit.ly/2E6V2dF
2 bitly redirect to this url

original_url: https://ooni.torproject.org/post/mining-ooni-data/
canonical_url: https://ooni.torproject.org/post/mining-ooni-data/
html_title: OONI - I have hands, how can I mine OONI data?
aggregate_link: http://bit.ly/2E6V2dF
indexed: 1519017306

User: 2oh6Nrj
Invalid user!

0 clicks on this link


  • And then, there are other commands I implemented because I wanted to see what I could get with some APIs. For instance, there is a command for github to search in github repos, or pgp to search for keys. One command I like particularly is the cache command which check for the existence of a web page in different cache platforms.
$ harpoon cache https://citizenlab.ca/2016/11/parliament-keyboy/
Google: FOUND https://webcache.googleusercontent.com/search?q=cache%3Ahttps%3A%2F%2Fcitizenlab.ca%2F2016%2F11%2Fparliament-keyboy%2F&strip=0&num=1&vwsrc=1 (2018-02-05 20:02:18+00:00)
Yandex: FOUND https://hghltd.yandex.net/yandbtm?fmode=inject&url=https%3A%2F%2Fcitizenlab.ca%2F2016%2F11%2Fparliament-keyboy%2F&tld=ru&lang=en&la=1518660992&tm=1519019381&text=https%3A%2F%2Fcitizenlab.ca%2F2016%2F11%2Fparliament-keyboy%2F&l10n=ru&mime=html&sign=ef543d285bc848b89e51b5a654f7f6aa&keyno=0
Archive.is: NOT FOUND
Archive.org: FOUND
-2018-02-03 10:05:03: http://web.archive.org/web/20180203100503/https://citizenlab.ca/2016/11/parliament-keyboy/
Bing: FOUND http://cc.bingj.com/cache.aspx?d=5023416941477933&w=p_fS69zzGSfsYoCCryqQAHXJ09tpPdBB (2016-11-17 00:00:00)
  • Last but not least, the help commands gives (hopefully) detailed information on any command :
$ harpoon help ip
# IP

Gathers information on an IP address

Get information on an IP:

harpoon ip info
MaxMind: Located in None, United States
MaxMind: ASN21928, T-Mobile USA, Inc.
ASN 21928 - T-MOBILE-AS21928 - T-Mobile USA, Inc., US (range

Censys:     https://censys.io/ipv4/
Shodan:     https://www.shodan.io/host/
IP Info:    http://ipinfo.io/
BGP HE:     https://bgp.he.net/ip/
IP Location:    https://www.iplocation.net/?query=

* Get intelligence information on an IP: harpoon ip intel IP


At some point, I got sick of using harpoon XXX for some commands I was using all the time. So I created a repository harpoontools that installs commands using Harpoon features. You can install it through pip install git+ssh://git@github.com/Te-k/harpoontools

For now I have only implemented ipinfo, asninfo and dns :

$ cat ips | ipinfo ; ASN15169 ; Google LLC ; Mountain View ; United States ; ASN6939 ; Hurricane Electric, Inc. ; Fremont ; United States ; ASN6939 ; Hurricane Electric, Inc. ; Salt Lake City ; United States


It is hard to give a real-life example of Harpoon as it is helpful for different types of investigations, but let’s try with the recent Palo Alto report about Quasar RAT and focus on the domain akamaicdn[.]ru.

First let’s check what is the actual DNS resolution :

$ harpoon  dns akamaicdn[.]ru
# A
No A entry

No AAAA entry configured

# NS
ns2.reg.ru. - - ASN197695 Domain names registrar REG.RU, Ltd - None Russia
ns1.reg.ru. - - ASN197695 Domain names registrar REG.RU, Ltd - None Russia

# MX:
No MX entry configured

NS: ns1.reg.ru.
Owner: hostmaster.ns1@reg.ru

# TXT:
No TXT entry configured

No A entry, I guess the operation stopped. Let’s see if we can get any older IP with robtex:

$ harpoon robtex domain akamaicdn.ru
Passive DNS info:
[+] A	(2017-03-16T14:55:12 -> 2017-03-16T14:55:12)
[+] NS	ns1.expired.r01.ru	(2017-03-16T14:55:12 -> 2017-03-16T14:55:12)
[+] A	(2017-03-16T14:55:12 -> 2017-03-16T14:55:12)
[+] NS	ns2.expired.r01.ru	(2017-03-16T14:55:12 -> 2017-03-16T14:55:12)
[+] MX	nomail.nic.ru	(2017-03-16T14:55:12 -> 2017-03-16T14:55:12)

Let’s check where these IPs are:

$ ipinfo ; ASN48287 ; Jsc ru-center ; Moscow ; Russia ; ASN48287 ; Jsc ru-center ; None ; Russia

We can check if certificates were created for this domain with crt.sh :

$ harpoon crtsh -d akamaicdn.ru
sni11878.cloudflaressl.com	2017-03-02T00:00:00+00:00	2017-09-03T23:59:59+00:00	B05CB0F1425FBFA7E9407C777C6B4DC0E3F7F1B6
sni11878.cloudflaressl.com	2017-02-21T00:00:00+00:00	2017-08-06T23:59:59+00:00	7B9F1F8A2F7211C332C60EBFDB6CF739DF7D2A3A
sni11878.cloudflaressl.com	2017-01-22T00:00:00+00:00	2017-07-30T23:59:59+00:00	D372B140802DA627BD0745B447A9E3A48B2FBD15
sni11878.cloudflaressl.com	2017-01-19T00:00:00+00:00	2017-07-23T23:59:59+00:00	3868C466BC8D131B2EB6B65CD7B20E7FFB255C51
sni11878.cloudflaressl.com	2016-12-05T00:00:00+00:00	2017-06-04T23:59:59+00:00	BBEBA7914A4287C8BDDCD81510A327D33E6476F5
sni11878.cloudflaressl.com	2016-12-05T00:00:00+00:00	2017-06-04T23:59:59+00:00	F0F0A0D02A8E16B3A261382D75B8C96393A16264
sni11878.cloudflaressl.com	2016-11-27T00:00:00+00:00	2017-06-04T23:59:59+00:00	E285191C82EA0F5FD23EF4688A62E5772F4584D4
sni11878.cloudflaressl.com	2016-11-23T00:00:00+00:00	2017-05-28T23:59:59+00:00	7F40B0D369700BFC27C2AD2EB858D8DF4955624D
sni11878.cloudflaressl.com	2016-11-21T00:00:00+00:00	2017-05-28T23:59:59+00:00	EDE5454F23BBC7BFBA17F2E293D7FBDD1266B260
sni11878.cloudflaressl.com	2016-10-23T00:00:00+00:00	2017-04-30T23:59:59+00:00	68D504FAEB6AF1DDA50062B16CBFB46AAD490171
sni11878.cloudflaressl.com	2016-10-23T00:00:00+00:00	2017-04-30T23:59:59+00:00	2F2231766F8432343B579DB21ECF829CB171E481
sni11878.cloudflaressl.com	2016-10-23T00:00:00+00:00	2017-04-30T23:59:59+00:00	B11FCBEAD0A2D174C661D8095A0693955FC62A99
sni11878.cloudflaressl.com	2016-09-12T00:00:00+00:00	2017-03-19T23:59:59+00:00	62578BABE0AFFE15ABE3FBD68A6EE8EF76AB556A
sni11878.cloudflaressl.com	2016-05-18T00:00:00+00:00	2016-11-20T23:59:59+00:00	5630B82083D14B0D5202FEAC7566971ECA41BBDC
sni11878.cloudflaressl.com	2016-05-18T00:00:00+00:00	2016-11-20T23:59:59+00:00	D5A7B4CC6DF2340E8F547E3CC0A17163A81FD51A
sni11878.cloudflaressl.com	2016-05-06T00:00:00+00:00	2016-11-06T23:59:59+00:00	500DA087F038AEB5A37D9F60638332DBA8368BA2
sni11878.cloudflaressl.com	2016-05-06T00:00:00+00:00	2016-11-06T23:59:59+00:00	C54AEC0DA67F1B12A134C6B997EE93DFA0EEE4F2
sni11878.cloudflaressl.com	2016-04-11T00:00:00+00:00	2016-10-16T23:59:59+00:00	E746CD3581198237D3D26F8A80FF71BAD88D1544
sni11878.cloudflaressl.com	2016-02-18T00:00:00+00:00	2016-08-21T23:59:59+00:00	451608653F741F079CC52569F7FAFB8C5B1F8855

So apparently they used Cloudfare to host this domain. We likely have a partial view of passive DNS information due to Robtex limitations, but let’s check if OTX knows anything about the IP address

$ harpoon otx -s
No analysis on this file
Listed in 5 pulses
	-BadRabbit-Ransomware - A Modified Version of NotPetya

		Created: 2017-11-03T17:58:14.502000
		id: 59fcae36f0c4a216de3560ea
	-Blueliv Chasing cybercrime: Vawtrak v2 IOCs
		Vawtrak is a serious threat for the finance sector and is predicted to be the next major banking Trojan. Blueliv's investigation into Vawtrak v2 has revealed new information to piece together a more complete view of the Vawtrak banking Trojan and the cybercriminal groups behind it than we've seen before. The report also provides real infection data and Indicators of Compromise (IoCs) that readers can input into their existing security solutions to enhance their protection. Here is the full list of Vawtrak and Moskalvzapoe IOCs discovered as part of the Blueliv analysis.
		Created: 2016-09-12T17:36:18.734000
		References: https://community.blueliv.com/#/s/57d6d33d82df41127d7a6ca4, https://www.sophos.com/en-us/medialibrary/PDFs/technical%20papers/sophos-vawtrak-v2-sahin-wyke.pdf?la=en
		id: 57d6e794aa954c115b68a85f
	-Ursnif CnC

		Created: 2016-02-17T00:35:04.809000
		id: 56c3c03867db8c12501745c6
	-Angler EK Network IOC
		Angler EK Network IOC observed in the past year.
		Created: 2016-02-17T15:25:35.814000
		id: 56c490f067db8c1250175b9d
	-Chinese Government Website Compromised, Leads to Angler
		Despite a recent takedown targeting the Angler Exploit Kit (EK), it's back to business as usual for kit operators. On 30-October-2015, ThreatLabZ noticed a compromised Chinese government website that led to the Angler Exploit Kit with an end payload of Cryptowall 3.0. This compromise does not appear targeted and the compromised site was cleaned up within 24 hours. We have noticed some recent changes to Angler, as well as the inclusion of newer Flash exploits. A set of indicators for this compromise is at the end of this post.
		Created: 2015-11-03T19:21:57.947000
		References: http://research.zscaler.com/2015/11/chinese-government-website-compromised.html
		id: 563909554637f2388aaf2311
Passive DNS:

Create a new command

Harpoon is a plugin-based tool, so it is pretty easy to add new features just by creating new plugins. To do so, you need to create a new file in harpoon/commands and implement a class inheriting the Command class. Let’s say we would like to implement a ping command (which would not be really helpful), we could create the following ping.py file :

import os
from harpoon.commands.base import Command

class CommandPing(Command):
    # Ping

    Here put the help in markdown format
    name = "ping"
    decription = "Ping command"

    def add_arguments(self, parser):
        # Here add arguments to the parser (which is an argparse parser)
        parser.add_arguments('IP', help='IP to ping')
        # It is nice to save the parser to call help later if needed
        self.parser = parser

    def run(self, conf, args, plugins):
        # here, implements the actual task
        # args contains the arguments received from the parser
        os.system("ping -c 1 " + args.IP)

That’s all folks, feel free to play with it and open issues and submit pull requests. You can also ping me on Twitter.

Thanks to Starcat for the feedback on the blog post! Most of this text was written while listening to Amanda Palmer singing No Surprises.