Rapid7 Labs

Open Data API

Datasets: 8     Files: 41,748     Total size: 47.9 TB

Open Data API

We offer an API for listing studies and retrieving the files offered on Open Data. API access requires an Open Data account. See the About page for information on requesting access.

Once you have an account, go to the API management website and create a new User Key. Please store the key in a safe place. It can only be viewed once after creation and anyone who has access to it can use the API in your name.

Using the API

The API is REST based and only consists out of GET requests as it is read-only. The data you receive is JSON encoded and contains all the information you can find on the website itself.

For authentication, you need to supply your API key on every request in an X-API-Key HTTP header.

See the following examples on how to use the API.

Listing studies

To list studies, execute the following request:

curl -H "X-Api-Key: <your API key>" "https://us.api.insight.rapid7.com/opendata/studies/"

The API should return a complete list of all known studies including their files (output shortened):

[
  {
    "uniqid": "sonar.fdns_v2",
    "name": "Forward DNS (FDNS)",
    "short_desc": "DNS 'ANY', 'A', 'AAAA', 'TXT' and 'CNAME' responses for known forward DNS names",
    "long_desc": "This dataset contains the responses to DNS requests for all forward DNS names known by Rapid7's Project Sonar.  Until early November 2017, all of these were for the 'ANY' record with a fallback A and AAAA request if neccessary.  After that, the ANY study represents only the responses to ANY requests, and dedicated studies were created for the A, AAAA, CNAME and TXT record lookups with appropriately named files.  The file is a GZIP compressed file containing the name, type, value and timestamp of any returned records for a given name in JSON format.",
    "study_url": "https://sonar.labs.rapid7.com",
    "study_name": "Forward DNS (FDNS)",
    "study_venue": "Project Sonar",
    "study_bibtext": "",
    "contact_name": "Rapid7 Labs",
    "contact_email": "research@rapid7.com",
    "organization_name": "Rapid7",
    "organization_website": "http://www.rapid7.com/",
    "created_at": "2018-03-20",
    "updated_at": "2018-03-20",
    "sonarfile_set": [
      "2018-06-15-1529049662-fdns_aaaa.json.gz",
      "2018-06-15-1529049601-fdns_a.json.gz",
      "2018-06-08-1528499062-fdns_any.json.gz",
      ...
    ]
  },
  ...
]

Show study details

To just show specific details about a single study, just add its unique name to the URL:

curl -H "X-Api-Key: <your API key>" "https://us.api.insight.rapid7.com/opendata/studies/sonar.fdns_v2/"

You will then receive the same output as for all the studies but just for the specific one you requested:

{
  "uniqid": "sonar.fdns_v2",
  "name": "Forward DNS (FDNS)",
  "short_desc": "DNS 'ANY', 'A', 'AAAA', 'TXT' and 'CNAME' responses for known forward DNS names",
  "long_desc": "This dataset contains the responses to DNS requests for all forward DNS names known by Rapid7's Project Sonar.  Until early November 2017, all of these were for the 'ANY' record with a fallback A and AAAA request if neccessary.  After that, the ANY study represents only the responses to ANY requests, and dedicated studies were created for the A, AAAA, CNAME and TXT record lookups with appropriately named files.  The file is a GZIP compressed file containing the name, type, value and timestamp of any returned records for a given name in JSON format.",
  "study_url": "https://sonar.labs.rapid7.com",
  "study_name": "Forward DNS (FDNS)",
  "study_venue": "Project Sonar",
  "study_bibtext": "",
  "contact_name": "Rapid7 Labs",
  "contact_email": "research@rapid7.com",
  "organization_name": "Rapid7",
  "organization_website": "http://www.rapid7.com/",
  "created_at": "2018-03-20",
  "updated_at": "2018-03-20",
  "sonarfile_set": [
    "2018-06-15-1529049662-fdns_aaaa.json.gz",
    "2018-06-15-1529049601-fdns_a.json.gz",
    "2018-06-08-1528499062-fdns_any.json.gz",
    ...
  ]
}

Showing file information

To get all the infos about a certain file of a study attach it again to the URL:

curl -H "X-Api-Key: <your API key>" "https://us.api.insight.rapid7.com/opendata/studies/sonar.fdns_v2/2018-06-15-1529049662-fdns_aaaa.json.gz/"

The output looks like this:

{
  "name": "2018-06-15-1529049662-fdns_aaaa.json.gz",
  "fingerprint": "9056f555bed59640ee5ffeadbeac8175e5873712",
  "size": 570536141,
  "updated_at": "2018-06-17"
}

Generate a download URL for a file

If you want to download a file, you can request a download link that will be valid for some time with the following API call:

curl -H "X-Api-Key: <your API key>" "https://us.api.insight.rapid7.com/opendata/studies/sonar.fdns_v2/2018-06-15-1529049662-fdns_aaaa.json.gz/download/"

PLEASE NOTE that every request of such a URL counts towards your quota, which defaults to 30 operations in a 24 hour window. Should you exceed the quota, you will receive an appropriate error message when trying to request another URL. The answer to the request will contain a Retry-After header field which tells you how long you need to wait until a new URL can be requested. If the quota applied to your account is prohibitively low, please contact opendata@rapid7.com.

The output looks like this - note that the URL might not always be a Backblaze URL:

{
  "url": "https://f002.backblazeb2.com/file/rapid7-opendata/sonar.fdns_v2/2018-06-15-1529049662-fdns_aaaa.json.gz?Authorization=3_20180626154115_2b4d96ec8e01568ccfb2e3a9_38dccb73bfd3a4e8df553577bf96a3b1d3c1accd_002_20180626164115_0053_dnld"
}

Getting information about your quota

To retrieve information about your quota, execute the following API call:

curl -H "X-Api-Key: <your API key>" "https://us.api.insight.rapid7.com/opendata/quota/"

The output looks like this:

{
    "quota_allowed":30,
    "quota_timespan":86400,
    "quota_used":1,
    "quota_left":29,
    "oldest_action_expires_in":86026
}

quota_allowed tells you how many download URL requests you can perform per quota_timespan seconds

quota_used describes how many download URLs you requested in the last quota_timespan seconds

quota_left describes how many download URLs you can still request until you exhausted your quota.

oldest_action_expires_in describes how long it takes until the oldest URL you requested is quota_timespan seconds old. When this happens you receive one more download to perform. NOTE: If you haven't requested any URLs in the last quota_timespan seconds, this field is not present.

Don't want to deal with integrations? Our products already come with this data built in.

Request Access

The policies for accessing this data changed on Feb 10, 2022. Please see the About page for more information.