Cypher Language Reference
Graph schema, Cypher syntax, functions, procedures, threat intelligence, and best practices for querying WhisperGraph.
Cypher Language Reference Documentation
Schema guide
Node labels
| Label | Description | Example values |
|---|---|---|
| HOSTNAME | Fully-qualified domain names, subdomains, mail server names | www.google.com, ns1.cloudflare.com |
| IPV4 | IPv4 addresses | 1.1.1.1, 142.250.64.100 |
| IPV6 | IPv6 addresses | 2606:4700::6810:84e5 |
| PREFIX | IP CIDR blocks | 142.250.64.0/24 |
| REGISTERED_PREFIX | RIR-allocated IP blocks (virtual, resolved at query time) | 1.1.1.0/24 |
| ANNOUNCED_PREFIX | BGP-announced prefixes (virtual, resolved at query time) | 104.16.128.0/20 |
| ASN | Autonomous system numbers | AS13335, AS15169 |
| ASN_NAME | Human-readable AS organization names | CLOUDFLARENET - Cloudflare, Inc. |
| TLD | Top-level domains | com, net, org, io |
| TLD_OPERATOR | TLD registry operators | VeriSign, Inc. |
| REGISTRAR | Domain registrars (IANA ID format) | iana:292 (MarkMonitor) |
| WHOIS contact email addresses | domains@cloudflare.com | |
| PHONE | WHOIS contact phone numbers (E.164) | +14158675825 |
| ORGANIZATION | Organizations from WHOIS records | cloudflare hostmaster |
| CITY | GeoIP city with country code | Mountain View, US |
| COUNTRY | ISO 3166-1 alpha-2 country codes | US, DE, AU |
| RIR | Regional Internet Registries | ARIN, RIPENCC, APNIC, LACNIC, AFRINIC |
| DNSSEC_ALGORITHM | DNSSEC signing algorithms | ECDSAP256SHA256, RSASHA256 |
| FEED_SOURCE | Threat intelligence feed sources (virtual) | Spamhaus DROP, Feodo Tracker |
| CATEGORY | Threat feed categories (virtual) | C2 Servers, Phishing |
Edge types
DNS resolution
| Edge type | From | To | Description |
|---|---|---|---|
| RESOLVES_TO | HOSTNAME | IPV4/IPV6 | DNS A/AAAA records |
| CHILD_OF | HOSTNAME | HOSTNAME/TLD | Domain hierarchy (sub.example.com -> example.com -> com) |
| ALIAS_OF | HOSTNAME | HOSTNAME | CNAME records |
| NAMESERVER_FOR | HOSTNAME | HOSTNAME | NS delegation (nameserver serves the target domain) |
| MAIL_FOR | HOSTNAME | HOSTNAME | MX records (mail server handles mail for the target domain) |
| SIGNED_WITH | HOSTNAME | DNSSEC_ALGORITHM | DNSSEC signing algorithm |
BGP and routing
| Edge type | From | To | Description |
|---|---|---|---|
| ANNOUNCED_BY | IPV4/PREFIX | ANNOUNCED_PREFIX | BGP announcement (virtual, resolved at query time) |
| ROUTES | ASN | ANNOUNCED_PREFIX | ASN routes this prefix (virtual) |
| BELONGS_TO | IPV4 | PREFIX/REGISTERED_PREFIX/ANNOUNCED_PREFIX | IP membership in a prefix block |
| PEERS_WITH | ASN | ASN | BGP peering session (virtual) |
| HAS_NAME | ASN | ASN_NAME | Network operator name (virtual) |
| HAS_COUNTRY | ASN/PREFIX | COUNTRY | Country assignment |
WHOIS and registration
| Edge type | From | To | Description |
|---|---|---|---|
| HAS_REGISTRAR | HOSTNAME | REGISTRAR | Current domain registrar |
| PREV_REGISTRAR | HOSTNAME | REGISTRAR | Previous domain registrar |
| REGISTERED_BY | HOSTNAME/REGISTERED_PREFIX | ORGANIZATION | WHOIS registrant organization |
| HAS_EMAIL | HOSTNAME | WHOIS contact email | |
| HAS_PHONE | HOSTNAME | PHONE | WHOIS contact phone |
GeoIP
| Edge type | From | To | Description |
|---|---|---|---|
| LOCATED_IN | IPV4 | CITY | GeoIP city location |
| LOCATED_IN | CITY | COUNTRY | City to country mapping |
Threat intelligence
| Edge type | From | To | Description |
|---|---|---|---|
| LISTED_IN | IPV4/HOSTNAME | FEED_SOURCE | IP or hostname appears in this threat feed (virtual) |
| BELONGS_TO | FEED_SOURCE | CATEGORY | Feed classified under this category |
Web
| Edge type | From | To | Description |
|---|---|---|---|
| LINKS_TO | HOSTNAME | HOSTNAME | Hyperlink between hostnames (from web crawl data) |
SPF
| Edge type | From | To | Description |
|---|---|---|---|
| SPF_INCLUDE | HOSTNAME | HOSTNAME | SPF include: mechanism |
| SPF_IP | HOSTNAME | PREFIX | SPF ip4: / ip6: mechanism |
| SPF_A | HOSTNAME | HOSTNAME | SPF a: mechanism |
| SPF_MX | HOSTNAME | HOSTNAME | SPF mx: mechanism |
| SPF_REDIRECT | HOSTNAME | HOSTNAME | SPF redirect= modifier |
| SPF_EXISTS | HOSTNAME | HOSTNAME | SPF exists: mechanism |
Other
| Edge type | From | To | Description |
|---|---|---|---|
| OPERATES | TLD_OPERATOR | TLD | Registry operator manages this TLD (virtual) |
Entity relationship diagram
Entity Relationship Diagram
Solid lines are physical edges stored on disk. Dashed lines are virtual edges computed at query time from live infrastructure and threat intelligence data.
Solid lines are physical edges stored on disk. Dashed lines are virtual edges computed at query time from live infrastructure and threat intelligence data.
Multi-hop path patterns
These are the most common traversal chains through the graph.
Domain to network owner:
HOSTNAME -[:RESOLVES_TO]-> IPV4 -[:ANNOUNCED_BY]-> ANNOUNCED_PREFIX -[:ROUTES]-> ASN -[:HAS_NAME]-> ASN_NAME
Domain to nameservers:
HOSTNAME(ns) -[:NAMESERVER_FOR]-> HOSTNAME(domain)
Domain to mail servers:
HOSTNAME(mx) -[:MAIL_FOR]-> HOSTNAME(domain)
IP to GeoIP location:
IPV4 -[:LOCATED_IN]-> CITY -[:LOCATED_IN]-> COUNTRY
IP to threat feeds:
IPV4 -[:LISTED_IN]-> FEED_SOURCE -[:BELONGS_TO]-> CATEGORY
Domain WHOIS chain:
HOSTNAME -[:HAS_REGISTRAR]-> REGISTRAR
HOSTNAME -[:HAS_EMAIL]-> EMAIL
HOSTNAME -[:REGISTERED_BY]-> ORGANIZATION
DNS hierarchy:
HOSTNAME -[:CHILD_OF]-> HOSTNAME(parent) -[:CHILD_OF]-> TLD
RIR allocation chain:
IPV4 -[:BELONGS_TO]-> REGISTERED_PREFIX -[:REGISTERED_BY]-> ORGANIZATION
Language reference
MATCH
Basic pattern matching. Always anchor your starting node with {name: "value"} on large labels like HOSTNAME and IPV4.
MATCH (n:HOSTNAME {name: "www.google.com"}) RETURN n.name
MATCH (h:HOSTNAME {name: "www.google.com"})-[:RESOLVES_TO]->(ip:IPV4)
RETURN h.name, ip.name LIMIT 10
OPTIONAL MATCH
Returns null for unmatched patterns instead of dropping the row. Use this for WHOIS fields and other sparse data where not every node has every relationship.
MATCH (h:HOSTNAME {name: "www.google.com"})
OPTIONAL MATCH (h)-[:HAS_CERTIFICATE]->(c)
RETURN h.name, c.name
This returns one row with c.name as null, rather than zero rows.
WHERE clause
Comparison operators:
=, <>, <, >, <=, >=
MATCH (n:TLD) WHERE n.name <> "com" RETURN n.name LIMIT 3
String predicates:
-- Prefix search (fast, uses index)
MATCH (n:HOSTNAME) WHERE n.name STARTS WITH "www.googl" RETURN n.name LIMIT 5
-- Substring search (fast)
MATCH (n:HOSTNAME) WHERE n.name CONTAINS "cloudflare" RETURN n.name LIMIT 5
-- Suffix search (fast on HOSTNAME and TLD, slow on other labels)
MATCH (n:HOSTNAME) WHERE n.name ENDS WITH ".google.com" RETURN n.name LIMIT 5
Regular expressions:
MATCH (n:HOSTNAME) WHERE n.name =~ "www\\.google\\.com" RETURN n.name LIMIT 1
Regex uses full-match semantics (not substring). Nested quantifiers are rejected for safety. Maximum pattern length is 1000 characters. Prefer STARTS WITH, ENDS WITH, or CONTAINS instead of regex when possible.
Logical operators: AND, OR, NOT, XOR
WITH 5 AS x WHERE x > 3 AND x < 10 RETURN x
NULL checks: IS NULL, IS NOT NULL
WITH null AS x RETURN x IS NULL AS result
List membership: IN
MATCH (n:ASN) WHERE n.name IN ["AS13335", "AS15169"] RETURN n.name
RETURN
Projects columns from matched patterns. Supports aliases with AS and DISTINCT.
MATCH (h:HOSTNAME {name: "www.google.com"})-[:RESOLVES_TO]->(ip)
RETURN DISTINCT ip.name AS address
MATCH (h:HOSTNAME {name: "www.google.com"})-[:RESOLVES_TO]->(ip)
WITH h.name AS host, collect(ip.name) AS ips
RETURN host, ips
WITH
Pipes results between query stages. You can aggregate, filter, and reshape data mid-query.
MATCH (h:HOSTNAME {name: "www.google.com"})-[:RESOLVES_TO]->(ip)
WITH h.name AS host, count(ip) AS ipCount
RETURN host, ipCount
MATCH (sub:HOSTNAME)-[:CHILD_OF]->(h:HOSTNAME {name: "github.com"})
WITH sub LIMIT 5000
MATCH (sub)-[:RESOLVES_TO]->(ip:IPV4)
RETURN DISTINCT ip.name LIMIT 30
ORDER BY, LIMIT, SKIP
Sort results and paginate. Always use LIMIT on queries against large labels.
MATCH (n:TLD) RETURN n.name ORDER BY n.name ASC LIMIT 5
MATCH (n:COUNTRY) RETURN n.name ORDER BY n.name SKIP 2 LIMIT 3
Both SKIP N LIMIT M and LIMIT M SKIP N orderings are supported.
UNWIND
Turns a list into individual rows. This is the batch lookup pattern.
UNWIND ["google.com", "cloudflare.com", "microsoft.com"] AS domain
MATCH (h:HOSTNAME {name: domain})
OPTIONAL MATCH (h)-[:HAS_REGISTRAR]->(r:REGISTRAR)
RETURN domain, collect(DISTINCT r.name) AS registrars
You can pass dozens or hundreds of indicators in a single UNWIND list.
UNION
Combines results from multiple MATCH clauses. UNION deduplicates; UNION ALL keeps duplicates.
MATCH (n:TLD {name: "com"}) RETURN n.name
UNION
MATCH (n:TLD {name: "net"}) RETURN n.name
CALL subqueries
Correlated subqueries import variables with WITH.
MATCH (h:HOSTNAME {name: "www.google.com"})
CALL { WITH h MATCH (h)-[:RESOLVES_TO]->(ip) RETURN count(ip) AS ipCount }
RETURN h.name, ipCount
CALL is also used to invoke procedures:
CALL explain("185.220.101.1")
CALL whisper.history("cloudflare.com")
CALL whisper.quota()
CALL db.labels()
CALL db.relationshipTypes()
EXPLAIN
Shows the query plan without executing the query.
EXPLAIN MATCH (h:HOSTNAME {name: "www.google.com"})-[:RESOLVES_TO]->(ip) RETURN ip.name
ProduceResult([ip.name])
Project([ip.name])
Expand(h-[RESOLVES_TO]->ip)
NodeLookup(h={name:'www.google.com'}:HOSTNAME)
Use EXPLAIN to verify the engine is using an indexed lookup rather than a label scan. PROFILE returns the same plan plus row counts per operator.
CASE / WHEN
Conditional expressions.
RETURN CASE WHEN 1 > 0 THEN "positive" ELSE "negative" END AS result
MATCH (h:HOSTNAME {name: "cloudflare.com"})-[:RESOLVES_TO]->(ip:IPV4)
OPTIONAL MATCH (ip)-[:LISTED_IN]->(f:FEED_SOURCE)
RETURN ip.name,
CASE WHEN f IS NOT NULL THEN "listed" ELSE "clean" END AS status
LIMIT 5
Nested CASE expressions are supported.
shortestPath
Finds the minimum-hop path between two nodes.
MATCH (a:HOSTNAME {name: "cloudflare.com"}), (b:HOSTNAME {name: "google.com"})
MATCH p = shortestPath((a)-[*1..6]-(b))
RETURN length(p) AS hops, [n IN nodes(p) | n.name] AS path
Always specify a bounded path length like [*1..6]. Unbounded paths may be slow or time out. allShortestPaths() is also supported and returns all paths of the shortest length.
EXISTS subqueries
MATCH (h:HOSTNAME {name: "www.google.com"})
WHERE EXISTS { (h)-[:RESOLVES_TO]->() }
RETURN h.name
The function-style EXISTS(expr) predicate is also supported. It returns true if the expression is non-null:
MATCH (h:HOSTNAME {name: "google.com"}) RETURN EXISTS(h.name) AS has_name
COUNT subqueries
MATCH (h:HOSTNAME {name: "www.google.com"})
RETURN h.name, COUNT { (h)-[:RESOLVES_TO]->() } AS ipCount
Note: COUNT{}, COLLECT{}, and EXISTS{} subqueries cannot be used directly in ORDER BY. Pre-compute in a WITH clause, then sort by the alias.
COLLECT subqueries
MATCH (h:HOSTNAME {name: "www.google.com"})
RETURN h.name, COLLECT { MATCH (h)-[:RESOLVES_TO]->(ip) RETURN ip.name } AS ips
List comprehensions
RETURN [x IN range(1, 10) WHERE x % 2 = 0] AS evens
WITH [1, 2, 3, 4, 5] AS nums
RETURN [x IN nums WHERE x > 2 | x * 10] AS filtered
Pattern comprehensions
MATCH (h:HOSTNAME {name: "www.google.com"})
RETURN h.name, [(h)-[:RESOLVES_TO]->(ip) | ip.name] AS ips
Function reference
Aggregation
| Function | Description | Example |
|---|---|---|
count(*) | Count rows | MATCH (n:TLD) RETURN count(*) |
count(DISTINCT x) | Count unique values | MATCH (n:COUNTRY) RETURN count(DISTINCT n.name) |
collect(x) | Collect values into a list | MATCH (n:RIR) RETURN collect(n.name) |
sum(x) | Sum numeric values | UNWIND [1,2,3] AS x RETURN sum(x) |
avg(x) | Average | UNWIND [1,2,3] AS x RETURN avg(x) |
min(x) | Minimum | UNWIND [3,1,4] AS x RETURN min(x) |
max(x) | Maximum | UNWIND [3,1,4] AS x RETURN max(x) |
stdev(x) | Sample standard deviation | UNWIND [1,2,3,4,5] AS x RETURN stdev(x) |
stdevp(x) | Population standard deviation | UNWIND [1,2,3,4,5] AS x RETURN stdevp(x) |
percentileDisc(x, p) | Discrete percentile | UNWIND [1,2,3,4,5] AS x RETURN percentileDisc(x, 0.5) |
percentileCont(x, p) | Continuous percentile | UNWIND [1,2,3,4,5] AS x RETURN percentileCont(x, 0.9) |
Note: min() and max() work on both aggregated rows and list literals.
String
| Function | Example | Result |
|---|---|---|
toUpper("hello") | "HELLO" | |
toLower("HELLO") | "hello" | |
trim(" hello ") | "hello" | |
lTrim(" hello") | "hello" | |
rTrim("hello ") | "hello" | |
replace("hello world", "world", "cypher") | "hello cypher" | |
substring("hello world", 6) | "world" | |
substring("hello world", 0, 5) | "hello" | |
split("a.b.c", ".") | ["a", "b", "c"] | |
left("hello", 3) | "hel" | |
right("hello", 3) | "llo" | |
reverse("hello") | "olleh" | |
size("hello") | 5 |
Numeric
| Function | Example | Result |
|---|---|---|
abs(-42) | 42 | |
ceil(3.14) | 4.0 | |
floor(3.99) | 3.0 | |
round(3.5) | 4 | |
sign(-42) | -1 | |
rand() | Random float 0..1 | |
log(e()) | 1.0 | |
log10(100) | 2.0 | |
exp(1) | 2.718... | |
sqrt(144) | 12.0 | |
e() | 2.718... | |
pi() | 3.14159... |
Arithmetic operators: +, -, *, /, %, ^ (exponent), + (string concatenation).
Trigonometric
sin(), cos(), tan(), asin(), acos(), atan(), atan2(), degrees(), radians()
All accept and return radians, except degrees() which converts radians to degrees, and radians() which converts degrees to radians.
Geospatial
| Function | Description |
|---|---|
point({latitude: 37.77, longitude: -122.42}) | Create a geographic point |
point.distance(p1, p2) | Haversine distance in meters between two points |
Collection
| Function | Example | Result |
|---|---|---|
head([1,2,3]) | 1 | |
last([1,2,3]) | 3 | |
tail([1,2,3]) | [2,3] | |
size([1,2,3]) | 3 | |
reverse([1,2,3]) | [3,2,1] | |
range(1, 5) | [1,2,3,4,5] | |
range(0, 10, 2) | [0,2,4,6,8,10] | |
keys({a: 1, b: 2}) | ["a", "b"] |
List predicates
| Function | Example | Result |
|---|---|---|
all(x IN list WHERE pred) | all(x IN [2,4,6] WHERE x % 2 = 0) | true |
any(x IN list WHERE pred) | any(x IN [1,2,3] WHERE x > 2) | true |
none(x IN list WHERE pred) | none(x IN [1,2,3] WHERE x > 5) | true |
single(x IN list WHERE pred) | single(x IN [1,2,3] WHERE x = 2) | true |
reduce(acc=init, x IN list | expr) | reduce(t=0, x IN [1,2,3,4,5] | t+x) | 15 |
Node and relationship
| Function | Description |
|---|---|
labels(n) | Returns the list of labels on a node, e.g. ["HOSTNAME"] |
type(r) | Returns edge type name, e.g. "RESOLVES_TO" |
id(n) | Returns node ID |
properties(n) | Returns all properties as a map |
startNode(r) | Source node of a relationship |
endNode(r) | Target node of a relationship |
nodes(p) | All nodes in a path |
relationships(p) | All relationships in a path |
length(p) | Number of edges in a path |
Type conversion
| Function | Example | Result |
|---|---|---|
toString(42) | "42" | |
toInteger("42") | 42 | |
toFloat("3.14") | 3.14 | |
toBoolean("true") | true | |
toIntegerList(["1","2"]) | [1, 2] | |
toFloatList(["1.1","2.2"]) | [1.1, 2.2] | |
toStringList([1,2]) | ["1", "2"] | |
toBooleanList(["true","false"]) | [true, false] | |
isEmpty("") | true | |
coalesce(null, null, 42) | 42 | |
randomUUID() | UUID string |
Date and time
| Function | Description |
|---|---|
datetime() | Current UTC datetime |
date() | Current date |
timestamp() | Current epoch milliseconds |
datetime("2024-01-15T10:30:00Z") | Parse an ISO 8601 datetime string |
duration("P1Y2M3D") | Create a duration from ISO 8601 |
duration.between(dt1, dt2) | Duration between two datetimes |
Property access on datetime values: .year, .month, .day, .hour, .minute, .second, .epochMillis, .epochSeconds.
Features
Threat intelligence
Whisper Graph indexes 40+ threat intelligence feeds across 18 categories. This data comes from whisper-feeds, a companion service that aggregates live threat feed data from public and commercial sources, with hourly incremental and daily full refresh cycles.
The threat intelligence layer works like this:
- IPs and hostnames that appear in threat feeds are connected to FEED_SOURCE nodes via
LISTED_INvirtual edges. - Each FEED_SOURCE belongs to one or more CATEGORYs (e.g., "C2 Servers", "Phishing", "General Blacklists").
- ASN nodes may carry threat properties like
maxThreatScoreandoverallThreatLevel. - The
explain()procedure computes a composite score that factors in feed count, feed weights, recency, and network density.
Querying threat feeds directly:
MATCH (ip:IPV4 {name: "185.220.101.1"})-[:LISTED_IN]->(f:FEED_SOURCE)
RETURN ip.name, f.name
[
{"ip.name": "185.220.101.1", "f.name": "Dan Tor Exit"},
{"ip.name": "185.220.101.1", "f.name": "FireHOL Level 2"},
{"ip.name": "185.220.101.1", "f.name": "Spamhaus DROP"}
]
Scored threat assessment with explain():
explain() accepts IPs, domains, ASNs, and CIDR ranges.
CALL explain("185.220.101.1")
[{
"indicator": "185.220.101.1",
"type": "ip",
"found": true,
"score": 5.28,
"level": "INFO",
"explanation": "185.220.101.1 is listed in 3 threat feed(s). Score 5.3 (Informational - minimal risk).",
"factors": [
"Listed in 3 source(s) with combined weight 2.20",
"Base score: 2.20 x log2(3 + 1) = 4.40",
"Recency boost: x1.2 (last seen just now)"
]
}]
CALL explain("cloudflare.com")
[{
"indicator": "cloudflare.com",
"type": "domain",
"found": true,
"score": 0.0,
"level": "NONE",
"explanation": "Not listed in any threat intelligence feed"
}]
CALL explain("AS60729")
[{
"indicator": "AS60729",
"type": "asn",
"found": true,
"score": 0.0,
"level": "NONE",
"explanation": "AS60729 (TORSERVERS-NET, DE) has a reputation score of 48.5 (Suspicious)."
}]
CALL explain("185.220.101.0/24")
[{
"indicator": "185.220.101.0/24",
"type": "network",
"found": true,
"score": 0.0,
"level": "MEDIUM",
"explanation": "Network 185.220.101.0/24 contains 167 listed IPs across multiple threat feeds."
}]
The level field returns one of: NONE, INFO, LOW, MEDIUM, HIGH, CRITICAL.
Feed sources (all 40):
| Feed | Category |
|---|---|
| AlienVault Reputation | Reputation |
| Binary Defense Banlist | General Blacklists |
| Blocklist.de All | General Blacklists |
| Blocklist.de Mail | Spam |
| Blocklist.de SSH | Brute Force |
| Botvrij Domains | Malicious Domains |
| Botvrij Dst IPs | C2 Servers |
| Brute Force Blocker | Brute Force |
| C2 Intel 30d | C2 Servers |
| C2 Tracker | C2 Servers |
| CERT.pl Domains | Malicious Domains |
| CINS Score | General Blacklists |
| Cloudflare Radar Top 1M | Popularity/Trust |
| DNS RD Abuse | General Blacklists |
| Dan Tor Exit | TOR Network |
| ET Compromised IPs | General Blacklists |
| Feodo Tracker | C2 Servers |
| FireHOL Abusers 1d | General Blacklists |
| FireHOL Anonymous | Proxies |
| FireHOL Level 1 | General Blacklists |
| FireHOL Level 2 | General Blacklists |
| FireHOL Level 3 | General Blacklists |
| FireHOL WebClient | General Blacklists |
| GreenSnow Blacklist | General Blacklists |
| Hagezi Light | Ad/Tracking Blocklists |
| Hagezi Pro | Ad/Tracking Blocklists |
| IPsum | General Blacklists |
| InterServer RBL | General Blacklists |
| MalwareBazaar Recent | Malware Distribution |
| OpenPhish Feed | Phishing |
| SSH Client Attacks | Brute Force |
| SSH Password Auth | Brute Force |
| SSL IP Blacklist | General Blacklists |
| Spamhaus DROP | General Blacklists |
| Spamhaus EDROP | General Blacklists |
| StevenBlack Hosts | Ad/Tracking Blocklists |
| ThreatFox IOCs | C2 Servers |
| Tor Exit Nodes | TOR Network |
| Tranco Top 1M | Popularity/Trust |
| URLhaus Recent | Malware Distribution |
Categories (all 18):
Ad/Tracking Blocklists, Anonymization Infrastructure, Attack Sources, Brute Force, C2 Servers, General Blacklists, Malicious Domains, Malicious Infrastructure, Malware Distribution, Phishing, Popularity/Trust, Proxies, Reference Data, Reputation, Spam, TOR Network, Threat Intelligence, VPNs.
Domain history
The whisper.history() procedure returns timestamped WHOIS and BGP snapshots for an indicator.
CALL whisper.history("cloudflare.com")
[
{
"indicator": "cloudflare.com",
"type": "domain",
"queryTime": "2024-06-13 18:31:16",
"createDate": "2009-02-17",
"updateDate": "2024-01-09",
"expiryDate": "2033-02-17",
"registrar": "CloudFlare, Inc.",
"nameServers": "..."
},
{
"indicator": "cloudflare.com",
"type": "domain",
"queryTime": "2020-05-01 07:03:56",
"createDate": "2009-02-17",
"registrar": "Cloudflare, Inc."
}
]
Supported indicator types: domain, IP, ASN, CIDR, hash.
History works on registrable domains (e.g., google.com), not subdomains (www.google.com). Subdomains return empty results.
Quota and plan info
CALL whisper.quota()
Returns your current plan tier, hourly query limit, usage count, maximum traversal depth, and concurrent query count.
Schema introspection
Discover what is in the graph without knowing the schema up front.
-- All node labels with counts
CALL db.labels()
Returns every label in the graph with its count, including virtual labels (REGISTERED_PREFIX, ANNOUNCED_PREFIX, FEED_SOURCE, CATEGORY).
-- All edge types with counts
CALL db.relationshipTypes()
Returns every edge type with its count, including virtual edge types (LISTED_IN, ANNOUNCED_BY, OPERATES).
-- All property keys
CALL db.propertyKeys()
Returns all known property keys in the graph.
-- Full schema summary
CALL db.schema()
Returns labels, relationship types, and property keys in a single response.
These are all instant lookups, regardless of graph size.
Best practices
General rules
- Anchor your starting node.
MATCH (n:HOSTNAME {name: "example.com"})does an indexed lookup.MATCH (n:HOSTNAME)scans billions of nodes. - Always use LIMIT. Especially on traversals that could fan out (LINKS_TO, RESOLVES_TO on CDN IPs). Start small and increase if needed.
- Use OPTIONAL MATCH for WHOIS fields. Not every domain has a registrar, email, or phone. A required MATCH on a missing field returns zero rows and hides other results.
- Use count() before pulling large result sets. Check cardinality first to avoid unexpectedly large responses.
- Use UNWIND for batch lookups. Pass lists of indicators in a single request rather than making one request per indicator.
- Specify edge types explicitly.
[:RESOLVES_TO]is faster than[r]because the engine does not need to check all edge types. - Use ANNOUNCED_BY for current BGP routing. Use BELONGS_TO for the registered RIR allocation. They return different prefix types and may give different results.
- Anchor LINKS_TO queries. The web link graph is one of the largest datasets. Queries without an anchored starting node will time out.
- Avoid CONTAINS with special characters. Characters like
&cause slow full-text scans. Use STARTS WITH or exact match when possible.
Do this / not that
| Do this | Not that | Why |
|---|---|---|
MATCH (h:HOSTNAME {name: "example.com"}) | MATCH (h:HOSTNAME) WHERE h.name = "example.com" | Inline property gets an indexed lookup |
Always add LIMIT | Open-ended traversals | Prevents timeout on billion-scale labels |
OPTIONAL MATCH for WHOIS fields | MATCH for sparse relationships | Avoids losing rows when fields are missing |
MATCH (sub)-[:CHILD_OF]->(h {name: "x.com"}) | WHERE h.name ENDS WITH ".x.com" | CHILD_OF uses an indexed edge; ENDS WITH scans |
STARTS WITH "www.goo" | =~ "www\\.goo.*" | STARTS WITH uses the FST index; regex does not |
| Pre-compute in WITH, then ORDER BY alias | COUNT{} or COLLECT{} in ORDER BY | Subquery expressions in ORDER BY are not supported |