If you’ve been around OSINT or even Infosec, both red and blue, for a while, you’ve probably heard of Google Dorking or Google Hacking. There is no difference in the terms, aside from some audiences being more apt to adopt the techniques if the word hacking is omitted.
The technique, pioneered by Johnny Long, Bill Gardner, Alrik van Eijkelenborg, Ed Skoudis, and Justin Brown was the subject of three books, the last being published in 2015. It also existed in a web-based database until Johnny turned over control to Exploit-DB when he went on a year-long mission trip. It now exists here. Johnny also presented the topic at Black Hat EU 2005.
What exactly is Google Dorking?
As the subtitle suggests, it merely consists of 2 things:
Speaking the language of the search engine
Asking the question in the most favorable way to the search engine (not WHAT you ask, but HOW you ask it)
Speaking the Language
Search engines, whether Google, Bing, DuckDuckGo, or Yandex are nothing more than the implementation of an algorithm. The intention of the algorithm is to provide the most relevant and engaging results to the user in the fastest possible time. The motives and monetization structure of the company running the search engine may also affect the algorithm, but it’s largely the ability to quickly and accurately comb through their index. This is why many search engine companies invest heavily in Data Science, specifically Natural Language Processing (NLP).
Think of it this way: Speaking English or Arabic to order food in France may work for you. In some parts of Paris, especially. In some of the more remote places, not as much. Taking the time to learn part of the language and at least attempting can pay massive dividends.
How to Ask the Questions
I typically equate this to asking a toddler what they want for lunch. Simply ask them and you have no boundaries as to what they are going to say. Limit it to a peanut butter sandwich or a tuna sandwich and you’ll get better results, although still some wild combinations.
For this, each search engine has its own “language.” Using Google’s language is known as Google Hacking or Google Dorking. Intel0logist has curated a nice StartPage of Search Engine resources, including dorks and syntax for many search engines available here.
For the purpose of this post, I will share some of my favorite Google dorks and explanations of how to use them.
intext:
This is my universal favorite. I use it to search for people, Google tracking codes, cryptocurrency addresses, and more!ext:
orfiletype:
I like to use this to find files that probably shouldn’t be on the internet in addition to looking for NMAP scan results (filetype:nmap
orfiletype:gnmap)
and Google Earth KML files (ext:KML
).inurl:
andintitle:
I like to use these in enumerating login pages and sensitive directories.related:
Logic would have this dork enumerating subdomains and other related properties. Nope. It is very useful in enumerating competitors.site:
Limits the scope of the search to a particular site or domain.link:
Finds links to a page. Useful in finding fraudulent and cloned pages.
These are amplified through the use of Boolean Logic:
AND
(+
)OR
(|
)NOT
(-
)*
(wildcard)““
(precise phrase)
Examples:
“Peanut Butter” AND Jelly
“butter” AND jelly -peanut
(Peanut | Almond) Butter AND jelly
“Peanut Butter Sandwich” OR “Tuna Sandwich”
Conclusion
This is the tip of the iceberg. There are many ways you can go about employing Dorking. Take a look at Exploit-DB’s GHDB to develop your own ways to chain dorks together to find what you’re looking for. As with anything related to OSINT and Intelligence, keep in mind that these could change at any time with little to no notice. Focus on developing the techniques before touching any tools. Tools may come and go, often faster than techniques become obsolete. If you want to learn more about search engine intelligence, consider taking The OSINTion’s Alternative and Advanced Search Engine Intelligence (AASEI) course.