Ausnahme gefangen: SSL certificate problem: certificate is not yet valid ๐Ÿ“Œ Pagodo - Automate Google Hacking Database Scraping And Searching

๐Ÿ  Team IT Security News

TSecurity.de ist eine Online-Plattform, die sich auf die Bereitstellung von Informationen,alle 15 Minuten neuste Nachrichten, Bildungsressourcen und Dienstleistungen rund um das Thema IT-Sicherheit spezialisiert hat.
Ob es sich um aktuelle Nachrichten, Fachartikel, Blogbeitrรคge, Webinare, Tutorials, oder Tipps & Tricks handelt, TSecurity.de bietet seinen Nutzern einen umfassenden รœberblick รผber die wichtigsten Aspekte der IT-Sicherheit in einer sich stรคndig verรคndernden digitalen Welt.

16.12.2023 - TIP: Wer den Cookie Consent Banner akzeptiert, kann z.B. von Englisch nach Deutsch รผbersetzen, erst Englisch auswรคhlen dann wieder Deutsch!

Google Android Playstore Download Button fรผr Team IT Security



๐Ÿ“š Pagodo - Automate Google Hacking Database Scraping And Searching


๐Ÿ’ก Newskategorie: IT Security Nachrichten
๐Ÿ”— Quelle: feedproxy.google.com


The goal of this project was to develop a passive Google dork script to collect potentially vulnerable web pages and applications on the Internet. There are 2 parts. The first is ghdb_scraper.py that retrieves Google Dorks and the second portion is pagodo.py that leverages the information gathered by ghdb_scraper.py.

What are Google Dorks?
The awesome folks at Offensive Security maintain the Google Hacking Database (GHDB) found here: https://www.exploit-db.com/google-hacking-database. It is a collection of Google searches, called dorks, that can be used to find potentially vulnerable boxes or other juicy info that is picked up by Google's search bots.

Installation
Scripts are written for Python 3.6+. Clone the git repository and install the requirements.
git clone https://github.com/opsdisk/pagodo.git
cd pagodo
virtualenv -p python3 .venv # If using a virtual environment.
source .venv/bin/activate # If using a virtual environment.
pip install -r requirements.txt

Google is blocking me!
If you start getting HTTP 503 errors, Google has rightfully detected you as a bot and will block your IP for a set period of time. The solution is to use proxychains and a bank of proxies to round robin the lookups.
Install proxychains4
apt install proxychains4 -y
Edit the /etc/proxychains4.conf configuration file to round robin the look ups through different proxy servers. In the example below, 2 different dynamic socks proxies have been set up with different local listening ports (9050 and 9051). Don't know how to utilize SSH and dynamic socks proxies? Do yourself a favor and pick up a copy of The Cyber Plumber's Handbook to learn all about Secure Shell (SSH) tunneling, port redirection, and bending traffic like a boss.
vim /etc/proxychains4.conf
round_robin
chain_len = 1
proxy_dns
remote_dns_subnet 224
tcp_read_time_out 15000
tcp_connect_time_out 8000
[ProxyList]
socks4 127.0.0.1 9050
socks4 127.0.0.1 9051
Throw proxychains4 in front of the Python script and each lookup will go through a different proxy (and thus source from a different IP). You could even tune down the -e delay time because you will be leveraging different proxy boxes.
proxychains4 python3 pagodo.py -g ALL_dorks.txt -s -e 17.0 -l 700 -j 1.1

ghdb_scraper.py
To start off, pagodo.py needs a list of all the current Google dorks. A datetimestamped file with the Google dorks and the indididual dork category dorks are also provided in the repo. Fortunately, the entire database can be pulled back with 1 GET request using ghdb_scraper.py. You can dump all dorks to a file, the individual dork categories to separate dork files, or the entire json blob if you want more contextual data about the dork.
To retrieve all dorks
python3 ghdb_scraper.py -j -s
To retrieve all dorks and write them to individual categories:
python3 ghdb_scraper.py -i
Dork categories:
categories = {      1: "Footholds",      2: "File Containing Usernames",      3: "Sensitives Directories",      4: "Web Server Detection",      5: "Vulnerable Files",      6: "Vulnerable Servers",      7: "Error Messages",      8: "File Containing Juicy Info",      9: "File Containing Passwords",      10: "Sensitive Online Shopping Info",      11: "Network or Vulnerability Data",      12: "Pages Containing Login Portals",      13: "Various Online devices",      14: "Advisories and Vulnerabilities",  }  

pagodo.py
Now that a file with the most recent Google dorks exists, it can be fed into pagodo.py using the -g switch to start collecting potentially vulnerable public applications. pagodo.py leverages the google python library to search Google for sites with the Google dork, such as:
intitle:"ListMail Login" admin -demo  
The -d switch can be used to specify a domain and functions as the Google search operator:
site:example.com  
Performing ~4600 search requests to Google as fast as possible will simply not work. Google will rightfully detect it as a bot and block your IP for a set period of time. In order to make the search queries appear more human, a couple of enhancements have been made. A pull request was made and accepted by the maintainer of the Python google module to allow for User-Agent randomization in the Google search queries. This feature is available in 1.9.3 and allows you to randomize the different user agents used for each search. This emulates the different browsers used in a large corporate environment.
The second enhancement focuses on randomizing the time between search queries. A minimum delay is specified using the -e option and a jitter factor is used to add time on to the minimum delay number. A list of 50 jitter times is created and one is randomly appended to the minimum delay time for each Google dork search.
categories = {
1: "Footholds",
2: "File Containing Usernames",
3: "Sensitives Directories",
4: "Web Server Detection",
5: "Vulnerable Files",
6: "Vulnerable Servers",
7: "Error Messages",
8: "File Containing Juicy Info",
9: "File Containing Passwords",
10: "Sensitive Online Shopping Info",
11: "Network or Vulnerability Data",
12: "Pages Containing Login Portals",
13: "Various Online devices",
14: "Advisories and Vulnerabilities",
}
Latter in the script, a random time is selected from the jitter array and added to the delay.
intitle:"ListMail Login" admin -demo
Experiment with the values, but the defaults successfully worked without Google blocking my IP. Note that it could take a few days (3 on average) to run so be sure you have the time.
To run it:
site:example.com

Conclusion
Comments, suggestions, and improvements are always welcome. Be sure to follow @opsdisk on Twitter for the latest updates.


...



๐Ÿ“Œ Find Hidden Info using Google Dorking manually, and Automated using Pagodo


๐Ÿ“ˆ 37.02 Punkte

๐Ÿ“Œ OSINT Tool: Pagodo


๐Ÿ“ˆ 33.23 Punkte

๐Ÿ“Œ Scrapestack Web Scraping API (Review): Powerful Real-time Engine for Website Scraping


๐Ÿ“ˆ 31.48 Punkte

๐Ÿ“Œ Scrapestack Web Scraping API (Review): Powerful Real-time Engine for Website Scraping


๐Ÿ“ˆ 31.48 Punkte

๐Ÿ“Œ A Comprehensive Guide to Scraping Instagram Data. How to bypass Instagram login while scraping - Facebook Spy / Meta Spy


๐Ÿ“ˆ 31.48 Punkte

๐Ÿ“Œ Next.js 14 Booking App with Live Data Scraping using Scraping Browser


๐Ÿ“ˆ 31.48 Punkte

๐Ÿ“Œ A journey to searching Have I Been Pwned database in 49ฮผs (C++)


๐Ÿ“ˆ 23.3 Punkte

๐Ÿ“Œ Expert Commentary: Chinese Data-Scraping Startup Leaked Its 408 GB Database Online


๐Ÿ“ˆ 23.13 Punkte

๐Ÿ“Œ Is scraping files from a Freedom of Information website โ€˜hackingโ€™?


๐Ÿ“ˆ 21.86 Punkte

๐Ÿ“Œ Facebook attributes 533 million users' data leak to "scraping" not hacking


๐Ÿ“ˆ 21.86 Punkte

๐Ÿ“Œ Web Scraping Doesn't Violate Anti-Hacking Law, Appeal Court Rules


๐Ÿ“ˆ 21.86 Punkte

๐Ÿ“Œ Court Rules That 'Scraping' Public Website Data Isn't Hacking


๐Ÿ“ˆ 21.86 Punkte

๐Ÿ“Œ Google, Which Owns Duck.com, Confuses Users Searching For Its Rival DuckDuckGo and Redirects Them Back To Google


๐Ÿ“ˆ 21.7 Punkte

๐Ÿ“Œ Scraping Data from Amazon into Google Sheets using ScraperAPI and Google Apps Script


๐Ÿ“ˆ 21.53 Punkte

๐Ÿ“Œ Searching and manipulating text with grep and sed


๐Ÿ“ˆ 19.49 Punkte

๐Ÿ“Œ UMG tells Apple and Spotify to block AI scraping of lyrics and melodies


๐Ÿ“ˆ 19.32 Punkte

๐Ÿ“Œ Searching for Amazon? Fake Google ad sends users to scam site


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Searching for Bitcoins in GitHub repositories with Google BigQuery


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Stalker found pop star by searching eyesโ€™ reflections on Google Maps


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Google stops pushing scam ads on Americans searching for how to vote


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Google Play Shows Warning To Anyone Searching For Fortnite APKs


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ DorkMe - Tool Designed With The Purpose Of Making Easier The Searching Of Vulnerabilities With Google Dorks


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ New Malvertising Campaign via Google Ads Targets Users Searching for Popular Software


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Malvertisers Using Google Ads to Target Users Searching for Popular Software


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Searching for news with google news rss feeds


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Microsoft Fixes Edge Bug That Made It Crash When Searching With Google


๐Ÿ“ˆ 17.91 Punkte

๐Ÿ“Œ Facebook, Google, YouTube order Clearview to stop scraping faceprints


๐Ÿ“ˆ 17.74 Punkte

๐Ÿ“Œ heise+ | Scraping mit Python: Google-Ergebnisse automatisch durchsuchen und auswerten


๐Ÿ“ˆ 17.74 Punkte

๐Ÿ“Œ Google Beats Song Lyric Scraping Lawsuit


๐Ÿ“ˆ 17.74 Punkte

๐Ÿ“Œ Google-Tabellen: importXML-Funktion fรผr Web Scraping nutzen


๐Ÿ“ˆ 17.74 Punkte

๐Ÿ“Œ Web scraping Google Lens Results with Nodejs


๐Ÿ“ˆ 17.74 Punkte

๐Ÿ“Œ Web Scraping Google With Rust


๐Ÿ“ˆ 17.74 Punkte

๐Ÿ“Œ Web Scraping Google News Using Python


๐Ÿ“ˆ 17.74 Punkte

๐Ÿ“Œ Scraping the full snippet from Google search result


๐Ÿ“ˆ 17.74 Punkte











matomo