10.3/3.1 how To identify what bots may be hitting your webtrac server and causing numerous writes to the SASession/WBMaster table
Default Subject
Problem
How to identify what Bots may be hitting your WebTrac Server and causing numerous writes to the SASession/WBMaster Table.
Solution
Complete the following steps:
- Turn off the weblive (if 3.1)/wsrtlive (if 10.3) webspeed broker on your RecTrac Transaction Server by going to your OpenEdge Explorer.
- Open IIS on the WebTrac Server and select the website.
- Go to the logging icon and copy the path displayed in the "Directory:" field. It should look something like this (%SystemDrive%\inetpub\logs\LogFiles)
- Open Windows Exploreer and paste that path into the URL field.
- Find the folder that has been modified with today's date and go into it, and then sort the records in a view or detail by modified date and look for today's file.
- Open the file using Notepad.
- Find the IP Address(es) of the items identiying themselves as "bots and use your firewall software to blacklist the IP by wildcarding the fourth octave of the ip address. For Example: 128.128.192.*
-
Once the entries have been made into your firewall software
- If RecTrac 10.3, then stop the rtliveTIMER and rtliveEVENT by going to your OpenEdge Explorer, and then delete the webwack.www in <x>:\vsi\logs. Restart rtliveTIMER adn rtliveEVENT.
- If RecTrac 3.1, then re-start the EVENTLive by going to your OpenEdge Explorer.
- Remain in OpenEdge Explorer and start webLive (if 3.1)/wsrtlive (if 10.3) webspeed broker.
-
Put a robots.txt file at the root level of your webtrac site.
- <x>:\vsi3\rectrac\webserver\web, if RecTrac 3.1.
- <x>:\vsi\webtrac103\, if RecTrac 10.3
If RecTrac 10.3, then this should be the contents of your robots.txt file:
# Every bot that might possibly read and respect this file.
User-agent: Googlebot
User-agent: Bingbot
User-agent: Slurp
User-agent: msnbot
User-agent: Mediapartners-Google*
User-agent: Googlebot-Image
User-agent: Yahoo-MMCrawler
Allow: /
# Crawl delay for any bots that may listen.
Crawl-delay: 15
#Semrush is not very friendly to our pages, so we disallow them explicitly in hopes they will listen.
User-agent: SemrushBot-SA
Disallow: /
User-agent: SemrushBot
Disallow: /
# Disallow access to site for all other bots than those listed above.
User-agent: *
Disallow: /
If 3.1 this should be the contents of your robots.txt file:
# Every bot that might possibly read and respect this file.
User-agent: Googlebot
User-agent: Bingbot
User-agent: Slurp
User-agent: msnbot
User-agent: Mediapartners-Google*
User-agent: Googlebot-Image
User-agent: Yahoo-MMCrawler
Allow: /
# Crawl delay for any bots that may listen.
Crawl-delay: 15
Disallow: /account.html
Disallow: /addtocart.html
Disallow: /autodebit.html
Disallow: /billingupdate.html
Disallow: /cart.html
Disallow: /checkout.html
Disallow: /confirmation.html
Disallow: /credit.html
Disallow: /documents.html
Disallow: /evaluation.html
Disallow: /history.html
Disallow: /household.html
Disallow: /login.html
Disallow: /logout.html
Disallow: /manualentry.html
Disallow: /paymenterror.html
Disallow: /paymentsuccess.html
Disallow: /questioninfo.html
Disallow: /renewal.html
Disallow: /report.html
Disallow: /reprint.html
Disallow: /sessioncheck.html
Disallow: /singlesignon.html
Disallow: /teetimecancel.html
Disallow: /unsubscribe.html
Disallow: /wishlist.html
Disallow: /index.html
Disallow: /js
Disallow: /css
Disallow: /GUI
Disallow: /gui
#Semrush is not very friendly to our pages, so we disallow them explicitly in hopes they will listen.
User-agent: SemrushBot-SA
Disallow: /
User-agent: SemrushBot
Disallow: /
# Disallow access to site for all other bots than those listed above.
User-agent: *
Disallow: /