Google And Yahoo Robots Ips
Apr 28, 2009Do you know what is most important search engine robot ips?
I need to all serach engine robot ips too firewall white list.
Do you know what is most important search engine robot ips?
I need to all serach engine robot ips too firewall white list.
I found myself writing some filters to monitor spider activity across various sites.  Unfortunately it appears that an increasing number of time wasters are setting their User Agent to Googlebot and likes.
Googlebot is pretty easy to filter out as given some research it appears that any legit googlebot IP has a valid .googlebot.com PTR that resolves back to the original IP.  It does not appear to be as straight forward with other spiders with msnbot topping the hilarity chart with IPs resolving to meaningless .phx.gbl names.
Is anyone aware of any recommendations or trustworthy references that can be used as guidelines to reliably identify legitimate spider request from fakes?
Ok, I am wondering how so many hosting companies can offer more credits than the fee of hosting so.. they have to be getting it really cheap or free to refer customers to use PPC, my question now is how can a registered trademark company go about contacting google, yahoo and the likes to do this and does anyone know where I can find the information?
View 14 Replies View RelatedI noticed in the apache error logs that Yahoo and Google (mostly Yahoo) have been crawling one of my clients' sites for warez, cracks and serials, but these files don't exist (and hence they're ending up with 404 errors in the error logs).
The IPs that are doing this are legit (as far as I can tell)... 74.6.20.32, 74.6.74.50, 66.249.65.178
It's systematically hitting the server once every second or two, looking for files such as "free.serial.no.of.flash.professional.8.jsp", "trainer.SimCity.4.million.jsp", "Paragon.Partition.Manager.v.7.crack.jsp"
and this has been happening for at least a day now.
Now here's the thing.  The site has gone through a redesign about 4 months ago, and the previous site was running an old version of PHP Nuke that was filled with spam (from tracebacks), and a photo gallery manager which was probably not installed securely.  The new site utilizes Joomla, and all the previous elements are gone (in fact the site was migrated to my servers after the redesign, so my servers never hosted their old PHP Nuke or photo gallery).
These mysterious crawls from Yahoo and Google all seem to hit the photo gallery URLs, such as http://www.domain.com/albums/album01...v3.1.patch.jsp
The other weird part is, all the links end with .jsp.  The site's previous server did not have JSP support, and neither does mine.  
Also before we moved the site over to my server, I archived their entire website on their old server, and none of these files exist there.
Doing a search on Google on their site (site:www.domain.com) doesn't show any of these files.  Searching Google for links that point to the site also don't show anything out of the ordinary.
This has me baffled... Does anyone have any idea what is going on?
I cannot figure this out.. I have tried EVERYTHING.. 
I am running a php script using the mail() function and sending an email..
I have had reverse dns point to the domain
I set an SPF record
My IP is not blacklisted.. I have had the dedicated server for 2 years now also
I modified a few things in the sendmail files.. I am stuck..
I am running freeBSD.. My buddy has his server set up with all of the same sendmail settings being the same.. and his emails don't get flagged..
I'm having a issue with my current robots.txt file , which is not properly handling the requests/ blocking the content to be access . What I want is that to only allow like google bots , yahoo , msn , bing , alexander ranking beside those bots block all other bots . my current file rebots.txt is below
Code:
User-agent: Googlebot
Allow: / 
User-agent: googlebot-image
Allow: / 
User-agent: googlebot-mobile
Allow: / 
User-agent: MSNBot
Allow: / 
User-agent: Slurp
Allow: /  
User-agent: Teoma
Allow: / 
User-agent: twiceler
Allow: / 
User-agent: Gigabot
Allow: /  
User-agent: Scrubby
Allow: / 
User-agent: Robozilla
Allow: / 
User-agent: Nutch
Allow: / 
User-agent: ia_archiver
Allow: /  
User-agent: baiduspider
Allow: / 
User-agent: naverbot
Allow: / 
User-agent: yeti
Allow: / 
User-agent: yahoo-mmcrawler
Allow: / 
User-agent: psbot
Allow: /  
User-agent: asterias
Disallow: 
User-agent: yahoo-blogs
Allow: /
Googlebot is eating up about 30GB bandwidth per month.
I want to prevent him (and other robots) from spidering:
[url]
and all derivatives thereof (e.g. [url])
Is this possible with robots.txt?  How would the code look?
Will this work?
User-agent: *
Disallow: /directory1/page1.php
or do I need to somehow specify a wildcard, e.g.
User-agent: *
Disallow: /directory1/page1.php?*
1 Do I really need a robots.txt file?
2 Don't misbehaved spiders simply ignore them?
3 For 'disallow', shouldn't I only include urls which are linked from public pages - and not those which I use for testing and which aren't linked-to from any public pages?
4 If I include such urls in 'disallow', aren't I simply alerting spiders (and anyone else who wants to see what sections of my server I don't want known) to stuff they'd otherwise not discover?
I use Outpost Firewall to view active connections to my server. If I don't restart the httpd service on a regular basis my server will grind to a halt from being flooded by robots. 
I currently have the service set up to restart at Midnight and Noon every day. Sometimes that's enough, lately it's not. For example, I checked an hour ago and I had 385 connections to httpd. At least 50% of the connections were robots - tons of the same IP addresses and they're just crawling the site.
Almost all of the connections show up as less than 1kb bytes received and 0 bytes sent per connection. 
I already have a good 20 connections by these robots and the connection time shows as 11 minutes... I just browsed to a web gallery page on my site figuring that'd be mildly "intensive" on connections with all the thumbnails and my connections aren't lasting more than one minute.
So, what's with all these connections that are lasting 10+ minutes? I've even got one connection that has an Uptime of 30 minutes, bytes sent 65811, bytes received 180. It seems like something with these robots doesn't terminate correctly...
what to do so these connections quit jamming my server up? It's like a very very slow DOS...
I hope some of you are using Google Apps and can help me to find an answer to the following question:
I own two different and independent domain names (e.g. domain1.com and domain2.com).
I'd like to use the Google Apps (Standard, free edition) with them to create two different and totally independent mailboxes (e.g. abc@domain1.com and xyz@domain2.com).
But how many Google accounts I need to do this? Can I manage two (or more) independent and fully functional domains using one Google account?
P.S.
Help section contains descriptions of aliases for multiple domains, which are just pointers or shortcuts, but not a fully functional mailboxes, so this solution isn't something I'm looking for.
There is probably a simple explanation for this, but in our Google Analytics stats one of the most popular pages is 
/?wcw=google
Can anyone explain exactly what this is?
For several months the Yahoo bot had been controlled to one visit per 40 seconds with a robots.txt page. Today it is not working and creating several page impressions per minute.
Has anyone seen any new advice on how to control this pesky creature?
This is the file content I have been using:
User-agent: Slurp
Crawl-delay: 40
User-agent: *
Disallow:
My server is being pounded with Yahoo bots, close to 500 or more at a time. They are causing my loads at various times of the day to skyrocket, crashing the mysql server, etc. They have been on my site for about 4 days now and I just dont see why that many bots are needed to crawl a single site for several days.
Is there any way around this without banning their ips? Any way to limit them?
i want to preview a file in my browser but so far it s not possible i am using yahoo sitebuilder 2.5 and when i m trying to previw the website an erro appears and says that i havent save my site in the write place.. when i do that it says that i haven still save my site in the right place,, i spend about a week of work on it... also i cant publish it.. is there anyway to c the html code or do somthn else?
View 6 Replies View RelatedI have this strange problem with smartermail, it can send perfectly to hotmail and gmail but when it tried sending to yahoo accounts, there is always a delay and the email end in the bulk folder. 
Such problems does not occur with my qmail or icewarp merak.  Any idea how to solve?  Have already emailed Yahoo hoping to get a whitelisting of my IP.  Besides installing domainkeys any other ways?
Has anyone ever used Yahoo to host their website?  I have heard some good and more bad comments about them.  I am a rookie at this so my site is going to be pretty basic.  How do they rank vs. other hosting companies?
Yahoo has an “easy site creator” tool to make a site with a template like process. Do sites that use these rank as well in search engines as sites built without a template?
lately yahoo make advancements in their hosting plan,and i want to choose best of yahoo or bluehost. could anyone give me an opinion.
View 14 Replies View Relatedyahoo offering unlimited bandwidth and space at very low cost around $50 (2000INR in our currency),is it reliable or not? 
Because i have a wallpaper site requires some server resources but not so high.
suggest me,can i take this one?
If I'm not mistaken we have reverse DNS setup and no blacklist entries. Is there any reason why welcome emails from our custom CP are being filtered directly into the Yahoo SPAM box?
Here's the full headers from my personal email. This is the exact message a client would be sent.
From HostVentrilo.com Sun Apr 20 20:38:17 2008
Return-Path: <nobody@web.teamspeakhost.com>
Authentication-Results: mta105.rog.mail.scd.yahoo.com  from=hostventrilo.com; domainkeys=neutral (no sig)
Received: from 69.93.229.114  (EHLO web.teamspeakhost.com) (69.93.229.114)
  by mta105.rog.mail.scd.yahoo.com with SMTP; Sun, 20 Apr 2008 20:38:18 -0700
Received: from nobody by web.teamspeakhost.com with local (Exim 4.68)
(envelope-from <nobody@web.teamspeakhost.com>)
id 1JnmrB-0007AT-P3
for xxxxx@rogers.com; Sun, 20 Apr 2008 23:38:17 -0400
Received: from phpmailer ([67.204.23.77]) 
by www.hostventrilo.com with HTTP (PHPMailer);
Sun, 20 Apr 2008 23:38:17 -0400
Date: Sun, 20 Apr 2008 23:38:17 -0400
To: Jeff Piper <xxxxx@rogers.com>
From: "HostVentrilo.com" <info@hostventrilo.com>
Subject: Ventrilo Server Information
Message-ID: <edee9a956937977bcb723593ae6938a5@www.hostventrilo.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"
Content-Length: 567
My only guess would be the PHP programming of the mailer code?
Some of my emails send to the clients who are using Yahoo's email is stored in their " Bulk " folder, so Yahoo is considering my email as spamer althought i'm not. So do you know how can i fix it ? Do i need to contat Yahoo about this matter ?
View 1 Replies View RelatedNowadays it's everyone's headache for sending mails to hotmail and yahoo i guess. I am having the same problem.
 
I add the domainkey to domains, but when i send mail to yahoo it drops to spam.
 
I didn't tried it with hotmail, but reason for dropping spam at yahoo is probably because of no spf record.
 
I am having a bit trouble with setting spf. I read some documents and my mind is messed up, i can't figure out what to do.
 
My server's ip address is : xx.xx.xx.xx
My server's hostname is : server.x.com
 
and i have domains like , y.com, z.com, f.com bla bla that wants to send mail to hotmail.
 
Those sites are using different ips, but each ip is using more than 1 site,
 
I read that i need to setup reverse dns, but one article says i should do it for my hostname, other one says for mail server : mail.x.com.
 
So first question how should i setup reverse dns? 
 
Second question, it says i need to add spf record for my domain, does it mean my hostname, or the domain i want to send mail ?
 
v=spf1 a mx ptr ~all 
and than i should  fill form at https://support.msn.com/eform.aspx?p...rid&ct=eformts
 
and than wait.
  
So i will be glad if someone can answer my questions and tell me if there is any missing parts.
I am looking for an expert who can help me configure my dedicated server to ensure my email gets delivered to Yahoo.
I have an autoresponder script hosted on the server. The script was previously hosted with another host and I got excellent deliverability to Yahoo, Gmail, Hotmail and everyone else.
Unfortunately the volume of mail was too large for that host, so now I have a dedicated server. Since I moved to the dedicated server (two weeks ago), my mail does not get delivered to Yahoo. It doesn't even go into the junk mail folder. It just disappears.
I am very strictt about using confirmed optin for my lists, so it's not a blacklist issue. Right from day one on the new server, my mail did not reach Yahoo. So it must be the way my server is set up.
If there is someone who knows how to resolve this, I am willing to pay you to solve the problem for me. Or, if there is anyone who can give me some advice, that would be much appreciated.
Im having problem with Yahoo web hosting. All .php pages is not working.. it will give you an error "403 Forbidden".
I cant even access my domain webhosting control panel.. 
I called Yahoo Customer Service and it will take 2-3 days to fix. Sad to know my website is down almost 1 week. 
Can anyone tell me what's really going on with Yahoo Webhosting? Do they care their customers?
I have noticed that Yahoo is refusing emails coming from the IP of my server. I have written them and I got this email, has this happened to you too? Any recommendations? ...
View 9 Replies View RelatedI have a problem with Yahoo defering my email from servers for weeks!  
I've submit to [url]
from
[url]
MANY TIMES! this is getting me frustrated. what's more.. i try to change my SMTP to a new server (fresh IP within same network in the same DC) and it is ALSO deferred. How do I check if this has something to do with a whole block of IP maybe listed in some spam engine or something?
How can i ban Yahoo! Slurp and its IPs using .htacces?
View 3 Replies View RelatedLike so many others I am having the problem in which Hotmail and Yahoo are rejecting emails being sent from my server. I recently changed servers and this is most likely the reason.
While Hotmail hasn't been fixed, they have responded swiftly, usually within 6 hours. 
I need to contact Yahoo about it but I can't find any information or forms to fill out. Could you please direct me to the correct URL - their site is a complete maze.
Unfortunately yahoo blocked our server ip, so our customer cant send any email to yahoo!
How can fix this issue? Do you have any email or ticket system in yahoo for contant about this issue?
business is just getting out of hand. I applied several weeks ago for Whitelist status, and my issues finally went away for a little over a week (though I never received a response to my Postmaster requests). But then today -bam- 100% deferrals for going on 18 hours now, not a single message has gone through. And naturally no two Yahoo servers give me the same error message.
So...
At this point I'm ready to contract out my Outbound mail to Yahoo through a whitelisted 3rd party until I can get this resolved on my end. Would this be reasonable? Is anyone else doing this? I worked with an outsourced SMTP provider in another life for an internal company mailing list with good success.
It seems that Yahoo does web hosting, and they offer WordPress as one of their one-click install kinds of things. Given the absolutely absurd responsiveness of yahoo.com, I was curious if anyone actually used them as a webhost?
Their pricing is decent, but I'm wondering about performance? I'm currently with Dreamhost, and I'm okay with them, except that WP performance isn't great (not awful, though, either), despite having wp-cache installed and operational.
So how good it Yahoo's small business webhosting? Their TOS seems pretty generous as well.
[url]
if there is much i can do to get the mail that gets sent from my server to show up in the inbox of users using a free mail service.
View 2 Replies View Related