Googlebot has been absolutely ripping through my bandwidth. This has been on going for many months now, but each month the damage has gotten worse and worse. I have posted about this problem here but with no luck:
I recently found out that the Google robot spidered my site and used up 5.74 GB in bandwidth! The hits were 4498 and I even have it scheduled to return every 2 weeks.
Now my site is down for the rest of the month unless I up my bandwidth. This is the second time this has happened to me in the past year. What is going on?
So I checked on the Google Webmaster tools, it has a nice graph now saying that it's hit my site 30 times/day for the last month or so. HOWEVER, when I look at my logs (I log googlebot) I have over 17000 page views in the last three days. Already I am using the Agent to remove images and some of the more expensive queries which are too old to be in the cache (Googlebot looks at pages on my site which are very old).
What other things should I be doing...? I appreciate that Google is indexing my site, but this is more of a scrape, than index. Just for this month alone, looking at my stats, I see the following:
I am just wondering whether my idea will work for the google search engine.
Basically, I have my official VPS root (home/admin/public_html/) and this is where my main website will be hosted. However, since my VPS will be used for additional website, I will direct additional domains to it.
My second site hosted on this VPS may have a document root of (home/admin/public_html/advertising/). And my domain will then be setup to have that as its document root.
However, when google searches for my second site (e.g. advertising.com) will it go 'below' the domain root? For example, will it also creep the files under /public_html/ for this domain, even though the domains root is /public_html/advertising/?
edit: Or do people host multiple sites differently? Is this an appropriate method?
Yes, it’s quite true. DreamHost representatives are asking their clients to block GoogleBot trough the .htaccess file, because their websites were “hammered by GoogleBot”.
Dreamhost representatives are also guiding their clients to make all their websites “unsearchable and uncrawlable by search engine robots”, because they cause “high memory usage and load on the server”.
PS: This is NOT a rumor. Multiple people are already complaining about Dreamhost asking clients to block search engine spiders. A post in QuickOnlineTips confirms this too.
If your server is blocking googlebot from finding your robots.txt file, how do you configure your firewall to unblock it?
I've searched through Google and I've seen may people just say your firewall is blocking it, but none mention how to really stop it from doing that. Like does Google have an IP it uses, and if so, what is the IP you should whitelist for your server?
As I keep getting that message: Network unreachable: robots.txt unreachable and I'm sure it's due to a firewall issue, just have no idea how to fix that.
One of my customers uses Webmasters Tools from google , looking at what pages he have indexed by Googlebot, found that 180 pages are giving a "DNS lookup timeout" error, tried searching for help on google and the only thing that i found is " We received a timeout on DNS lookup."
DNS are ok, same as the zone file, everything is responding OK, I dont know what can be the issue.. ? any ideas ?
If I type google.com in my address bar, it forwards me to www.google.com. This is not happening for my website right now. I think its a good idea to do this, since then search engines will have only 1 main URL for the website to index.
My question is:
How do I implement this? I think this may involve mucking with CNAME settings...
I use Ian Lloyd's book and that's where I found out about this forum. Looks like a great forum.
I downloaded Fliezilla FTP and I'm trying to transfers files from my computer onto an angelfire web site.
Filezilla asks for a server address and I put in the URL address that I registered with angelfire. It then asks me for an administration password, and I put in my password to the angelfire site. I keep getting: Error: Connection to server lost...
Does anyone know what I'm doing wrong here? I would like to use Filezilla to upload my files (web pages) to the angelfire site.
I have a website which is currently hosted with streamline.net on their shared msql 11 server.
We have had several issues with them over the last few weeks where someone is using most of the server and slowing everyone elses sites down so much so they crash. This week and weekend are my busiest time of the year (I sell fancy dress) and my site it totally unuseable.
We have phoned them and they have done nothing except ask us for a log which we have provided for short periods of time.
The down time has now got so bad that I have had only 2 sales today. I estimate I am losing approx 400 per day at the moment due to this problem.
Is there anything that I can do urgently to prevent my business from being killed by someone else.
So I'm interviewing with a company and when I typed in the URL to their website, I was met with a nasty surprise: a "hacked by so and so" message! However, after looking closer, I see that I had accidentally appended a period (".") to the end of the domain name, for example: http://www.example.com./
When I removed the period, the site appeared as normal. I don't know anything about the server other than it's IIS. Is there anything I can suggest to them when I go in to interview? I'd like to point this out to them; it may even help my chances at landing the job! (It's not related to networking, though.)
Is it possible to buy a dedicated server off eg Dell and host your own website on it from home ie with a www prefix to the url- Do ye know any good tutorial on it. Would 20Mbit bandwidth be enough bandwidth for a fairly busy php ,mysql site? Completely new to this.
I have a domain name websiteexample.com and its hosted with somebody, I have built a new website under a new domain .co.uk on a new server and need to redirect my .com to the .co.uk, how the best way to handle this with serach engines in mind, I dont want to loose my .com listing if possible, is a framed forward a good way?
I want to have my email and website going to different servers:
At the moment i have it setup so that the nameserver records point to the ip of the website. That means that both site and mail go to the server where i host my site. I have then set up an mx record to point the mail to another server...
however can i make this split at the start?.. i.e. from the domain registar? can i set up an A record to point to the website and an MX record for the mail? instead of pointing them both to the website and then redirecting the mail to another server?
Just this week, I believe one of my site has been hacked...or potentially my whole server! When accessing the website (a vBulletin forum), instead of going to the main page, we get a screen that looks like Window's "My Computer" and there is a scan running. Firefox has blocked the site for suspicion.
I am stumped. Where to begin? I have full SSH access to my server (after rebooting it). Thank you in advance.
I have a wordpress site, and am trying to find someone good and reasonable price to design me a template for the website so I can still use wordpress but have access and control to everything.. I am going to buy a domain as well to hook wordpress up to it so I can edit themes etc.. I was looking at maybe using mmhosting..
Also, I have a forum, but not sure what type of forum I need to be able to arrange it and add and use features don't I can't now, so basically upgrade..
My site, as I said is wordpress but I want it to look like the forums I am using..
I'm thinking of starting a website that pokes fun at news stories and photos, but I'm a complete rookie at this. All the RSS feeds I found say they are free for non-commercial use. I want to be able to sell t-shirts and caps on my site, or at least advertising. Any ways to use current news and photos that are free or inexpensive?
I've discovered that a site of mine has being completely copied
The sites are in a different country
[url] [url]
The original site, that is still currently under construction with a lot of copy, rewriting and pages still have
[url]
Now, I don't mind someone taking inspiration or borrowing ideas. I'll admit as a web designer I have 1000's of book marks of sites that I've liked and will look at for ideas and inspiration, but I don't just download and put up.
Now as the sites are in a separate country I don't know yet if it is worth all the hassle as our customer base are separate, but I presume that having identical content means that google ranking could be damaged?
I registered my domain with domain.com several weeks ago. Then, when I tried to get web hosting services! but I had the worst experience over the internet by professional internet company.
Anyway, the issue is not that, I did lost my money with this company but I did manage to purchase web hosting services with another web hosting company (webhostingpad.com).
However, now I am stuck with how do I manage to activate my web address at (domain.com) to be directed to my website at (webhostingpad.com).
I tried both companies support with no respond at all.
So, My question is;
What is needed to do now:
How do I activate my web address over the internet?
How to make this address point to my website at webhostingpad.com?
How to make my address shows in all search engines?