PDA

View Full Version : SE question



davidundalicia
11-04-2006, 02:14 AM
If a site has some pages driven by php and a mysql database, will these be indexed bt the SEs ???

Bethers
11-04-2006, 04:10 AM
Hi David, and the answer is yes. You can get so far out with dynamics in a url that it could become a problem - but I don't think you're talking that - so yes.

davidundalicia
11-04-2006, 12:30 PM
Thanks beth, does that mean that the S.E´s read php files ??

ez-ez
11-04-2006, 02:41 PM
G'day David.. even though I'm not an expert on this, I think SE's read everything and will index everything.. this is evident in the fact that if you do a search for a file name 'for example hello.php' Google might spit out a url like: www.greetings.com/hello.php (http://www.greetings.com/hello.php)... I have seen wmv's, mp3's and even sometimes txt files..

I don't know if you use robots.txt on your website, but they can tell some SE's to buzz off certain folders and pages.

davidundalicia
11-04-2006, 03:09 PM
Hi Carlos, thanks for your reply.
I have recently added a form to one of my websites which collects data and then stores the data in a database.
If someone then wishes to view that data the php code gets the data from the database and displays it on another (empty, header and footer only) page.
The question I have in my mind is that, if googlebot can read my php file, then so can other bots.
So, How secure then are all my other php files(contact forms etc) ??

ez-ez
11-04-2006, 03:28 PM
I guess if you feel that by bots reading your pages that meaks them unsecure, then the bad news would is that yes they are unsecrue.. Bots will read everything unless you tell them to go away.. and that is the whole purpose of robots.txt files.. but here is the thing, what are the chances of anyone finding those files on a search engine if they don't know their names... php files don;t have keywords.. and yes they may contain words like $name, $email.. and so on... but how good is that as content for search engine to produce results..?????? I wouldn't be too concenred.. (please please verify this.. I'm not an expert).

The information I have is that SE's hardly ever has anything to do with php, and text hacks... rather, I would be concerned more about hackers adding tags to your domain name which would then give them your root directory content.. offcourse it wont be accessible, but they will know what's on it... from there, you know it's a matter of calling the page in the browser and checking it out.. there are now HTML Editors that will open files on external locations like a url address.. Ultimately, It really isent an issue until something triggers it..

This is what I would do.. I would give SE's time to read and index all of my pages, then I would do searches for the files I don't want indexed.. if I find them.. then I would consider using a robots.txt to tell SE's to f*** off those folders and files.

The best way to see if those files are indexed by SE's is to use one of those online free page analyzers... they also have tools where you enter a url and give you a list of popular search engines and what that page url rank on each one of them.

davidundalicia
11-04-2006, 03:51 PM
Thanks Carlos, If your ever down my way (Spain) I will buy you several beers...................

ez-ez
11-04-2006, 04:00 PM
Thank you David.. I'll keep that in mind when I go visit my family in Barcelona.. It would be real good to catch up.

davidundalicia
11-04-2006, 04:06 PM
Dime quando tu vienes y vengo a verte....

mas tarde mi amigo.

ez-ez
11-04-2006, 04:27 PM
Sounds good.. it's a deal.
mas tarde amigo.

Bethers
11-04-2006, 07:42 PM
If the info in the form, etc needs to be secure, you need to use an SSL.

If you just don't want the se's to follow those pages, use the robot.txt and tell them NOT to go to those pages and they will listen.

navaldesign
11-05-2006, 05:59 AM
Sorry to cut in. SE spiders, crawlers etc. will FIND anything on your site. READING it is a totally different issue: they CAN'T. They can only read the output to the browser created by the php code, so it is actually as if a visitor has visited that page. So, no worries about your content. Your only worry should be that they might unveil links and pages that you don't like to, so your solution would be that of creating a robot.txt file to tell them NOT to spider the pages you don't want them to.

navaldesign
11-05-2006, 06:07 AM
The information I have is that SE's hardly ever has anything to do with php, and text hacks... rather, I would be concerned more about hackers adding tags to your domain name which would then give them your root directory content.. offcourse it wont be accessible, but they will know what's on it... from there, you know it's a matter of calling the page in the browser and checking it out.. there are now HTML Editors that will open files on external locations like a url address.. Ultimately, It really isent an issue until something triggers it..

ALL html editors that can input html code from an internet page, do excactly this. Even BlueVoda does it, when importing html pages from the nest. Even I do it, with a php script of my own, for the next version of ABVFP that will be capable of reading the form directly from the site, before creating the processing script.

I wouldn't worry too much about this, because they can only read the html output of these pages, not the php code, and even more, the database content.