

| Sign in: |
| Members log in here with your user name and password to access the your admin page and other special features. |
|
|
|

|
|| SportsShooter.com: Member Message Board

Keywording (PhotoShelter and/or PhotoMechanic)
 
Delane B. Rouse, Photographer, Photo Editor
 |
Washington, DC & Seattle | WA | US | Posted: 1:40 PM on 11.23.05 |
->> Im trying to fighure oth the best way to keyword images...let's say I have images of Bobby Knight...do I keyword it as:
bobby, knight, basketball, coach, angry
or
bobby knight, basketball, coach, angry
or
Bobby Knight, basketball, coach, angry,
I just want my clients to be able to find the images they need...with the least amount of effort.
Thanks in advance!!!
Delane |
|
 
Melissa Wade, Photographer, Student/Intern
 |
New York | NY | USA | Posted: 2:22 PM on 11.23.05 |
->> I wouldn't separate Bobby and Knight. I don't think capitalization matters so I would use what is proper.
FYI, a general search on PhotoShelter gets you the search term whether it is a keyword OR merely in the caption. If you know that someone needs to pull some shots up, it might be preferable to do the advanced search specifying keyword and then send them the URL to the result. I've been trying to fully caption my shots which means names appear of players that have their backs in it. The regular search means those appear as well as the images I've keyworded for the player. |
|
 
Allen Murabayashi, Photographer
 |
New York | NY | USA | Posted: 2:25 PM on 11.23.05 |
->> the answer is "it depends."
databases that deal with blocks of text (as opposed to limited columns of well-defined data, e.g. a part number) fall under the category of "full-text" databases.
they index each word, position, and frequency of a word in a document. each database has its own scoring algorithm to return relevant results. for example, if you search for "zebra," it will likely return a higher score because it's a relatively uncommon word.
in your examples, the comma is used to delineate "tokens" in the database. so each comma delimited word or phrase has it's own little node in the db index. you can think of the db index like an index in the back of the book.
in your case, example #2 is preferable over #1. #3 is the same as #2 because the database is case insensitive.
as far as the searching is concerned...many databases (e.g. google, photoshelter) employ OR searching in order to return the largest possible set. so if the user searches for:
bobby knight
you will get results that include the word "bobby" or "knight". like "bobby deutsch" or "knights football."
the "correct," but perhaps less intuitive way is to quote the search:
"bobby knight"
this ensures that the phrase "bobby knight" is found, where the search terms are adjacent to one another. or to ensure that both words are included in the document you could do:
+bobby +knight
it's worth noting that a comparison of google vs. a "normal" text-search database isn't quite fair. remember, google results are based on link frequency. the more links that are attached to a web page, the more relevant it's considered. so their search algorithm isn't simply word-frequency based.
the bottom line is that searching large sets of data is inherently complex, which is why it's so hard to find stuff that you want sometimes. although the database structures are similar, everyone tweaks their db differently, and we often have to guess what is important to the user (is date more relevant that word frequency?). so what works in one database, may not work in the next.
more confused now? i feel ya. makes you wanna toss a chair or something. |
|
 
Greg Ferguson, Photographer
 |
Scottsdale | Az | USA | Posted: 9:26 PM on 11.23.05 |
->> There's more to it than an internet search engine just grabbing the keywords on the page or a database search. Commas between the words only mark the logical separation between phrases in the meta-keyword tag. They have no relationship to the operators used in a search phrase by a user. The back-end databases are a lot more sophisticated in their use of the keywords than that.
The engines consider linguistics and relevancy - words that relate to the subject and are included in the text of the page, and WHERE they are included on the page. They also look for words in the "alt" tag of the image tag in the page, and the "title" tag of any anchors pointing to the page.
Having a keyword list with no body content will do very little good. Also, image pages have very poor hits because they don't relate well to the context of the site - rich content is real important, and images don't fall into that bucket.
bobby, knight, bobby knight, coach, basketball coach, angry, angry coach
would be where I'd start. You could also reverse the words in phrases...
knight bobby, coach basketball, coach angry
though those will probably have little added benefit. They'd just serve to fill out the list of keywords if you can't think of anything else that applies.
You could also look at Yahoo and Google's keyword hit lists to find what words are related to your subject and you might find other synonyms, misspellings, and homonyms to consider. If I remember right, you can have up to 12 keywords before they're ignored. Stuffing a bajillion in there will do no good. Ignoring additional ones when you have a few slots empty is almost as bad because you miss opportunities to direct searches to you.
What's worse is that the big search engines have some secret heuristics that are used to determine relevancy based on the content of a page, and those change over time and differ at any given time depending on which spider is crawling. Google and Yahoo have their published spiders, and they have stealth spiders that are used to determine whether a page is being stuffed to try to raise its relevance to the keywords, and they'll blackball the page if it triggers certain rules.
The best tactic is to include some good verbiage with the image - write a good cut-line, and select the pertinent words from that as your keywords. And, don't overlook the title, anchor, image and meta-description tags as they've got some bearing on the page weighting too.
IPTC/EXIF fields in images and Flash files are death for page searches because they are ignored by the engines. The spiders only want text, though they will index postscript and pdf files because those are largely text. |
|
 
Pablo Galvez, Photographer
 |
Calgary | AB | CANADA | Posted: 1:07 AM on 11.24.05 |
->> Doesn't photoshelter search within your caption as well? You could just caption your image with "Bobby Knight is an angry coach" and a search would still pull up that photo, right???
-Pablo |
|
 
Greg Ferguson, Photographer
 |
Scottsdale | Az | USA | Posted: 1:54 AM on 11.24.05 |
->> Photoshelter, or a dedicated image browsing site or app can do it. The problem is, you want to reach more than those people you've given a disc and app to, or those who are savvy enough to know to go to a specific site.
You want to reach the unknown potential customers who you haven't identified and who haven't already found you. That takes the help of the big generic search engines.
The search engines are very picky and don't like a lot of different types of web-pages. Pages that rely on Javascript for their navigation and content don't get indexed. And especially any pages that take some sort of human intervention, like search forms that need to be filled out.
When a spider runs into a search page, it'll usually just bail out, skip the page, and continue on unless it has pre-knowledge of that site and how to fill in the form, submit it, and then scan returned pages. The big sites like Google and Yahoo won't bother.
If that search page is the only way to find information about images you've cataloged, then you are SOL; The image tags, keywords and whatever else are available to search by will not show up.
The pages have to have normal links chaining them together that the engines can crawl, and valid (tasty) information, then eventually they'll get searched and indexed. That's when the "relatedness" of the keywords and content will come into play, by raising the page's score and making it appear higher in search results.
In the case of a dedicated app like PhotoMechanic,... well, if you've given a disc of images to a customer and they're searching the disc using an app that can scan the IPTC and EXIF information, then they are already in your pocket. Simple keywords in the IPTC fields will help them, but that scenario will do little to help you reach new customers, which is the goal. |
|
 
Delane B. Rouse, Photographer, Photo Editor
 |
Washington, DC & Seattle | WA | US | Posted: 2:25 AM on 11.24.05 |
->> hey Greg...thanks...I'm really interested in how to best write the keyword section when it applies to applications like Photo (Mechanic/Shelter/Shop) etc.
When you go here
http://www.photoshelter.com/usr-show?U_ID=U0000weIlTaVO1f4
and type "bobby knight"...I want to make sure the appropriate images come back in the results. when you type "bobby" or "knight" I want the correct images as the result...
Appreciate the info...definitely informative... |
|
 
Greg Ferguson, Photographer
 |
Scottsdale | Az | USA | Posted: 1:10 PM on 11.24.05 |
->> For PhotoShelter or any site, the job of creating good keywords is basically summarizing the description of what you want indexed into its most important words, without incorporating "stop words", which are common words that don't add anything to the "search-ability". Think of the most basic phrase that would describe something.
"Bobby Knight Basketball Coach"
then break that down into component words and phrases that make sense.
Do you want someone to search on "angry" and find a picture of him? Include "angry".
"Angry Bobby Knight Basketball Coach"
Photoshelter's search engine is probably not doing entomology and root extraction of keywords and contextual word breakdown and indexing the images to those root words, so it'd miss if someone searched for "anger", which is a more likely search term. You have to include "anger" too because it's the root word.
"Angry Bobby Knight Basketball Coach Anger"
I haven't talked to Grover or Jason to see how they wrote their search, but like Alan said, most likely they're using simple database and/or operators to include or exclude words. If so, there's no need to break down into comma-delimited keyword phrases like we'd do on "the internet". The back-end database will find the words anyway. |
|


Return to --> Message Board Main Index
|