Home U.S. Coin Forum

Database folks...help!

I'm considering the possibility of creating a database of coin publication articles that will be searchable, and provide it to the world at large through my site, but also make it available to my local club for checking out coin publications that have articles of interest to their specialties. I have been tossing the idea around for a while, and have decided I need help in structuring the database for this project. The idea I have is to make the articles searchable by category - if you are a Seated Dime collector, you would be able to enter "seated dime" and find all of the articles I have found about seated dimes - the issue you would need to find would be listed next to the title of the article....something like a library card catalog system.

All was well in my thinking until I ran into a bit of a snag. What if there is an area that talks mainly about the history of the Carson City mint, then discusses something specific about the seated coins created there. I would want to include the article in "mint history", and also for each of the seated categories mentioned. This would end up building quite a redundancy if I had the volume, publication, pub date, and article title in the DB that many times. So, how should I solve this conundrum?
C. D. Daughtrey, NLG
The Lincoln cent store:
http://www.lincolncent.com

My numismatic art work:
http://www.cdaughtrey.com
USAF veteran, 1986-1996 :: support our troops - the American way.
image

Comments

  • krankykranky Posts: 8,709 ✭✭✭
    What if you made each article the primary DB entry, and use a bitmask for each category that would indicate if it was relevant. It would only take one bit of storage to indicate if an article was related to a particular category. The downside would be if you wanted to continually add categories, you'd have to keep modifying your retrieval code to handle them.

    New collectors, please educate yourself before spending money on coins; there are people who believe that using numismatic knowledge to rip the naïve is what this hobby is all about.

  • coppercoinscoppercoins Posts: 6,084 ✭✭✭
    Sorry, Kranky....that's complete Chinese. What's a "bit mask"? I'm a baginner here, trying to do a service - not a pro.

    I do intend on doing one table of nothing but categories, and link them to the article list by cat. number....or at least that's what I think I will do.
    C. D. Daughtrey, NLG
    The Lincoln cent store:
    http://www.lincolncent.com

    My numismatic art work:
    http://www.cdaughtrey.com
    USAF veteran, 1986-1996 :: support our troops - the American way.
    image
  • meos1meos1 Posts: 1,135
    What web server platform are you running, IIS or Apache?
    I am just throwing cheese to the rats chewing on the chains of my sanity!

    First Place Winner of the 2005 Rampage design contest!
  • coppercoinscoppercoins Posts: 6,084 ✭✭✭
    Apache...using PHP and MySQL.
    C. D. Daughtrey, NLG
    The Lincoln cent store:
    http://www.lincolncent.com

    My numismatic art work:
    http://www.cdaughtrey.com
    USAF veteran, 1986-1996 :: support our troops - the American way.
    image
  • meos1meos1 Posts: 1,135
    Why did you select two tables? I know you typically want the lowest normalization, but I would find it easier to start off with a table for the condition, type, author, primary subject, secondary subject, publication. Many of these tables would have a one to many relationship. For example, the author table would need to be able to be related to the publication table. Others such as condition and type would have a one to one relationship. After I setup the tables and the linkages I would construct some basic SQL queries and run them for a output test. Then you can setup your web page to be a front end for the SQL query statement builder and a second page that would be a display page for the resultant data set from the SQL query. I can't be much more specific as I typically deal with MS boxes and have not kept my UNIX skills upto date.
    I am just throwing cheese to the rats chewing on the chains of my sanity!

    First Place Winner of the 2005 Rampage design contest!
  • coppercoinscoppercoins Posts: 6,084 ✭✭✭
    As for "condition", that's irrelevant. I'm only considering turning the articles into a category-based system for easier finding. as for the number of tables, I am not sure how to structure that, which is why I am asking. I'm not sure I completely understand the logic of having so many different tables for something that should be somewhat simple. My only real obstacle was the fact that one article could easily cover two or more categories. Perhaps...

    categories
    catnumber
    catname

    articles
    articleID
    catnumber1
    catnumber2
    catnumber3

    details
    articleID
    pubname
    pubdate
    author
    title

    Dunno, like I said, I'm no good at this stuff, I'm raw and untrained.
    description
    C. D. Daughtrey, NLG
    The Lincoln cent store:
    http://www.lincolncent.com

    My numismatic art work:
    http://www.cdaughtrey.com
    USAF veteran, 1986-1996 :: support our troops - the American way.
    image
  • krankykranky Posts: 8,709 ✭✭✭
    Sorry for being vague. A bitmask is when you use a single bit in a byte (or series of bytes) to set a "flag".

    Let's oversimplify and say we have eight categories of articles. We'll assign each one a value based on incrementing powers of two.
    Copper coins = 1
    Silver coins = 2
    Gold coins = 4
    Philadelphia Mint = 8
    Denver Mint = 16
    San Francisco Mint = 32
    Circulation strikes = 64
    Proof strikes = 128

    Now our database record might contain the following fields:
    Article title (character datatype)/ Publication (character)/ Date (date)/ Category flags (integer)

    The integer field takes only one byte in this example - a very small amount of storage.

    If a particular article relates to Proof (128, based on our table above) gold coins (4) from the Philadelphia Mint (8), we would add the flags together (128 + 8 + 4 = 140) and store 140 in the Category flags field. Obviously, you could do this with any number of flags, and make the flags field long enough to store 2^(number of categories).

    When you want to retrieve articles related to a particular category, you just search the database for all records where the flags field has the particular bit set that represents the desired category by doing a bitwise AND with a number that has only that bit set. For example, to find all articles that relate to the Denver Mint, you would AND the flags field with the number 16.

    I doubt my explanation is very good, but the concept isn't very hard. I'm no programmer, but I've used it in simple programs by finding examples in books and modifying them to suit my needs. It's a good way to store a lot of "yes/no"-type data efficiently.

    But I wouldn't say this is the only answer, hopefully some real DB folks will have better suggestions.

    [edit] couldn't add three numbers correctly! image [/edit]

    New collectors, please educate yourself before spending money on coins; there are people who believe that using numismatic knowledge to rip the naïve is what this hobby is all about.

  • coppercoinscoppercoins Posts: 6,084 ✭✭✭
    Actually, Kranky, that does sound logical, and I did understand it vaguely. I'll have to toy around with that idea for a while...thanks!
    C. D. Daughtrey, NLG
    The Lincoln cent store:
    http://www.lincolncent.com

    My numismatic art work:
    http://www.cdaughtrey.com
    USAF veteran, 1986-1996 :: support our troops - the American way.
    image
  • meos1meos1 Posts: 1,135
    The tables you listed look fine. Develop a relationship map between each table. Populate the database and begin testing queries.

    I like bit mask idea Kranky, maybe we can use the redbook as a topic directory. I think we will end up needing more than the one byte will represent. Rapid development, love it.



    I am just throwing cheese to the rats chewing on the chains of my sanity!

    First Place Winner of the 2005 Rampage design contest!
  • coppercoinscoppercoins Posts: 6,084 ✭✭✭
    The Red Book would be a start, but one of the major categories, as I can see, will be History....mint history, auction history, etc...stuff that the Red Book wouldn't necessarily include.

    For series and type categories, it would be fine, though.
    C. D. Daughtrey, NLG
    The Lincoln cent store:
    http://www.lincolncent.com

    My numismatic art work:
    http://www.cdaughtrey.com
    USAF veteran, 1986-1996 :: support our troops - the American way.
    image
  • critocrito Posts: 1,735
    you need a full text search engine. all the major SQL products (MS SQL Server, Oracle, Sybase) include this ability, I don't think mySQL does, but haven't used it in a while. It didn't support foreign keys (relationships) either, last time I check. A quick google search shows several open source full text search engines available for download. I only work with the three big boys, so can't help you much more than that.
  • coppercoinscoppercoins Posts: 6,084 ✭✭✭
    I'm not sure I understand why I need a full text search engine since what I am doing is simply prganizing articles by category, including only the title and one line of brief information about the article. I'm not including the actual articles. Searching through the DB one would find the title, "Designs of the Times" which would mean very little. It would, however, be listed in the categories for each of the early series and in mint history because it's an article about the "other" 1804 coins...

    Makes more sense to me to have this article fit in the categories mentioned and have them pulled by category, for which there would be no text search needed. Once again, I'm keeping this as simple and easy to do as possible. My only problem is in multiple categories for a single article.
    C. D. Daughtrey, NLG
    The Lincoln cent store:
    http://www.lincolncent.com

    My numismatic art work:
    http://www.cdaughtrey.com
    USAF veteran, 1986-1996 :: support our troops - the American way.
    image
  • critocrito Posts: 1,735
    guess you don't need a full text search engine then. good luck.
  • Meos1 has the right ideas. Each article will have a fixed number of parts (author, date, original publication source, etc...) These are linked in the real common ways.

    My suggestion would be to adopt a variable length table for each article. This table would contain a list of keywords for each article. You could use a fixed length table, but make sure to allow for plenty of keywords. You will then need to maintain a master keyword table and map all the links. By using such a list you can provide for considerable future re-organization or the addition of new keywords as time progresses. My experience is that a typical article will have about 20 keywords, but a small percentage of articles break all the rules.

    After a while (several years), you'll find that the number of articles for each subject becomes staggering and you'll spend all yout time wading through cruft to find what you want. At that time you'll start looking for ways to subgroup your 1000 Lincoln Cent articles (as an example). That's where the variable length keyword table pays off.

    I never got involved with any of the database programming work for our company projects. People with better skills were always available. I also don't have much UNIX skill, but the Microsoft based tools don't give much value for the cost.

    perfectstrike

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file