Home U.S. Coin Forum
Options

AI Grading- Appears Much Closer Than I Thought - Implications?

2»

Comments

  • Old_CollectorOld_Collector Posts: 329 ✭✭✭✭

    @Coins3675 said:
    A lot of ebay sellers photos are terrible.

    A lot? Try almost all. I buy off EBay on occasion and I rarely find decent photos unless their is a True View shot and then I have to mentally color correct. Most shots are at odd angles, poorly lit, and low resolution. And that is only the actual legitimate coins. So yeah, it is an issue. When I put up a coin on EBay I use high def images and always include a video with rotation under a bright light to show the surface as best as I can, but I almost never see that when I look for things.

  • Coins3675Coins3675 Posts: 236 ✭✭✭

    @RiveraFamilyCollect said:
    I'm going to go out on a limb here. Anything graded ms70 is no longer bullion.

    I don't know about everyone else, but I consider MS-70 bullion collector coins instead of bullion.

  • DMWJRDMWJR Posts: 6,030 ✭✭✭✭✭

    Can AI reproduce the human experience of eye appeal? Technical aspects can be done. But this is the real question IMO

    Doug
  • jmlanzafjmlanzaf Posts: 36,219 ✭✭✭✭✭

    @DMWJR said:
    Can AI reproduce the human experience of eye appeal? Technical aspects can be done. But this is the real question IMO

    Even the human eye can't reproduce eye appeal. People do not agree on what is appealing. But the AI can be trained for subjectivity.

  • MarkInDavisMarkInDavis Posts: 1,720 ✭✭✭✭

    Which is more difficult with a scarier downside, AI grading coins or AI driving cars? The fact of the matter is we already have AI driving cars. You can hop in a Waymo in Phoenix, Austin or SF. If it fails miserably you could die, but people do it every day. More time and money has been put into AI driving cars because the financial rewards are much greater. If the same time and money were invested in AI grading, we would already have it. It wouldn't be perfect, but neither is human grading. AI grading will come like it or not and it is not far off. Don't fear it.. or do. It will come just the same.

    image Respectfully, Mark
  • GoldbullyGoldbully Posts: 17,868 ✭✭✭✭✭

    @Mr_Spud said:


    Numismatic AI image of the year 2025.


  • semikeycollectorsemikeycollector Posts: 1,189 ✭✭✭✭✭

    @JCH22 said:
    Came across a very intersting Master Thesis--- “Development of an Automated Coin Grading System: Integrating Image reprocessing, Feature Extraction, and ML Modeling, by Jianzhu Chen, submitted to Virginia Polytechnic Institute for his Masters in Electrical and Computer Engineering in 2024.

    Apart from being a mouthful, is a long read—94 pages—

    Summary line—he was able to develop/demonstrate a promising model nicely matching NGC & PCGS grading of Franklins, including toners.

    Full Link:
    https://vtechworks.lib.vt.edu/server/api/core/bitstreams/f60f5933-c775-470c-bc46-6a18c2724d4c/content

    His model is far from prime time, but given the exponential increases in AI, the future of non-human coin grading appears something that's not too far off. Might almost be there with something like generic bulk Silver Eagles.

    I guess in the future, discussions will center around whose algorithm is tighter, which is better for which series/issues etc. Perhaps even a Grand Master Computer-- most strict Algo--- can review the others---give whatever the equivalent of an Algo sticker might be?

    Will there be a separate market for human graded holders? Would these be thought less valuable, or more, given they were graded by us subjective & fallible humans? Registry?

    Would very much appreciate any thoughts on what others might have good/bad, consequences/ramifications ... Seems just a matter of time....

    Thank you so much for posting this!

  • yosclimberyosclimber Posts: 5,026 ✭✭✭✭✭
    edited July 12, 2025 10:38AM

    I read through the paper, and it is not very close to even grading Franklins accurately.
    So if it is "closer than I expected", your standards are fairly low in my view.

    The paper does a lot of things well, such as creating standard masks which identify the fields,
    rotating photos to align with the standard masks, and a good effort at handling color toning.
    But it is very much limited by the quality of the input data (photos from DLRC auctions).
    The author is aware of this when he mentions "lighting bias", and @hedgefundtradingdesk is aware of this as well, when he describes standardized lighting that is needed.

    Last sentence in the Abstract:

    At first this 95% looks like a high number.
    But if you consider the grade distribution, an accuracy of 91% can be achieved by a very dumb model that just assigns MS65 to every coin and ignores the photos.

    2500 MS64, 65, 66 are within +/- 1 point of 65
    254 are 2 or more points from 65
    2754 total
    2500/2754 = 90.78%


    (Difference = Predicted - Actual)
    I think a better accuracy metric is to look at prediction accuracy for each actual (slab) grade.
    The above graph is slightly different, but all the coins with the same actual grade are on a roughly 45 degree upward sloped line.

    So all the actual 65s are in the circles at (-1,64), (0,65), (1,66).
    Here the model is giving more 64s than 66s in error.

    The actual 66s (-2,64), (-1,65), (0,66), (1,67) get more 65 predictions than 66s, which is quite a bias.

    For the actual 67s, they get more 65s and 66s than 67s, also biased.

    And for the actual 64s, slightly more 65s are predicted than 64s - not good.

    So the model acts quite a bit like my "very dumb" model that gives 65 to everything.

    The predictions should also be grouped by toning in addition to human grade.
    I would expect more prediction variance for toned coins than untoned.
    And it would let you partly separate the toning part of the grade prediction.


    The model is really too simple.
    It mainly looks for brightness differences in the fields, so blast white coins with diffuse lighting get high grades.
    But the above 67 with apparent dark areas in the fields for this particular photo gets a 65.

    As the author and others here have mentioned, better input photos would have multiple lighting angles and also a photo with diffuse lighting.
    He standardized all the photos to be 1000 x 1000, which seems way too small, but it's what was easily available.
    He used arc and pie shaped regions, which map to some of the fields, but the regions should be created by hand for the actual fields.

    (Like was done in that famous 1990 CompuGrade Patent application for Peace dollars).
    Also, the areas with the date and mint mark should be ignored in the mask and differencing.

  • FlyingAlFlyingAl Posts: 3,826 ✭✭✭✭✭

    AI is only ever as good as the models it is trained on. The better the consistency between the model set and the experimental set, the more accurate the AI.

    Essentially, for AI to be accurate you’d need thousands of images of each coin in each grade, all taken under identical conditions. Then you can start grading, but each coin graded would need to be taken in those same identical conditions.

    Right now, I think that’s a little too much to handle. Maybe in the future.

  • pmh1nicpmh1nic Posts: 3,335 ✭✭✭✭✭

    I listen to a few podcast on AI over the last week and the massive AI learning that has occurred over the last year. The technology is at the point where its gets perfect SAT scores every time, even on questions it hasn't seen before, and near perfect GRE scores in math, engineering, physics, linguistics and humanities. It's smarter that all graduate students in all disciplines. Folks like Elon Musk and Geoffrey Hinton, major developers of AI expressed real concern about the unintended ramifications of AI. It has a tremendous potential for find in cures for certain diseases AND developing disasters viruses. Learning how to grade coins is a piece of cake.

    The longer I live the more convincing proofs I see of this truth, that God governs in the affairs of men. And if a sparrow cannot fall to the ground without His notice is it possible for an empire to rise without His aid? Benjamin Franklin
  • JCH22JCH22 Posts: 341 ✭✭✭✭
    edited July 11, 2025 10:31PM

    @yosclimber said:
    I read through the paper, and it is not very close to even grading Franklins accurately.
    So if it is "closer than I expected", your standards are fairly low in my view....

    Perhaps I should have fleshed out the "closer than expected" reference more, but to directly answer your point--no, the reference to "closer than expected" was not confined to the complexity or performance of only this one specific model.

    Mentioned the model was not ready for prime time. "Closer than expected" referred to the proof of concept a masters level student on a shoe string could demonstrate. Sure a well capitalized effort would yield exponential short term progress. Don't think TPGs & VCs are unaware of the potential market.... Especially should the future market demand AI screening of legacy human graded coins.

  • yosclimberyosclimber Posts: 5,026 ✭✭✭✭✭
    edited July 11, 2025 10:32PM

    @pmh1nic said:
    I listen to a few podcast on AI over the last week and the massive AI learning that has occurred over the last year. The technology is at the point where its gets perfect SAT scores every time, even on questions it hasn't seen before, and near perfect GRE scores in math, engineering, physics, linguistics and humanities. It's smarter that all graduate students in all disciplines.

    In my view, AI is not smarter than all grad students.
    It just happens to be good at answering GRE questions, which it has practiced very extensively.

  • MsMorrisineMsMorrisine Posts: 35,564 ✭✭✭✭✭

    one in love with ai could ask "which is correct and which is predicted between the ai and human?"

    Current maintainer of Stone's Master List of Favorite Websites // My BST transactions
  • MsMorrisineMsMorrisine Posts: 35,564 ✭✭✭✭✭

    @yosclimber said:
    2500 MS64, 65, 66 are within +/- 1 point of 65
    254 are 2 or more points from 65
    2754 total
    2500/2754 = 90.78%

    considering 63s and below are less likely to be submitted, the pop report can't be used to judge grade distribution of coins. it is the coins submitted for grading grade distribution

    65 and higher is likely to represent a better distribution of grades for fhd overall, while 64 and below are probably under represented. so, perhaps ms64 grades would be higher than 65s if more were submitted?

    Current maintainer of Stone's Master List of Favorite Websites // My BST transactions
  • yosclimberyosclimber Posts: 5,026 ✭✭✭✭✭
    edited July 11, 2025 10:58PM

    @JCH22 said:
    Mentioned the model was not ready for prime time. "Closer than expected" referred to the proof of concept a masters level student on a shoe string could demonstrate.

    I agree, it was an appropriate level of effort for a masters thesis.

    Sure a well capitalized effort would yield exponential short term progress.

    I don't agree.
    With some time and effort, many quality photos could be taken of each coin and improved grading accuracy should result.
    Especially for the task of assigning 69 or 70 to ASEs.
    But even for Franklins, I believe handling the toning would be quite difficult.
    But that's just my guess. I could be wrong.

    Don't think TPGs & VCs are unaware of the potential market.... Especially should the future market demand AI screening of legacy human graded coins.

    I think TPGs and VCs are aware of the potential market. And I think it's a small market. This should limit their interest.

  • MsMorrisineMsMorrisine Posts: 35,564 ✭✭✭✭✭

    Current maintainer of Stone's Master List of Favorite Websites // My BST transactions
  • yosclimberyosclimber Posts: 5,026 ✭✭✭✭✭
    edited July 12, 2025 12:37AM

    @MsMorrisine said:
    one in love with ai could ask "which is correct and which is predicted between the ai and human?"

    It's true - there is some risk in treating the human grade as absolutely correct, conditioning on it, etc.
    We see people getting more accuracy by adding the Eagle Eye and CAC stickers.

    But I think in this study the human grade has to be the standard for comparison.
    Each coin was observed in hand with multiple light angles by multiple experienced professional graders.
    Compare to a 1000 x 1000 auction photo, which is very limited data.

  • MsMorrisineMsMorrisine Posts: 35,564 ✭✭✭✭✭
    edited July 12, 2025 12:04AM

    when 67 is guessed, it is more likely to be 68 or 66 than correct; and,barring optical illusions, it undergraded 68s more than over graded 66s

    when 65s were guessed it was more likely to overgrade a 64 than undergrade a 66

    Current maintainer of Stone's Master List of Favorite Websites // My BST transactions
  • @yosclimber said:
    I think TPGs and VCs are aware of the potential market. And I think it's a small market. This should limit their interest.

    I agree that coin grading is a small market. Facebook (META) is rumored to have poached a top Apple AI executive for $200 million. TPGs and grad students simply cannot compete with the level of resources required.

    The path to AI grading starts with general purpose visual AI applications: drone detection, robotaxis, sea floor mining, fruit picking, invasive species detection, etc.

    Once AI masters these applications, it becomes much easier to say "Okay, now do coins."

    Then you get the application for free, with R&D paid for by military, transportation, mining, and agricultural interests.

    You still need better photographs.

  • MsMorrisineMsMorrisine Posts: 35,564 ✭✭✭✭✭

    we need better monitors to see what the ai can see

    we can make the pics 8192x8192 but to teach would require us to be able to see the whole coin in 8192x8192

    Current maintainer of Stone's Master List of Favorite Websites // My BST transactions
  • MrEurekaMrEureka Posts: 24,403 ✭✭✭✭✭

    Assuming humans will review AI grades and sometimes override them, the machine will learn something from every grading event and modify its “standards” accordingly. Which means that whenever you see a slab with an AI generated grade, you won’t know if the coin would grade the same way if resubmitted. That doesn’t mean the system can’t work, or that it’s inferior to what we now have. But there will be some problems to solve.

    Andy Lustig

    Doggedly collecting coins of the Central American Republic.

    Visit the Society of US Pattern Collectors at USPatterns.com.
  • jmlanzafjmlanzaf Posts: 36,219 ✭✭✭✭✭

    @yosclimber said:

    @pmh1nic said:
    I listen to a few podcast on AI over the last week and the massive AI learning that has occurred over the last year. The technology is at the point where its gets perfect SAT scores every time, even on questions it hasn't seen before, and near perfect GRE scores in math, engineering, physics, linguistics and humanities. It's smarter that all graduate students in all disciplines.

    In my view, AI is not smarter than all grad students.
    It just happens to be good at answering GRE questions, which it has practiced very extensively.

    Actually, check out futurehouse.com. the reasoning models they have developed are now accurate than PhD chemists on chemical questions.

  • jmlanzafjmlanzaf Posts: 36,219 ✭✭✭✭✭

    @hedgefundtradingdesk said:

    @yosclimber said:
    I think TPGs and VCs are aware of the potential market. And I think it's a small market. This should limit their interest.

    I agree that coin grading is a small market. Facebook (META) is rumored to have poached a top Apple AI executive for $200 million. TPGs and grad students simply cannot compete with the level of resources required.

    The path to AI grading starts with general purpose visual AI applications: drone detection, robotaxis, sea floor mining, fruit picking, invasive species detection, etc.

    Once AI masters these applications, it becomes much easier to say "Okay, now do coins."

    Then you get the application for free, with R&D paid for by military, transportation, mining, and agricultural interests.

    You still need better photographs.

    I think medical imaging should be on that list.

  • fathomfathom Posts: 1,859 ✭✭✭✭✭
    edited July 12, 2025 6:39AM

    Again I think classic coins are a real challenge from a resource perspective and market acceptability. Not to mention subjective analysis.

    But moderns can be done, and it is a large enough marketplace.

    One challenge is the imaging. The coin could be rotated and subject to the almost infinite lighting angles to be completely precisely graded. If it is done in this manner, it will be tremendously precise, you could probably intro a thousand point scale., whether that is market acceptable is another issue.

  • Morgan13Morgan13 Posts: 1,618 ✭✭✭✭✭

    I am against it.
    I like the human touch.
    I am one of the people who think AI is not good for everything.

    Student of numismatics and collector of Morgan dollars
    Successful BST transactions with: Namvet Justindan Mattniss RWW olah_in_MA
    Dantheman984 Toyz4geo SurfinxHI greencopper RWW bigjpst bretsan MWallace logger7

  • rooksmithrooksmith Posts: 1,040 ✭✭✭✭

    So far AI is not as good as humans. I think the real value might be in pre-screening massive amounts of coins that are fed into some sort of assembly line and analysed. This might help the guy with all the pennies in trash cans to find a few key dates!

    “When you don't know what you're talking about, it's hard to know when you're finished.” - Tommy Smothers
  • stockdude_stockdude_ Posts: 503 ✭✭✭

    "I would assume the algorithms will be able to successfully match the human graders 99.99% of the time." LOL we dont want that! We want accuracy and consistency. Human grading is all over the place. Computer grading needs to happen.

  • stockdude_stockdude_ Posts: 503 ✭✭✭

    @Morgan13 said:
    I am against it.
    I like the human touch.
    I am one of the people who think AI is not good for everything.

    I dont think anyone here said its good for everything ;)

  • Cougar1978Cougar1978 Posts: 8,759 ✭✭✭✭✭
    edited July 12, 2025 8:10PM

    Not to worry - self grading holders $5 a dozen. Consistent grading - no more sticker game - yay!

    Coins & Currency
  • 4Redisin4Redisin Posts: 598 ✭✭✭

    @jmlanzaf said:

    Neither. I'm saying that people said that collectors would not want 3rd party graded coins.

    Where I'm from COIN DEALERS did not want TPG and a large number refused to go along until they were forced to by collectors. I never heard one average or below average collector (less knowledge than collectors who are true numismatists) ever say TPG was not needed but I was not on coin chat forms or in coin clubs back then.

  • jmlanzafjmlanzaf Posts: 36,219 ✭✭✭✭✭
    edited July 13, 2025 7:41AM

    @4Redisin said:

    @jmlanzaf said:

    Neither. I'm saying that people said that collectors would not want 3rd party graded coins.

    Where I'm from COIN DEALERS did not want TPG and a large number refused to go along until they were forced to by collectors. I never heard one average or below average collector (less knowledge than collectors who are true numismatists) ever say TPG was not needed but I was not on coin chat forms or in coin clubs back then.

    It was worse than "not needed". There are still people, to this day, who will not buy slabbed coins for a variety of reasons. I think dealers embraced it faster than most collectors because they realized they could get a premium for the same coins if they could get it in the right holder. Even as late as the mid-90s people at the local coin club would rail about TPGs and how they were ruining the hobby.

  • 4Redisin4Redisin Posts: 598 ✭✭✭

    Thanks for your post. I'm in agreement that a great many dealers and collectors (including myself) wish authentication services never graded coins, leading to PCGS and NGC being started because they did not like the strict (non-commercial) grading being done at the time. However, from my perspective as a non-dealer, back in the late70's and 80's (beginning of TPG) I never heard a complaint about TP grading. Most loved the verification and safety it provided for those who were not informed. Since then, that's all I hear about TPGS and not one bit of praise. As a dealer, you have a different perspective and I'm glad that you share it.

  • jmlanzafjmlanzaf Posts: 36,219 ✭✭✭✭✭
    edited July 13, 2025 1:20PM

    @4Redisin said:
    Thanks for your post. I'm in agreement that a great many dealers and collectors (including myself) wish authentication services never graded coins, leading to PCGS and NGC being started because they did not like the strict (non-commercial) grading being done at the time. However, from my perspective as a non-dealer, back in the late70's and 80's (beginning of TPG) I never heard a complaint about TP grading. Most loved the verification and safety it provided for those who were not informed. Since then, that's all I hear about TPGS and not one bit of praise. As a dealer, you have a different perspective and I'm glad that you share it.

    I don't go back that far. Lol. I'm talking about the "coffins" starting in the mid 80s.

  • 4Redisin4Redisin Posts: 598 ✭✭✭

    @jmlanzaf said:

    @4Redisin said:
    Thanks for your post. I'm in agreement that a great many dealers and collectors (including myself) wish authentication services never graded coins, leading to PCGS and NGC being started because they did not like the strict (non-commercial) grading being done at the time. However, from my perspective as a non-dealer, back in the late70's and 80's (beginning of TPG) I never heard a complaint about TP grading. Most loved the verification and safety it provided for those who were not informed. Since then, that's all I hear about TPGS and not one bit of praise. As a dealer, you have a different perspective and I'm glad that you share it.

    I don't go back that fast. Lol. I'm talking about the "coffins" starting in the mid 80s.

    We all base our opinion on the period we live in. Your opinion is no different than mine except yours's is more important as it is closer to the reality of today! Thank you very much.

  • logger7logger7 Posts: 8,975 ✭✭✭✭✭

    If anything we should be working in the opposite direction, working to educate people on grading standards in person. Computers suck a lot of creativity and the human element out of life and make society increasingly sterile.

Leave a Comment

BoldItalicStrikethroughOrdered listUnordered list
Emoji
Image
Align leftAlign centerAlign rightToggle HTML viewToggle full pageToggle lights
Drop image/file