RESEARCHERS COMPILE COIN DIGIT IMAGE DATABASES

PREV ARTICLE NEXT ARTICLE FULL ISSUE PREV FULL ISSUE

V20 2017 INDEX E-SYLUM ARCHIVE

The E-Sylum: Volume 20, Number 30, July 23, 2017, Article 29

Something I learned about this week from Ed Snible's A Gift for Polydektes blog is a new paper describing work advancing the science and technology of image-based coin recognition. -Editor

Xingyu Pan and Laure Tougne have published a paper describing their database of digits extracted from scans of modern coins.

The database of 3000+ images itself cannot be directly downloaded. They explain here that you may request it from them.

Although the new dataset only contains digits from modern coins the authors discuss the literature of recognition of ancient coin lettering. There are as yet no public databases of images of the digit or letters extracted from images of ancient coins.

Here's an excerpt from the paper. -Editor

Since the release date struck on a coin is important information of its monetary type, recognition of extracted digits may assist in identification of monetary types. However, digit images extracted from coins are challenging for conventional optical character recognition methods because the foreground of such digits has very often the same color as their background. In addition, other noises, including the wear of coin metal, make it more difficult to obtain a correct segmentation of the character shape. To address those challenges, this article presents the CoinNUMS database for automatic digit recognition.

Nowadays, character recognition methods have been widely implemented into various applications in our daily life like license plate recognition system, mail sorting machine, and Google Instant Camera Translation. For each category of application, appropriate databases are essential requirements to develop adapted algorithms. Even a large amount of applications are developed on a small set of private data collected by researchers, public databases play more and more important roles for developing generalized methods and launching competitions of different approaches.

characters having the same color in foreground as in background, like characters on coins, draw our attention. As a matter of fact, such characters are seldom studied in character recognition issues but they are largely presented in sculptures, industrial products, coins, artworks, etc. Let us call “hollow-type font” such characters have the same color or texture in foreground as in background. Up to now, there is no specific database of characters in hollow-type font.

As image-based coin recognition has become an active research topic for recent decade, various algorithms based on training global features (Fukumi et al., 1992; Huber et al., 2005; Van Der Maaten and Poon, 2006; Reisert et al., 2007) or matching local features (Kampel and Zaharieva, 2008; Arandjelovic, 2010; Pan et al., 2014) have been proposed.

At the beginning, studies were limited in classification of modern coins where legends were treated no differently as other patterns in relief. In our opinion, there are two reasons why researchers were not interested in reading coin legends. First, detection and recognition of coin characters in hollow-type font is a real challenge to obtain good results because of their hard-to-segment foreground, not to mention their extremely high variations in terms of font, size, tilt, etc. and, the possible overlap with other decorative patterns (see Figure 2). Second, most legends may stay unchanged among different coin types of one country at the same period, for example, “LIBERTE EGALITE FRATERNITE” on different French franc coins. Thus, the differences in terms of overall appearance are easier to differentiate different coin classes than local legends.

First attempts to deal with characters in hollow-type font were made by researchers trying to classify ancient coins. To classify manual struck ancient coins, the overall appearance is sometimes confusing due to a high intra-class difference. For example, in some cases for ancient coins with an emperor portrait as the main pattern, the different portraits can be the same emperor and the similar portraits can be different emperors. Therefore, reading legends indicating the emperor's name can assist a lot to classify those ancient coins. To solve such challenges within characters extracted from ancient coins, researchers have opted for solutions rooted in object recognition rather than character recognition methods.

Recently, several databases of coin photographs have been built mainly for image-based coin recognition issues. Establishing such a database with a decent data scale is much more difficult than building one for face recognition, for example, due to availability of rare coins and highly required photograph quality. Therefore, most online amateur photographs taken by coin dealers and collectors cannot be used as eligible data. It requires that museums, numismatic companies, and research institutions make effort to collect coins and take photographs under strictly controlled conditions. Meanwhile, how to shoot quality coin photographs through a structured and systematic approach is still an open field for professional photographers.

Interestingly, even among professional coin photographs, criteria of quality could differ in different application contexts. To display an esthetic and shiny coin appearance to coin dealers and collectors, one or more angled lighting could be applied; however, to analyze small details on coins, a quasi-uniform light condition is expected. In general, a quasi-uniform background, a minimal shadow, and constant controlled light conditions are main characteristics of numismatic databases used in computer vision-assisted applications.

Conclusion and Perspectives
A database consisting of digit characters having indistinct foreground and background has been described in this article. It is built from professional coin photographs provided by numismatic companies. We propose this database based on two motivations. First, those characters with “hard-to-segment” foreground in hollow-type font have been less studied in previous work of character recognition. Second, due to lack of effective algorithms to deal with characters in hollow-type font, legends on coins are difficult to read precisely by computer vision, which make numismatic community still relies on expensive human investigation.

Due to the availability of professional coin photographs, the current version of our database has not a huge amount of data and equal data distribution. Our future work is to continue our effort to further enlarge the database with help of numismatic companies: with the availability of more professional coin photographs in better quality, in next versions CoinNUMS will be largely increased in scale; furthermore, we aim at extending CoinNUMS little by little to a more comprehensive database, perhaps renamed as CoinLEGENDS, that will include letters, foreign characters, and short words as well. We hope CoinNUMS will become an initiative to draw attentions of researchers studying on character recognition to some complex cases which rarely happens in general situations but could be common in a specific domain. For a numismatic perspective, we hope automatic reading legends on coins will become as fast and precise as nowadays matured OCR applications.