Cataloguing Writing Systems of Korean Language Works in HathiTrust, or Why I Wish They Would – Translation Networks

Cataloguing Writing Systems of Korean Language Works in HathiTrust, or Why I Wish They Would

Welcome to My HathiTrust Collection!

One of my life’s great passions is Korean language, so when I was first introduced to HathiTrust, of course I wanted to explore the Korean language sources first. I was impressed with the collection overall, but I found that many of the works were illegible to me because of the writing system used. Korean has two writing systems: Hangeul which is phonetic and is used almost ubiquitously in the modern day, and Hanja which employs Sino-Korean characters (Chinese characters, though many have different meanings than when used in Chinese languages). Hanja was the dominant Korean writing system (for scholarly works in particular) until well into the 20th century. This being the case, many of the Korean works in the HathiTrust collection are written predominantly in Hanja. Though I was surprised to find that several of the works I looked at were written in a combination of scripts. I was further surprised that information on the writing system used is not provided in the metadata for Korean works in HathiTrust. This seems like useful information to provide, as the scripts are vastly different and could impact one’s ability to read the text.  As a monument to these discoveries, I decided to make my HathiTrust collection out of works that use multiple writing systems, in this case Hanja and Hangeul.

An example that well represents my collection is this work from 1924 entitled Kaebyeok (link). It features the use of Hangeul and Hanja mixed together. I found this interesting because I was not expecting to find sources that use both Hangeul and Hanja together so closely.

(For those new to Korean scripts, character blocks with characters that look like: 이것은 한글 예다 are Hangeul)

The first page featured in the screencast shows large sections of Hangeul (most notably in the boxed sections) alongside other words written in Hanja. The second page is purely written in Hanja. (link to pages)

Below is the catalogue entry which contains the metadata about this work. You’ll notice that the language is listed as Korean, but there is no information on the script that is used (beyond the title which is listed in it’s original form in Hanja).

Why Should I Care About Cataloguing the Use of Hanja vs. Hangeul?

For at least two reasons! Firstly, not everyone who speaks Korean can read Hanja, or at least Hanja at the level necessary to fully comprehend some works (it takes a lot more dedicated study than Hangeul). If someone is looking for Korean sources on a certain topic, it would be useful to be able to filter based on the script they can read. Secondly, there is historical and cultural significance to the use of Hanja vs. Hangeul. Hangeul was created in the late 1400s to be an accessible writing system that anyone, regardless of education level, could use. For much of history there were class-related differences between what was written in Hangeul vs. Hanja. If one is looking for historical sources and is curious to see different perspectives, looking for sources written in Hanja vs. in Hangeul could prove useful. Having the ability to filter based on writing system would improve the quality and accessibility of HathiTrust’s Korean language collection.

Connecting to the Sawyer Seminar on Building Translation Networks in the Midwest Using HathiTrust

One lightning talk from the seminar that related to my collection was the “Chinese translation of Bengali prose poems via English” presented by Xiaoxi Zhang. In this talk Xiaoxi discusses how Chinese writing evolved through history in terms of character usage (simplified vs. traditional). Like Korean, the motivation behind implementing the simplified Chinese writing system was to improve public literacy. The result was that many modern works and translations are written using the simplified Chinese writing system. This difference is not catalogued in the metadata of Chinese sources on HathiTrust either. Like with Korean, it would be useful to include this information in Chinese sources because of the influence it has on language comprehension (Xiaoxi mentioned names, particularly transliterated names, as a place where this matters).

If I were to add Kaebyeok (or any work in my collection) to the map of works mentioned in the seminar lightning talks, I would connect it to The Baitál Pachísí. This book features three different writing systems: Hindústání, Nágarí, and English. Aan important distinguishing factor between Kaebyeok and The Baitál Pachísí  is that that the use of three writing systems is mentioned in the metadata for The Baitál Pachísí.

Connecting to Other Blog Posts

One post that this post connects to is “Cataloguing of Works in Japanese and Japanese-Related Languages in HathiTrust” by Kristen. In this post, Kristen focuses on the lack of writing system related cataloging in many Japanese language sources in HathiTrust, much like with Korean. The example discussed provides a mix of Japanese and the Ainu language, it is fascinating to learn about the implications of listing this and other sources as simply “Japanese.”

Another related post is “Translating Chinese” by Yining Zhang. In this post, Yining discusses the translation and transliteration of proper nouns and how the evolution of Chinese writing influenced these practices. This post tied in many of the connections I found with Xiaoxi Zhang’s lightning talk from the Sawyer Seminar.

lsa logoum logoU-M Privacy StatementAccessibility at U-M