تجزیه جغرافیایی چند زبانه
ترجمه نشده

تجزیه جغرافیایی چند زبانه

عنوان فارسی مقاله: تجزیه جغرافیایی چند زبانه براساس ترجمه ماشینی
عنوان انگلیسی مقاله: Multi-lingual geoparsing based on machine translation
مجله/کنفرانس: سیستم های کامپیوتری نسل آینده-Future Generation Computer Systems
رشته های تحصیلی مرتبط: مهندسی کامپیوتر
گرایش های تحصیلی مرتبط: الگوریتم ها و محاسبات
کلمات کلیدی فارسی: شناسایی موجودیت نامدار، موقعیت، تجزیه جغرافیایی، چند زبانه، ترجمه ماشینی، صف بندی کلمه
کلمات کلیدی انگلیسی: Named entities recognition، Location، Geoparse، Multi-lingual، Machine translation، Word Alignment
نوع نگارش مقاله: مقاله پژوهشی (Research Article)
نمایه: Scopus – Master Journals List – JCR
شناسه دیجیتال (DOI): http://dx.doi.org/10.1016/j.future.2017.07.057
دانشگاه: State Key Laboratory of Software Engineering, Computer School, Wuhan University, China
ناشر: الزویر - Elsevier
نوع ارائه مقاله: ژورنال
نوع مقاله: ISI
سال انتشار مقاله: 2019
ایمپکت فاکتور: 7.007 در سال 2018
شاخص H_index: 93 در سال 2019
شاخص SJR: 0.835 در سال 2018
شناسه ISSN: 0167-739X
شاخص Quartile (چارک): Q1 در سال 2018
فرمت مقاله انگلیسی: PDF
تعداد صفحات مقاله انگلیسی: 11
وضعیت ترجمه: ترجمه نشده است
قیمت مقاله انگلیسی: رایگان
آیا این مقاله بیس است: خیر
کد محصول: E12090
فهرست انگلیسی مطالب

Abstract


1. Introduction


2. Related work


3. Our multi-lingual geoparser, LanguageBridge


4. Data


5. Evaluation of our LanguageBridge prototype for multi-lingual geoparsing


6. Conclusion


Acknowledgments


References

نمونه متن انگلیسی مقاله

Abstract


Our method for multi-lingual geoparsing uses monolingual tools and resources along with machine translation and alignment to return location words in many languages. Not only does our method save the time and cost of developing geoparsers for each language separately, but also it allows the possibility of a wide range of having a wide range of language capabilities within a single interface. We evaluated our method in our LanguageBridge prototype on location named entities using newswire, broadcast news and telephone conversations in English, Arabic and Chinese data from the Linguistic Data Consortium (LDC). Our results for geoparsing Chinese and Arabic text using our multi-lingual geoparsing method are comparable to our results for geoparsing English text with our English tools. Furthermore, our experiments using our tools on machine translation approach in accuracy results on results from the same data that was translated manually, further showing the robustness of locations to machine translation.


Introduction


Named Entity Recognition is central to many Natural Language Processing tasks, including information retrieval, question answering, data mining and text analysis. Often, finding named entities in different languages is approached by developing tools in each language separately. NLP tools for English are widely developed and used and can be downloaded easily on Internet. However, minority languages have little useful NLP tools, such as Mongol, Vietnamese and so on. In this paper, our method aims to reduce development time for Named Entity Recognition tools by processing in a single language via machine translation. We assume that our method extends to person and organization named entities, although our research focus is on named entities for location. Named entities for location. Named Entity Recognition typically encompasses named entities for person, organization and location. Our focus for experimentation is on named entities for location, which we alternately refer to as toponym. That is because our ultimate goal is to produce not only the locations, but also the geographic coordinates for each location. Our results can be displayed on a geographic map, if desired. Logic of method. The previous version of our English geoparser can find location named entities in high quality English text, as well as in English text produced by machine translation from other languages. Our method is based on a finding in our previous research that finding locations in Spanish tweets with a geoparser trained for Spanish was less accurate than geoparsing an English translation of the same Spanish tweets with a geoparser trained for English [1]. Similar results were found when using machine translation and English tools to find named entities in source texts in Swahili and Arabic [2]. In fact, statistical machine translation is often used for cross-language information retrieval [3].

  • اشتراک گذاری در

دیدگاه خود را بنویسید:

تاکنون دیدگاهی برای این نوشته ارسال نشده است

تجزیه جغرافیایی چند زبانه
نوشته های مرتبط
مقالات جدید
لوگوی رسانه های برخط

logo-samandehi

پیوندها