Страница публикации
Towards End-to-End Transformation of Arbitrary Tables from Untagged Portable Documents (PDF) to Linked Data
Авторы: Shigarov A., Cherepanov I., Cherkashin E., Dorodnykh N., Khristyuk V., Mikhailov A., Paramonov V., Rozhkow E., Yurin A.
Журнал: CEUR Workshop Proceedings: Proc. of 2nd Scientific-Practical Workshop Information Technologies: Algorithms, Models, Systems (ITAMS'2019)
Том: 2463
Номер:
Год: 2019
Отчётный год: 2019
Издательство:
Местоположение издательства:
URL:
Проекты:
DOI:
Аннотация: The paper is devoted to the problem of an end-to-end table transformation from untagged portable documents (PDF) to linked data. It covers the issues of the table extraction from documents, the reconstruction of logical table structure, the conceptualization of their natural-language content, and the linking of extracted data with external vocabularies. We consider some perspective approaches for the deeplearning-based table detection, heuristic-based table structure recognition, rule-based table analysis, and knowledge-based table interpretation. They can be used as a basis to develop a consistent solution for this problem. Our application experience confirms that such solutions are demanded for populating databases and generating ontologies with tabular data being extracted from weakly and semi-structured documents.
Индексируется WOS: Нет
Индексируется Scopus: Нет
Индексируется УБС: Нет
Индексируется РИНЦ: Нет
Индексируется ВАК: Нет
Индексируется CORE: Нет
Публикация в печати: 0