The core structure for KBpedia is derived from six (6) main knowledge bases — OpenCyc, UMBEL, GeoNames, DBpedia, Wikipedia and Wikidata. The conceptual relationships in the KBpedia Knowledge Ontology (KKO) are largely drawn from OpenCyc and UMBEL, though any of the other four sources may contribute local knowledge graph structure. Additional reference concepts (RCs) are contributed primarily from GeoNames and Wikipedia. Wikidata contributes the bulk of the instance data, though instance records are actually drawn from all sources. DBpedia and Wikidata are also the primary sources for attribute characterizations of the instances.
Two characteristics define what is a core contributor to the KBpedia structure: 1) the scale and completeness of the source; and 2) its contribution of a large number of RCs to the overall KKO knowledge graph. Here is more discussion of these six core knowledge bases:
|OpenCyc||Cyc is an artificial intelligence project that combines a comprehensive ontology with a knowledge base of everyday common sense. OpenCyc is an open source part of Cyc with more than 300,000 concepts. Cyc has been manually developed since 1984 by Cycorp of Austin, TX. It has been developed and refined through more than 1,000 person-years of effort. OpenCyc provides most of the starting reference concepts and relationships used in the KBpedia Knowledge Ontology (KKO) knowledge, especially in the areas of entity types and in the KBpedia typologies. Though OpenCyc is English only, KKO is multi-lingual, using the Cyc concepts as a language-neutral backbone. As of March 2017, OpenCyc has been removed from the public domain, though thousands of earlier copies remain.|
|UMBEL||UMBEL (Upper Mapping and Binding Exchange Layer) is a logically organized knowledge graph used to aid data interoperability. UMBEL has two means to promote the semantic interoperability of information. It is 1) an ontology of about 35,000 reference concepts, designed to provide common mapping points for relating different ontologies or schema to one another, and 2) a vocabulary for making such mappings, particularly to external sources. The UMBEL design that provides splits between relations, attributes, concepts and entity types (and typologies) is a key intellectual underpinning to the KBpedia system.|
|GeoNames||The GeoNames knowledge base contains over 10,000,000 geographical names corresponding to over 7,500,000 unique features. All features are categorized into one of nine feature classes and further subcategorized into one of 645 feature codes. Beyond names of places in various languages, data stored include latitude, longitude, elevation, population, administrative subdivision and postal codes. All coordinates use the World Geodetic System 1984 (WGS84). GeoName's feature codes were used to organize all of the geographical entity types (geopolitical, settlements, natural areas, and developed areas) in KBpedia.|
|DBpedia||DBpedia is a project that extracts structured content from the infoboxes on Wikipedia and makes it available using semantic technologies. DBpedia is the source of much of the structured information for Wikipedia concepts and entities contained within KBpedia. (The separate DBpedia ontology is mapped to KBpedia as an external linkage, but is not part of the KBpedia core.)|
|Wikipedia||Wikipedia is a free Internet encyclopedia written and maintained by volunteer editors. Wikipedia is the largest and most popular general reference work on the Internet and is ranked among the ten most popular websites. It has more than 200 language versions; the English version has more than 5 million articles. The English Wikipedia is the main source of reference concept content in KBpedia and the source of most text. Though local relationships may be used, KBpedia replaces the Wikipedia category structure with its own KBpedia Knowledge Ontology (KKO). Concept and entity linkages (via Wikipedia articles) provide the means for adopting multiple language versions of KBpedia.|
|Wikidata||Wikidata is a collaboratively edited knowledge base of (mostly) entity records, intended to provide a common source of data that can be used by Wikimedia projects such as Wikipedia. There are more than 20 million records in Wikidata, which are also linked to key Wikipedia articles. Many are also linked to Wikimedia images. Wikidata is the primary source of entity and attribute data and images in KBpedia. Most Wikidata records are multilingual, which also provides the means for adopting multiple language versions of KBpedia.|