• College of Arts & Sciences
  • Graduate Division
  • College of Liberal and Professional Studies

Home

Semantics research is about how the meaning of a sentence is determined from its parts and the way the parts are put together. Semantics at Penn focuses on several new approaches to the field, including LTAG semantics and underspecification as well as the application of game theory.

Florian Schwarz and Robin Clark lead Penn's research in formal semantics, mathematical linguistics, and computational semantics. Schwarz's main research interests are in formal semantics and pragmatics of natural language, as well as their relationship to each other, in particular with respect to the role of context in interpretation. He has worked, most recently, on the analysis of different types of definite articles in German, couched within a situation semantics. This work also bears on our understanding of various types of implicit content, e.g., the issue of domain restriction and implicit arguments, and on theories of quantification and covariation. Other topics he has worked on include the syntactic encoding of information structure in Kikuyu and the analysis of intensional transitive verbs (e.g., need in I need a beer ). In addition to his theoretical work, he is engaged in several types of research that provide new empirical perspectives on semantics and pragmatics, and is in the process of setting up a lab for the experimental study of meaning-related phenomena. His work in this area has looked at, among other things, the processing of pragmatic content, such as presuppositions and implicatures. Furthermore he has begun to develop, in collaboration with Chris Potts (Stanford), an approach for investigating expressive aspects of meaning and their relation to context using large online corpora.

Clark has been pursuing a line of research that began as an attempt to address the issue of language learnability. About a decade ago his work on Kolmo­gorov complexity led him to an automata-theoretic characterization of quantifiers. He then needed to confront the fact that, while first-order quantifiers can be simulated with finite-state machines, more complex quantifiers require more complex machines and more memory. Since it seems likely that processing different types of quantifiers might involve different parts of the brain, Clark is collaborating with Murray Grossman of the School of Medicine to test that hypothesis experimentally.

His unique perspective on linguistics has led him to pursue a new game-theoretic approach to the semantics of quantifiers and more generally a model of meaning rooted in the cooperative interaction between social agents. He is now collaborating with Prashant Parikh on the analysis of discourse anaphora using classic game theory; they have extended the game-theoretic analysis to other semantic and pragmatic problems. Clark introduces his students to important alternative approaches to semantics, including game theory and Categorial Grammar, which has been advanced by former Penn professor Mark Steedman , now at the University of Edinburgh.

Anna Papafragou brings developmental expertise to the study of natural language meaning at Penn. She asks how young children acquire the semantics of various terms in their language, including space and motion expressions, numbers, quantifiers, epistemic modals and evidentials. A large part of her work addresses the way in which semantic content combines with pragmatic inference in both young  and more mature (adult) users of the language. Papafragou has worked extensively on the development of conversational (especially scalar) implicatures and other aspects of the semantics-pragmatics interface. She is especially interested in cross-linguistic work that seeks to uncover semantic and pragmatic universals, and in links between pragmatic phenomena and the cognitive capacity to think about others’ mental states.

Mathematical linguistics and formal semantics, game theory, acquisition and learnability, formal syntax

Syntax, morphology, syntax/morphology interface, neurolinguistics

Formal syntax, modern and historical Germanic syntax, statistical patterning of syntactic usage

Formal semantics and pragmatics, semantic and pragmatic processing

Logic, formal learning theory, machine learning, recursive function theory

Experimental semantics and pragmatics, language acquisition, language and cognition

Linguistics, The University of Chicago

Topics in Semantics and Pragmatics

This course will provide a comprehensive overview of the empirical patterns, analytical challenges and broader theoretical issues surrounding a particular topic, such as information structure, presupposition, scalar implicature, binding, aspectual composition, nominal reference, and so forth.

Semantics and Pragmatics

writing on whiteboard

Semantics and pragmatics research at Stanford aims to develop theoretical models that appreciate and explain the complexity of meaning witnessed in language use.

The Stanford semantics and pragmatics community encompasses a broad range of interests including:

  • Lexical semantics
  • Formal semantics and pragmatics, and their interfaces with syntax
  • Psycholinguistics
  • Numerous sub-areas of psychology, philosophy, and computer science

We share the goal of grounding theories of meaning in diverse research methodologies, including:

  • Formal linguistic analysis
  • Psycholinguistic experimentation
  • Computational modeling
  • Corpus studies
  • Fieldwork on less widely studied languages

We offer courses ranging from introductory courses at the undergraduate and graduate levels to seminars on cutting edge topics and workshops focused on developing key research skills. 

Recent seminar topics have included:

  • Temporal interpretation
  • Experimental studies of quantification
  • Definiteness
  • Conditionals

Faculty and students participate in a wide range of research projects, many of them collaborative.  Many of them also span subfields of linguistics or reach out to neighboring disciplines.  There are usually informal reading or working groups that reflect the community's ever evolving research interests, as well as occasional larger gatherings, such as Construction of Meaning Workshops and an annual SemFest . 

Ongoing Labs & Projects

  • Computational Semantics in the Stanford NLP Group
  • CSLI Computational Semantics Lab
  • CSLI Language and Natural Reasoning Group
  • CSLI Pragmatic Enrichment and Contextual Inference Lab
  • CSLI Psychosemantics Lab
  • The interActive Language Processing (ALPS) Lab

People in this subfield

Related news, look who’s talking: krejci, look who’s talking: stanford linguists at lsrl54, welcome to our incoming ph.d. students, qp fest is next friday, look who’s talking: stanford linguistics at cls 60, upcoming events.

Linguistics

  • Finding Articles
  • General Reference Materials
  • Primary Materials and Data
  • Applied Linguistics
  • Computational Linguistics
  • Corpus-based Linguistics
  • Neurolinguistics
  • Philosophy of Language
  • Sociolinguistics and Linguistic Anthropology
  • Handbooks & Atlases

About this page

There are specialized reference resources in all branches of linguistics. Most of these resources can be found either electronically or in the first floor reference area.

Specialized titles

""

NOTE: See also the online volumes of this title:  Volume 1,   Volume 2,  Volume 3,  Volume 4

""

NOTE: See also the online volumes of this title: Volume 1 ,  Volume 2 ,  Volume 3 ,  Volume 4 ,  Volume 5 ,  Volume 6 ,  Volume 7 ,  Volume 8 ,  Volume 9 ,  Volume 10 ,  Volume 11 ,  Volume 12 ,  Volume 13 ,  Volume 14 ,  Volume 15 ,  Volume 16 ,  Volume 17 ,  Volume 18

Cover Art

  • << Previous: Pragmatics
  • Next: Sociolinguistics and Linguistic Anthropology >>
  • Last Updated: Mar 25, 2024 11:30 AM
  • URL: https://guides.nyu.edu/linguistics
  • Tools and Resources
  • Customer Services
  • Applied Linguistics
  • Biology of Language
  • Cognitive Science
  • Computational Linguistics
  • Historical Linguistics
  • History of Linguistics
  • Language Families/Areas/Contact
  • Linguistic Theories
  • Neurolinguistics
  • Phonetics/Phonology
  • Psycholinguistics
  • Sign Languages
  • Sociolinguistics
  • Share This Facebook LinkedIn Twitter

Article contents

Lexical semantics.

  • Dirk Geeraerts Dirk Geeraerts University of Leuven
  • https://doi.org/10.1093/acrefore/9780199384655.013.29
  • Published online: 25 January 2017

Lexical semantics is the study of word meaning. Descriptively speaking, the main topics studied within lexical semantics involve either the internal semantic structure of words, or the semantic relations that occur within the vocabulary. Within the first set, major phenomena include polysemy (in contrast with vagueness), metonymy, metaphor, and prototypicality. Within the second set, dominant topics include lexical fields, lexical relations, conceptual metaphor and metonymy, and frames. Theoretically speaking, the main theoretical approaches that have succeeded each other in the history of lexical semantics are prestructuralist historical semantics, structuralist semantics, and cognitive semantics. These theoretical frameworks differ as to whether they take a system-oriented rather than a usage-oriented approach to word-meaning research but, at the same time, in the historical development of the discipline, they have each contributed significantly to the descriptive and conceptual apparatus of lexical semantics.

  • structuralism
  • cognitive semantics
  • lexical field theory
  • componential analysis
  • semasiology
  • onomasiology

Lexical semantics is the study of word meaning. The following first presents an overview of the main phenomena studied in lexical semantics and then charts the different theoretical traditions that have contributed to the development of the field. The focus lies on the lexicological study of word meaning as a phenomenon in its own right, rather than on the interaction with neighboring disciplines. This implies that morphological semantics, that is the study of the meaning of morphemes and the way in which they combine into words, is not covered, as it is usually considered a separate field from lexical semantics proper. Similarly, the interface between lexical semantics and syntax will not be discussed extensively, as it is considered to be of primary interest for syntactic theorizing. There is no room to discuss the relationship between lexical semantics and lexicography as an applied discipline. For an entry-level text on lexical semantics, see Murphy ( 2010 ); for a more extensive and detailed overview of the main historical and contemporary trends of research in lexical semantics, see Geeraerts ( 2010 ).

1 The Descriptive Scope of Lexical Semantics

The main phenomena studied by lexical semantics are organized along two dimensions. First, it makes a difference whether we look at semantic phenomena within individual words or whether we look at meaningful structures within the vocabulary as a whole. Terminologically, this difference of perspective can be expressed by referring to a ‘semasiological’ and an ‘onomasiological’ perspective. (Semasiology looks at the relationship between words and meaning with the word as starting point: it is basically interested in the polysemy of words. Onomasiology takes the converse perspective: given a concept to be expressed or a thing to be categorized, what options does a language offer, and how are the choices made?) Second, a distinction needs to be made between an approach that focuses on elements and relations only and one that takes into account the differences of structural weight between those elements and relations. Even though the terms are not perfect, we can use the terms ‘qualitative approach’ and ‘quantitative approach’ to refer to this second distinction. If we cross-classify the two distinctions, we get four groups of topics. ‘Qualitative’ semasiology deals with word senses and the semantic links among those senses, like metaphor and metonymy at the level of individual words. ‘Qualitative’ onomasiology deals with the semantic relations among lexical items, like lexical fields and lexical relations. ‘Quantitative’ semasiology deals with prototype effects: differences of salience and structural weight within an item or a meaning. ‘Quantitative’ onomasiology deals with salience effects in the lexicon at large, like basic-level phenomena.

Table 1 The Descriptive Scope of Lexical Semantics

The four groups of topics are summarized in Table 1 . As will be seen later, this schematic representation is also useful to identify the contribution of the various theoretical approaches that have successively dominated the evolution of lexical semantics.

1.1 Polysemy and vagueness

Establishing which meanings a word has is arguably the basic step in lexical semantic research. Polysemy is the common term for the situation in which a lexical item has more than one meaning, such as when late can mean ‘after the usual, expected, or agreed time’ ( I am late again ), ‘advanced in day or night’ ( a late dinner ), or ‘no longer alive’ ( my late aunt Polly ). Terminologically speaking, polysemy needs to be contrasted with homonymy and, more importantly, vagueness. When two (or more) words have the same shape, such as bank (‘slope, elevation in sea or river bed’) and bank (‘financial institution’), they are homonyms; whereas polysemy refers to multiplicity of meaning within a single word, the multiplicity is distributed over various words in the case of homonymy. As such, making a distinction between polysemy and homonymy comes down to determining whether we are dealing with one and the same word or with two different ones. The distinction between vagueness and polysemy involves the question of whether a particular piece of semantic information is part of the underlying semantic structure of the item or is the result of a contextual (and hence pragmatic) specification. For instance, neighbor is not polysemous between the readings ‘male dweller next door’ and ‘female dweller next door,’ in the sense that the utterance my neighbor is a civil servant will not be recognized as requiring disambiguation in the way that she is smart might ( Do you mean ‘bright’ or ‘stylish’? ). The semantic information that is associated with the item neighbor in the lexicon does not, in other words, contain a specification regarding sex; neighbor is vague (or general, or unspecified) as to the dimension of gender.

To decide between polysemy and vagueness, a number of tests can be invoked. The three main ones are the following. First, from a truth-theoretical point of view, a lexical item is polysemous if it can simultaneously be clearly true and clearly false of the same referent. Considering the readings ‘harbor’ and ‘fortified sweet wine from Portugal’ of port , the polysemy of that item is established by sentences such as Sandeman is a port (in a bottle) , but not a port (with ships). This criterion basically captures a semantic intuition: are two interpretations of a given expression intuitively sufficiently dissimilar so that one may be said to apply and the other not?

Second, linguistic tests involve syntactic rather than semantic intuitions. Specifically, they are based on acceptability judgments about sentences that contain two related occurrences of the item under consideration (one of which may be implicit). If the grammatical relationship between both occurrences requires their semantic identity, the resulting sentence may be an indication for the polysemy of the item. For instance, the so-called identity test involves ‘identity-of-sense anaphora.’ Thus, at midnight the ship passed the port, and so did the bartender is awkward if the two lexical meanings of port are at stake. Disregarding puns, it can only mean that the ship and the bartender alike passed the harbor, or conversely that both moved a particular kind of wine from one place to another. A mixed reading, in which the first occurrence of port refers to the harbor and the second to wine, is normally excluded. By contrast, the fact that the notions ‘vintage sweet wine from Portugal’ and ‘blended sweet wine from Portugal’ can be combined in Vintage Noval is a port, and so is blended Sandeman indicates that port is vague rather than polysemous with regard to the distinction between blended and vintage wines.

Third, the definitional criterion specifies that an item has more than one lexical meaning if there is no minimally specific definition covering the extension of the item as a whole, and that it has no more lexical meanings than there are maximally general definitions necessary to describe its extension. Definitions of lexical items should be maximally general in the sense that they should cover as large a subset of the extension of an item as possible. Thus, separate definitions for ‘blended sweet fortified wine from Portugal’ and ‘vintage sweet fortified wine from Portugal’ could not be considered definitions of lexical meanings, because they can be brought together under the definition ‘sweet fortified wine from Portugal.’ On the other hand, definitions should be minimally specific in the sense that they should be sufficient to distinguish the item from other nonsynonymous items. A maximally general definition covering both port ‘harbor’ and port ‘kind of wine’ under the definition ‘thing, entity’ is excluded because it does not capture the specificity of port as distinct from other words.

The distinction between polysemy and vagueness is not unproblematic, methodologically speaking. An examination of different basic criteria for distinguishing between polysemy and vagueness reveals, first, that those criteria may be in mutual conflict (in the sense that they need not lead to the same conclusion in the same circumstances) and, second, that each of them taken separately need not lead to a stable distinction between polysemy and vagueness (in the sense that what is a distinct meaning according to one of the tests in one context may be reduced to a case of vagueness according to the same test in another context). Without going into detail (for a full treatment, see Geeraerts, 1993 ), let us illustrate the first type of problem. In the case of autohyponymous words, for instance, the definitional approach does not reveal an ambiguity, whereas the truth-theoretical criterion does. Dog is autohyponymous between the readings ‘Canis familiaris,’ contrasting with cat or wolf , and ‘male Canis familiaris,’ contrasting with bitch . A definition of dog as ‘male Canis familiaris,’ however, does not conform to the definitional criterion of maximal coverage, because it defines a proper subset of the ‘Canis familiaris’ reading. On the other hand, the sentence Lady is a dog, but not a dog , which exemplifies the logical criterion, cannot be ruled out as ungrammatical.

1.2 Semantic Relations

Once senses are identified (and assuming they can be identified with a reasonable degree of confidence), the type of relationship that exists between them needs to be established. The most common classification of semantic relations emerges from the tradition of historical semantics, that is, the vocabulary used to describe synchronic relations between word meanings is essentially the same as the vocabulary used to describe diachronic changes of meaning. In the simplest case, if sense a is synchronically related to sense b by metonymy, then a process of metonymy has acted diachronically to extend sense a to sense b : diachronic mechanisms of semasiological change reappear synchronically as semantic relations among word meanings.

The four basic types are specialization, generalization, metaphor, and metonymy (described here, from a diachronic perspective, as mechanisms rather than synchronic relations). In the case of semantic specialization , the new meaning is a restriction of the old meaning: the new meaning is a subcase of the old. In the case of semantic generalization , the reverse holds: the old meaning is a subcase of the new. Classical examples of specialization are corn (originally a cover-term for all kinds of grain, now specialized to ‘wheat’ in England, to ‘oats’ in Scotland, and to ‘maize’ in the United States), starve (moving from ‘to die’ to ‘to die of hunger’), and queen (originally ‘wife, woman,’ now restricted to ‘king’s wife, or female sovereign’). Examples of generalization are moon (primarily the earth’s satellite, but extended to any planet’s satellite), and French arriver (which originally meant ‘to reach the river’s shore, to embank,’ but which now signifies ‘to reach a destination’ in general). There is a lot of terminological variation in connection with specialization and generalization. ‘Restriction’ and ‘narrowing’ of meaning equal ‘specialization,’ while ‘extension,’ ‘schematization,’ and ‘broadening’ of meaning equal ‘generalization.’ Also, the meanings involved can be said to entertain relations of taxonomical subordination or superordination: in a taxonomy (a tree-like hierarchical classification) of concepts, the specialized meaning is subordinate with regard to the original one, whereas the generalized meaning is superordinate with regard to the original.

Like specialization and generalization, it is convenient and customary to introduce metaphor and metonymy together, even though the relationship is not as close as with the former pair. (More on metaphor and metonymy follows in section 1.6, “Conceptual Metaphor and Metonymy.” ) Metaphor is then said to be based on a relationship of similarity between the old and the new reading, and metonymy on a relationship of contiguity. Current computer terminology yields examples of both types. The desktop of your computer screen, for instance, is not the same as the desktop of your office desk—except that in both cases, it is the space (a literal space in one case, a virtual one in the other) where you position a number of items that you regularly use or that urgently need attention. The computer desktop, in other words, is not literally a desktop in the original sense, but it has a functional similarity with the original: the computer reading is a metaphorical extension of the original office furniture reading. Functional similarities also underlie metaphorical expressions like bookmark , clipboard , file , folder , cut , and paste . Mouse , on the other hand, is also metaphorically motivated, but here, the metaphorical similarity involves shape rather than function. But now consider a statement to the effect that your desktop will keep you busy for the next two weeks, or that you ask aloud where your mouse has gone when you are trying to locate the pointer on the screen. In such cases, desktop and mouse are used metonymically. In the former case, it’s not the virtual space as such that is relevant, but the items that are stored there. In the latter case, it’s not the mouse as such (the thing that you hold in your hand) that you refer to, but the pointer on the screen that is operated by the mouse. The desktop and the stored items, or the mouse and the pointer, have a relationship of real-world connectedness that is usually captured by the notion of ‘contiguity.’ When, for instance, one drinks a whole bottle, it is not the bottle but merely its contents that are consumed: bottle can be used to refer to a certain type of container, and the (spatially contiguous) contents of that container. When lexical semanticians state that metonymical changes are based on contiguity, contiguity should not be understood in a narrow sense as referring to spatial proximity only, but more broadly as a general term for various associations in the spatial, temporal, or causal domain.

1.3 Lexical Fields and Componential Analysis

A lexical field is a set of semantically related lexical items whose meanings are mutually interdependent. The single most influential study in the history of lexical field theory is Trier’s ( 1931 ) monograph, in which he presents a theoretical formulation of the field approach and investigates how the terminology for mental properties evolves from Old High German up to the beginning of the 13th century. Theoretically, Trier emphasizes that only a mutual demarcation of the words under consideration can provide a decisive answer regarding their exact value. Words should not be considered in isolation, but in their relationship to semantically related words: demarcation is always a demarcation relative to other words.

While different conceptions of the notion ‘lexical field’ were suggested after Trier’s initial formulation, the most important development is the emergence of componential analysis as a technique for formalizing the semantic relationships between the items in a field: once a lexical field has been demarcated, the internal relations within the field will have to be described in more detail. It is not sufficient to say that the items in the field are in mutual opposition—these oppositions will have to be identified and defined. Componential analysis is a method for describing such oppositions that takes its inspiration from structuralist phonology: just like phonemes are described structurally by their position on a set of contrastive dimensions, words may be characterized on the basis of the dimensions that structure a lexical field. Componential analysis provides a descriptive model for semantic content, based on the assumption that meanings can be described on the basis of a restricted set of conceptual building blocks—the semantic ‘components’ or ‘features.’

A brief illustration of the principles of componential analysis is given by Pottier ( 1964 ), who provides an example of a componential semantic analysis in his description of a field consisting of, among others, the terms siège , pouf , tabouret , chaise , fauteuil , and canapé (a subfield of the field of furniture terms in French). The word which acts as a superordinate to the field under consideration is siège , ‘seating equipment with legs.’ If we use the dimensions s1 ‘for seating,’ s2 ‘for one person,’ s3 ‘with legs,’ s4 ‘with back,’ s5 ‘with armrests,’ s6 ‘of rigid material,’ then chaise ‘chair’ can be componentially defined as [+ s1, + s2, + s3, + s4, − s5, + s6], and canapé ‘sofa’ as [+ s1, − s2, + s3, + s4, + s5, + s6], and so on.

While componential forms of description are common in formal types of semantic description (see the historical overview in section 2, “The Theoretical Evolution of Lexical Semantics,” specifically section 2.3, “Neostructuralist Semantics” ), the most important theoretical development after the introduction of componential analysis is probably Wierzbicka’s ( 1996 ) attempt to identify a restricted set of some 60 universally valid, innate components. The Natural Semantic Metalanguage aims at defining cross-linguistically transparent definitions by means of those allegedly universal building-blocks.

1.4 Lexical Relations

Like componential analysis, relational semantics, as introduced by Lyons ( 1963 ), develops the idea of describing the structural relations among related words. It, however, restricts the theoretical vocabulary to be used in such a description. In a componential analysis, the features are essentially of a ‘real world’ kind: as in Pottier’s example, they name properties of the things referred to, rather than properties of the meanings as such. But if linguistics is interested in the structure of the language rather than the structure of the world, it may want to use a descriptive apparatus that is more purely linguistic. Relational semantics looks for such an apparatus in the form of sense relations like synonymy (identity of meaning) and antonymy (oppositeness of meaning): the fact that aunt and uncle refer to the same genealogical generation is a fact about the world, but the fact that black and white are opposites is a fact about words and language. Instead of deriving statements about the synonymy or antonymy of a word (and in general, statements about the meaning relations it entertains) from a separate and independent description of the word’s meaning, the meaning of the word could be defined as the total set of meaning relations in which it participates. A traditional approach to synonymy would for instance describe the meaning of both quickly and speedily as ‘in a fast way, not taking up much time,’ and then conclude to the synonymy of both terms on the basis of their definitional identity. Lyons by contrast deliberately eschews such content descriptions, and equates the meaning of a word like quickly with the synonymy relation it has with speedily , plus any other relations of that kind.

In the actual practice of relational semantics, ‘relations of that kind’ specifically include—next to synonymy and antonymy—relations of hyponymy (or subordination) and hyperonymy (or superordination), which are both based on taxonomical inclusion. The major research line in relational semantics involves the refinement and extension of this initial set of relations. The most prominent contribution to this endeavor after Lyons is found in Cruse ( 1986 ). Murphy ( 2003 ) is a thoroughly documented critical overview of the relational research tradition.

1.5 Distributional Relations

Given a Saussurean distinction between paradigmatic and syntagmatic relations, lexical fields as originally conceived are based on paradigmatic relations of similarity. One extension of the field approach, then, consists of taking a syntagmatic point of view. Words may in fact have specific combinatorial features which it would be natural to include in a field analysis. A verb like to comb , for instance, selects direct objects that refer to hair, or hair-like things, or objects covered with hair. Describing that selectional preference should be part of the semantic description of to comb . For a considerable period, these syntagmatic affinities received less attention than the paradigmatic relations, but in the 1950s and 1960s, the idea surfaced under different names. Firth ( 1957 ) for instance introduced the (now widely used) term collocation .

The distributional approach can be more radical than the mere incorporation of lexical combinatorics into the description of words: if the environments in which a word occurs could be used to establish its meaning, lexical semantics could receive a firm methodological basis. The general approach of a distributionalist method is summarized by Firth’s dictum: ‘You shall know a word by the company it keeps,’ that is, words that occur in the same contexts tend to have similar meanings. In the final decades of the 20th century, major advances in the distributional approach to semantics were achieved by applying a distributional way of meaning analysis to large text corpora. Sinclair, a pioneer of the approach, developed his ideas (see Sinclair, 1991 ) through his work on the Collins Cobuild English Language Dictionary , for which a 20-million-word corpus of contemporary English was compiled. In Sinclair’s original conception, a collocational analysis is basically a heuristic device to support the lexicographer’s manual work. A further step in the development of the distributional approach was taken through the application of statistics as a method for establishing the relevance of a collocation and, more broadly, for analyzing the distributional co-occurrence patterns of words (see Glynn & Robinson, 2014 , for a state-of-the-art overview of quantitative corpus semantics).

1.6 Conceptual Metaphor and Metonymy

Metaphorical relations of the kind mentioned in section 1.2 ( “Semantic Relations” ) do not only exist between the readings of a given word: several words may exhibit similar metaphorical patterns. Conceptual metaphor theory, the approach introduced by Lakoff and Johnson ( 1980 ), includes two basic ideas: first, the view that metaphor is a cognitive phenomenon, rather than a purely lexical one; second, the view that metaphor should be analyzed as a mapping between two domains. To illustrate the first point, metaphor comes in patterns that transcend the individual lexical item. A typical example (Lakoff & Johnson, 1980 , pp. 44–45) is the following.

love is a journey Look how far we’ve come. We are at a crossroads. We’ll just have to go our separate ways. We cannot turn back now. We are stuck. This relationship is a dead-end street. I don’t think this relationship is going anywhere. It’s been a long, bumpy road. We have gotten off the track.

The second pillar of conceptual metaphor theory is the analysis of the mappings inherent in metaphorical patterns. Metaphors conceptualize a target domain in terms of the source domain, and such a mapping takes the form of an alignment between aspects of the source and target. For love is a journey , for instance, the following correspondences hold (compare Lakoff & Johnson, 1999 , p. 64).

Metonymies too can be systematic in the sense that they form patterns that apply to more than just an individual lexical item. Thus, the bottle example mentioned in section 1.2 ( “Semantic Relations” ) exhibits the name of a container (source) being used for its contents (target), a pattern that can be abbreviated as container for contents . Making use of this abbreviated notation, other common types of metonymy are the following: a spatial location for what is located there ( the whole theater was in tears ); a period of time for what happens in that period, for the people who live then, or for what is produced during that period ( the 19th century had a nationalist approach to politics ); a material for the product made from it ( a cork ); the origin for what originates from it ( astrakhan , champagne , emmental ); an activity or event for its consequences (when the blow you have received hurts, it is not the activity of your adversary that is painful, but the physical effects that it has on your body); an attribute for the entity that possesses the attribute ’ ( majesty does not only refer to ‘royal dignity or status,’ but also to the sovereign himself); and of course part for whole ( a hired hand ). The relations can often work in the other direction as well. To fill up the car , for instance, illustrates a type whole for part : it’s obviously only a part of the car that gets filled. For the current state of affairs in metonymy research from a cognitive semantic point of view, see Benczes, Barcelona, and Ruiz de Mendoza Ibáñez ( 2011 ).

Yet another approach to semantic structure in the lexicon focuses on the way our knowledge of the world is organized in larger ‘chunks of knowledge’ and how these interact with language. The most articulate model in this respect is Fillmore’s frame theory (Fillmore & Atkins, 1992 ; and see Ruppenhofer, Ellsworth, Petruck, Johnson, & Scheffczyk, 2006 , for the large-scale application of frame theory in the FrameNet project). Frame theory is specifically interested in the way in which language may be used to perspectivize an underlying conceptualization of the world: it’s not just that we see the world in terms of conceptual models, but those models may be verbalized in different ways. Each different way of bringing a conceptual model to expression so to speak adds another layer of meaning: the models themselves are meaningful ways of thinking about the world, but the way we express the models while talking adds perspective. This overall starting point of Fillmorean frame theory leads to a description on two levels. On the one hand, a description of the referential situation or event consists of an identification of the relevant elements and entities, and the conceptual role they play in the situation or event. On the other hand, the more purely linguistic part of the analysis indicates how certain expressions and grammatical patterns highlight aspects of that situation or event.

An illustration comes from the standard example of frame theory, the commercial transaction frame. The commercial transaction frame involves words like buy and sell . The commercial transaction frame can be characterized informally by a scenario in which one person gets control or possession of something from a second person, as a result of a mutual agreement through which the first person gives the second person a sum of money. Background knowledge involved in this scenario includes an understanding of ownership relations, a money economy, and commercial contracts. The categories that are needed for describing the lexical meanings of the verbs linked to the commercial transaction scene include Buyer, Seller, Goods, and Money as basic categories. Verbs like buy and sell then each encode a certain perspective on the commercial transaction scene by highlighting specific elements of the scene. In the case of buy , for instance, the buyer appears in the participant role of the agent, for instance as the subject of the (active) sentence. In active sentences, the goods then appear as the direct object; the seller and the money appear in prepositional phrases: Paloma bought a book from Teresa for €30 . In the case of sell , on the other hand, it is the seller that appears in the participant role of the agent: Teresa sold a book to Paloma for €30 .

1.8 Prototype Effects and Radial Sets

The prototype-based conception of categorization originated in the mid-1970s with Rosch’s psycholinguistic research into the internal structure of categories (see, among others, Rosch, 1975 ). Rosch concluded that the tendency to define categories in a rigid way clashes with the actual psychological situation. Linguistic categories do not have sharply delimited borderlines. Instead of clear demarcations between equally important conceptual areas, one finds marginal areas between categories that are unambiguously defined only in their focal points. This observation was taken over and elaborated in linguistic lexical semantics (see Hanks, 2013 ; Taylor, 2003 ). Specifically, it was applied not just to the internal structure of a single word meaning, but also to the structure of polysemous words, that is, to the relationship between the various meanings of a word. Four characteristics, then, are frequently mentioned in the linguistic literature as typical of prototypicality.

Prototypical categories cannot be defined by means of a single set of criterial (necessary and sufficient) attributes.

Prototypical categories exhibit a family-resemblance structure, i.e., one like the similarities that exist between relatives (some have the same typical hair color, some have the same typically shaped nose, some have the same typical eyes, but none have all and only the typical family traits); the different uses of a word have several features in common with one or more other uses, but no features are common to all uses. More generally, their semantic structure takes the form of a set of clustered and overlapping meanings (which may be related by similarity or by other associative links, such as metonymy). Because this clustered set is often built up round a central meaning, the term ‘radial set’ is often used for this kind of polysemic structure.

Prototypical categories exhibit degrees of category membership; not every member is equally representative for a category.

Prototypical categories are blurred at the edges.

By way of example, consider fruit as referring to a type of food. If you ask people to list kinds of fruit, some types come to mind more easily than others. For American and European subjects (there is clear cultural variation on this point), oranges, apples, and bananas are the most typical fruits, while pineapples, watermelons, and pomegranates receive low typicality ratings. This illustrates the third characteristic mentioned above. But now, consider coconuts and olives. Is a coconut or an olive a fruit in the ordinary everyday sense of that word? For many people, the answer is not immediately obvious, which illustrates the fourth characteristic: if we zoom in on the least typical exemplars of a category, membership in the category may become fuzzy. A category like fruit should be considered not only with regard to the exemplars that belong to it, but also with regard to the features that these category members share and that together define the category. Types of fruit do not, however, share a single set of definitional features that sufficiently distinguishes fruit from, say, vegetables and other natural foodstuffs. All are edible seed-bearing parts of plants, but most other features that we think of as typical for fruit are not general: while most are sweet, some are not, like lemons; while most are juicy, some are not, like bananas; while most grow on trees and tree-like plants, some grow on bushes, like strawberries; and so on. This absence of a neat definition illustrates the first characteristic. Instead of such a single definition, what seems to hold together the category are overlapping clusters of representative features. Whereas the most typical kinds of fruit are the sweet and juicy ones that grow on trees, other kinds may lack one or even more of these features. This then illustrates the second characteristic mentioned above.

The four characteristics are systematically related along two dimensions. On the one hand, the third and the fourth characteristics take into account the referential, extensional structure of a category. In particular, they consider the members of a category; they observe, respectively, that not all referents of a category are equal in representativeness for that category and that the denotational boundaries of a category are not always determinate. On the other hand, these two aspects (centrality and nonrigidity) recur on the intensional level, where the definitional rather than the referential structure of a category is envisaged. For one thing, nonrigidity shows up in the fact that there is no single necessary and sufficient definition for a prototypical concept. For another, family resemblances imply overlapping of the subsets of a category; consequently, meanings exhibiting a greater degree of overlapping will have more structural weight than meanings that cover only peripheral members of the category. As such, the clustering of meanings that is typical of family resemblances implies that not every meaning is structurally equally important (and a similar observation can be made with regard to the components into which those meanings may be analyzed).

The four characteristics are not coextensive; that is, they do not necessarily occur together. In that sense, some words may exhibit more prototypicality effects than others. In the practice of linguistics, the second feature in particular has attracted the attention, and the radial set model (which graphically represents the way in which less central meanings branch out from the prototypical, core reading) is a popular representational format in lexical semantics; see Tyler and Evans ( 2001 ) for an example.

1.9 Basic Levels and Onomasiological Salience

Possibly the major innovation of the prototype model of categorization is to give salience a place in the description of semasiological structure: next to the qualitative relations among the elements in a semasiological structure (like metaphor and metonymy), a quantifiable center-periphery relationship is introduced as part of the architecture. But the concept of salience can also be applied to the onomasiological domain.

The initial step in the introduction of onomasiological salience is the basic-level hypothesis . The hypothesis is based on the ethnolinguistic observation that folk classifications of biological domains usually conform to a general organizational principle, in the sense that they consist of five or six taxonomical levels (Berlin, 1978 ). The basic-level hypothesis embodies a notion of onomasiological salience, because it is a hypothesis about alternative categorizations of referents: if a particular referent (a particular piece of clothing) can be alternatively categorized as a garment, a skirt, or a wrap-around skirt, the choice will be preferentially made for the basic-level category ‘skirt.’ But differences of onomasiological preference also occur among categories on the same level in a taxonomical hierarchy. If a particular referent can be alternatively categorized as a wrap-around skirt or a miniskirt, there could just as well be a preferential choice: when you encounter something that is both a wrap-around skirt and a miniskirt, the most natural way of naming that referent in a neutral context would probably be ‘miniskirt.’ If, then, we have to reckon with intra-level differences of salience next to inter-level differences, the concept of onomasiological salience has to be generalized in such a way that it relates to individual categories at any level of the hierarchy.

This notion of generalized onomasiological salience was first introduced in Geeraerts, Grondelaers, and Bakema ( 1994 ). Using corpus materials, this study established that the choice for one lexical item rather than the other as the name for a given referent is determined by the semasiological salience of the referent (i.e., the degree of prototypicality of the referent with regard to the semasiological structure of the category), by the overall onomasiological salience of the category represented by the expression, and by contextual features of a classical sociolinguistic and geographical nature, involving the competition between different language varieties. By zooming in on the last type of factor, a further refinement of the notion of onomasiological salience is introduced, in the form the distinction between conceptual and formal onomasiological variation. Whereas conceptual onomasiological variation involves the choice of different conceptual categories for a referent (like the examples presented so far), formal onomasiological variation merely involves the use of different synonymous names for the same conceptual category. The names jeans and trousers for denim leisure-wear trousers constitute an instance of conceptual variation, for they represent categories at different taxonomical levels. Jeans and denims , however, represent no more than different (but synonymous) names for the same denotational category.

2. The Theoretical Evolution of Lexical Semantics

Four broadly defined theoretical traditions may be distinguished in the history of word-meaning research.

2.1 Prestucturalist Historical Semantics

The prestructuralist period (ranging from the middle of the 19th century up to the 1930s) was the heyday of historical semantics, in the sense that the study of meaning change reigned supreme within semantics. The main theoretical achievement of prestructuralist historical semantics consists of various classifications of types of semantic change, coupled with considerable attention to psychological processes as the explanatory background of changes: the general mechanisms of change included in the classifications were generally considered to be based on the associative patterns of thought of the human mind. Important figures (among many others) are Hermann Paul, Michel Bréal, and Gustaf Stern (see Ullmann, 1962 , for an introductory overview). With the shift toward a structuralist approach that occurred round 1930 , lexical semantics switched from a preference for diachronic studies to a preference for synchronic studies. However, the poststructuralist cognitive approach provides a new impetus for historical lexical semantics.

2.2 Structuralist Semantics

Inspired by the Saussurean conception of language, structural semantics originated as a reaction against prestructural historical semantics. The origins of structural semantics are customarily attributed to Trier ( 1931 ), but while Trier’s monograph may indeed be the first major descriptive work in structural semantics, the first theoretical and methodological definition of the new approach is to be found in Weisgerber ( 1927 ), a polemical article that criticized historical linguistics on three points. First and foremost, because the vocabulary of a language is not simply an unstructured set of separate items, and because the meaning of a linguistic sign is determined by its position in the linguistic structures in which it takes part, the proper subject matter of semantics is not the atomistic changes of word meanings that historical semantics had concentrated on, but the semantic structure of the language that demarcates the meanings of individual words with regard to each other. Second, because that structure is a linguistic rather than a psychological phenomenon, linguistic meanings should not be studied from a psychological perspective, but from a purely linguistic one. And third, because semantic change has to be redefined as change in semantic structures, synchronic semantics methodologically precedes diachronic semantics: the synchronic structures have to be studied before their changes can be considered. The realization of this attempt to develop a synchronic, nonpsychological, structural theory of semantics depends on the way in which the notion of semantic structure is conceived. In actual practice, there are mainly three distinct definitions of semantic structure that have been employed by structuralist semanticians. More particularly, three distinct kinds of structural relations among lexical items have been singled out as the proper methodological basis of lexical semantics. First, there is the relationship of semantic similarity that lies at the basis of semantic field analysis and componential analysis: see section 1.3, “Lexical Fields and Componential Analysis.” Second, there are unanalyzed lexical relations such as synonymy, antonymy, and hyponymy: see section 1.4, “Lexical Relations.” Third, syntagmatic lexical relations lie at the basis of a distributional approach to semantics: see section 1.5, “Distributional Relations.”

2.3 Neostructuralist Semantics

While componential analysis was developed in the second half of the 1950s and the beginning of the 1960s by European as well as American structural linguists, its major impact came from its incorporation into generative grammar: the publication of Katz and Fodor ( 1963 ) marked a theoretical migration of lexical semantics from a structuralist to a generativist framework. As a model for lexical semantics, Katzian semantics combined an essentially structuralist approach with two novel characteristics: the explicit inclusion of lexical description in a generative grammar and, accordingly (given that the grammar is a formal one), an interest in the formalization of lexical descriptions. Although Katzian semantics as such has long been abandoned, both features continue to play a role in this ‘neostructuralist’ tradition (the label is not an established one, but it will do for lack of a more conventional one). On the one hand, the integration of the lexicon into the grammar informs the continuing debate about the interface of lexicon and syntax; see Wechsler ( 2015 ) for an overview. On the other hand, a number of models for the formalization of word meaning have been developed, the most prominent of which is Pustejovsky’s ‘generative lexicon’ approach ( 1995 ).

2.4 Cognitive Semantics

Compared to prestructuralist semantics, structuralism constitutes a move toward a more purely ‘linguistic’ type of lexical semantics, focusing on the linguistic system rather than the psychological background or the contextual flexibility of meaning. With the poststructuralist emergence of cognitive semantics, the pendulum swings back to a position in which the distinction between semantics and pragmatics is not a major issue, in which language is seen in the context of cognition at large, and in which language use is as much a focus of enquiry as the language system. Cognitive lexical semantics emerged in the 1980s as part of cognitive linguistics, a loosely structured theoretical movement that opposed the autonomy of grammar and the marginal position of semantics in the generativist theory of language. Important contributions to lexical semantics include prototype theory (see section 1.8, “Prototype Effects and Radial Sets” ), conceptual metaphor theory (see section 1.6, “Conceptual Metaphor and Metonymy” ), frame semantics (see section 1.8), and the emergence of usage-based onomasiology (see section 1.9, “Basic Levels and Onomasiological Salience” ).

From a theoretical perspective, the various traditions are to some extent at odds with each other (as may be expected). Specifically, structuralist (and to a large extent neostructuralist) theories tend to look at word meaning primarily as a property of the language, that is the linguistic system as an entity in its own right. Prestructuralist historical semantics and cognitive semantics, on the other hand, tend to emphasize the way in which word meanings are embedded in or interact with phenomena that lie outside language in a narrow sense, like general cognitive principles, or the cultural, social, historical experience of the language user. They then also take a more ‘pragmatic’ perspective: if the emphasis moves away from the linguistic system as a more or less stable, more or less autonomous repository of possibilities, there will be more attention to language use as the actualization of those possibilities.

Descriptively speaking, however, each of the major theoretical frameworks has contributed to the expansion of lexical semantics, that is they have drawn attention to specific phenomena and they have proposed terms, classifications, and representational formats for analyzing those phenomena. Focusing on the major topics, these contributions successively include the links between the various senses of words in prestructuralist historical semantics, the semantic relationships within the vocabulary in the structuralist era, and the importance of semasiological and onomasiological salience effects in cognitive semantics. Regardless of the theoretical oppositions, these phenomena all belong to the descriptive scope of current lexical semantics: the emergence of new points of attention has not made the older topics irrelevant.

Table 2 The Contribution of the Successive Theoretical Traditions

A summary of the contribution of the major theoretical approaches is given in Table 2 . If one keeps in mind the chronology of the various theories, it will be clear that regardless of the theoretical differences, lexical semantics has witnessed an outspoken descriptive expansion, from a semasiological starting point to various forms of onomasiological structure, and from a focus on elements and structures alone to the relevance of salience effects on the semasiological and onomasiological architecture of meaning.

Further Reading

  • Goddard, C. (1998). Semantic analysis: A practical introduction . Oxford: Oxford University Press.
  • Riemer, N. (2015). Word meanings. In J. R. Taylor (Ed.), The Oxford handbook of the word (pp. 315–319). Oxford: Oxford University Press.
  • Benczes, R. , Barcelona, A. , & Ruiz de Mendoza Ibáñez, F. (Eds.). (2011). Defining metonymy in cognitive linguistics: Towards a consensus view . Amsterdam: John Benjamins.
  • Berlin, B. (1978). Ethnobiological classification. In E. Rosch & B. B. Lloyd (Eds.), Cognition and categorization (pp. 9–26). Hillsdale, NJ: Lawrence Erlbaum.
  • Cruse, D. A. (1986). Lexical semantics . Cambridge, U.K.: Cambridge University Press.
  • Fillmore, C. J. , & Atkins, B. T. S. (1992). Toward a frame-based lexicon: The semantics of ‘risk’ and its neighbors. In A. Lehrer & E. F. Kittay (Eds.), Frames, fields and contrasts: New essays in semantic and lexical organization (pp. 75–102). Hillsdale, NJ: Lawrence Erlbaum.
  • Firth, J. R. (1957). Papers in linguistics, 1934–51 . Oxford: Oxford University Press.
  • Geeraerts, D. (1993). Vagueness’s puzzles, polysemy’s vagaries. Cognitive Linguistics , 4 , 223–272.
  • Geeraerts, D. (2010). Theories of lexical semantics. Oxford: Oxford University Press.
  • Geeraerts, D. , Grondelaers, S. , & Bakema, P. (1994). The structure of lexical variation: Meaning, naming, and context . Berlin: Mouton de Gruyter.
  • Glynn, D. , & Robinson, J. A. (Eds.). (2014). Corpus methods for semantics: Quantitative studies in polysemy and synonymy . Amsterdam: John Benjamins.
  • Hanks, P. W. (2013). Lexical analysis: Norms and exploitations . Cambridge, MA: MIT Press.
  • Katz, J. J. , & Fodor, J. A. (1963). The structure of a semantic theory. Language , 39 , 170–210.
  • Lakoff, G. , & Johnson, M. (1980). Metaphors we live by . Chicago: University of Chicago Press.
  • Lakoff, G. , & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and its challenges to western thought . Chicago: University of Chicago Press.
  • Lyons, J. (1963). Structural semantics . Oxford: Blackwell.
  • Murphy, M. L. (2003). Semantic relations and the lexicon: Antonymy, synonymy, and other paradigms . Cambridge, U.K.: Cambridge University Press.
  • Murphy, M. L. (2010). Lexical meaning . Cambridge, U.K.: Cambridge University Press.
  • Pottier, B. (1964). Vers une sémantique moderne. Travaux de linguistique et de littérature , 2 , 107–137.
  • Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press.
  • Rosch, E. (1975). Cognitive representations of semantic categories. Journal of Experimental Psychology , 104 , 192–233.
  • Ruppenhofer, J. , Ellsworth, M. , Petruck, M. R. L. , Johnson, C. R. , & Scheffczyk, J. (2006). FrameNet II: Extended theory and practice . Berkeley, CA: FrameNet.
  • Sinclair, J. M. (1991). Corpus, concordance, collocation . Oxford: Oxford University Press.
  • Taylor, J. R. (2003). Linguistic categorization . 3d ed. Oxford: Oxford University Press.
  • Trier, J. (1931). Der deutsche Wortschatz im Sinnbezirk des Verstandes: Die Geschichte eines sprachlichen Feldes I. Von den Anfängen bis zum Beginn des 13. Jhdts. Heidelberg: Winter.
  • Tyler, A. , & Evans, V. (2001). Reconsidering prepositional polysemy networks: the case of ‘over.’ Language , 77 , 724–765.
  • Ullmann, S. (1962). Semantics: An introduction to the science of meaning . Oxford: Blackwell.
  • Wechsler, S. (2015). Word meaning and syntax: Approaches to the interface . Oxford: Oxford University Press.
  • Weisgerber, L. (1927). Die Bedeutungslehre: Ein Irrweg der Sprachwissenschaft? Germanisch-Romanische Monatsschrift , 15 , 161–183.
  • Wierzbicka, A. (1996). Semantics: Primes and universals . Oxford: Oxford University Press.

Related Articles

  • Middle English
  • Chinese Semantics
  • Phonological Templates in Development
  • Argument Realization in Syntax
  • Lexical Semantic Framework for Morphology
  • Cognitively Oriented Theories of Meaning
  • Acquisition of Pragmatics
  • Type Theory for Natural Language Semantics
  • Conversational Implicature
  • Nominal Reference
  • The Onomasiological Approach
  • Nominalization: General Overview and Theoretical Issues
  • Artificial Languages

Printed from Oxford Research Encyclopedias, Linguistics. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 03 June 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • [66.249.64.20|185.66.14.133]
  • 185.66.14.133

Character limit 500 /500

Semantics and Pragmatics

Semantics and Pragmatics, founded in 2007 and first published in 2008, is a Diamond Open Access journal published by the Linguistic Society of America.

Current Issue

Vol. 17 (2024)

Published: 2024-01-05

Main Articles

The semantics and probabilistic pragmatics of deadjectival intensifiers, free choice and presuppositional exhaustification, formalizing spatial-causal polysemy of agent prepositions, covert mixed quotation, squibs, remarks, and replies, limitations of a modal analysis of before and after, previous six years (2023-2018).

Vol. 16 (2023)

Published: 2023-01-05

Context Dynamics

Probabilities and logic in implicature computation.

Two puzzles with embedded disjunction

Putting oughts together

Using the anna karenina principle to explain why cause favors negative-sentiment complements, imperatives in a dynamic pragmatics, a square of necessities.

X-marking weak and strong necessity modals

A semantic universal for modality

Pair-list answers to questions with plural definites.

Vol. 15 (2022)

Published: 2022-01-19

“More is up” for domain restriction in ASL

Referential transparency as the proper treatment for quantification, logic and conversation.

The case of free choice

Keep only strong

Exceptional wide scope of bare nominals, pragmatic reasoning and semantic convention.

A case study on gradable adjectives

Attentional Pragmatics

A new pragmatic approach to exhaustivity

Two paths to habituality

The semantics of habitual mode in Tlingit

Probabilistic modeling of rational communication with conditionals

Belief or consequences, presupposition projection as a scope phenomenon, alternatives and attention in language and reasoning.

A reply to Mascarenhas & Picat 2019

Varieties of Hurford disjunctions

The role of alternatives in the interpretation of scalars and numbers.

Insights from the inference task

Scorekeeping in a chess game

Vol. 14 (2021)

Published: 2021-03-02

Fine-grained semantics for attitude reports

Formal properties of now revisited, explaining gaps in the logical lexicon of natural languages.

A decision-theoretic perspective on the square of Aristotle

A variably exhaustive and scalar focus particle and pragmatic focus concord in Burmese

The landscape of speech reporting, the most , the fewest and the least.

On the relative readings of quantity superlatives

Anyone might but everyone won’t

Presuppositional exhaustification, actualistic interpretations in french, negative free choice, npis, intervention, and collectivity, does intonation automatically strengthen scalar implicatures, on believing and hoping whether.

Vol. 13 (2020)

Published: 2020-04-04

Understanding focus

Pitch, placement and coherence

Unveiling multiple wh - free relative clauses and their functional wh -words

Numerals and the theory of number, a preference semantics for imperatives, multi-modal meaning.

An empirically-founded process algebra approach

Conjectural questions

The case of German verb-final wohl questions

Reciprocity

Anaphora, scope, and quantification

Predicting the end

Epistemic change in Romance

Weak necessity without weak possibility

The composition of modal strength distinctions in Javanese

Expressions in focus

Compositional trace conversion, similative plurality and the nature of alternatives, intervention in deontic reasoning, numeral differential constructions in arabic, non-informative assertions.

The case of non-optional wh -in-situ

Eavesdropping

What is it good for?

The logic of Quantifier Raising

Evidential meaning and (not-)at-issueness, alternative questions.

Distinguishing between negated and complementary disjuncts

The semantics and pragmatics of nouns in concealed questions

Vol. 12 (2019)

Published: 2019-11-14

Donkeys under discussion

Modal interpretation of tense in subjunctive conditionals, different again, learnability and semantic universals, distributive ignorance inferences with wonder and believe, the proprial article and the semantics of names, the discourse commitments of illocutionary reportatives, coercion by modification.

The adaptive capacities of event-sensitive adnominal modifiers

Experimental studies on it -clefts and predicate interpretation

Attitudes in discourse.

Italian polar questions and the particle mica

Pluralities across categories and plural projection

Polarity reversals under sluicing, know whether and -ever free relative clauses, free choice and homogeneity, on the grammatical source of adjective ordering preferences, where is the destructive update problem, filtering free choice, anankastic conditionals are still a mystery, polarity particles revisited, sluicing on free choice, presupposing counterfactuality, vagueness, contextualism, and ellipsis.

Vol. 11 (2018)

Published: 2018-02-09

Symmetric predicates and the semantics of reciprocal alternations

Counterfactual de se, that’s not quite it.

An experimental investigation of (non‑)exhaustivity in clefts

Disentangling two distinct notions of NEG raising

Might do better: flexible relativism and the qud, a formal semantics for situated conversation, free choice and distribution over disjunction.

The case of free choice ability

Reconstructing the syntax of focus operators

The case of the missing ‘if’: accessibility relations in stalnaker’s theory of conditionals, complex sentential operators refute unrestricted simplification of disjunctive antecedents.

  • Skip to main content
  • Skip to main navigation

Linguistics

  • About Linguistics at UCSC
  • What is Linguistics?
  • Linguistics Newsletters
  • Support Linguistics at UCSC
  • Contact Information and Directions
  • Faculty Administrators
  • Graduate Students
  • Visiting Researchers
  • In Memoriam: William F. Shipley
  • Degree Programs
  • Advice and Guidance
  • Career Prospects
  • Staying Connected
  • Study Abroad
  • Undergraduate Program Learning Outcomes
  • Linguistics-MIIS Agreement
  • Graduate Life in the Department
  • Financial Aid
  • Graduate Alumni Placement
  • Graduate Program Learning Outcomes
  • Linguistics Course Schedule
  • Linguistics Course Catalog
  • Research Areas
  • Labs and Other Research Groups
  • Linguistics Research Center
  • Faculty Collaboration
  • Publications
  • Externally Funded Projects
  • Linguistics News Highlights
  • Events Calendar
  • Department Colloquia
  • Linguistics Conferences
  • What's Happening at Santa Cruz (WHASC)
  • Celebrations

Home / Research / Research Areas / Semantics and Pragmatics

  • Semantics and Pragmatics

Common to work in semantics and pragmatics at UCSC is a formal approach to theoretically relevant problems grounded in detailed investigation of empirical data coming from a variety of languages. A thread uniting the research of the faculty and students here is attention to both semantic and pragmatic factors with particular emphasis on understanding language in context. Research topics, theoretical tools and languages considered are quite diverse. Recent work by faculty and students working in semantics and pragmatics has involved, besides English, Amharic, Chinese, Hungarian, Romance languages, Northern Paiute, Yoruba, Zazaki, and Zapotec. Topics currently investigated by faculty and students are distributivity, number interpretation, indefinites, propositional attitude predicates, affective language, noun phrase scope, and the semantics and pragmatics of polarity particles across languages. See the faculty members' and dissertators' websites below for more details.

  • Pranav Anand
  • Adrian Brasoveanu
  • Donka Farkas (Emerita)
  • William Ladusaw (Emeritus)
  • Roumyana Pancheva
  • Maziar Toosarvandani

Dissertators

  • Lalitha Balachandran
  • Morwenna Hoeks

Recent Alumni (see all recent )

  • Lisa Hofmann  (PhD) 2022,  Anaphora and Negation
  • Kelsey Sasaki  (PhD) 2021,  Components of Coherence
  • Tom Roberts   (PhD) 2021,  How to make believe: inquisitivity, veridicality, and evidentiality in belief reports
  • Margaret Krol l (PhD) 2020,  Comprehending Ellipsis
  • Deniz Rudin (PhD) 2018,   Rising Above Commitment 
  • Kelsey Kraus   (PhD) 2018,  Great Intonations
  • Karen Duek ( PhD ) 2017,  Sorting a complex world: an experimental study of polysemy and copredication in container and committee nominals
  • Karl DeVries  ( PhD ) 2016,  Independence Friendly Dynamic Semantics: Integrating Exceptional Scope, Anaphora and their Interactions
  • Oliver Northrup  ( PhD ) 2014,  Grounds for Commitment
  • Robert Henderson ( PhD ) 2012,  Ways of Pluralizing Events
  • Scott AnderBois ( PhD ) 2011,  Issues and Alternatives 
  • Kyle Rawlins ( PhD ) 2008,  Concession, Conditionals and Free Choice
  • Peter Alrenga , 2007 Dimensionality in the Semantics of Comparatives
  • James Isaacs, 2007 Supposition in Discourse
  • Lynsey Wolter , 2006 That's That: The Semantics and Pragmatics of Demonstrative Noun Phrases
  • Christopher Potts , 2003 The Logic of Conventional Implicature
  • Christine Gunlogson , 2001 True to Form: Rising and Falling Declaratives as Questions in English
  • Ryan Bush , 2000  A Typology of Focal Categories
  • Chris Kennedy , 1997 Projecting the Adjective: The Syntax and Semantics of Gradability and Comparison
  • Theodore Fernald , 1994 On the Nonuniformity of the Individual- and Stage-Level Effects
  • Michael Johnston, 1994 The Semantics of Adverbial Adjuncts
  • Louise McNally , 1992 An Interpretation for the English Existential Construction
  • Chris Barker , 1991 Possessive Descriptions

Research Visitors

  • Daniel Hardt (Copenhagen Business School)

Labs and Research Groups

  • Language, Logic, and Cognition Lab (LaLoCo)
  • Semantics, Pragmatics and LAnguage Philosophy (SPLAP!)
  • Syntax and Semantics Circle (S-Circle)

Artifacts from our Work

inquisitive semantic depiction of classical disjunction

  • Phonology and Phonetics
  • Syntax and Morphology
  • Psycholinguistics
  • Report an accessibility barrier
  • Land Acknowledgment
  • Accreditation

Last modified: October 25, 2023 128.114.113.82

logo

100+ Compelling Linguistics Research Topics for University Students

Linguistics Research Topics

Confused while selecting the interesting linguistics research topics to pen down your thoughts on a piece of paper? So, bounce back to this article and pick the best linguistics research paper topics and boost your grades.

Un-layering the essence of teaching-learning methodology demonstrates the development of linguistic theories. Linguistics is a science of language in which fact-finding is done through some rational and systematic study. While digging into the information about the history of linguistics, two perspectives on languages are unveiled: prescriptive and descriptive views.

The linguistic analysis uncovers the following areas: phonetics, phonology, syntax, morphology, semantics, and pragmatics. Furthermore, the scrutinization of linguistics helps you to know about every aspect of languages as well as methods for studying them.

Table of Contents

How To Choose the Right Linguistics Research Topics?

Stress work is the indication of degraded academic performance and lower grades even if we talk about a linguistics research paper. Make your every endeavor effective and energetic by applying the right strategy. Therefore, make the right selection for your academic writing that starts from the interesting topic selection in linguistics.

Moreover, take advantage of research paper help and discuss your concerns with professional writers. As a suggestion, you can choose the right linguistics research topics by keeping the following points in your mind:

Find your interest: Linguistics uncover various aspects of language learning and allow you to expand your mind capabilities. So, try to explore the depth of the subject and find your area of interest. It will make your academic writing more interesting and enthralling.

Brainstorm the ideas: Picking the interesting linguistics topics demands your knowledge and expertise. Therefore, you need to take the advantage of brainstorming and collect various ideas to explore the concept of linguistics.

Perform pensive research : When you are keen to score high marks, you need to have sufficient knowledge. Conduct insightful research and uncover the pensive ideas for your research paper topics in linguistics.

Interesting Topics in Linguistics

Linguistics is the foundation of language knowledge. Linguistics theories indeed are interrelated to learning the English language. When you have to boost your grades, your selection for linguistics research paper topics makes a huge difference.  Some of the interesting linguistics research topics are:

  • Explain the significance of music in the evolution of language.
  • Does age really impact English pronunciation?
  • What is the role of sociolinguistics education in creating discipline?
  • What is the significance of language in creating teaching methodology?
  • Analysis of verbal and written communication based on language usage.
  • Is it important to have expertise in several languages?
  • Explain the issues related to receptive language disorder and its impact on brain development.
  • How do you correlate sentence-making and word flow in linguistics?
  • Discuss the comparability between English and French languages.
  • Factors responsible for different spoken languages.
  • The impact of slang in the development of languages.
  • Is text messaging creating a revolutionary subculture in the new linguistic scenario?
  • How are linguistic patterns helpful in locating migration roadways?
  • What are factors affecting the capability of learning a language?
  • Explain the role of language in building a national identity for developing a multicultural society.
  • Digital Revolution: impact of computers in modern language
  • A systematic review on vowel pronunciation in the American Schools.
  • Significance of language in creating cross-cultural communities: A comprehensive review
  • Elucidate the impact of language on one’s perception.
  • Textual and Linguistic analysis for housing studies.

Stimulating Research Paper Topics In Sociolinguistics

While seeking linguistics research topics for your assignments or research paper, you may find sociolinguistics interesting to explore. Sociolinguistics demonstrates the impact of language on our society. When you are keen to explore the effect of language in different aspects of society (including cultural values and expectations), you need to do an in-depth analysis of sociolinguistics.

For building a good foundation on sociolinguistics, you can select the following linguistics paper topics:

  • How would you define linguistic practices in specific communities?
  • An elaborative approach for code-switching and code-mixing
  • Explain the impact of dialect on gender.
  • A correlational study to share the relationship between language, social class, and cognition.
  • In-depth study of interactional sociolinguistics in the 21st Century.
  • A comprehensive analysis on accountability and aptness of dialect.
  • Evaluate the education of language in the U.S.
  • The role of languages in controlling emotions.
  • Effectiveness of verbal communication in expressing one’s feelings: A competitive analysis.
  • A literature review on communication with a precise comparison of verbal and non-verbal communication
  • Difference between advanced placement (AP) English literature and language.
  • What is the relationship between language and one’s personality?
  • A critical analysis on the relation of language and ethnicity.
  • Describe the attitudes to various languages among societies.
  • A comprehensive approach on dialect variations in American English-speaking people.
  • Scrutinize linguistic variation on language loyalty.
  • Develop a good understanding of sociological variations to languages.
  • Impact of the generation gap on language usage.
  • Examine the impact of various factors (social tension, media, racism, and entertainment) on the utilization of languages.
  • Is there a difference between linguistic practices among men and women?

Also, Read: 150+ Business Research Topics

Interesting Research Topics in Applied Linguistics

Are you looking for linguistics research topics to advance your learning abilities? In such a case, you have to learn about “Applied Linguistics.” It is the branch of linguistics in which one can understand the practical applications of language studies such as speech therapy, language teaching, and more.

In other words, applied linguistics offers solutions to deal with language-related real-life problems. Imperative academic areas where you can find the applications of applied linguistics are psychology, education, sociology, communication research, and anthropology. Some applied linguistics research paper topics:

  • Discuss the expansion of learning a second language through reading.
  • Share your learning on the critical period hypothesis for the acquisition of the second language.
  • Impact of bilingualism on an individual’s personality.
  • Linguistics evaluation on the difference between written and spoken language.
  • Describe language cognition and perceptions in a learning process.
  • Impact of language barriers on healthcare delivery.
  • Detailed analysis on various methodologies to learn applied linguistics.
  • Discuss the relationship between empathy and language proficiency in learners of adult language.
  • Detailed analysis on multilingualism and multiculturalism.
  • Impact of extended instructions on the use of passive voices, modals, and relative clauses: A critical analysis.
  • Explain digitally-mediated collaborative writing for ESL students.
  • How do we evaluate self-efficacy in students who speak low-level English language?
  • Elucidate the significance of phrasal verbs in creating technical documents.
  • Expectations of American Students while taking Japanese language classes.
  • A detailed study on American deaf students in English as a Non-Native Language (ENNL) classes.
  • How do you understand by modeling music with Grammars?
  • The cognitive development of expertise as an ESL teacher: An insightful analysis.
  • Sound Effects: Gender, Age, and Sound symbolism in American English.
  • Importance of applied linguistics in today’s digital world.

Also, Read: Modern Literature

Interesting Research Topics in Semantics

The study of reference, meaning, and the truth is covered under semantics or semiotics, or semasiology. A comprehensive analysis of semantics reflects the essence of compositional semantics and lexical semantics.  The combination of words and their interaction to form larger experiences like sentences comes under compositional semantics. Whereas, the notion of words is shared under lexical semantics.

Some academic disciplines in linguistic semantics are conceptual semantics, cognitive semantics, formal semantics, computational semantics, and more. Linguistic research paper topics on Semantics are as follows:

  • Examine meaning work in language interpretation and scrutinization
  • A critical evaluation of language acquisition and language use.
  • Challenges in the study of semantic and pragmatic theory.
  • Discuss semantics lessons and paragraph structure in written language.
  • How do you explain the semantic richness effects in the recognition of visual words?
  • How richness of semantics affects the processing of a language.
  • Semantic generation to action-related stimuli: A neuroanatomical evaluation of embodied cognition.
  • Examine the understanding of blind children for reading phonological and tactual coding in Braille.
  • Explain a semantic typology of gradable predicates.
  • A comparison of between blind and sighted children’s memory performance: the reverse-generation effect.
  • Clinical research for designing medical decision support systems.
  • Discuss word recognition processes in blind and sighted children.
  • A corpus-based study on argumentative indicators.
  • The typology of modality in modern West Iranian languages.
  • A critical analysis on changes in naming and semantic abilities in different age groups.
  • Explain the multidimensional semantics of evaluative adverbs.
  • A comprehensive analysis on procedural meaning: problems and perspectives.
  • Cross-cultural and cross-linguistic perspectives on figurative language.
  • Elucidate semantic and pragmatic problems in discourse and dialogue.

Topics For Linguistics Essays

A curiosity of exploring the various concepts in linguistics leads you to work on essays. Projecting your thoughts in writing linguistics essays makes you understand the structure and changes in human languages. In a case, if you are searching for the best topics in linguistics, go through the following list of linguistics essays:

  • Difference between human language and artificial language.
  • Classification of writing systems based on various stages of development.
  • The laws of language development
  • Culture and language: impact on reflections.
  • Methodology of reading and writing for children by Albert James.
  • Significance of phoneme and phonological matters
  • The complexity of human language: the specific cases of the apes
  • Explain the development of languages and derivational morphology.
  • Detailed analysis on language extinction.
  • Investigate the peculiarities of English-Chinese and Chinese-English translations.
  • A comprehensive overview on the acquisition of English as a second language by Mid-Eastern students.
  • Discuss semiology in language analysis.
  • Impact of blogging on learning languages.
  • Linguistics: grammar and language teaching.
  • English Language: Explain its standard and non-standard types.
  • Discuss speech community as linguistic anthropology.
  • A systematic review on linguistic diversity in modern culture.
  • Similarities and differences between language and logic.
  • What is the impact of language on digital communication?
  • Listening comprehension: a comparative analysis of the articles.

Computational Linguistics Research Topics

Analysis and synthesis of language and speech using the techniques of computer science share the significance of computational linguistics. This branch of linguistics reflects the study of computational modeling of natural language. It also describes the computational approaches to answering the linguistic questions.

Under computational linguistics, you can explore different concepts such as artificial intelligence, mathematics, computer science, cognitive science, neuroscience, anthropology. More interesting computational linguistics research topics are:

  • Explain the factors measuring the performance of speech recognition.
  • Discuss word sense disambiguation.
  • Detailed analysis on dependency parsing based on graphs and transitions.
  • A multidimensional analysis on linguistic dimensions
  • Analyze Medieval German poetry through supervised learning.
  • Extraction of Danish verbs.
  • Analysis of Schizophrenia text dataset.
  • An intra-lingual contrastive corpus analysis based on computational linguistics.
  • Discuss various methods to introduce, create, and conclude a text.

Still, Confused? Select The Compelling Linguistics Research Topics With Our Writers!

Are you still stressed about picking the right linguistics research paper topic? Without striking the right ideas to your mind, you find it hard to initiate your research work. But, don’t take tension anymore. Our professional and Ph.D. writers will help you to make the appropriate selection for linguistics assignments. Grab our online paper help and receive customized solutions for your research papers.

' src=

By Alex Brown

I'm an ambitious, seasoned, and versatile author. I am experienced in proposing, outlining, and writing engaging assignments. Developing contagious academic work is always my top priority. I have a keen eye for detail and diligence in producing exceptional academic writing work. I work hard daily to help students with their assignments and projects. Experimenting with creative writing styles while maintaining a solid and informative voice is what I enjoy the most.

  • Locations and Hours

Semantic Web and Linked Data: Journals, Articles and Papers

Best practices, standards and metadata application profiles (maps), blogs, listservs and wikis, instructional resources, journals, articles and papers, semantic web services, semantic web tools, ontologies and frameworks, registries, portals, and authorities, vocabularies, wikiprojects, wikidata properties, wikidata/wikimedia tools, workshops and projects.

The Semantic Web encompasses the technology that connects data from different sources across the Web as envisioned by Tim Berners-Lee and led by the World Wide Web Consortium (W3C). This Web of Data enables the linking of data sets across data silos on the Web by providing for machine-to-machine communication through the use of Linked Data. This Guide provides descriptions and links to resources used to implement this technology.

The UCLA Semantic Web LibGuide was compiled and written by Rhonda Super. It began as a data page on Ms. Super's personal resource home page. Over a twenty year period, the Semantic Web resources listed on Rhonda's Resource Page developed into a stand alone LibGuide that served as a comprehensive resource for the Semantic Web and Linked Data community providing links to tools, best standards, instructional materials, use cases, vocabularies, and more.The Guide was updated continuously through August 2022 using the SpringShare LibGuide platform as customized by the UCLA Library. Many of its resources provide a historical look at the development of Linked Data.

Ms. Super holds a BA in English and Government and an MA in Communications from Ohio University. She earned her MLIS from San Jose State University with a concentration in archives, rare books, and academic libraries. She earned a Certificate in XML and RDF Systems from the Library Juice Academy. Ms. Super was awarded scholarships to attend the California Rare Book School where she studied Rare Books for Scholars and Archivists, Descriptive Bibliography, and History of the Book: Nineteenth and Twentieth Centuries. Ms. Super was employed by the UCLA Library from 2007 until her retirement in 2022.

The final iteration of the Guide is deposited in the University of California eScholarship Open Access repository so the Linked Data community can continue to use it as a resource.

If you cite resources from this Guide, please check the original resource for copyright and citation requirements.

Scroll down the page to access the topics listed below.

  • Best Practices, Standards, and Metadata Application Profiles (MAPS)

Blogs, Listservs, and Wikis

Journals, articles, and papers.

  • Wikidata Tools

About the Semantic Web

The Semantic Web provides for the ability to semantically link relationships between Web resources, real world resources, and concepts through the use of Linked Data enabled by Resource Description Framework (RDF). RDF uses a simple subject-predicate-object statement known as a triple for its basic building block. This provides a much richer exploration of Web and real world resources than the Web of Documents to which we are accustomed.

LINKED OPEN DATA (LOD) CLOUD

research topics of semantics

About the LOD Cloud

The diagram on this page is a visualization of Linked Open Datasets published in the Linked Data format as of April, 2014. The large circle in the center is Dbpedia, the linked data version of Wikipedia. Click on the diagram to learn more about the diagram, licensed and open linked data, statistics about the datasets in the diagram, and the latest version of the LOD Cloud. As of June, 2018, you can view Sub-CLouds by subject area.

Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak.

5-Star Open Data Rules

research topics of semantics

5-Star Open Data

Click on the image of the mug and open the link to access more information.

  • 5-Star Open Data Click here for an explanation of the costs and benefits of the 5-Star Open Data deployment scheme, and examples.
  • Open Data Certificate Open Data Institute. Open Data Certificate is a free online tool to assess and recognize the sustainable publication of quality open data. The tool benchmarks data against standards covering legal, practical, technical and social requirements to support the trust in and use of sustainable data. A badge that can be embedded in a website is awarded a data publisher based on answers provided by the publisher to a questionnaire. The Certificate builds on standards such as opendefinition.org, 5* of Open Data, Sunlight principles, and DCAT.

Getty Vocabularies Documentation

For the Getty Vocabularies, please see the Registries, Portals, and Authorities page under Vocabularies, Ontologies & Frameworks.

Best Practices and Standards

Trust is a major component of the Semantic Web. This requires providing accurate information when publishing a Linked Data instance. The World Wide Web Consortium (W3C), comprised of an international community, develops Web standards and best practices. Additionally, authorities in subject disciplines establish, administer, and maintain standards in their disciplines which adhere to W3C best practices.

This page provides access to information regarding best practices and standards relevant to Semantic Web technology as developed by W3C and other authoritative bodies. For controlled vocabularies, ontologies, etc., please consult the Vocabularies, Ontologies & Fameworks page.

  • ALCTS Standards Association for Library Collections & Technical Services (ALCTS). The ALCTS Standards is designed to be an aggregator providing a single place to find standards pertinent to the information industry. The guide is organized by topic.
  • Best Practice Recipes for Publishing RDF Vocabularies Berrueta, Diego and Jon Phipps. (2008, Aug. 28). W3C. This document describes best practice recipes for publishing vocabularies or ontologies on the Web in RDF Schema or OWL. Each recipe introduces general principles and an example configuration for use with an Apache HTTP server which may be adapted to other environments.
  • Best Practices for Recording Faceted Chronological Data in Bibliographic Records American Library Association Institutional Repositor Subcommittee on Faceted Vocabularies; Mullin, Casey; Anderson, Karen; Contursi, Lia; McGrath, Kelley; Prager, George; Schiff, Adam. (2020, June 19). This document describes best practices for encoding the date(s) of creation of works and expressions in bibliographic descriptions. The categories of dates, currently serviced by MARC 046 and 388 fields, covered by these practices are: date(s) of creation of individual works; date(s) of creation of the aggregated works in a compilation; date(s) of creation of aggregating works (compilations, anthologies, etc.); and date(s)of creation of expressions.
  • Data on the Web Best Practices This W3C document provides best practices on a range of topics including data formats, data access, data identification and metadata by providing guidelines on how to represent, describe and make data available in a way that it will be easy to find and to understand. The document provides a series of best practices. A template is used to show the "what", "why" and "how" of each best practice.
  • Generating RDF from Tabular Data on the Web W3C. (2015, December 15). This document describes the process of converting tabular data to create RDF subject-predicate-object triples which may be serialized in a concrete RDF syntax such as N-Triples, Turtle, RDFa, JSON-LD, or TriG.
  • Guidelines for Collecting Metadata on Linked Datasets in the datahub.io Data Catalog This page explains how data publishers describe datasets they want included in the DataHub (aka LOD Cloud), a registry of open data and content packages maintained by the Open Knowledge Foundation. The page also provides access to a validator that tests whether a data set fulfills the requirements for inclusion in the LOD Cloud.
  • Library of Congress (LC) Metadata This page provides links to the LC Linked Data Service metadata structure standards including Metadata Authority Description Schema in RDF (MADS/RDF), Simple Knowledge Organization System (SKOS), Web Ontology Language (OWL), Resource Description Framework (RDF), RDF Schema (RDFS), Dublin Core Metadata Initiative Metadata Terms, and SemWeb Vocab Status ontology. There is also an explanation of the relationship between LC authorities and vocabularies and SKOS.
  • Linked Data Platform Best Practices and Guidelines This W3C document provides best practices and guidelines for implementing Linked Data Platform [LDP] servers and clients. It also provides links to associated W3C documents.
  • PCC Task Group on URIs in MARC Year One Report Bremer, Robert, Folsom,Steven, Frank, Paul, et al. (2016, October 6). This Program for Cooperative Cataloging report discusses the issues associated with setting standards for provisioning URIs in MARC in transitioning from MARC to linked data. Some of the issues include repeatability, pairing, ambiguous relationships, the significance of the ordinal sequence, and identifying a potential field and/or indicator/subfield to record an identifier representing a work.
  • Wikipedia: Authority Control Wikipedia. This page describes the editing community's consensus with regard to authority control in Wikipedia articles. It describes how authority control is used in Wikipedia articles to link to corresponding entries in library catalogs of national libraries and other authority files all over the world. The page also provides instruction for using the Wikipedia template to add authority control identifiers to articles.

Additional Resources about Standards

  • Using the W3C Generating RDF from Tabular Data on the Web Recommendation to manage small Wikidata datasets Baskauf, Steven J. and Baskauf Jessica K. (2021, June 6). This article discusses the W3c recommendation for generating RDF from tabular data.

Metadata Application Profiles (MAPs)

A metadata application profile (MAP) is a set of recorded decisions about a shared application or metadata service, whether it is a datastore, repository, management system, discovery indexing layer, or other, for a given community. MAPs declare what types of entities will be described and how they relate to each other (the model), what controlled vocabularies are used, what fields are required and which fields have a cap on the number of times they can be used, data types for string values, and guiding text/scope notes for consistent use of fields/properties.

A MAP may be a multipart specification, with human-readable and machine-readable aspects, sometimes in a single file, sometimes in multiple files (e.g., a human-readable file that may include input rules, a machine-readable vocabulary, and a validation schema).

The function of a MAP is to clarify the expectations of the metadata being ingested, processed, managed, and exposed by an application or service and document shared community models and standards, and note where implementations may diverge from community standards.

Cornell University Library. (2018, October 23).CUL Metadata Application Profiles. Downloaded January , 2020, from

Library of Congress. (2019, April 30). PCC Task Group on Metadata Application Profiles. Downloaded July 19, 2022 from https://confluence.cornell.edu/display/mwgweb/CUL+Metadata+Application+Profiles

  • BIBCO Standard Record (BSR) RDA Metadata Application Profile Library of Congress, Program for Cooperative Cataloging (PCC). (2017, September 6). The BSR is a model for bibliographic monographic records using a single encoding level (Ldr/17=‘blank’) in a shared database environment, and it follows RDA 0.6.4 in its approach to core. The BSR establishes a baseline set of elements that emphasize access points over descriptive data, while not precluding the use of any data representing a more extensive cataloging treatment. The BSR MAP consists of a combination of RDA Core, RDA Core if, PCC Core, and PCC recommended elements applicable to archival materials, audio recordings, cartographic resources, electronic resources, graphic materials, moving images, notated music, rare materials, and textual monographs. Digital formats, digital reproductions, and authority records are also covered.
  • BIBFRAME Profiles: Introduction and Specification Library of Congress. (2014, May 5). This document describes how BIBFRAME Profiles are created, maintained and used. It describes an information model and reference serialization to support a means for identifying and describing structural constraints addressing functional requirements, domain models, guidelines on syntax and usage, and possibly data formats.
  • CONSER Standard Record (CSR) RDA Metadata Application Profile Library of Congress, Program for Cooperative Cataloging (PCC). (2020, January 21). The CSR is a model for serial descriptive records using a single encoding level (Ldr/17=‘blank’) in a shared database environment, and it follows RDA 0.6.4 in its approach to the concept of core. The CSR establishes a baseline set of elements that emphasize access points over descriptive data while not precluding the use of any data representing a more extensive cataloging treatment. The CSR consists of a combination of RDA Core, RDA Core if, PCC Core, and PCC Recommended elements applicable to textual serials in various formats. Instructions for rare serials and authority records are included.
  • CUL Metadata Application Profiles Cornell University Library Metadata Application Profiles. This page provides an overview and documentation of Cornell University Library's use of metadata application profiles (MAPs). The page offers a definition and explains the role of MAPs in an application or metadata service, and gives examples. A wealth of information regarding documentation for training, MAPS used at CUL, and the CUL metadata ecosystem is provided.
  • DLF AIG Metadata Application Profile Clearinghouse Project Digital Library Federation (DLF), Assessment Interest Group (AIG) Metadata Working Group. The mission of this project is to provide a hub and repository for collecting application profiles, mappings, and related specifications that aid or guide descriptive metadata conventions for digital repository collections to be shared with peers in the metadata community. The initial focus is on digital repository descriptive metadata documentation and specifications.
  • Digital Public Library (DPLA) Metadata Application Profile DPLA MAP Working Group. (2017, December 7). Version 5. This is the technical specification of the DPLA's Metadata Application Profile and provides a list of classes and properties used. Links to other useful documentation include an introduction to the profile, geographic and temporal guidelines, metadata quality guidelines, and rights statements guidelines.
  • Dublin Core Application Profiles (Guidelines for ) This document provides a framework for designing a Dublin Core Application Profile (DCAP), and more generally, a good blueprint for implementing a generic model for metadata records. A DCAP can use any terms that are defined on the basis of RDF, combining terms from multiple namespaces as needed.
  • Dublin Core Collection Description Application Profile Dublin Core Collection Description Task Group. (2007, March 9). This document presents full details of the Dublin Core application profile using Dublin Core properties for describing a collection, a catalogue, or an index.
  • IFLA Library Reference Model (IFLA LRM) International Federation of Library Associations and Institutions (IFLA). (2017, December). IFLA LRM is a high-level conceptual reference model developed within an enhanced entity-relationship modelling framework for bibliographic data. The model aims to make explicit general principles governing the logical structure of bibliographic information, without making presuppositions about how that data might be stored in any particular system or application. Distinctions between data traditionally stored in bibliographic or holdings records and data traditionally stored in name or subject authority records are not made.
  • PCC Task Group on Metadata Application Profiles Library of Congress, Program for Cooperative Cataloging (PCC). April 30, 2019. This page outlines the Program for Cooperative Cataloging (PCC)’s Task Group on Metadata Application Profiles charge to help PCC understand issues and practices associated with the management of MAPs and to help develop the expertise needed within PCC to work with MAPs. The charge includes defining MAPs in the PCC context, performing an environmental scan of current work in this space, determining what shareable application profiles means in the PCC context, collaborating with LDRP2 profiles groups, monitoring ongoing LDRP2 PCC Cohort discussions, and recommending actions for a plan to create and maintain profiles that meet stated use cases for application profiles.
  • BIBFLOW BIBFLOW is a two-year project of the UC Davis University Library and Zepheira, funded by IMLS. Its official title is “Reinventing Cataloging: Models for the Future of Library Operations.” BIBFLOW’s focus is on developing a roadmap for migrating essential library technical services workflows to a BIBFRAME / LOD ecosystem. This page collects the specific library workflows that BIBFLOW will test by developing systems to allow library staff to perform this work using LOD native tools and data stores. Interested stakeholders are invited to submit comments on the workflows developed and posted on this site. Information from comments will be used to adjust testing as the project progresses.
  • CODE4LIB Wiki This is the Wiki for library computer programmers and library technologists. It provides information regarding software, conferences, topics, local & regional groups, and interest groups.
  • DBPedia Blog DBpedia is an open, free, and comprehensive global knowledge base which is continuously extended and improved by putting into effect a quality-controlled and reliable fact extraction from Wikipedia and Wikidata. This blog provides information regarding DBpedia, tools, events, dataset releases, the the DBpedia ontology, and more.
  • Dublin Core Metadata Initiative Wiki This MediaWiki for the Dublin Core Metadata Initiative (DMCI) provides information on DCMI's activities regarding work on architecture and modeling, discussions and collaborative work in DCMI Communities and DCMI Task Groups, annual conferences and workshops, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices. Access the DCMI Handbook and LD4PE Linked Data Exploratorium.
  • FRBR Open Comments This blog encourages transparency and invites comments regarding the continued development of the international library entity relationship model, the Functional Requirements of Bibliographic Records (FRBR) and the FRBR-Library Reference Model (FRBR_LRM), a consolidation of the FRBR, FRAD and FRSAD conceptual models. Access an Executive Summary, and read or contribute to the General Comments or other areas of interest such as User tasks, Entities, User population considered, Entity-Relationship Diagrams, Modeling of Aggregates, and more.
  • Hanging Together: The OCLC Research Blog Hanging Together is OCLC's research blog. It provides information about the types of projects and issues which OCLC is researching and with whom it is partnering. The blog covers a wide range of topics including Architecture and Standards, Digitization, Identifiers, Infrastructure, Linked Data, Metadata, Modeling New Services, and more.
  • Schema Bib Extend Community Group This is the main Wiki page for the Schema Bib Extend Community Group, a W3C group formed to discuss and prepare proposal(s) for extending Schema.org schemas for the improved representation of bibliographic information markup and sharing. The Wiki provides links to the following topics: Recipes and Guidelines for those looking to adopt Schema.org for bibliographic data; Areas for Discussion; Use Cases; Scope; Object Types; Vocabulary Proposals; and Example Library.
  • Schema blog This is the official schema.org blog.

Below is a list of books which provide a good introduction to the Semantic Web. Items whose titles are highlighted in blue link either to the UCLA Library record for that title if the tile is held by the library, or to an online copy if available. Use the Safari Books Online link to search for additional resources.

research topics of semantics

This page provides a short list of datasets and data portals. To explore the global network of datasets connected on the Web, click on the Linked Open Data Cloud on the home page.

  • DataCite DataCite is a global non-profit organization that provides persistent identifiers (DOIs) for research data and other research outputs. Use it to locate, identify, and cite research data. DataCite provides several services including a global registry of research data repositories from a diverse range of academic disciplines and information about them (re3data.org), a citation formatter, content negotiation, a Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) service, and more.
  • Data.gov This page provides access to the datasets in the United States open government data catalog. Data are provided by hundreds of organizations and Federal agencies. It provides an online repository of policies, tools, case studies, and other resources to support data governance, management, exchange, and use throughout the federal government.
  • Data Hub - Linking Open Data Cloud This Data Hub group catalogs data sets that are available on the Web as Linked Data and contain data links pointing at other Linked Data sets. A search option for the datasets is available. The descriptions of the data sets in this group are used to generate the Linking Open Data Cloud diagram at regular intervals. The descriptions are also used generate the statistics provided in the State of the LOD Cloud document. The descriptions are also used generate the statistics provided in the State of the LOD Cloud document.
  • Data Portals DataPortals.org is a comprehensive list of government and NGO open data portals across the world. It is curated by a group of leading open data experts from around the world, including representatives from local, regional and national governments, international organizations such as the World Bank, and numerous NGOs.
  • DBpedia DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia provides the ability for sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data.
  • EPSG Geodetic Parameter Dataset Geodesy Subcommittee of the International Association of Oil & Gas Producers (IOGP). The EPSG Geodetic Parameter Dataset is a structured dataset of Coordinate Reference Systems and Coordinate Transformations. It can be accessed through an online registry or downloaded as zip files. Geographic coverage is worldwide, but it is does not record all possible geodetic parameters in use around the world. The dataset is maintained by the IOGP's Geomatics Committee.
  • Europeana Europeana provides access to European cultural heritage material from institutions across Europe. Discover artworks, books, music, and videos on art, newspapers, archaeology, fashion, science, sport, and much more.
  • GOKb GOKb (Global Open Knowledge base) is an an open data repository to describe electronic journals and books, publisher packages, and platforms for use in a library environment. It includes tracking changes over time, including publisher take-overs and bibliographic changes.
  • Linked Open Data Cloud lod-cloud.net. This is the home of the LOD Cloud diagram. It is a dataset of datasets published in Linkded Data format contained in the LOD Cloud. Datasets contained in the Cloud should follow the Linked Data principles listed on the site's About page. Subject areas have been broken into Subclouds for easier use.
  • List of online music databases Wikipedia. (2021, April 19). This page lists music domain datasets covering sheet music, reviews, artists, labels, a heavy metal encyclopedia, audio samples, a database of Arabic and Middle Eastern music artists, tracks, and albums, biographies and discographies, audio based music recognition and provision of song lyrics, and more.
  • Resources.data.gov This repository of Federal enterprise data resources provides links to policies, tools, case studies, and other resources to support Federal government data governance, management, exchange, and use.
  • WordNet WordNet® is a lexical database of English useful for computational linguistics and natural language processing. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets). Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts can be navigated with the browser. The dataset is available for downloading. Unfortunately, due to staffing, updates have been suspended.

There are many resources available to help you learn about the Semantic Web and Linked Data. This page provides access to a few instructional resources on topics relating to Linked Data in a variety of formats. See the SPARQL page for SPARQL related instructional resources.

  • BIBFRAME Manual Library of Congress. (2019). This is the Library of Congress training manual for the BIBFRAME Editor and BIBFRAME Database.
  • BIBFRAME Training at the Library of Congress The Library of Congress is providing training for participants in the BibFrame Pilot which is testing bibliographic description in multiple formats and in multiple languages. This website provides access to the three training modules: 1) Introduction to the Semantic Web and Linked Data; 2) Introduction to the BibFrame Tools; and 3) Using the BibFrame Editor. There is a PowerPoint presentation and quiz for each module, and some modules have additional resources.
  • Catalogers Learning Workshop (CLW) Library of Congress. This page links to Library of Congress training materials for topics such as Library of Congress Subject Headings, RDA: Resource Description & Access; BIBFRAME training at the Library of Congress; BIBFRAME Webcasts and Presentations; and other training resources.
  • Competency Index for Linked Data (CI) LD4PE. The Competency Index for Linked Data (CI) is an initiative of Exploring Linked Data, a Linked Data for Professional Educators (LD4PE) project. The web site supports the structured discovery of learning resources for Linked Data available online by open educational resource (OER) and commercial providers. The site indexes learning resources within a framework according to specific competencies, skills, and knowledge they address. Tutorials are available for such topics as Fundamental of Resource Description Framework (RDF), Fundamentals of Linked Data, RDF Vocabularies and Application Profiles, Creating and Transforming Linked Data, Interacting with RDF data, and Creating Linked Data applications. LD4PE is administered under the jurisdiction of the DCMI Education & Outreach Committee and is funded by the Institute of Museum and Library Services (IMLS).
  • Free Your Metadata This site, geared for libraries, archives, and museums, enables the matching of metadata with controlled vocabularies connected to the Linked Data cloud and the enriching of unstructured description fields using the named entity extraction tool OpenRefine extension. Learn how to check for errors and correct them, and publish metadata in a sustainable way. The site also provides information on relevant publications.
  • The language of languages Might, Matt. This article provides a brief explanation of grammars and common notations for grammars, such as Backus-Naur Form (BNF), Extended Backus-Naur Form (EBNF) and regular extensions to BNF. Grammars determine the structure of programming languages, protocol specifications, query languages, file formats, pattern languages, memory layouts, formal languages, config files, mark-up languages, formatting languages, and meta-languages. The Extended Backus-Naur Form notation is used to describe the essential BIBFRAME Profile syntax elements.
  • Linked Data: Evolving the Web into a Global Data Space Tom Heath, Tom and Bizer, Christian. (2011). (1st edition). Synthesis Lectures on the Semantic Web: Theory and Technology, 1:1, 1-136. Morgan & Claypool. This overview of Linked Data principles and the Web of Data discusses patterns for publishing Linked Data and describes deployed Linked Data applications and their architecture. This book supersedes the publication, "How to Publish Linked Data on the Web," by Chris Bizer, Richard Cyganiak, and Tom Heath.
  • Linked Data Tools This site has been created by professional developers to help the web community transition into Web 3.0, or the Semantic Web. The site provides tools and tutorials for learning how to begin using the semantic web.
  • MarcEdit and OpenRefine Reese, Terry. (2016, January 16). This page describes how to export a MARC file for use in OpenRefine.
  • MARCEdit You Tube Videos This page lists over 90 videos produced by Terry Reese providing instructions for using MARCEdit. Topics include "MarcEdit 101: I have a MARC record, now what?," "Installing MarcEdit natively on a Mac operating system," "Extract and Edit Subsets of Records in MarcEdit," "MarcEdit Task Automation Tool," and "MarcEdit RDA Helper."
  • NCompass Live: Metadata Manipulations: Using MarcEdit and OpenRefine Nebraska Library Commission. (2015, June 24). This tutorial provides instruction for using OpenRefine and MARCEdit.
  • NCompass Live: Metadata Manipulations: Using Marc Edit And Open Refine To Enhance Technical Services Workflows Nebraska Library Commission. (2015, June 24). This video shows how to use MARCEdit and OpenRefine to edit your catalog records more efficiently, transform your library data from one format to another, and detect misspellings and other inaccuracies in your metadata.
  • Ontogenesis Lord, Phillip. (2012). This is an archived Knowledge Blog which provides access to descriptive, tutorial, and explanatory material about building, using, and maintaining ontologies, as well as the social processes and technology that support this. There are links to articles, many peer reviewed, and tutorials regarding a range of topics of interest for developers and users of ontologies.
  • Ontology Development 101: A Guide to Creating Your First Ontology Noy, Natalya F. and McGuiness, Deborah L. Stanford University. This guide discusses the reasons for developing an ontology and the methodology for creating an ontology based on declarative knowledge representation systems.
  • OpenRefine Wiki External Resources This page lists tutorials and resources developed outside the OpenRefine wiki covering a wide range of topics and use cases, including general instruction, data clean up, geospatial metadata, spreadsheet transformations, and much more.
  • Programming Historian Crymble, Adam, Fred Gibbs, Allison Hegel, Caleb McDaniel, Ian Milligan, Evan Taparata, and Jeri Wieringa, eds. (2016). The Programming Historian. 2nd ed. This blog provides peer-reviewed tutorials geared towards helping humanists learn a wide range of digital tools, techniques, and workflows to facilitate their research. Several of the tutorials are related to linked data. Other tutorials may be of interest to those generating or consuming data.
  • RDFa with schema.org codelab: overview Scott, Dan. (2014, Dec.1). This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Using detailed instructions and examples, this page walks through the process of using schema.org to enhance library web pages so that they contain structured data using the schema.org vocabulary and RDFa attributes.
  • Semantic Web Data at the University of Washington Libraries Cataloging and Metadata Services, University of Washington. This webpage links to a wide range of useful resources and guidelines for working with Linked Data in a University setting. The project was developed by the Institute of Museum and Library Services.
  • What Can We Do About Our Legacy Data? Hillmann, Diane. (2015). This is Diane Hillmann's presentation given at the 2015 American Library Association Conference raising questions about moving library data onto the Semantic Web. Posted to SlideShare on June 29, 2015.
  • XPath Tutorial This W3schools page provides an introductory tutorial for XPath, a language for finding information in an XML document.
  • Bulletin of the Association for Information Science and Technology Association for Information Science and Technology. Silver Spring, Maryland. [2013-]
  • Cataloging & Classification Quarterly Haworth Press. Binghamton, NY. (1981)- ISSN: 1544-4554. The journal covers the full spectrum of creation, content, management, use, and usability of bibliographic records, including the principles, functions, and techniques of descriptive cataloging. The range of methods of subject analysis and classification, provision of access for all formats of materials, and policies, planning, and issues connected to the effective use of bibliographic data in modern society are also focuses of this journal.
  • The Code{4}Lib Journal Code{4}Lib Journal, Chapil HIll, N.C.. (2007-). ISSN: 1940-5758The focus of this journal is to provide the library community with information regarding technology tools for managing information in libraries.
  • International Journal of Web & Semantic Technology Academy & Industry Research Collaboration Center (AIRCC). (2010 - .) ISSN: 0975-9026; EISSN: 0975-9026. This journal focuses on theory, methodology, and applications of web and semantic technology.
  • Journal of library metadata Haworth Press. New York, NY. (2008 - ). SSN : 1937-5034; ISSN : 1938-6389. The metadata that describes library resources is becoming more critical for digital resource management and discovery. This journal covers application profiles, best practices, controlled vocabularies, cross walking of metadata and interoperability, digital libraries and metadata, federated repositories and searching, folksonomies, individual metadata schemes, institutional repository metadata, metadata content standards, resource description framework, SKOS, topic maps, and more.
  • Journal of the Association for Information Science and Technology Association for Information Science and Technology. Wiley Blackwell. Hoboken, NJ. (2014). This journal publishes original research that focuses on the production, discovery, recording, storage, representation, retrieval, presentation, manipulation, dissemination, use, and evaluation of information and on the tools and techniques associated with these processes.
  • Library Technology Reports American library Association, Chicago, Ill. (2009 - ). Library Technology Reports focuses on the application of technology to library services, including evaluative descriptions of specific products or product classes and covers emerging technology. The journal is sunsetting December, 2022 and will be available for single-issue sales only.
  • Web Semantics : Science, Services and Agents on the World Wide Web Elsevier Science. Amsterdam; New York. (2004)- ISSN: 1873-7749; ISSN : 1570-8268. This journal covers all aspects of Semantic Web development including topics such as knowledge technologies, ontology, agents, databases and the semantic grid. It also focuses on disciplines such as information retrieval, language technology, human-computer interaction and knowledge discovery.

Articles and Papers

  • Addressing the Challenges with Organizational Identifiers and ISNI Smith-Yoshimura, Karen, Gatenby, Janifer, Agnew,Grace, Brown,Christopher, Byrne, Kate, Carruthers,Matt, Fletcher, Peter, Hearn, Stephen, Li, Xiaoli, Muilwijk, Marina, Naun, Chew Chiat, Riemer, John, Sadler, Roderick, Wang, Jing, Wiley, Glen, and Willey, Kayla. (2016). Dublin, Ohio: OCLC Research. This paper discusses a model for using unique identifiers that are resolvable globally over networks via a specific protocol to provide the means to find and identify an organization accurately and to define the relationships among its sub-units and with other organizations.
  • A Division of Labor: The Role of Schema.org in a Semantic Web Model of Library Resources Godby, Carol Jean. (2017). This article describes experiments with Schema.org conducted by OCLC as a foundation for a linked data model for library resources, and why Schema.org was the vocabulary considered in designing the next generation standards for library data.
  • Creating Organization Name Authority within an Electronic Resources Management System Blake, K., & Samples, J. (2009) Library Resources & Technical Services, 53(2), 94-107. To access the linked data project associated with this article, click on Organization Name Linked Data on our Use Cases Page.
  • Creating Value with Identifiers in an Open Data World Open Data Institute and Thomson Reuters. (2014) Creating Value with Identifiers in an Open Data World. Retrieved from http://thomsonreuters.com/site/data-identifiers. This joint effort between Thomson Reuters and the Open Data Institute serves as a guide for how identifiers can create value by empowering linked data for publishing and discovery.
  • The Global Open Knowledgebase (GOKb): open linked data supporting electronic resources management and scholarly communication Antelman ,Kristin and Wilson, Kristen. (2015). DOI: http://doi.org/10.1629/uksg.217. CC BY 3.0 License. Kristen Wilson Global Open Knowledgebase is an open data repository of information related to e-resources as they are acquired and managed in a library environment. This article describes how the GOKb model was developed to track this information.
  • Hello BIBFRAME2.0: Changes from 1.0 and Possible Directions for the Future Kroeger, Angela. J. (2016, October 20). Criss Library Faculty Proceedings & Presentations. 65. This presentation introduces the basics and history of the BIBFRAME model, and its relationship to RDF, FRBR, and RDA. It covers core classes, editors, mixing metadata, holdings, approaches, PREMIS, changes from BIBFRAME1.0, and more.
  • Introducing the FRBR Library Reference Model Riva, Pat, and Žumer, Maja. (2015). This paper serves as an introduction to the FRBR Library Reference Model which consolidates the FRBR, FRAD, and FRSAD models for bibliographic data, authority data, and subject authority data so that the model's definitions can be readily transferred to the IFLA FRBR namespace for use with linked open data applications.
  • Linked Data in Libraries: A Case Study of Harvesting and Sharing Bibliographic Metadata with BIBFRAME Tharani, Karim. (2015). In "Information Technology and Libraries", 34(1). This paper illustrates and evaluates the Bibliographic Framework (BIBFRAME) as a means for harvesting and sharing bibliographic metadata over the web for libraries. With BIBFRAME disparate library metadata sources such as catalogs and digital collections can be harvested and integrated over the web.
  • LTS and Linked Data: a position paper Naun,Chew Chiat , Kovari,Jason, and Folsom, Steven. (2015, Dec. 16). Prepared for Cornell University Library Technical Services (LTS), this paper explores reasons for adopting linked data techniques techniques for describing and managing library collections, and seeks to articulate a specific role for Library Technical Services within this linked data environment.
  • Making Ontology Relationships Explicit in a Ontology Network Díaz, Alicia, Motz, Regina, and Rohrer, Edelweis. (2011). This paper formally defines the different relationships among networked ontologies and shows how they can be modeled as an ontology network in a case study of the health domain.
  • RDA vocabularies for a twenty-first-century data environment Coyle, Karen. (2010). Library technology reports, v. 46, no. 2, p.5-39. Contents include Library Data in the Web World, Metadata Models of the World Wide Web, FRBR, the Domain Model, and RDA in RDF.
  • The Relationship between BIBFRAME and OCLC’s Linked-Data Model of Bibliographic Description: A Working Paper Godby, Carol Jean. (2013, June). Dublin, Ohio: OCLC Research. This paper describes a proposed alignment between BIBFRAME and an OCLC model using Schema Bib Extend extensions to enhance Schema.org for use with the description of library resources.
  • Sharing Research Data and Intellectual Property Law: A Primer Carroll. Michael W. (2015) PLoS Biol 13(8): e1002235. doi:10.1371/journal.pbio.1002235. This article explains how to work through the general intellectual property and contractual issues for all research data.
  • Towards Identity in Linked Data McCusker, James P. and McGuinness, Deborah L. Rensselaer Polytechnic Institute. This paper poses problems with and solutions for using owl:sameAs for linking datasets when dealing with provenance, context, and imperfect representations in Linked Data. The paper uses examples of merging provenance in biomedical applications.
  • Understanding Metadata Riley, Jenn. National Information Standards Organization (NISO). This primer serves as a guidance for using data and covers developments in metadata, new tools, best practices, and available resources.
  • Web-Scale Querying through Linked Data Fragments Verborgh, Ruben, Vander Sande, Miel, Colpaert, Pieter, Coppens, Sam, Mannens, Erik, Van de Walle, Rik. (2014). This paper explains the core concepts behind Linked Data Fragments, a method that allows efficient linked data query execution from servers to clients through a lightweight partitioning strategy.
  • When owl:sameAs Isn’t the Same: An Analysis of Identity in Linked Data Halpin, Harry, Hayes, Patrick J., McCusker, James P., McGuinness, Deborah L., and Thompson, Henry S. (2010). Patel-Schneider, P. F. et al. (Eds.): ISWC 2010, Part I, LNCS 6496, pp. 305–320, Springer-Verlag Berlin Heidelberg. This document discusses how owl:sameAs is being used and misused on the Web of data, particularly with regards to interactions with inference. The authors describe how referentially opaque contexts that do not allow inference exist, and outline some varieties of referentially-opaque alternatives to owl:sameAs.

This page lists Semantic Web services which are of interest to information specialists, libraries, museums, and cultural organizations.

  • Library.Link Network Library.Link Network is a service which transforms data from library resources into searchable resources on the Web using Linked Data.
  • Library of Congress Linked Data Service This is the portal for all of the Library of Congress' Linked Data Vocabularies and Authorities, including without limitation, LC Subject Headings, Name Authority File, MARC Relators, LC Classification, LC Children's Subject Headings, LC Genre/Form Terms, ISO Languages, Cultural Organizations, Content Types, to name a few.
  • Share-VDE Share-VDE (SVDE) is a discovery interface offering an intuitive delivery service of wide-ranging and detailed search results to library patrons. Library catalogues of participating institutions are converted from MARC to Resource Description Framework (RDF) using the BIBFRAME vocabulary and other ontologies to form clusters of entities. The network of resources created is published as linked data. A common knowledge base of clusters is compiled in a Cluster Knowledge Base named Sapientia. Participating libraries handle their own data as independently as possible and receive their original records converted into linked data. The SVDE infrastructure is built on the LOD Platform.
  • VIAF: The Virtual International Authority File VIAF links and matches multiple name authority files from global resources into a single OCLC-hosted name authority service increasing the utility of library authority files and making them available on the Web.
  • WorldCat Entities OCLC. (2022). This OCLC service provides the ability to search WorldCat Entities for persons and works. Browse through different languages and explore the way each entity links to other external vocabularies and authority.

Semantic Web technology uses an array of tools. This page lists conversion tools, data management tools, glossaries, ontology & vocabulary building platforms, Semantic Web browsers, validators, XML editors, and XPath tools.

  • W3C Semantic Web Tools This Wiki lists an array of tools for developing Semantic Web applications compiled by the W3C, including development environments, editors, libraries or modules for various programming languages, specialized browsers, and more.

Assessment Tools

  • DLF AIG MWG Metadata Assessment Toolkit The Digital Library Federation (DLF) Assessment Interest Group (AIG) Metadata Working Group (MWG) aka DLF Metadata Assessment Working Group. The toolkit is a great resource for assessment information and tools and covers a review of the literature, tools, and organizations concerning metadata assessment, quality, and best practices. The site provides a list of metadata assessment tools, and a collection of application profiles, mappings, code and best practices provided by several institutions.
  • LODQuator LODQuator is a data portal built on the Luzzu Quality Assessment Framework for ranking and filtering Linked Open Data Cloud datasets. It provides the ability to search datasets based on their quality using over a dozen metrics which are listed on the site.
  • Luzzu Enterprise Information Systems (EIS) at Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), University of Bonn. Luzzu is a quality assessment framework for Linked Data Open datasets based on the Dataset Quality Ontology (daQ). It assesses Linked Data quality using user-provided domain specific quality metrics in a scalable manner, provides query enabled quality metadata on assessed datasets, and assembles detailed quality reports on assessed datasets.

Authority Tools

  • Authority toolkit: create and modify authority records Strawn, Gary L. (2016, June 30). Northwestern University. Evanston, IL USA. This document describes how the Authority Toolkit can be used to create a new authority record from an access field in a bibliographic record. Use the tool to help you enhance the preliminary authority record, enhance an existing authority record, or extract one identity from an undifferentiated personal name authority record and then enhance the preliminary authority record for the extracted identity. The tool can be used to extract information from sources such as VIAF, Wikidata, Wikipedia, and the CERL thesaurus into authority records.

BIBFRAME Tools

  • BIBFRAME Comparison Tool This tool provides for the side-by-side conversion of MARCXML records from the Library of Congress database to BIBFRAME2 using a LCCN or record number. Records can be serialized in Turtle or RDF XML.
  • Bibliographic Framework Initiative Library of Congress. The Bibliographic Framework Initiative is the replacement for MARC developed by the Library of Congress and is investigating all aspects of bibliographic description, data creation, and data exchange. More broadly the initiative includes accommodating different content models and cataloging rules, exploring new methods of data entry, and evaluating current exchange protocols.This page provides access to the BIBFRAME 2.0 model, vocabulary, extension list view, and MARC 21 to BIBFRAME conversion tools. The BIBFRAME Implementation Register can be accessed here.
  • marc2bibframe2 This tool, available on GitHub, uses an XSLT 1.0 application to covert MARCXML to RDF/XML, using the BIBFRAME 2.0 and MADSRDF ontologies. Information regarding integration of the application with Metaproxy is also available.
  • MARC 21 to BIBFRAME 2.0 Conversion Specifications These specifications were developed to support a pilot in the use of BIBFRAME 2.0 at the Library of Congress. They specify the conversion of MARC Bibliographic records to BIBFRAME Work, Instance and Item descriptions, and MARC Authority records for titles and name/titles to BIBFRAME Work descriptions. The specifications were written from rom the perspective of MARC so that each element in MARC would at least be considered, even if not converted. The specifications are presented in MS Excel files with explanatory specifications in MS Word.
  • Sinopia Sinopia is an implementation of the Library of Congress BIBFRAME Editor and Profile Editor.

Conversion Tools

  • Freeformatter JSON to XML Converter This tool converts a JSON file into an XML file. The converter uses rules to make allowances for XML using different item types that do not have an equivalent JSON representation.
  • Freeformatter XML to JSON Converter This tool converts an XML file into a JSON file. The converter uses rules to make allowances for XML using different item types that do not have an equivalent JSON representation.
  • OxGarage OxGarage is a web, RESTful conversion service developed by the University of Oxford IT Services. The majority of transformations use the Text Encoding Initiative (TEI) format as a pivot format, and many other formats are supported, including TEI to Word and Word to TEI. Give the page a moment to load. Choose a format from a menu of Documents, Presentations, or Spreadsheets to convert to a format from a list provided for each menu option.
  • Pandoc Pandoc converts documents in markdown, reStructuredText, textile, HTML, DocBook, LaTeX, MediaWiki markup, TWiki markup, OPML, Emacs Org-Mode, Txt2Tags, Microsoft Word docx, LibreOffice ODT, EPUB, or Haddock markup to HTML formats, word processor formats, Ebooks, documentation formats, page layout formats, outline formats, TeX formats, PDF, lightweight markup formats, and custom formats.
  • SearchFAST OCLC Research. SearchFast is a suite of tools for working with FAST headings. The tools include a converter to convert Library of Congress Subject Headings to FAST headings, searchFast, a search interface for the FAST database, and mapFast, a Google Maps mashup to provide map based access to bibliographic records using FAST geographic and event authorities. Other tools in the suite include FAST Linked Data, authorities formatted using schema.org and SKOS (Simple Knowledge Organization System) that are linked to LCSH and other authorities such as VIAF, Wikipedia, and GeoNames, and assignFast, a web service that automates manual selection of FAST subjects.

Data Management Tools

  • CKAN CKAN (Comprehensive Knowledge Archive Network) is open-source data portal platform aimed at data publishers such as national and regional governments (including the U. S. government), companies and organizations wanting to make their data open and available. CKANs harvesting framework can be used to retrieve, normalize, and convert dataset metadata from multiple catalogs. It provides a catalog system, integration with third-party content management systems like Drupal and WordPress, data visualization and analytics, integrated data storage and full data API, and more. CKAN is maintained by the Open Knowledge Foundation which provides support and hosting.
  • DataHub DataHub is a free data management platform from the Open Knowledge Foundation. It can be used to publish or register datasets as well as create and manage groups and communities. It is based on the CKAN data management system.
  • The Dataverse Project A repository for research data that supports the sharing of open data and enables reproducible research.
  • eXistdb eXistdb is a NoSQL XML and non-documents database which uses the XML Query Language (XQuery) for coding and indexing. It can work alongside oXygen. Users of eXistdb include the Office of the Historian, United States Department of State and the University of Victoria Humanities Computing and Media Centre.
  • Fedora Fedora (Flexible Extensible Digital Object Repository Architecture) is a modular, open source repository platform for the management and dissemination of digital content, including curating research data throughout the research life cycle from beginning through preservation in a RDF environment. Fedora is being used for digital collections, e-research, digital libraries, archives, digital preservation, institutional repositories, open access publishing, document management, digital asset management, and more.
  • Jupyter Jupyter is an open-source web application for creating and sharing documents containing live code, equations, visualizations and narrative text. It can be used for data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. Jupyter supports over 40 programming languages, including Python, R, Julia, and Scala.
  • KriKri KriKri is a Ruby on Rails open source engine for metadata aggregation, enhancement, and quality control developed by the Digital Library of America (DPLA) released under the MIT License. It works with Heiðrún, DPLA's metadata ingestion system. Features include: harvesting metadata from OAI-PMH providers; creating RDF metadata models, with specific support for the DPLA Metadata Application Profile; enrichments for mapped metadata, including date parsing and normalization, stripping and splitting on punctuation; parsing metadata and mapping to RDF graphs using a Domain Specific Language; and more.
  • OpenRefine OpenRefine (formerly Google Refine) is a tool for working with data. Use it to clean data, transform data from one format into another, extend data with web services, and link it to databases such as Wikidata.
  • Samvera Samvera (previously, Hydra), is an open source digital asset management framework. The system uses Ruby gem building blocks allowing for customization. Samvera instances can be cloned and adapted to local needs. Bundled solutions requiring fewer local resources or cloud-based, hosted versions include Avalon, Hyrax, and Hyku.
  • Wikibase Wikibase was developed for Wikidata as an open source collection of applications and libraries for creating and sharing structured data as linked data entities and their relationships. It consists of a set of extensions to the MediaWiki software for storing and managing data (Wikibase Repository) and for embedding data on other wikis (Wikibase Client). Wikibase provides an editing interface for creating, updating, merging, and deleting item and property entities.

Discovery Interfaces

  • Blacklight Blacklight is in open source, discovery interface platform framework for searching an Apache Solr index. Blacklight MARC provides library catalog enhancements, Spotlight enables the creation of feature rich websites for digital collections, and Geoblacklight provides for the discovery and sharing of geospatial data. Search box, facet constraints, stable document urls, and more are customizable via Rails templating mechanisms. It accommodates heterogeneous data, allowing different information displays for different types of objects.
  • Geany Geany is an open source text editor using the GTK+ toolkit with basic features of an integrated development environment (IDE). It supports many filetypes including C, Java, Java Script, PHP, HTML, CSS, Python, Perl, Pascal, Ruby, XML, SQL, and more. Features include syntax highlighting, code folding, symbol name auto-completion, auto-closing of XML and HTML tags, code navigation, build system to compile and execute your code, symbol lists, and a plug-in interface. Geany runs on every platform which is supported by the GTK libraries including Linux, FreeBSD, NetBSD, OpenBSD, MacOS X, AIX v5.3, Solaris Express and Windows. Only the Windows port of Geany is missing some features.
  • LIME Palmirani, Monica, Vitali, Fabio, and Cervone, Luca, et al. LIME is an open source, customizable web based editor for converting non-structured legal documents into XML. Currently, there are demo versions of LIME for three schema languages: AkomaNtoso; TEI; and LegalRuleML. LIME provides a linked outline view of the document and a contextual markup menu showing available elements. Click on the Demo tab at the top of the web site to choose a schema. LIME is under development at CIRSFID and the University of Bologna.
  • MarcEdit MarcEdit is a free Marc editing tool. Use the tool to download a MARC record and transform it into an RDF/XML serialization of the record. The tool also can be used to perform MARC database maintenance. MarcEdit includes a tool for querying registered xslt crosswalks and downloading them for use with MarcEdit.
  • Notepad ++ Notepad ++ is a free source code editor that runs in the MS Windows environment.
  • oXygen oXygen is a licensed cross platform XML editor that works with all XML-based technologies including XML databases, XProc pipelines, and web services. oXygen XML Author comes with a configurable and extensible visual editing mode based on W3C CSS stylesheets with ready-to-use DITA, DocBook, TEI, XHTML, XSLT, and XQuery support.
  • pymarc Python Software Foundation. (2019). pymarc is a python library for working with bibliographic data encoded in MARC21. It provides an API for reading, creating, and modifying MARC records.
  • RDFa Play RDFa Play is a real-time RDFa 1.1 editor, data visualizer and debugger. Paste your HTML+RDFa code into the editor to view a preview page, a data visualization, and the raw data of your code.
  • Dublin Core Generator This site provides three tools developed by Nick Steffel to generate Dublin Core code. The Simple Generator generates simple Dublin Core metadata using only the 15 main elements. Advanced Dublin Core metadata code using the more detailed qualified elements and encoding schemes can be generated using the Advanced Generator, and there is a generator for the xZINECOREx variation of Dublin Core.
  • Glossary of Metadata Standards This glossary lists the most common metadata standards used in the cultural heritage community. Several of them are listed on our Vocabularies page, which you can access by clicking on Vocabularies, etc. in the menu on the left. A color version of the Seeing Standards poster is also shown on that page. A poster version of the glossary is also available.
  • Glossary of Terms Relating to Thesauri and Other Forms of Structured Vocabulary Will, Leonard D. and Will, Sheena. (2013). This is an alphabetical list of terms associated with thesauri and structured vocabularies.
  • Linked Data Glossary This is the W3C's glossary of Linked Data terms.

Ontology/Vocabulary Building Platforms and Tools

  • Fluent Editor Fluent Editor is a tool for editing and manipulating complex ontologies that use Controlled Natural Language. A main feature is the usage of Controlled English as a knowledge modeling language. it prohibits one from entering any sentence that is grammatically or morphologically incorrect and actively helps the user during sentence writing. It is free for individual or academic use. Access to updates and information is given with registration.
  • Neologism Neologism is an open source vocabulary publishing platform for creating and publishing vocabularies compatible with Linked Data principles. It supports the RDFS standard enabling you to create RDF classes and properties. It also supports a part of OWL. Neologism is written in PHP and built on the Drupal platform.
  • NeOn NeOn is an open source multi-platform for the support of the ontology engineering life-cycle. The toolkit is based on the Eclipse platform and provides an extensive set of plug-ins covering a variety of ontology engineering activities, including Annotation and Documentation, Development, Human-Ontology Interaction, Knowledge Acquisition, Management, Modularization and Customization, Neon Plugins, Old Main Page, Ontology Dynamics, Ontology Evaluation, Ontology Matching, Reasoning and Inference, and Reuse. NeOn’s aim is to advance the state of the art in using ontologies for large-scale semantic applications in distributed organisations by improving the ability to handle multiple networked ontologies that exist in a particular context, are created collaboratively, and might be highly dynamic and constantly evolving.
  • OOPS! (OntOlogy Pitfall Scanner!) OOPS! is an application used to detect common pitfalls when developing ontologies. Enter the URI or the RDF code of the ontology. Once the ontology is analyzed, a results list of pitfalls appear that can be expanded to display information regarding the pitfalls.
  • Protégé Protégé is a free, open­source platform with a suite of tools to construct domain models and knowledge ­based applications with ontologies. Protégé Desktop is a feature rich ontology editing environment with full support for the OWL 2 Web Ontology Language,and direct in-memory connections to description logic reasoners like HermiT and Pellet. Protégé Desktop supports creation and editing of one or more ontologies in a single workspace via a completely customizable user interface. Visualization tools allow for interactive navigation of ontology relationships. It is W3C standards compliant and offers ontology refactoring support, direct interface to reasoners like HermiT and Pellet and is cross compatible with WebProtégé. Protégé provides an environment to create, upload, modify, and share ontologies for collaborative viewing and editing. Protégé was developed by the Stanford Center for Biomedical Inforamtics Research at the Stanford University School of Medicine. Download the desktop version or use the Web version from this site.
  • VOWL: Visual Notation for OWL Ontologies This page provides access to three tools for visualizing ontologies: WebVOWL; QueryVOWL; and the Protégé plug-in, ProtégéVOWL. A link to the VOWL (Visual Notation for OWL Ontologies) specification and a Language Reference for QueryVOWL (Visual Query Language) for Linked Data is also provided.

Query Tools, Search Engines & Browser Add-ons

  • Linked Data Fragments Use this tool to execute queries against live Linked Data on the Web in your browser. The tool supports federated querying.
  • OpenLink Data Explorer Extension OpenLink Software. This web browser extension provides options for viewing Data Sources associated with Web Pages to explore the raw data and entity relationships that underlay the Web resources it processes. The extension enables Hypertext and Hyperdata traversal of Web data. The browser add-on is easy to install. It was first developed for use on most browsers, but with some browser updates, the add-on doesn't work. Try using it with Chrome. The browser provides filters for faceted searching and visualization options.
  • OpenLink Structured Data Sniffer (OSDS) OpenLink Software. OpenLink Structured Data Sniffer is a browser extension for Google Chrome, Microsoft Edge, Mozilla Firefox, Opera, and Vivaldi that reveals structured metadata embedded in HTML pages in notations including POSH (Plain Old Semantic HTML), Microdata, JSON-LD, RDF-Turtle, and RDFa. Buttons assist in navigating the Web, and it provides the ability to save extracted metadata or new annotations to the cloud or local storage.
  • Metaproxy Index Data. Metaproxy is a proxy Z39.50/SRW/SRU front end server designed for integrating multiple back end databases into a single searchable resource. It also works in conjunction with Index Data’s library of gateways to access non-standard database servers. Index Data works with libraries, consortia, publishers, aggregators, technology vendors, and developers.
  • Ontobee He Group. University of Michigan. Ontobee is a Linked Data server designed to facilitate ontology sharing, visualization, query, integration, and analysis. It dereferences term URIs to HTML web pages for user-friendly browsing and navigation and to RDF source code for Semantic Web applications.

Triple Store Tools

  • Blazegraph Blazegraph is a scalable, high-performance graph database with support for Blueprints and RDF/SPARQL APIs. It supports up to 50 billion edges on a single machine. Blazegraph works in a Python environment. Wikimedia uses it to power their wikidata query service.
  • Gruff Gruff is a free, downloadable graphical triple-store browser with a variety of tools for laying out cyclical graphs, displaying tables of properties, managing queries, and building queries as visual diagrams. Use gruff to display visual graphs of subsets of a store’s resources and their links and build a visual graph that displays a variety of the relationships in a triple-store. Gruff can also display tables of all properties of selected resources or generate tables with SPARQL queries, and resources in the tables can be added to the visual graph.
  • Freeformatter JSON Validator This tool validates a JSON string against RFC 4627 (the application/json media type for JavaScript Object Notation) and against the JavaScript language specification. Configure the validator to be lenient or strict.
  • Link Checker W3C. (2019). Use this validator to check issues with links, anchors and referenced objects in Web pages, CSS style sheets, or whole Web sites. Best results are achieved when the documents checked use Valid (X)HTML Markup and CSS.
  • RDF Validation Service Use this tool to parse RDF/XML documents. A 3-tuple (triple) representation of the corresponding data model as well as an optional graphical visualization of the data model will be displayed.
  • Structured Data Linter The Structured Data Linter was initiated by Stéphane Corlosquet and Gregg Kellogg. It is a tool to verify structured data present in HTML pages. The Linter provides snippet visualizations for schema.org and performs limited vocabulary validations for schema.org, Dublin Core Metadata Terms, Friend of a Friend (FOAF), GoodRelations, Facebook's Open Graph Protocol, Semantically-Interlinked Online Communities (SIOC), Facebook's Open Graph Protocol, Simple Knowledge Organization System (SKOS), and Data-Vocabulary.org.
  • Toolz Online XML Validator Insert a fragment of an XML document into this tool to validate it.
  • Yandex Yandex is a structured data Microformat validator for checking semantic markup. Check all the most common microformats: microdata, schema.org, microformats, OpenGraph and RDF by cutting and pasting the source code into the validator.

Visualization Tools

  • D3 Data-Driven Documents is a JavaScript library for manipulating documents based on data using HTML, SVG and CSS. Using D3, data can be displayed in a vast array of visualization formats including, but not limited to Box Plots, Bubble Charts, Bullet Charts, Calendar Views, Chord Diagrams, Dendograms, Force-Directed Graphs, Chord Diagrams, Circle Packings, Population Pyramids, Steamgraphs, Sunbursts, Node-link Trees, Treemaps, Voronoi Diagrams, Collision Detections, Hierarchical Edge Bundlings, Word Cloud, and more.
  • Visual Data Web The Visual Data Web provides links to visualization tools compatible with RDF and Linked Data on the Semantic Web, especially for average Web users with little to no knowledge about the underlying technologies. The site provides information regarding developments, related publications, and current activities to generate new ideas, methods, and tools to make the Data Web more accessible and visible.

XPath Tools

  • eagle-i The eagle-i Software and ontology consists of six web applications: eagle-i Central Search and iPS Cell Search — for resource discovery and exploration; Institutional search — for a single repository search UI; Ontology Browser — for viewing the eagle-i ontology without any additional applications; SWEET (Semantic Web Entry & Editing Tool) — for manually entering and managing data in an eagle-i repository; RDF repository — for storing resource and provenance metadata as RDF triples; and SPARQLer — a SPARQL query entry point and workbench to query an eagle-i repository. These applications are served by the ETL (extract, transform, and load) toolkit — for batch entry of information to an eagle-i repository in an ontology-compliant manner and the Data management toolkit — for bulk data maintenance and migration. The open source software development platform offers integrated tools for JIRA bug tracking, Confluence Wiki, Bamboo continuous builds, Nexus download repository, project mailing lists, repository monitoring, and more.
  • Freeformatter XPath Tester/Evaluator Use this tool to test XPath expressions/queries against an XML file. It supports most of the XPath functions (string(), number(), name(), string-length() etc.) and is not limit to working against nodes.
  • Toolz XPath Tester/Evaluator Use this tool to run an XPATH statement against an XML fragment
  • W3C XPath evalutation online Use this W3C tool yo check a XPath expression against XML.

Miscellaneous

  • Keyword Planner The Google AdWords Keyword Planner is not a semantic web tool. While geared towards advertising, it can be a useful took to discover similar keywords for a topic. It is a free tool, but you will have to create an account.
  • prefix.cc Enter a namespace prefix in this tool to find the full namespace for the prefix. The service also provides a reverse lookup option which finds a prefix for a given namespace URI.

Instructional Resources for Semantic Tools

research topics of semantics

  • WebLearn This blog provides examples of using OpenRefine to clean MARC data. Stephen shares his experience working with MARC data while developing the Sir Louie Project, a project to improve the searching of library catalogues and the displaying availability information with a reading list on behalf of the British Library.

SPARQL serves as the search engine for RDF. It is a set of specifications recommended by W3C Recommendation that provide languages and protocols to query and manipulate RDF graph content on the Web or in an RDF triple store.

SPARQL Documentation

  • SPARQL 1.1 Entailment Regimes Glimm, Birte, and Ogbuji, Chimezie, editors. (2013, March21). This document defines entailment regimes and specifies how they can be used to redefine the evaluation of basic graph patterns from a SPARQL query making use of SPARQL's extension point for basic graph pattern matching. Entailment regimes specify conditions that limit the number of entailments that contribute solutions for a basic graph pattern.
  • SPARQL 1.1 Federated Query Seaborne, Andy, Polleres, Axel, Feigenbaum, Lee, and Williams, Gregory Todd. (2013, March 21). The SPARQL Federated Query extension is a specification which defines the syntax and semantics for using the SERVICE keyword to execute queries that merge data distributed over different SPARQL endpoints. It provides for the ability to direct a portion of a query to a particular SPARQL endpoint. Results are returned to the federated query processor and are combined with results from the rest of the query.
  • SPARQL 1.1 Graph Store HTTP Protocol Ogbuji, Chimezie, editor. (2013, March 21). This document describes the use of HTTP for managing a collection of RDF graphs as an alternative to the SPARQL 1.1 Update protocol interface. For some clients or servers, HTTP may be easier to implement or work with, and this specification serves as a non-normative suggestion for HTTP operations on RDF graphs which are managed outside of a SPARQL 1.1 graph store.
  • SPARQL 1.1 Overview W3C SPARQL Working Group. (2013, March 21).This document provides an introduction to a set of W3C specifications for querying and manipulating RDF graph content on the Web or in an RDF store. It gives a brief description of the eleven specifications that comprise SPARQL.
  • SPARQL 1.1 Protocol Feigenbaum, Lee, Williams, Gregory Todd, Clark, Kendall Grant, Torres, Elias. (2013, March 21. The SPARQL 1.1 Protocol describes a means for conveying SPARQL queries and updates to a SPARQL processing service and returning the results via HTTP to the entity that requested them. It has been designed for compatibility with the SPARQL 1.1 Query Language [SPARQL] and with the SPARQL 1.1 Update Language for RDF. The intended use of this document is primarily intended for software developers implementing SPARQL query and update services and clients.
  • SPARQL 1.1 Query Results CSV and TSV Formats Seaborne, Andy. (2013, March 21). This document describes the use of Comma Separated Values (CSV) and tab separated values (TSV ) for expressing SPARQL query results from SELECT queries. CSV and TSV are formats for the transmission of tabular data, particularly spreadsheets.
  • SPARQL 1.1 Query Results JSON Format Clark, Kendall Grant, Feigenbaum, Lee, Torres,Elias. (2013, March 21). SPARQL is a set of standards which defines several Query Result Forms used to query and update RDF data, along with ways to access such data over the web. This document defines the representation of SELECT and ASK query results using JSON.
  • SPARQL Query Results XML Format (Second Edition) Beckett, Dave, and Broekstra, Jeen. (2013, March 21). SPARQL is a set of standards which defines several Query Result Forms used to query and update RDF data, along with ways to access such data over the web. This document defines the SPARQL Results Document that encodes variable binding query results from SELECT queries and boolean query results from ASK queries in XML.
  • SPARQL 1.1 Query Language Harris, Steve, Seaborne, Andy. (2013, March 21). This document defines the syntax and semantics of the SPARQL query language for RDF, a directed, labeled graph data format for representing information in the Web. SPARQL is used to express queries across data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL supports querying required and optional graph patterns along with their conjunctions and disjunctions, aggregation, subqueries, negation, creating values by expressions, extensible value testing, and constraining queries by source RDF graph. Results of SPARQL queries can be result sets or RDF graphs.
  • SPARQL 1.1 Service Description Williams, Gregory Todd. (2013, March 21). A SPARQL service description lists the features of a SPARQL service made available via the SPARQL Protocol for RDF. This document describes how to discover a service description from a specific SPARQL service and an RDF schema for encoding such descriptions in RDF.
  • SPARQL 1.1 Update Gearon, Paula, Passant, Alexandre, and Polleres, Axel. (2013, March 21). SPARQL 1.1 Update is a language used to update RDF graphs using a syntax derived from the SPARQL Query Language for RDF. Operations are provided to update, create, and remove RDF graphs in a Graph Store.

GeoSpatial SPARQL

In addition to the W3c SPARQL documents, there is documentation for a Geospatial SPARQL query language.

  • OGC GeoSPARQL - A Geographic Query Language for RDF Data Matthew Perry, Matthew, and Herring, John, editors. (2012, September 10). Open Geospatial Consortium (OGC). This OGC standard defines a vocabulary for representing geospatial data in RDF. It also defines an extension to the SPARQL query language for processing geospatial data. The GeoSPARQL query language is designed to accommodate systems based on qualitative spatial reasoning and systems based on quantitative spatial computations.

SPARQL Endpoints

This box provides links to some SPARQL endpoints that are useful for researchers, and are good examples of datasets to practice using SPARQL queries. The Europeana dataset is used in the SPARQL for humanists tutorial on the left.

  • Europeana SPARQL API Use this API to explore connections between Europeana data and outside data sources, like VIAF, Iconclass, Getty Vocabularies (AAT), Geonames, Wikidata, and DBPedia.

SPARQL Tools

This box contains SPARQL tools.

  • Apache Jena Fuseki2 Apache Jena Fuseki is a SPARQL server. It has the capability to run as a operating system service, as a Java web application (WAR file), and as a standalone server. It provides SPARQL 1.1 protocols for query, update and the SPARQL Graph Store. Fuseki can be configured with TDB to provide a transactional persistent storage layer, and incorporates Jena text query and Jena spatial query.
  • Pubby Bizer, Christian, and Cyganiak, Rhichard. Freie Universität Berlin. Pubby adds Linked Data interfaces to SPARQL endpoints. It can turn a SPARQL endpoint into a Linked Data server, and is implemented as a Java web application. Features include providing dereferenceable URIs by rewriting URIs found in the SPARQL-exposed dataset into the Pubby server's namespace, providing an HTML interface showing the data available about each resource, handling 303 redirects and content negotiation, and provides for the addition of metadata. It is compatible with Tomcat and Jetty servlet containers.

SPARQL Instructional Resources

  • SPARQL Sample Queries Coombs, Karen. This page on the github blog, Library Web Chic, provides useful examples of SPARQL queries. This is an excellent place to browse through when learning how to query with SPARQL. Examples include simple queries for finding subjects, predicates, and objects and build into more complex federated and filtered queries across datasets. This serves as a companion to Karen Coombs' Querying Linked Data webinar.
  • SPARQL for humanists Lincoln, Matthew. (2014, July 10). From The Programming Historian. This blog entry describes using SPARQL using the Europeana Data Model (EDM). It provides a good introduction to SPARQL. more... less... For a more advanced lesson in learning SPARQL, see Matthew Lincoln, Using SPARQL to access Linked Open Data, from The Programming Historian.
  • Using SPARQL to access Linked Open Data Lincoln, Matthew. (2015, November 24). From The Programming Historian. This blog entry provides a lesson explaining why cultural institutions are moving to graph databases. The entry also gives a detailed lesson in using SPARQL to access data in cultural institution databases.

Vocabularies, Ontologies and Frameworks

Vocabularies, ontologies & frameworks.

Controlled vocabularies, ontologies, schemas, thesauri, and syntaxes are building blocks used by Resource Description Framework (RDF) to structure data semantically, identify resources, and to show the relationships between resources in Linked Data. Libraries and cultural institutions belong to one of the many knowledge organization domains making use of controlled authorities. These pages focus especially on the vocabularies and computer languages that are used in the library and cultural heritage institutions data landscape.

Seeing Standards: Visualization of the Metadata Universe

research topics of semantics

About Seeing Standards

Becker, Devin and Jenn L. Riley. (2010). Seeing Standards: A Visualization of the Metadata Universe . Click on the chart to access a PDF version and a Glossary of Metadata Standards.

About Vocabularies

  • About Taxonomies & Controlled Vocabularies American Society for Indexing, Taxonomies & Controlled Vocabularies Special Interest Group. This page describes the differences between controlled vocabularies, taxonomies, thesauri, and ontologies.

Ontologies & Frameworks

International Image Interoperability Framework (IIIF)

IIIF is a framework for image delivery developed by a community of leading research libraries and image repositories. The goals are to provide access to an unprecedented level of uniform and rich access to image-based resources hosted around the world, define a set of common application programming interfaces supporting interoperability between image repositories, develop, cultivate and document shared technologies, such as image servers and web clients, for providing viewing, comparing, manipulating, and annotating images.

The two core APIs for the Framework are:

  • IIIF Image API 3.0

IIIF Consortium. (2021). Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, and Warner, Simeon. This document describes an image delivery API specification for a web service that returns an image in response to a standard HTTP or HTTPS request. The URI can specify the region, size, rotation, quality characteristics and format of the requested image as well as be enabled to request basic technical information about the image to support client applications.

  • IIIF Presentation API 3.0.

IIIF Consortium. (2021). Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, and Warner, Simeon. The IIIF Presentation API provides information necessary to human users to allow a rich, online viewing environment for compound digital objects. It enables the display of digitized images, video, audio, and other content types associated with a particular physical or born-digital object, allows navigation between multiple views or time extents of the object, either sequentially or hierarchically, displays descriptive information about the object, view or navigation structure, and provides a shared environment in which publishers and users can annotate the object and its content with additional information.

  • Presentation Cookbook of IIIF Recipes

The Cookbook provides resource types and properties of the Presentation specification and for rendering by viewers and other software clients. Examples are provided to encourage publishers to adopt common patterns in modeling classes of complex objects, enable client software developers to support these patterns, for consistency of user experience, and demonstrate the applicability of IIIF to a broad range of use cases.

Additional APIs for the Framework are:

  • IIIF Authentification API 1.0

IIIF Consort ium. (2021). Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, and Warner, Simeon.The Authentication specification describes a set of workflows for guiding the user through an existing access control system. It provides a link to a user interface for logging in, and services that provide credentials, modeled after elements of the OAuth2 workflow acting as a bridge to the access control system in use on the server, without the client requiring knowledge of that system.

  • IIIF Content Search API 1.0

IIIF Consort ium. (2021). Appleby, Michael, Crane, Tom, Sanderson, Robert, Stroop, Jon, and Warner, Simeon. The Content Search specification lays out the interoperability mechanism for performing searches among varied content types from different sources. The scope of the specification is searching annotation content within a single IIIF resource, such as a Manifest, Range or Collection.

Linked Art is a data model which provides an application profile used to describe cultural heritage resources, with a focus on artworks and museum-oriented activities. Based on real world data and use cases, it defines common patterns and terms used in its conceptual model, ontologies, and vocabulary. Linked Art follows existing standards and best practices including CIDOC-CRM, Getty Vocabularies, and JSON-LD 1.1 as the core serialization format.

Ontologies are formalized vocabularies of terms, often covering a specific domain. They specify the definitions of terms by describing their relationships with other terms in the ontology. OWL 2 is the Web Ontology Language designed to facilitate ontology development and sharing via the Web. It provides classes, properties, individuals, and data values that are stored as Semantic Web documents. As an RDF vocabulary, OWL can be used in combination with RDF schema.

VOWL : Visual Notation for OWL Ontologies

Negru,Stefan, Lohmann, Seffan, and Haag, Florian. (2014, April 7). Specification of Version 2.0. VOWL defines a visual language for user-oriented representation of ontologies. The language provides graphical depictions for elements of OWL that are combined to a force-directed graph layout visualizing the ontology. It focuses on the visualization of the classes, properties and datatypes, sometimes called TBox, while it also includes recommendations on how to depict individuals and data values, the ABox. Familiarity with OWL and other Semantic Web technologies is required to understand this specification.

  • OWL 2 Web Ontology Language Document Overview (Second Edition) This is the W3C introduction to OWL 2 and the various other OWL 2 documents. The document describes the syntaxes for OWL 2, the different kinds of semantics, the available sub-languages, and the relationship between OWL 1 and OWL 2. Read this document before reading other OWL 2 documents.
  • OWL 2 Web Ontology Language Structural Specification and Functional-Style Syntax (Second Edition) This document defines the OWL 2 language. The core part, the structural specification, describes the conceptual structure of OWL 2 ontologies and provides a normative abstract representation for all OWL 2 syntaxes. The document also defines the functional-style syntax, which follows the structural specification and allows OWL 2 ontologies to be written in a compact form. This syntax is used in the definitions of the semantics of OWL 2 ontologies, the mappings from and into the RDF/XML exchange syntax, and the different OWL 2 profiles.
  • OWL 2 Web Ontology Language Mapping to RDF Graphs (Second Edition) This document defines two mappings between the structural specification of OWL 2 and RDF graphs. The mappings can be used to transform an OWL 2 ontology into an RDF graph and an RDF graph into an OWL 2 ontology.
  • Time Ontology in Owl Cox, Simon, Little, Chris, Hobbs, Jerry R., and Pan, Feng. (2017, October 19). W3C. Time Ontology in Owl (OWL-Time) can be used to describe temporal relationships. It focuses particularly on temporal ordering relationships. Elements of a date and time are put into separately addressable resources. OWL-Time supports temporal coordinates (scaled position on a continuous temporal axis) and ordinal times (named positions or periods) and does not necessarily expect the use of the Gregorian calendar.
  • PRESSoo Le Boeuf, Patrick (2016, January). PRESSoo is an ontology designed to represent bibliographic information relating to serials and continuing resources. PRESSoo is an extension of the Functional Requirements for Bibliographic Records – Object Oriented model (FRBRoo). PRESSoo has been developed by representatives of the ISSN International Centre, the ISSN Review Group, and the Bibliothèque nationale de France (BnF).

The Resource Description Framework (RDF) is a framework for representing information in the Web of Data. It comprises a suite of standards and specifications whose documentation is listed below.

  • Cool URIs for the Semantic Web Leo Sauermann, Leo and Cyganiak, Richard. (2008, Dec.3). W3C. Uniform Resource Identifiers (URIs) are at the core of RDF providing the link between RDF and the Web. This document presents guidelines for their effective use. It discusses two strategies, called 303 URIs and hash URIs. It gives pointers to several Web sites that use these solutions, and briefly discusses why several other proposals have problems.
  • RDF 1.1 Concepts and Abstract Syntax This W3C document defines an abstract syntax (a data model) for linking RDF-based languages and specifications. The syntax has a data structure for expressing descriptions of resources as RDF graphs made of sets of subject-predicate-object triples, where the elements may be IRIs, blank nodes, or datatyped literals. The document introduces key concepts and terminology, and discusses datatyping and the handling of fragment identifiers in IRIs within RDF graphs.
  • RDF 1.1 Primer The Primer introduces basic RDF concepts and shows concrete examples of the use of RDF. It is designed to provide the basic knowledge required to effectively use RDF.
  • RDF Schema 1.1 The RDF Schema provides a data-modeling vocabulary for RDF data and is an extension of the basic RDF vocabulary. The IRIs for the namespaces for the RDF Schema and the RDF Syntax are defined in this document. The RDF Schema provides mechanisms for describing groups of related resources and the relationships between these resources which can be used to describe other RDF resources in application-specific RDF vocabularies.
  • RDF 1.1 Semantics This is one of the documents that comprise the full specification of RDF 1.1. It describes semantics for the Resource Description Framework 1.1, the RDF Schema, and RDFS vocabularies.

RDF 1.1 Serializations

There are a number of RDF serialization formats for implementing RDF. The first format was XML/RDF. Subsequent serialization formats have been developed and may be more suited to particular environments.

  • JSON-LD 1.0 Sporny, Manu, Longley, Dave, Kellogg, Gregg, Lanthaler, Markus, and Lindström, Niklas. (2014, Jan.16).A JSON-based Serialization for Linked Data. Recommendation. W3C. This specification defines JSON-LD, a JSON-based format to serialize Linked Data. JSON-LD with RDF tools can be used as a RDF syntax.
  • RDF 1.1 Turtle Terse RDF Triple Language. David Beckett, Berners-Lee, Tim, Prud'hommeaux, Eric, and Carothers, Gavin. (2014, Feb.25). Recommendation. W3C. This document defines Turtle, the Terse RDF Triple Language, a concrete syntax for RDF that allows an RDF graph to be written in a compact, natural text form with abbreviations for common usage patterns and datatypes. Turtle provides levels of compatibility with the N-Triples format and SPARQL.
  • RDF 1.1 XML Syntax This W3C document defines the XML syntax for RDF graphs. W3C. (2014, Feb.25). Recommendation. Gandon, Fabien and Schreiber, Guus. eds.
  • RDFa Core 1.1 Adida, Ben, Birbeck, Mark, McCarron, Shane, and Herman, Ivan. (2015, Mar. 17). Syntax and processing rules for embedding RDF through attributes. 3rd. ed. Recommendation. W3C. RDFa Core is a specification for attributes to express structured data in any markup language. The rules for interpreting the data are generic, so that there is no need for different rules for different formats. The embedded data already available in the markup language (e.g., HTML) can often be reused by the RDFa markup
  • RDFa 1.1 Primer Herman, Ivan, Adida, Ben, Sporny, Manu, and Birbeck, Mark. (2015, Mar. 17). Rich Structured Data Markup for Web Documents. W3C. RDFa (Resource Description Framework in Attributes) is a technique to add structured data to HTML pages directly. This Primer shows how to express data using RDFa in HTML, and in particular how to mark up existing human-readable Web page content to express machine-readable data.

SKOS (Simple Knowledge Organization System)

SKOS is a W3C data model defined as an OWL Full ontology for use with knowledge organization systems including thesauri, classification schemes, subject heading systems, and taxonomies. Many Semanatic Web vocabularies incorporate the SKOS model. The Library of Congress Subject Headings and the Getty Vocabularies are an examples of vocabularies published as SKOS vocabularies.

  • SKOS Simple Knowledge Organization System eXtension for Labels (SKOS-XL) Namespace Document - HTML Variant SKOS-XL defines an extension for SKOS which provides additional support for describing and linking lexical entities.This document provides a brief description of the SKOS-XL vocabulary.
  • SKOS Simple Knowledge Organization System Namespace Document - HTML Variant This W3C document provides an HTML non-normative table of the SKOS vocabulary.
  • SKOS Simple Knowledge Organization System Primer SKOS provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary. This document serves as a user guide for those who would like to represent their concept scheme using SKOS.
  • SKOS Simple Knowledge Organization System Reference This document defines the SKOS namespace and vocabulary. SKOS is a data-sharing standard which aims to provide a bridge between different communities of practice within the library and information sciences involved in the design and application of knowledge organization systems widely recognized and applied in both modern and traditional information systems.

Ontology Development

  • Ontology Development 101: A Guide to Creating Your First Ontology Noy, Natalya F. and McGuiness, Deborah L. This guide discusses the reasons for developing an ontology and the methodology for creating an ontology based on declarative knowledge representation systems.

Open Linked Vocabularies (LOV)

Linked Open Vocabularies

Click on the LOV image to access vocabularies chosen based on quality requirements and publication best practices.

Click on a vocabulary. Look for a circled elipse in the upper right corner, click on it and have fun playing with tools for vocabularies. Explore around a bit more, and find useful information about the vocabulary you have chosen.

  • EMBL-EBI Ontology Lookup Service EMBL-EBI. (2022). Administered by the European Bioinformatics Institute, the Ontology Lookup Services (OLS) serves as a repository for the latest versions of biomedical ontologies. The site also provides access to several services including OxO, a cross-ontology term mapping tool, Zooma which assists in mapping data to ontologies in OLS, and Webulous, a tool for building ontologies from spreadsheets. OLS includes over 270 structured vocabularies and ontologies.
  • Library of Congress Linked Data Service: Authorities and Vocabularies This page provides access to commonly used ontologies, controlled vocabularies, and other lists for bibliographic description including Genre/Form headings, Subject Headings for Children, Thesaurus of Graphic Materials, Preservation Events, Crytographic Hash Functions, schemes, and codelists, etc. A search function is provided. Clicking on Technical Center in the menu on the left will provide information on how to download datasets, searching and querying, and serialization formats.
  • Library of Congress Standard Identifiers Scheme The Standard Identifiers Scheme from the Library of Congress lists standard number or code systems and assigns a URI to each database or publication that defines or contains the identifiers in order to enable these standard numbers or codes in resource descriptions to be indicated by a URI. This is an extensive list which includes for example: Digital Object Identifier, EIDR: Entertainment Identifier Registry; International Article Number; International Standard Book Number; Library of Congress Control Number; Linking International Standard Serial Number; Locally defined identifier; Publisher-assigned music number; Open Researcher and Contributor Identifier; Standard Technical Report Number; U.S. National Gazetteer Feature Name Identifier; Universal Product Code; Virtual International Authority File number; and more.
  • Linked Open Vocabularies (LOV) Use this site to find a list of vetted linked open vocabularies (RDFS or OWL ontologies) used in the Linked Open Data Cloud, which conform to quality requirements including URI stability and availability on the Web, use of standard formats and publication best practices, quality metadata and documentation, an identifiable and trusted publication body, and proper versioning policy. Vocabularies are individually described by metadata and classified by the following vocabulary spaces: General and Meta; Library; City; Market; Space-Time; Media; Science; and Web. They are interlinked using the dedicated vocabulary VOAF. Search the LOV dataset at the vocabularly or element level. LOV Stats provide metric informataion regarding the vocabularies such as the number of vocabulary element occurrences in the LOD, the number of vocabularies in LOV that refer to a particular element, and more.
  • Open Metadata Registry The Registry provides a means to identify, declare, and publish through registration metadata schemas (element/property sets), schemes (controlled vocabularies) and Application Profiles (APs). It supports the machine mapping of relationships among terms and concepts in those schemes (semantic mappings) and schemas (crosswalks). The Registry supports metadata discovery, reuse, standardization, and interoperability locally and globally.
  • RDA Registry The RDA Registry defines vocabularies that represent the Resource Description Access (RDA) element set, relationship designators, and controlled terminologies as RDA element sets and RDA value vocabularies in Resource Description Framework (RDF). The published vocabularies are currently available in several sets which reflect the underlying FRBR conceptual model.
  • TaxoBank Terminology Registry TaxoBank contains information about controlled vocabularies of all types and complexities. The information collected about each vocabulary follows a study conducted by the Joint Information Systems Committee (JISC) of the Higher and Further Education Funding Councils. The site offers additional resources including information on Thesauri and Vocabulary Control - Principles and Practice, and a Glossary of Terms Relating to Thesauri and Other Forms of Structured Vocabulary.
  • Virtual International Authority File (VIAF) VIAF is a utility that matches and links authority files of national libraries. Data are derived from the personal name authority and related bibliographic data of the participating libraries. VIAF is implemented and hosted by OCLC.

The Getty Vocabularies

  • Art & Architecture Thesaurus® Online The Getty Research Institute. The scope of this vocabulary includes terminology needed to catalog and retrieve information about the visual arts and architecture
  • Cultural Objects Name Authority® Online and Iconography Authority (IA) The Getty Research Institute. The Cultural Objects Name Authority ® (CONA) compiles titles, attributions, depicted subjects, and other metadata about works of art, architecture, and other cultural heritage, both extant and historical, physical and conceptual and can be used to record works depicted in visual surrogates or other works. Metadata may be gathered and linked from photo archive collections, visual resource collections, special collections, archives, libraries, museums, scholarly research, and other sources. The Getty Iconography Authority (IA) includes proper names and other information for named events, themes and narratives from religion/mythology, legendary and fictional characters, themes from literature, works of literature and performing arts, and legendary and fictional places.
  • The Getty Vocabularies as Linked Open Data The Getty Art & Architecture Thesaurus (AAT) ®, Thesaurus of Geographic Names (TGN) ®, and the Union List of Artist Names (ULAN) ® are available as Linked Open Data. This link provides access the vocabularies and information regarding how to use them. Examples of URIs for each vocabulary are provided.
  • Getty Vocabularies: Linked Open Data Semantic Representation Vladimir Alexiev, Joan Cobb, Gregg Garcia, Patricia Harpring. (2017, June 13). This document explains the representation of the Getty Vocabularies in semantic format, using RDF and appropriate ontologies. It covers the Art and Architecture Thesaurus (AAT)®, the Thesaurus of Geographic Names (TGN)® and the Union List of Artist Names (ULAN)®.
  • Getty Vocabularies OpenRefine Reconciliation The Getty Research Institute. This page offers information and a tutorial on how to reconcile data sets to the Getty Vocabularies using the browser add-on OpenRefine. Use data reconciliation to compare local data to values in the Getty Vocabularies in order to map to them.
  • Thesaurus of Geographic Names® Online The Getty Research Institute. The scope of this vocabulary spans a wide spectrum of geographic vocabulary in cataloging and scholarship of art and architectural history and archaeology.
  • Traing Materials The Getty Research Institutes. This page provides training materials for the Art & Architecture Thesaurus (AAT)®, the Getty Thesaurus of Geographic Names (TGN)®, the Union List of Artist Names (ULAN)®, the Cultural Objects Name Authority (CONA)®, the Getty Iconography Authority (IA)™, Categories for the Description of Works of Art (CDWA), Cataloging Cultural Objects (CCO), and standards in general. It also provides access conference presentations.
  • Union List of Artist Names® Online The Getty Research Institute. The ULAN is a structured vocabulary containing names and other information about artists, patrons, firms, museums, and others related to the production and collection of art and architecture. Names in ULAN may include given names, pseudonyms, variant spellings, names in multiple languages, and names that have changed over time (e.g., married names).

A schema uses a formal language to describe a database system and refers to how the organization of data in a database is constructed. Several schemas addressing varied domain areas are listed in this box. Scroll down to the Dublin Core box to access information regarding the Dublin Core schema and tools.

  • BIBFRAME (Bibliographic Framework) Initiative This is the homepage for BIBFRAME the Library of Congress' Bibliographic Framework Initiative. BIBFRAME is a replacement for MARC and serves as a general model for expressing and connecting bibliographic data to the Web of Data. Access links to general information, the vocabulary, BIBFRAME implementation register, tools, draft specifications for Profiles, Authorities, and Relationships, a BIBFRAME testbed, webcasts and presentations, and more.
  • BIBFRAME Model & Vocabulary 2.0 This page provides access to three available vocabulary views of the BIBFRAME Vocabulary. The vocabulary is comprised of RDF properties, classes, and relationships between and among them RDF properties, classes, and relationships between and among them.
  • BIBFRAME Pilot (Phase One—Sept. 8, 2015 – March 31, 2016): Report and Assessment Acquisitions & Bibliographic Access Directorate, Library of Congress. (2016, June 16). This document describes Phase One of the Library of Congress' pilot to test the efficacy of BIBFRAME. It includes descriptions of the planning process, what was being tested, the results, and lessons learned that will assist the Library of Congress as it moves to Phase Two of assessing the BIBFRAME model and vocabulary.
  • Bibliographic Framework as a Web of Data: Linked Data Model and Supporting Services Eric Miller, Eric, Ogbuji, Uche, Mueller, Victoria , and MacDougall, Kathy. (2012, Nov. 21). Library of Congress. This document provides an introduction and overview of the Library of Congress, Bibliographic Framework Initiative.
  • bibliotek-o: a BIBFRAME Ontology Extension Bibliotek-o is an ontology extension which defines additions and modifications to BIBFRAME, intended as a supplement to the core BIBFRAME ontology. It provides a set of recommended fragments from external ontologies and an application profile based on its recommended models and patterns. Bibliotek-o ontology extension is a joint product of the Mellon Foundation-funded Linked Data for Libraries Labs and Linked Data for Production projects.
  • bib.schema.org This is a bibliographic extension for schema.org. The page lists the types, properties, and enumeration values for use in describing bibliographic material using schema.org.
  • DataCite Metadata Schema DataCite. (2019, August 16). The DataCite Metadata Schema provides a list of core metadata properties chosen for accurate and consistent identification of resources for citation and retrieval purposes.Recommended use instructions are provided.
  • Direct Mapping of Relational Data to RDF Arenas, Marcelo, Bertails, Alexandre, Prud'hommeaux, Eric, Sequeda, Juan (editors). (2012 Sept.27). This document defines a direct mapping from relational data to RDF with provisions for extension points for refinements within and outside of the document.
  • FAST Linked Data FAST (Faceted Application of Subject Terminology) is an enumerative, faceted subject heading schema derived from the Library of Congress Subject Headings (LCSH). The purpose of adapting the LCSH with a simplified syntax to create FAST is to retain the vocabulary of LCSH while making the schema easier to understand, control, apply, and use. The schema maintains upward compatibility with LCSH, and any valid set of LC subject headings can be converted to headings. The site provides access to searchFAST, a user interface that simplifies the process of heading selection, and to a Web interface for FAST Subject selection available at FAST.
  • JSON Schema Version 7 (Draft). (2019 March 31). This is a vocabulary that provides for the annotation and validation of JSON documents. It can be used to describe data formats, provide human and machine-readable documentation, make any JSON format a hypermedia format, allow the use of URI templates with instance data, describe client data for use with links using JSON Schema., and recognize collection and collection items.
  • Metadata Authority Description Schema (MADS) MADS is an XML schema for an authority element set used to provide metadata about agents (people, organizations), events, and terms (topics, geographics, genres, etc.). It serves to provide metadata about the authoritative entities used in MODS descriptions.
  • Metadata Object Description Schema (MODS) MODS is a bibliographic element set that may be used for a variety of purposes, and particularly for library applications. MODS is an XML schema intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records. It includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format. It is maintained by the Library of Congress.
  • Metadata Object Description Schema (MODS) - Conversions Access MODS mapping including MARC to MODS, MODS to MARC, RDA to MODS, Dublin Core (simple) to MODS, and MODS to Dublin Core (simple). Style sheets are also available on this page.
  • Music Encoding Initiative (MEI) MEI is an XML DTD for the representation and exchange of comprehensive music information. MEI is a schema that provides ways to encode data from all the separate domains: logical; visual; gestural (performance); and analytical, commonly associated with music. It accommodates bibliographic description required for archival uses. It also addresses relationships between elements, cooperative creation and editing of music markup, navigation within the music structure as well as to external multimedia entities, the inclusion of custom symbols, etc. MEI can record the scholarly textual apparatus frequently found in modern editions of music.
  • R2RML: RDB to RDF Mapping Language Das, Souripriya, Sundara, Seema, Cyganiak, Richard (editors). (2012, Sept. 27). This document describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. The mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice.
  • R2RML: RDB to RDF Mapping Language Schema This document defines the R2RML: RDB to RDF Mapping Language schema which is used to specify a mapping of relational data to RDF.
  • Schema.org Schema.org is a vocabulary that can be used with many different encodings, including RDFa, Microdata and JSON-LD to mark up web pages and e-mail messages. Sponsored by Google, Microsoft, Yahoo and Yandex, the vocabularies are developed by an open community process which includes an extension mechanism to enhance the core vocabulary for specific knowledge domains. It's primary function is to provide web page publishers a means by which they can enhance HTML pages so they can be crawled by semantic search engines linking the pages to the web of data.
  • Text Encoding Initiative (TEI) The Text Encoding Initiative (TEI) is a global consortium which develops and maintains a set of Guidelines which specify encoding methods for machine-readable texts. TEI Guidelines have been widely used by libraries, museums, publishers, and individual scholars to present texts chiefly in the humanities, social sciences and linguistics. The site provides information on resources, projects using TEI, a bibliography of TEI-related publications, and TEI related software including Roma, a web-based application to generate P5-compatible schemas and documentation, and OxGarage, a tool for converting to and from TEI. In addition, the site links to a page of tools for use with TEI resources.
  • Thema Thema is a multilingual subject category schema designed for the commercial global book trade industry to meet the needs of publishers, retailers, trade intermediaries, and libraries. Thema aims to reduce the duplication of effort required by the many distinct national subject schema, and to eliminate the need for scheme-to-scheme mapping that inevitably degrades the accuracy of classification, by providing a single scheme for broad international use. It can be used alongside existing national schema.
  • XML 1.0 Bray, Tim, Jean Paoli, Jean, C. M. Sperberg-McQueen, Maler, Eve, Yergeau, François, eds. (2013, Feb. 7). XML 1.0 is a version of the Extensible Markup Language used to store and transport data on the Web. It is both human and machine readable.
  • XML Path Language (XPath) 2.0 Berglund, Anders, Boag, Scott, Chamberlin, Don, Fernández, Mary F., Kay, Michael, Robie, Jonathan, Siméon, Jérôme, (eds.) (2015, Sept. 7). 2nd edition. XPath is an expression language that uses a path notation for navigating through the hierarchical structure of XML documents. XPath 2.0 is a superset of XPath 1.0. It supports a richer set of data types and takes advantage of the type information that becomes available when documents are validated using XML Schema.
  • XQuery 1.0: An XML Query Language Boag, Scott, Chamberlin, Don, Fernández, Mary F., Florescu, Daniela, Robie, Jonathan, Siméon, Jérôme, (eds.). (2015, Sept. 7). 2nd edition. This is a version of XQuery, A query language that uses the structure of XML to express queries across all kinds of data, whether physically stored in XML or viewed as XML via middleware.

Dublin Core

  • Dublin Core Metadata Initiative This site provides specification of all Dublin Core vocabulary metadata terms maintained by the Dublin Core Metadata Initiative, including properties, vocabulary encoding schemes, syntax encoding schemes, and classes.
  • DCMI Application Profile Vocabulary Coyle, Karen, editor (2021, April 9). This vocabulary supports the specification of Dublin Core Tabular Application Profiles (DC TAP). It is used to create a table or spreadsheet that defines the elements of an application profile. The vocabulary is also provided as a comma separated value template for use in a tabular form.
  • DC Tabular Application Profiles (DC TAP) - Primer Coyle, Karen, editor. (2021, April 3). This primer describes DC TAP, a vocabulary and a format for creating table-based application profiles.
  • dctap DCMI. (2021). dctap is a module and command-line utility for reading and interpreting CSV files formatted according to the DC Tabular Application Profiles (DCTAP) model. This document explains the project, installation, sub-commands, model, configuration, and provides a glossary.
  • dctap-python DCMI. dctap requires Python 3.7 or higher. This GitHub page provides information and documentation on installing tap-python.
  • dctap/TAPtemplate.csv Coyle, Karen. (2020, December 2). Access the TAP csv template from this GitHub page.

Legal Schemas

  • Akoma Ntoso Akoma Ntoso is an initiative to develop a number of connected XML standards, languages and guidelines for parliamentary, legislative and judiciary documents, and specifically to define a common document format, a model for document interchange, data schema, metadata schema and ontology, and schema for citation and cross referencing.
  • Legislative Documents in XML at the United States House of Representatives U.S. House of Representatives. This page provides Document Type Definitions (DTD) for use in the creation of legislative documents using XML, links to DTDs, and background information regarding legislative XML. Access element descriptions and content models for bills, resolutions, Amendments, and roll call votes. This initiative was conducted under the direction of the Senate Committee on Rules and Administration and the House Committee on Administration, and with the involvement of the Secretary of the Senate and the Clerk of the House, the Congressional Research Service, the Library of Congress, and the Government Publishing Office.
  • Electronic Court Filing Version 4.01 Plus Errata 01 OASIS. Angione, Adam and Cabral, James, editors. (2014, July 14). This specification defines a technical architecture and a set of components, operations and message structures for an electronic court filing system, and sets forth rules governing its implementation. It was developed by the OASIS LegalXML Electronic Court Filing Technical Committee.

RELATED RESOURCES

  • Akoma Ntoso an open document standard for Parliaments Palmirani, Monica, and Vitali, Fabio. (2014). World e-Parliament Conference. This set of slides describes an open XML standard for legal documents used in Parliamentary processes and judgments.
  • BIBFLOW: A Roadmap for Library Linked Data Transition Smith, MacKenzie, Stahmer, Carl G., Li, Xiaoli, and Gonzalez, Gloria. (2017, March 14). University of California, Davis and Zepheira, Inc. This is the report of the BIBFLOW project which provides a roadmap for libraries to use to transition into Linked Data environment. Recommendations for a phased transition are provided, as well as an analysis of transition tools, workflow transitions, estimated training, and work effort requirements.
  • Library of Congress BIBFRAME Manual Library of Congress. (Revised 2020, May). This is the training manual for the BIBFRAME Editor and BIBFRAME Database.
  • Artists’ Books Thesaurus This controlled vocabulary is for artists’ books. The Thesaurus is administered by the Arts Libraries Society of North America (ARLIS/NA). The platform,currently in draft form, will offer an illustrated, user-friendly guide to exploring and finding vocabulary terms.
  • DCAT (Data Catalog Vocabulary) DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. This document defines the schema and provides examples for its use.
  • Data Documentation Initiative (DDI) DDI Alliance. (2021). DDI is a free international standard for describing data produced by by surveys and other observational methods in the social, behavioral, economic, and health sciences. It can be used to document and manage different stages in the research data lifecycle, such as conceptualization, collection, processing, distribution, discovery, and archiving.
  • DOAP (Description of a Project) DOAP is an XML/RDF vocabulary to used to describe software projects, and in particular open source projects. This site hosts the DOAP wiki, and provides links to DOAP validators, generators, viewers, aggregators, and web sites using DOAP.
  • Expression of Core FRBR Concepts in RDF This vocabulary is an expression in RDF of the concepts and relations described in the IFLA report on the Functional Requirements for Bibliographic Records (FRBR). It includes RDF classes for the group 1, 2, and 3 entities described by the FRBR report and properties corresponding to the core relationships between those entities. Where possible, appropriate relationships with other vocabularies are included in order to place this vocabulary in the context of existing RDF work.
  • FOAF (Friend of a Friend) Vocabulary Specification This specification describes the FOAF language used for linking people and information. FOAF integrates three kinds of network: social networks of human collaboration, friendship and association; representational networks that describe a simplified view of a cartoon universe in factual terms, and information networks that use Web-based linking to share independently published descriptions of this inter-connected world.
  • Language of Bindings (LoB) Ligatus, University of the Arts London. Based on SKOS, LoB is a thesaurus which provides terms used to describe historical binding structures. LoB can be used as a lookup resource on the website or as a software service where terms can be retrieved through an application. It can also be used for learning about book structures and materials, the frequency of the occurrence of bookbinding components, or other aspects connected with the book trade.
  • Lexvo.org Lexvo defines global IDs (URIs) for language-related objects, and ensures that these identifiers are dereferenceable and highly interconnected as well as externally linked to a variety of resources on the Web. Data sources include the Ethnologue Language Codes database, Linguist List, Wikipedia, Wiktionary, WordNet 3.0, ISO 639-3 specification, ISO 639-5 specification, ISO 15924 specification, Unicode Common Locale Data Repository (CLDR), et. al. The site provides mappings between ISO 639 standards and corresponding Lexvo.org language identifiers and downloads of Lexvo datasets. Search over 7,000 language identifiers with names in many languages, links to script URIs (Latin and Cyrillic scripts, Indian Devanagari, the Korean Hangul system, etc.), geographic region URIs, etc.
  • OLAC video game genre terms (olacvggt) Online Audiovisual Catalogers Network (OLAC). (2019). Guidelines for OLAC video game genre terms (olacvggt). This vocabulary provides a list of video game genre terms, each of which has a corresponding MARC authority record. Links to the MARC authority records are provided.
  • PeriodO Rabinowitz, Adam T., Shaw, Ryan, and Golden, Patrick. PeriodO is a period gazetteer which documents definitions of historical period names. Definitions include a period name, temporal bounds on the period, an implicit or explicit association with a geographical region, and must have been formally or informally published in a citable source. Period definitions are modeled as SKOS concepts. Temporal extent is modeled using the OWL-Time ontology.
  • Rights Statements The Rights Statements vocabulary provides rights statements for three categories of rights statements - In Copyright, No Copyright, and Other. Statements are meant to be used by cultural heritage institutions to communicate the copyright and re-use status of digital objects to the public. They are not intended to be used by individuals to license their own creations. RightsStatements.org is a joint initiative of Europeana and the Digital Public Library of America (DPLA).
  • Texas Digital Library Descriptive Metadata Guidelines for Electronic Theses and Dissertations, Version 2.0 Potvin, Sarah, Thompson, Santi, Rivero, Monica, Long, Kara, Lyon, Colleen, Park, Kristi. These Guidelines, produced by the Texas Digital Library ETD Metadata Working Group, comprise two documents to guide and shape local metadata practices for describing electronic theses and dissertations. The Dictionary, which lays out the standard, and the Report lays out detailed explanations for rationale, process, findings, and recommendations.

Vocabulary of Interlinked Datasets (VoID)

  • Describing Linked Datasets with the VoID Vocabulary Alexander, Keith, Cyganiak, Richard, Hausenblas, Michael, and Zhao, Jun. (2011, March 3). This document describes the VoID model and how to provide general metadata about a dataset or linkset (and RDF triple whose subject and object are described in different datasets.
  • Vocabulary of Interlinked Datasets (VoID) Digital Enterprise Research Institute, NUI Galway. (2011, March 6). This document describes the formal definition of RDF classes and properties for VoID, an RDF Schema vocabulary for expressing metadata about RDF datasets. It functions as a bridge between publishers and users of RDF data, with applications including data discovery, cataloging, and archiving of datasets.
  • WorldCat Linked Data Vocabulary OCLC's WorldCat Linked Data uses a subset of terms from Schema.org/ as its core vocabulary. Access the list of classes, attributes, and extensions with this link.

For the Getty Vocabularies, please see the Registries, Portals, and Authorities page.

Wikibase and Wikidata

Wikibase is the platform on which Wikidata, A Wikimedia Project, is built. It allows for multi-language instances. For Wikibase Use Cases, see the Wikibase Use Case box on the bottom of the Use Cases page.

Wikimedia Movement

Wikimedia is a global movement that seeks to bring free education to the world vai websites known as Wikimedia Projects. Wikimedia Projects are hosted by the Wikimedia Foundation. Some of these Projects are listed below. Access the full family of Wikimedia Projects here .

  • WikiCite Wikimedia. WikiCite. (2019, July 20). WikiCite is an initiative to develop a database of open citations and linked bibliographic data to better manage citations across Wikimedia projects and languages. Potential applications include ease of discovering publications on a given topic, profiling of authors and institutions, and visualizing knowledge sources in new ways.
  • Wikidata Wikidata is a free linked database that acts as central storage for the structured data of Wikimedia projects including Wikipedia, Wikivoyage, Wikisource, and others. It can be read and edited by both humans and machines. The content of Wikidata is available under a free license, exported using standard formats, and can be interlinked to other open data sets on the linked data web.
  • Wikipedia Wikipedia is the open source encyclopedia within the MediaWiki universe. A page in Wikipedia is an article to which Wikidata can link.
  • Wikimedia Commons Wikimedia Commons is a repository of freely usable media files to which anyone can contribute. Media files from Wikimedia can be linked to Wikidata statements.
  • Wikiquote Wikiquote is a free compendium of sourced quotations from notable people and creative works in every language and translations of non-English quotes. It links to Wikipedia for further information.
  • Wiktionary Wiktionary. (2019, June 27). Wiktionary is the English-language collaborative Wikimedia Project to produce a free-content multilingual dictionary. It aims to describe all words of all languages using definitions and descriptions in English. It includes a thesaurus, a rhyme guide, phrase books, language statistics and extensive appendices.
  • Wikibase DataModel MediWiki. (2019, May 19). This specification describes the structure of the data that is handled in Wikibase. It specifies which kind of information can be contributed to the system. The Wikibase data model provides an ontology for describing real world entities, and these descriptions are concrete models for real world entities. For a less technical explanation of the model, see the Wikibase DataModel Primer.
  • Wikibase DataModel Primer MediaWiki. (2018, August 1). This primer gives a good introduction to the Wikibase data model. It provides an outline and an explanation of the different elements of the Wikibase knowledge base and describes the function of each.

Wikibase Resources

  • Install Docker Compose In order to install a usable instance of Docker Desktop, the installation package must contain Docker Compose. If an instance is missing Docker Compose, this page provides instructions for installing it.
  • Installing a stand-alone Wikibase and most used services This GitHub page provides instructions for establishing a Wikibase instance. It was developed by a member of Wikimedia Deutschland e. V. and four other software developers.
  • Use cases for institutional Wikibase instances Mendenhall, Timothy R., Chou, Charlene, Hahn, Jim, et.al. (2020, May). Developed informally by library staff at Columbia University, Harvard University, New York University, and the University of Pennsylvania this GitHub page provides a wealth of information for for institutions considering installing their own Wikibase instance. Covering a wide range of topics such as local vocabularies, authority control, organizational name changes, cross-platform discovery, multilingual discovery, pipeline to Wikidata and broader web discovery, digital humanities, database, metadata, and exploratory projects, and more, each topic also supplies a use case example.
  • Wikibase Consultants and Support Providers Wikimedia. (2021, Jan. 14). This page list Wikibase service providers who may help with issues with Wikibase instances.
  • Wikibase Install Basic Tutorial Miller, Matt. (2019, September). Semantic Lab at Pratt Institute. This tutorial provides instructions for setting up Wikibase using Docker. The tutorial uses Digital Ocean, and it requires setting up an account at Digitalocean.com.
  • Wikibase Roadmap 2020 High-level Overview (Public) WikiMedia. (2021, Jan. 11). This is an interactive chart that describes Wikibase development initiatives, including Wikibase Features, Wikibase System Improvements, Partnerships & Programs, Documentation, Wikibase Strategy & Governance, and Developer Tools.

Wikidata Alert

  • Wikidata:SPARQL query service/Blazegraph failure playbook Wikidata. (2021, Dec. 13). This Wikidata article describes the proposed steps the Wiki Media Foundation is considering in the event of a catastrophic failure of its SPARQL Query Service powered by Blazegraph. The failure would occur when the amount of query-able data in Blazegraph exceeds Blazegraph's limits.

Wikidata is a free, collaborative, multilingual software application built from Wikibase components that can be read and edited by humans and machines. It collects structured data to provide support for Wikimedia Projects including Wikipedia, Wikimedia Commons, Wikivoyage, Wiktionary, Wikisource, and others. The content of Wikidata is available under a free license, exported using standard formats, and can be interlinked to other open data sets on the linked data web.

  • Wikidata Introduction Wikidata. (2018, June 18). This page provides a quick overview of Wikidata, its function withing the Wikimedia Universe, and an introduction to Wikidata basics.
  • Wikidata List of Policies and Guidelines Wikidata. (2019, January 16). This page lists the proposed and approved policies and guidelines that govern Wikidata covering a wide range of topics including Living people, Deletion policy, Sources, Editing restrictions, Statements, Sitelinks, Verifiability, Administrators, Property creators, CheckUser, and more.
  • Wikidata: Notability Wikidata.Notability. (2019, October 10). This page describes the Wikidata policy that sets forth the criteria needed for an item to be acceptable in Wikidata. It provides a link to a list of Wikimedia pages that are not considered automatically acceptable in Wikidata, and a link to a list of items that have been considered acceptable, in accordance to the general guidelines on this page.
  • Wikidata: Property constraints portal Wikidata. (2020, June 19). Help:Property constraints portal. This page provides information on property constraints including a list of types and links to pages explaining how the constraints should be applied.
  • Wikidata Sandbox Wikidata. (2020, August 24). This page provides a link to the Wikidata Sandbox in which you can experiment and practice using Wikidata. For experimenting with creating new items and properties, use the test.wikidata link on this page.
  • Wikidata Tours (2018, April 7). This page provides access to interactive tutorials showing how Wikidata works and how to edit and add data.

Articles, Development Plans, & Reports

  • ARL White Paper on Wikidata Opportunities and Recommendations Association of Research Libraries (ARL). (April, 2019). In Wikisource. This paper discusses joint efforts between ARL and Wikidata to explore a way to interlink Wikidata to sources of library data and provide libraries and other GLAM institutions the opportunity to get involved in contributing to modeling and data efforts on a larger scale. Some possible contributions include name authorities, institutional holdings, and faculty information. Suggestions for contributing to Wikidata are also explored.
  • Creating Library Linked Data with Wikibase: Lessons Learned from Project Passage OCLC Research. (2019, August). This document describes OCLC's Project Passage, a Wikibase prototype in which librarians from 16 US institutions experimented with creating linked data to describe resources without requiring knowledge of the technical machinery of linked data. The report provides an overview of the context in which the prototype was developed, how the Wikibase platform was adapted for use by librarians, and discusses use cases where participants describe creating metadata for resources in various formats and languages using the Wikibase editing interface. The potential of linked data in library cataloging workflows, the gaps that must be addressed before machine-readable semantic data can be fully adopted and lessons learned are also addressed.
  • Differences between Wikipedia, Wikimedia, MediaWiki, and wiki MediaWiki. (2019, April 19). This article provides a brief description of components and related software of the Wikimedia movement. It also provides links to additional Wikimedia movement resources.
  • Introducing Wikidata to the Linked Data Web Erxleben, Fredo, Gunter, Michael, Krotzsch, Markus, Mendez, Julian, and Vrandecic, Denny. (2014). This document explains the Wikidata model and discusses its RDF encoding. It is a good place to start if you are considering editing Wikidata.
  • Lexemes in Wikidata: 2020 Status Nielsen, Finn Arup. (2020, May). Proceedings of the 7th Workshop on Linked Data in Linguistics, pages 82–86. This article discusses the use of lexemes in different languages, senses, forms, and etymology in Wikidata.
  • Wiki Wikipedia. (2019, August 22). This Wikipedia article explains the features of a wiki knowledge base website and discusses the software, history, implementations, editing, trustworthiness, and other aspects of a wiki.
  • Wikidata:Development Plan [2020] Wikidata. (2020, June 17). This page provides an interactive roadmap to the projects on which the Wikidata Development Team is working during 2020. Clicking on projects in the Wikidata matrix will provide information on projects under the categories: Increase Data Quality & Trust; Build Out the Ecosystem; Encourage More Data Use; Enable More Diverse Data and Users; and Other. The Wikibase matrix includes categories: Wikibase Features; Wikibase System Improvements; Partnerships & Programs; Documentation; Wikibase Strategy & Governance; and Developer Tools.
  • Wikidata: Development Plan [2022] Wikidata. (2022, February 10). This page provides the roadmap for the Wikidata development team (Wikimedia Deutschland) for Wikidata and Wikibase for 2022. Highlights of the plan include empowering the community to increase data quality, strengthen underrepresented languages, increase re-use for increased impact, empowering knowledge curators to share their data, ecosystem enablement, and to connect data across technological & institutional barriers. Some objectives include better understanding of which organizations want to use Wikibase in the future and for what, ensure Wikibases can connect more deeply with each other and Wikidata to form an LOD web, user testing of federated properties in combination with local properties, and more.
  • Wikidata/Strategy/2019 Wikimedia. (2019, August 27). Wikidata/Strategy/2019. This page provides access to a product vision paper and three product strategy papers discussing possible future developments for Wikidata and Wikibase and a very ambitious role in shaping the future of the information commons through 2030. Topics discussed include strategies for making Wikimedia projects ready for the future; maintaining and supporting Wikimedia's growing content; ensuring the integrity of Wikimedia content; furthering knowledge equity; and enabling new ways of consuming and contributing knowledge. There is a strategy paper discussing Wikidata as a platform and another discussing the Wikibase ecosystem.
  • Wikimedia:LinkedOpenData/Strategy2021/Joint Vision Wikimedia. Linked Open Data/Strategy 2021/Joint Vision. This document sets out the Wikibase and Wikidata joint vision for working in Linked Open Data. The document describes the vision, strategy, guiding principles, and approach to building out the Wikibase ecosystem.

Wikidata Related Resources

  • Creating and editing libraries in Wikidata Scott, Dan. (2018, February 18). Dan Scott's blog provides useful linked data information. This blog entry describes how to create a Wikidata item for a particular library. Properties useful for describing libraries and their collections are also provided.
  • Linked Open Data Cloud Wikidata Page This LOD Cloud page provides information about Wikidata, including download links, contact information, SPARQL endpoint, triples count, the Wikidata namespace, and more. It also provides examples of Wikidata concepts using information about Nelson Mandela in Wikidata.
  • MediaWiki MediaWiki. (2019, June 14). This is the MediaWiki main page. MediaWiki is a multilingual, free and open, extensible, and customizable wiki software engine used for websites to collect, organize, and make available knowledge. It was developed for Wikipedia and other WikiMedia Projects. It includes an API for reading and writing data, and support for managing user permissions, page editing history, article discussions, and an index for unstructured text documents.
  • Practical Wikidata for Librarians Wikidata. (2021, Feb. 11). Wikidata:WikiProject Linked Data for Production/Practical Wikidata for Librarians. This page provides a vast array of resources for librarians and archivists interested in editing Wikidata, and provides a space to share data models and best practices. Among the resources are instructional materials, policies, project recipes, verifiability, guidelines for describing entities in particular domains, constraint reports, user scripts, gadgets, and more.
  • User:HakanIST/EntitySchemaList Wikidata. (2021, April 20). This is a list of schemas used for describing Entities in Wikidata compiled by Wikidata user HkanIST.
  • Wikidata editing with OpenRefine Wikidata: Tools/OpenRefine/Editing. (2021, April 25). This page provides links to tutorials, videos and a reference manual demonstration how to use OpenRefine to add and edit items in Wikidata. It also demonstrates using MarkEdit with OpenRefine and Wikidata.
  • Wikidata in Brief Wikimedia Commons. (2017, July 31). This document gives a one page overview of Wikidata.
  • Wikidata Query Service in Brief Stinson, Alex. (2018, March). This document gives a one page overview of the Wikidata query service.
  • Wikimedia Wikibooks. (2019, April 12). This open book provides information on how to use Wikis covering topics including editing, basic markup language, images, templates, categories, namespaces, administrative namespaces, user namespaces, Wikipedia, Wikimedia Commons, Wikibooks, Meta, Wikidata, Wikiversity, Wikispecies, Wikiquotes, Wikivoyage, and more.
  • Works in Progress Webinar: Introduction to Wikidata for Librarians OCLC Research. (2018, June, 12). This OCLC Webinara gives a brief introduction to Wikidata.

WikiProject Universities

  • WikiProject Universities Wikidata. (2019, August 21). The purpose of this WikiProject is to provide better coverage of universities and other research institutions in Wikidata. The goal is to create a comprehensive and rich catalog of institutions, with strong links to other entities in the academic ecosystem (researchers, publications, alumni, facilites, projects, libraries…). The scope of the project includes listing the recommended statements about universities and evaluating their coverage across Wikidata; Building showcase items to demonstrate what an university item should ideally look like; Linking between items about universities and their subunits; Linking items about people to items of the universities they are/were educated at, work(ed) at or were/ are otherwise affiliated with; and providing counts by type, country, etc. Subpages and participants are listed.
  • WikiProject University of California Wikipedia. (2019, February 17). This Wikipedia article describes the WikiProject to improve Wikipedia's coverage of the University of California system, which encompasses University of California campuses, professional schools, facilities and biographies of major figures. The site provides links to WikiProjects for all of the University of California campuses and the UC System.
  • WikiProject Stanford Libraries Stanford Wikidata Working Group. (2019, September 4). This is the page for a WikiProject for work done at Stanford Libraries to connect library data with Wikidata. The page provides useful links, references, and guides covering a wide range of topics including Description guidelines, Wikidata policies and guidelines, Quick reference guides, Property resources, Projects, and much more.
  • WikiProject Books WikiProject Books is used to: define a set of properties to be used by book infoboxes, Commons books templates, and Wikisource; map and import relevant data currently spread in Commons, Wikipedia, and Wikisource; and establish methods to interact with this data from different projects. Based on the Functional Requirements for Bibliographic Records (FRBR) model, Wikimedia projects uses a two level model, "work", and replaces the "expression" and "manifestation" levels of the FRBR model into one "edition" level. Bibliographic properties and qualifiers are listed here.
  • WikiProject Heritage institutions Wikidata. (2019, October 3). This project aims to create a comprehensive, high-quality database of archives, museums, libraries and similar institutions. While the main focus is on institutions that have curatorial care of a collection, the scope of the project includes related institutions, such as lending libraries, exhibition centers, zoos, and the like, to the extent that they are not covered by any other WikiProject. The project also serves to coordinate a range pf activities including the creation of an inventory of all existing public databases that contain data about heritage institutions, the implementation and maintenance of ontologies and multilingual thesauri relating to heritage institutions, the ingestion of data about heritage institutions into Wikidata, the inclusion of the data into Wikipedia and its sister projects, through Wikidata-powered infobox templates or lists, and more.
  • WikiProject Libraries Wikidata. (2019, September 25). The aims of this Wiki Project is to define a structure for libraries and to create and improve the items about library. The page provides item identifiers for types of libraries.
  • WikiProject Linked Data for Production/Practical Wikidata for Librarians Wikidata. (2020, August 25). This project seeks to gather and organize resources for librarians interested in editing Wikidata as well as to prevent duplicative work and provide stepping stones and guidance for librarians interested in working with Wikidata. Resources, links to gadgets and user scripts, information on data modeling, and project recipes are provided.
  • WikiProject Maps Wikidata:WikiProject Maps. (2019: April 25). This Wikidata page provides access to geographic projects in Wikidata, possible properties to use for maps entered into Wikidata, a list of map types, and a link to maps on Wikidata.
  • WikiProject Medicine/National Network of Libraries of Medicine Wikimedia. (2019, September 17). The goals of this Wikipedia project are to Improve the quality of Wikipedia medical related articles using authoritative mental health resources, raise visibility of NLM mental health resources, and promote Wikipedia as an outreach tool for engagement and open data. The project is centered on an edit-a-thon for October and November, 2019.
  • WikiProject Museums Wikidata. (2018, May 22). This project aims to define properties for items related to museums and the rules of use for these properties (qualifiers, datatypes, ...) and to organize the creation and improving the quality of the elements. The page provides suggested properties to use with museum related entities, tools, and example queries.
  • WikiProject Source MetaData Wikidata. WikiProject Source MetaData. (2019, August 11). WikiProject Source MetaData aims to: act as a hub for work in Wikidata involving citation data and bibliographic data as part of the broader WikiCite initiative; define a set of properties that can be used by citations, infoboxes, and Wikisource; map and import all relevant metadata that currently is spread across Commons, Wikipedia, and Wikisource; establish methods to interact with this metadata from different projects; create a large open bibliographic database within Wikidata; and reveal, build, and maintain community stakeholdership for the inclusion and management of source metadata in Wikidata. This page provides information regarding ongoing imports and projects, and a very substantial list of metadata sub-pages belonging to this project.
  • WikiProject Periodicals Wikidata: WikiProject Periodicals. (2019, June 15). This project aims to: define a set of properties from w:Template:Infobox_journal and w:Template:Infobox_magazine (and other languages), especially prior names with year ranges, and standard abbreviations; define a set of properties about periodical publishers, including learned societies; map and import 'Journals cited by Wikipedia; map and import all relevant data to the Wikipedia collection of journal articles at w:Category:Academic journal articles / w:Category:Magazine articles (and other languages), and link these items to the reason for their notability - e.g. the discovery that was made, or event it records; prepare for linking Wikisource collection of journal/magazine articles into Wikidata; map and import all other relevant data that currently is spread in Commons, Wikipedia, and Wikisource; and establish methods to interact with this data from different projects. This page provides lists of properties relevant to periodicals.
  • Citing sources Wikidata. (2019, January 5). This is a list of properties appropriate for citing sources in Wikidata. The list includes such properties as place of publication, imported from Wikimedia project, publisher, author, stated in, chapter, described by source, quote, inferred from, archive date, etc.
  • Wikidata List of Properies Wikidata. (2019, July 22). This page provides access to Wikidata properties by broad description topics. The page also lists tools for browsing properties in different languages, and a download option for all properties.
  • Wikidata property for items about people or organisations Wikidata. (2019, February 7). This is a list of properties that can be used to describe people or organizations. It encompasses a very wide range such as head of state, flag, logo, movement, league, chief executive officer, headquarters location, record label, Queensland Australian Football Hall of Fame inductee ID, field of work, award received, etc.
  • Wikidata property for items about people or organisations/human/authority control Wikidata. (2019, October 5). This is a list of Wikidata name authority control properties for writers, artists, architects, and organizations.
  • Wikidata property for items about works Wikidata. (2019, February 11). This is a list of properties to describe works such as articles, books, manuscripts, authority control for works, plays, media items, musical works, algorithms, software, structures, comics, television programs, works of fiction, and films.

There are many tools developed for working with Wikidata, many which are listed on the Wikidata Tools page listed below. General tools that are helpful with editing and adding items to Wikidata are listed here.

  • Author Disambiguator Wikidata:Tools/Author Disambiguator. (2020, October 2). Author Disambiguator is a tool for editing authors of works recorded in Wikidata, and is partially coordinated with the Scholia project that provides visual representations of scholarly literature based on what can be found in Wikidata. By converting author strings into links to author items a much richer analysis and tracing of relationships between researchers and their works, institutions, co-authors, etc. can be achieved. The tools ability to integrate with Scholia provides enhanced visual analysis.
  • Cradle Manski, Magnus. Cradle is a tool for creating new Wikidata items using a form. A link to existing forms along with their descriptions is provided. It is also possible to compose an original form.
  • Docker Desktop Docker Desktop is a MacOS and Windows application for building and sharing containerized applications and microservices and delivering them from your desktop. It enables the leveraging of certified images and templates in a choice of languages and tools. Docker Desktop uses the Google-developed open source orchestration system for automating the management, placement, scaling, and routing of containers.
  • EntiTree Schibel, Martin. EntiTree generates dynamic, navigable tree diagrams of people, organizations, and events based on information drawn from several sources and linked to Wikipedia articles.
  • FindingGLAMs This tool is a modified version of Monumental used to display information and multimedia about cultural heritage institutions gathered through Wikidata, Wikipedia and Wikimedia Commons. Search by name of institution or explore by geographic region. example item, or city.
  • Miraheze Miraheze is non-profit MediaWiki hosting service created by John Lewis and Ferran Tufan. The service offers free MediaWiki hosting, compatible with VisualEditor and Flow.
  • Monumental Marynowski, Paweł and LaPorte, Stephen. (2017). This tool displays information and multimedia about cultural heritage monuments gathered through Wikidata, Wikipedia and Wikimedia Commons. Explore by entering a name of a monument, geographical region, example monument, or city.
  • OpenRefine Wikidata:Tools/OpenRefine. (July 17,2019). OpenRefine is a free data wrangling tool used to clean tabular data and connect it with knowledge bases, including Wikidata. This page provides recipes, instructions, and resources to tutorials.
  • osm-wikidata Betts, Edward. Downloaded October 22, 2019. Use this tool to match Open Street Map (OSM) Entities with Wikidata Items. It uses the Wikidata SPARQL query service and the OSM Overpass and Nominatim APIs. Installation and configuration instructions are provided.
  • Scholia Nielsen, Finn Arup, Mietchen, Daniel, et. al. (2020). Scholia is a service which uses the information in Wikidata to create visual scholarly profiles for topic, people, organizations, species, chemicals, etc. It can be used with the Author Disambiguator tool to generate bar graphs, bubbles charts. line graphs, scatter plots, etc.
  • Semantic MediaWiki Krötzsch, Markus. (2020, Apr. 19). Semantic MediaWiki (SMW) is an open source extension for MediaWiki, the software that powers Wikipedia. It provides the ability to store data in wiki pages, and query it elsewhere, thus turning a wiki that uses it into a semantic wiki.
  • Wikidata:SourceMD Wikidata. (2019, February 10). SourceMD,aka Source Metadata Tool, can be used to take the persistent Wikidata identifier for a scholarly article or book to automatically generate Wikidata items using metadata from scholarly publications. The tool works with these identifiers: ISBN-13 (P212); DOI (P356); ORCID iD (P496); PubMed ID (P698); and PMCID (P932).
  • Wikidata Tools (2019, January 18). This page provides a list of tools to ease working with Wikidata, inlcuding a property list, query tools, lexicographical data tools, tools for editing items, data visualization tools, a Wikidata graph builder, and more.
  • Wikimedia Programs & Events Dashboard Wikimedia. (2020, January 21). The Programs & Events Dashboard is a management tool used to initiate and organize edit-a-thons, campaigns, and other wiki events. It provides instructions, registration functions, tracking functions to measure and report the outcome of a program (number of editors, number of edits, items created, references added, number of views, etc.).

This page provides access to documents and reports associated with workshops, institutions, organizations, or other entities which relate valuable information, or describe initiatives or projects regarding the Semantic Web or Linked Data.

  • Addressing the Challenges with Organizational Identifiers and ISNI Smith-Yoshimura, Karen, Wang, Jing, Gatenby, Janifer, Hearn, Stephen, Byrne, Kate. (2016). This webinar discusses documenting the challenges, use cases, and scenarios where the International Standard Name Identifier (ISNI) can be used to disambiguate organizations by using a unique, persistent and public URI associated with the organization that is resolvable globally over networks via specific protocols, thus providing the means to find and identify an organization accurately and to define the relationships among its sub-units and with other organizations.
  • BIBCO Mapping BSR to BIBFRAME 2.0 Group: Final Report to the PCC Oversight Group BBIBCO Mapping BSR to BIBFRAME 2.0 Group. (2017, July). This report summarizes the BIBCO Mapping the BIBCO Standard Record (BSR) to BIBFRAME 2.0 group's work and identifies issues that require further discussion by the Program for Cooperative Cataloging (PCC).
  • BIBCO Standard Record to BIBFRAME 2.0 Mapping BIBCO Mapping BSR to BIBFRAME 2.0 Group. (2017, July). This spreadsheet maps BIBCO Standard Record elements to BIBFRAME 2.0. Amid the information included in the spreadsheet are RDA instructions & elements, MARC coding, rda-rdf properties as defined in the RDA Registry, Triple statements needed to properly map the element, and specific instructions pertaining to elements.
  • BIBFLOW BIBFLOW is a two-year project of the UC Davis University Library and Zepheira, funded by IMLS. Its official title is “Reinventing Cataloging: Models for the Future of Library Operations.” BIBFLOW’s focus is on developing a roadmap for migrating essential library technical services workflows to a BIBFRAME / LOD (LOD) ecosystem. This page collects the specific library workflows that BIBFLOW will test by developing systems to allow library staff to perform this work using LOD native tools and data stores. Interested stakeholders are invited to submit comments on the workflows developed and posted on this site. Information from comments will be used to adjust testing as the project progresses.
  • British Library Data Model This is the British Library's data model for a resource.
  • British Library Data Model - Book This is the British Library's data model for cataloging a book in a Semantic Web environment.
  • British Library Data Model - Serial This is the British Library's data model for cataloging a serial in a Semantic Web environment. This is the British Library's data model for cataloging a serial in a Semantic Web environment.
  • Common Ground: Exploring Compatibilities Between the Linked Data Models of the Library of Congress and OCLC Jean Godby,Carol and Denenberg, Ray. (2015, Jan.). Library of Congress and OCLC Research. This white paper compares and contrasts the Bibliographic Framework Initiative at the Library of Congress and OCLC’s efforts to refine the technical infrastructure and data architecture for at-scale publication of linked data for library resources in the broader Web.
  • CONSER CSR to BIBFRAME Mapping Task Group: [Final Report] of the PCC BIBFRAME Task Group CONSER CSR to BIBFRAME Mapping Task Group. (2017). This report summarizes the mapping outcomes and recommendations of the group for mapping CONSER Standard Record (CSR) elements to BIBFRAME 2.0. It also identifies several issues that will require further discussion.
  • CONSER Standard Record to BIBFRAME 2.0 Mapping CONSER CSR to BIBFRAME Mapping Task Group. (2017, July). This spreadsheet maps the CONSER Standard Record (CSR) elements to BIBFRAME 2.0. The spreadsheet "Examples' column contains links to sample code documents containing Turtle serializations of each CSR element in BIBFRAME.
  • Europeana pro Europeana Foundation. This site provides a detailed description of the European Union's Linked Open Data initiative, including a history, the Europeana Data Model, a list of namespaces used, tools, and more.
  • Game Metadata and Citation Project (GAMECIP) This University of California Santa Cruz and Stanford University project is developing the metadata needs and citation practices surrounding computer games in institutional collections. It seeks to address the problems of cataloging and describing digital files, creating discovery metadata, and providing access tools associated with the stewardship of digital games software stored by repositories. The site provides information regarding tools and vocabularies under development.
  • IIIF Explorer OCLC ResearchWorks. (2020). The IIIF Explorer is a prototype tool that searches across an index of all of the images in the CONTENTdm digital content management systems hosted by OCLC.
  • Library of Congress Labs The Library of Congress Labs site shares experimental initiatives the Library is conducting with its digital collections. Access videos, reports, presentations, and APIs. Clicking on the LC for Robots tab provides bulk data for Congressional bills, MARC records (in UTF-8, MARC8, and XML), Chronicling America, and more. The site demonstrates how to interact with the Library's collection.
  • Linked Art Linked Art is a Community working on creating a shared model to describe art based on Linked Open Data. The site lists partner projects, consortia, and institutions.
  • Linked Data for Libraries (LD4L) LD4L is a collaborative project of Cornell University Library, the Harvard Library Innovation Lab, and the Stanford University Libraries. The project is developing a Linked Data model to capture the intellectual value added to information resources when they are described, annotated, organized, selected, and used, along with the social value evident from patterns of usage.
  • Linked Data for Production: Closing the Loop (LD4P3) LD4P3 aims to create a working model of a complete cycle for library metadata creation, sharing, and reuse. LD4P3 builds on the foundational work of LD4P2: Pathway to Implementation, LD4P Phase 1, and Linked Data for Libraries Labs (LD4L Labs). Access the statement of objectives for two domain projects, one for cartographic material and one for film/moving image resources.
  • Linked Data for Production: Pathway to Implementation (LD4P2) Futornick, Michelle. (2019, January 14). LD4P Phase 2 builds upon the work of Linked Data for Production (LD4P) Phase 1 and Linked Data for Libraries Labs (LD4L Labs). This phase marks the beginning of implementing the cataloging community’s shift to linked data for the creation and manipulation of their metadata. Access information regarding the seven goals of Phase 2 outlined by the institutions collaborating on the project: Cornell; Harvard; Stanford; the University of Iowa School of Library and Information Science; and the Library of Congress and the Program for Cooperative Cataloging (PCC).
  • Linked Data Wikibase Prototype OCLC Research. In partnership with several libraries, OCLC has developed a prototype to demonstrate the value of linked data for improving resource-description workflows in libraries. The service is built on the Wikibase platform to provide three services: a Reconciler to connect legacy bibliographic information with linked data entities; a Minter to create and edit new linked data entities; and a Relator to view, create, and edit relationships between entities.
  • Looking Inside the Library Knowledge Vault Washburn, Bruce and Jeff Mixter, Jeff. (2015, Aug.26). This is a U-Tube recording of an OCLC Research Works in Progress webinar describing how OCLC Research is evaluating the Google Knowledge Vault model to test an approach to building a Library Knowledge Vault.
  • OCLC Data strategy and linked data This page describes OCLC library bibliographic initiatives focusing on designing and implementing new approaches to re-envision, expose, and share library data as entities that are part of the Semantic Web.
  • RDA Input Form The RDA Input Form is a proof-of-concept experiment created by the Cataloging and Metadata Services of the University of Washington to demonstrate that RDA cataloging (input) can be easily output in multiple schemas using a processing pipeline and mappings. The form focuses on PCC core and output is in RDA/RDF and BIBFRAME in RDF-XML. The experiment showed that output in these schemas can be generated in an automated fashion using a pipeline. Implications for future production cataloging systems is that input and output should not be directly tied to each other, and cataloging systems should have sufficient flexibility to output in multiple schemas, which can be achieved in an automated way.
  • Report of the Stanford Linked Data Workshop This report includes a summary of the workshop agenda and a chart showing the use of Linked Data in cultural heritage venues for the workshop held at Stanford University June 27 - July 1, 2011.
  • rightsstatements.org This GitHub page provides access to the request for proposals issued by the International Rights Statements Working Group, a joint Digital Public Library of America (DPLA) and Europeana Foundation working group to develop and implement a technical infrastructure for a rights statements application, a content management system, and a server configuration, deployment, and maintenance implementation for rights management. Links to a PDF version of the request and a PDF version of the "Requirements for the Technical Infrastructure for Standardized International Rights Statements" are provided.
  • Schema Bib Extend Community Group This W3C group was formed to discuss and prepare proposal(s) for extending Schema.org schemas for the improved representation of bibliographic information markup and sharing. Access the group wiki, contact information, a mailing list, information regarding joining the group, information about proposals, an RSS feed, and recipes and guidelines.
  • SHARE Virtual Discovery Environment project Casalini Libri, @Cult, and participating libraries. The aim of this project (Share-VDE Project) is to design a flexible configuration that uses the paradigms of the Semantic Web to provide a way for libraries to handle their data related to information management, enrichment, entity identification, conversion, reconciliation, and publication processes of the Semantic Web as independently as possible. The project provides a prototype of a virtual discovery environment with a three BIBFRAME layer architecture (Person/Work, Instance, Item) established through the individual processes of analysis, enrichment, conversion, and publication of data from MARC21 to RDF. Records from libraries with different systems, habits, and cataloguing traditions were included in the prototype.
  • Stanford Linked Data Workshop Technical Plan This report summarizes the output of the Linked Data in cultural heritage venues workshop held at Stanford University June 27 - July 1, 2011.
  • Stanford Tracer Bullets Futornick, Michelle. (2008, August 6). This Stanford Linked Data production project focused on all the steps to transition to a linked data environment in four technical services workflows: copy cataloging through the Acquisitions Department, original cataloging, deposit of a single item into the Stanford Digital Repository, and deposit of a collection of resources into the Stanford Digital Repository.
  • Wikipedia + Libraries: Better Together Wikipedia + Libraries: Better Together was an 18-month OCLC project to strengthen the ties between US public libraries and English Wikipedia which ended in May, 2018. Information provided includes how librarians use and contribute to Wikipedia, teach information literacies using Wikipedia, and use Wikipedia for events. Training materials are provided.
  • The Europeana Linked Open Data Pilot Haslhofer, Bernhard and Isaac, Antoine. Proc. In Int’l Conf. on Dublin Core and Metadata Applications 2011. This is the model developed to make metadata available from Europeana data providers as Linked Open Data. The paper describes the model and experiences gained with the Europeana Data Model (EDM), HTTP URI design, and RDF store performance.

This page provides links to examples of Linked Data currently in use.

  • BBC Academy: Linked Data The British Broadcasting Company (BBC) is an early experimenter and adopter of Linked Data. The BBC Backstage project, working with Wikipedia, developed and produced content rich prototypes showing the potential of Linked Data. Explore this site to experience the hidden power seamless exploitation of Linked Data.
  • Becoming Data Native: How BIBFRAME Extensibility Delivers Libraries a Path to Scalable, Revolutionary Evolution Miller, Eric. (2017). Zepheira and The Library.Link Network. This is a PowerPoint presentation by Eric Miller presented at the 2017 American Library Association conference. It describes how third party linked data library vendor Zepheira uses BIBFRAME in its iterations to connect library collections to the linked data cloud, including the Library of Congress collection.
  • BIBFRAME 2.0 Implementation Register Library of Congress. The BIBFRAME 2.0 implementation register lists existing, developing, and planned implementations of BIBFRAME 2.0, the Library of Congress' replacement for MARC.
  • The British National Bibliography The BNB is the single most comprehensive listing of UK titles, recording the publishing activity of the United Kingdom and the Republic of Ireland. It includes print publications since 1950 and electronic resources since 2003.
  • Dallas Public Library This Dallas Public Library site demonstrates a Library.Link Network instance of library resources implemented by third party linked data vendor, Zepheira.
  • Data.gov Data.gov is the open data initiative of the United States government. It provides federal, state and local data, tools, and resources to conduct research, build apps, design data visualizations, and more. Data are provided by hundreds of organizations and Federal agencies, and the code is open source. The data catalog is powered by CKAN, and the content seen is powered by WordPress.
  • data-hnm-hu - Hungarian National Museum Datasets The Hungarian National Museum has made its Linked Data datasets available on datahub. As a means of familiarizing Hungarian librarians with BIBFRAME, the datasets were published so that the BIBFRAME and MARC descriptions were crossed linked. Conversion features work and entity recognition and name entities are linked to external datasets.
  • dblp computer science bibliography Schloss Dagstuhl - Leibniz Center for Informatics. (2020, January 4). dblp is an on-line reference database providing free access to high-quality bibliographic meta-data and links to the electronic editions of computer science publications. When an external website that hosts an electronic edition of a research paper is known, a hyperlink together with the bibliographic meta-data is provided. Some links require subscriptions and some are open access.
  • Digital Public Library of America (DPLA) The Digital Public Library of America is a portal that brings together and makes freely available digitized collections of America’s libraries, archives, and museums. More than that, DPLA is a platform that provides developers, researchers, and others the ability to create tools and applications for learning, and discovery. This is a site worth exploring to see the next generation library. Click on Bookshelf to search for a book. Visit the Apps page to find ways of accessing DPLA's resources. DLPA uses Krikri, a Ruby on Rails engine for metadata aggregation, enhancement, and quality control as part of Heiðrún, its metadata ingestion system.
  • DTU Orbit - The Research Information System DTU Orbit is the official research database of the Technical University of Denmark, DTU. Available to browse in standard web browsers and in addition to providing open access to articles, it provides a linked data type graph interface to cross search publications, projects, activities, department profiles and staff profiles related to publications to which DTU employees have contributed.
  • English Language Books listed in Printed Book Auction Catalogues from 17th Century Holland Alexander, Keith. This datahub dataset lists books in the English language section of Dutch printed book auction catalogues of collections of scholars and religious ministers. For access to this data set and other auction catalogues, see the Printed Book Auction Catalogues resource.
  • Europeana Pro Europeana Foundation. This is the European Union's initiative to share its countries' rich cultural heritage resources. Information regarding APIs, tools, grants, and events are also provided.
  • Harvard LibraryCloud APIs Created by Licht, Jeffrey Louis, last modified by Wetherill, Julie M. (2019, May 6). Library Cloud is a metadata service that provides open, programmatic access to item and collection APIs that provide search access to Harvard Library collections metadata.
  • Ligatus Ligatus. (2021). Ligatus is part of an initiative of the University of the Arts London conducting research on documentation in historical libraries and archives. Some of the projects include the Language of Bindings Thesaurus, Linked Conservation Data, Artivity (a tool capturing contextual data produced during the creative process of artists and designers while working on a computer), The St. Catherine's Project (conservation support for the unique monastery library in Sinai), and Archive as Event (online archive of the artist John Latham structured using Creative Archiving principles based on Latham's ideas).
  • Linked Jazz This Pratt Institute project is built around oral histories of jazz musicians from Rutgers Institute for Jazz Studies Archives, Smithsonian Jazz Oral Histories, the Hamilton College Jazz Archive, UCLA’s Central Avenue Sounds series, and the University of Michigan’s Nathaniel C. Standifer Video Archive of Oral History. Tools developed for the project include the Linked Jazz Transcript Analyzer, a Name Mapping and Curator Tool, the crowd sourcing tool Linked Jazz 52nd Street, and the Linked Jazz Network Visualization Tool. The project also used Ecco! - a Linked Open Data application for entity resolution designed to disambiguate and reconcile named entities with URIs from authoritative sources.
  • London DataStore The London DataStore is a free and open data-sharing portal providing access to over 500 datasets about London.
  • National Széchényi Library catalogue (National Library of Hungary) The National Széchényi Library provides an example of a library Linked Data interface. Use the search box to perform a search. Click on "Semantic Web" under "Services" and click on Semantic web to learn more about this library's service and its move to Virtuoso.
  • OCLC Research This page shows OCLC's current research projects on libraries, metadata, collections, library enterprises, and more.
  • Office of the Historian Department of State, United States. The Office of the Historian publishes the Foreign Relations of the United States and a Guide to Country Recognition and Relations, and the World Wide Diplomatic Archives Index. Among other resources provided by the Office are bibliographic information about U. S. Presidents and Secretaries of State, information about travels of the President and Secretaries of State, Visits by Foreign Heads of State, and more. The office is using the TEI Processing Model and eXistdb for publishing its documents on the Web.
  • Organization Name Linked Data The Organization Name Linked Data (ONLD) is based on the North Carolina State University Organization Name Authority, a tool maintained by the Libraries' Acquisitions & Discovery department to manage the variant forms of name for journal and e-resource publishers, providers, and vendors in their local electronic resource management system (ERMS). Names chosen as the authorized form reflect an acquisitions, rather than bibliographic, orientation. Data is represented as RDF triples using properties from the SKOS, RDF Schema, FOAF and OWL vocabularies. Links to descriptions of the organizations in other linked data sources, including the Virtual International Authority File, the Library of Congress Name Authority File, Dbpedia, Freebase, and International Standard Name Identifier (ISNI) are provided.
  • SHARE Catalogue @ Cult Rome Italy. Scholarly Heritage and Access to Research Catalog (SHARE Catalogue) is a portal providing a single point of access to the entirety of the integrated resources of eight Italian libraries organized according to the BIBFRAME linked data model.
  • Share-VDE (Virtual Discovery Environment) Share-VDE is a library-driven initiative which collects the bibliographic records and authority files in a shared discovery environment using Linked Data. It is a collaborative endeavor between Casilini Libri, @CULT, the Program for Cooperative Cataloging, international research libraries, and the LD4P project. The Share-VDE interface provides wide-ranging and detailed search results to library patrons. Each library received the information corresponding to its own catalog in Linked Data which may be re-used according to local requirements with no restrictions.
  • Text Creation Partnership (TCP) The TCP is making available standardized, accurate XML/SGML encoded electronic text editions from Early English Books Online (EEBO-TCP), Eighteenth Century Collections Online (ECCO-TCP), Evans Early American Imprints (Evans-TCP), and EEBO-TCP Collections: Navigations. Texts are from ProQuest’s Early English Books Online, Gale Cengage’s Eighteenth Century Collections Online, and Readex’s Evans Early American Imprints and are made available through through web interfaces provided by the libraries at the University of Michigan and University of Oxford.
  • University of Edinburgh Wikimedian in Residence University of Edinburgh. (2021). This page lists Wikidata Use Cases from the University of Edinburgh's collaboration with Wikimedia UK. Cases which have garnered international acclaim and served as inspiration for other research and collaborations include Scottish Witches, The Aberdeen Tower Block Archives, Documenting Biomedical Sciences: The Gene Wiki Project, Mapping the Scottish Reformation, Digitising Collections at the National Library of Wales, and others. Projects developed student skills as they surfaced data from MS Access databases to Wikidata as structured, machine-readable, linked open data.
  • University of Southampton Open Data Service University of Southampton Open Data service has developed several mobile apps based on datasets using linked data. The data sets cover all aspects of university life including academic sessions, campus map, buildings, disabilities informaton, food services, organizations, and more. This initiative won the Times Higher Award in 2012 for Outstanding ICT Initiative of the Year, and a Cost Sector Catering award in 2015 for Best Innovation in Catering.

Wikibase Use Cases

  • Enslaved Michigan State University, Matrix: Center for Digital Humanities & Social Sciences. This Wikibase instance provides for the exploration of individuals who were enslaved, owned slaves, or participated in the historical trade. Search over numerous datasets and browse interconnected data, generate visualizations, and explore short biographies of enslaved and freed peoples.
  • The EU Knowledge Graph European Commission. (2021, March 29). This Wikibase instance contains structured information about the European Union. Click on the Kohesio link to see the Project Information Portal for Regional Policy, which showcases how linked data can be uses to provide local policy information regarding different topics.
  • Rhizome Artbase Rhizome provides a dataset for born-digital artworks from 1999 to the present day using the Wikibase platform. Search by date or artist name. Some entries include external links to artworks maintained by artists or others, archived copies hosted on Rhizome infrastructure, and documentation. The instance provides timeline capability and uses its own ontology data model that integrates with Wikidata and other standards.

University of Edinburgh Wikimedian in Residence Projects

  • Mapping the Scottish Reformation This project maps the Scottish Reformation by tracing clerics across early modern and modern Scotland using information from a database of the Scottish clergy generated by Wikidata. Information from this database runs parallel to another University of Edinburgh project, Scottish Witches.
  • Sottish Witches Access the data visualizations of geolocation information pulled from the Survey of Scottish Witchcraft by Geology and Physical Geography student Emma Carroll. The work transformed the Survey from a static database to an acclaimed interactive linked open data collaboration with Wikimedia and with the support from Ewan McAndrew, University of Edinburgh’s Wikimedian in Residence. Information about the project is available.
  • Last Updated: Feb 26, 2024 1:06 PM
  • URL: https://guides.library.ucla.edu/semantic-web

Semantics Research Paper

Academic Writing Service

View sample Semantics Research Paper. Browse other  research paper examples and check the list of research paper topics for more inspiration. If you need a religion research paper written according to all the academic standards, you can always turn to our experienced writers for help. This is how your paper can get an A! Feel free to contact our custom writing service s for professional assistance. We offer high-quality assignments for reasonable rates.

Semantics is the study of meaning communicated through language, and is usually taken to be one of the three main branches of linguistics, along with phonology, the study of sound systems, and grammar, which includes the study of word structure (morphology) and of sentence structure (syntax). This entry surveys some of the main topics of current semantics research.

Academic Writing, Editing, Proofreading, And Problem Solving Services

Get 10% off with 24start discount code, 1. introduction.

Traditionally, the main focus of linguistic semantics has been on word meaning, or lexical semantics. Since classical times writers have commented on the fact, noticed surely by most reflecting individuals, that the meaning of words changes over time. Such observations are the seeds of etymology, the study of the history of words. Over longer stretches of time, such changes become very obvious, especially in literate societies. Words seem to shift around: some narrow in meaning such as English ‘queen,’ which earlier meant ‘woman, wife’ but now means ‘wife of a king.’ Others become more general, while still others shift to take on new meaning or disappear altogether. Words are borrowed from language to language. The study of such processes is now part of historical semantics (Fisiak 1985). Another motivation for the study of word meaning comes from dictionary writers as they try to establish meaning correspondences between words in different languages, or, in monolingual dictionaries, seek to provide definitions for all the words of a language in terms of a simple core vocabulary. In lexicology similarities and differences in word meaning are a central concern.

The principled study of the meaning of phrases and sentences has only become established in linguistics relatively recently. Thus it is still common for descriptive grammars of individual languages to contain no separate section on semantics other than providing a lexicon. Nonetheless it has always been clear that one can identify semantic relations between sentences. Speakers of English know from the semantics of negation that nominal negation has different effects than sentence negation, so that ‘No-one complained’ may aptly be used to answer ‘Who complained?,’ while ‘Someone did not complain’ may not. Two sentences may seem to say essentially the same thing, even be paraphrases of each other, yet one may be more suited to one context than another–like the pair ‘Bandits looted the train’ and ‘The train was looted by bandits.’ A single sentence may be internally inconsistent, such as ‘Today is now tomorrow,’ or seem to be repetitive or redundant in meaning, such as ‘A capital city is a capital city.’ Another feature of sentence meaning is the regularity with which listeners draw inferences from sentences, and often take these to be part of the meaning of what was said. Some inferential links are very strong, such as entailment. Thus we say that ‘Bob drank all of the beer’ entails ‘Bob drank some of the beer’ (assuming the same individual Bob, beer, etc.), because it is hard to think of a situation where acceptance of the second sentence would not follow automatically from acceptance of the first. Other inferential links are weaker and more contextually dependent: from the utterance ‘Bob drank some of the beer’ it might be reasonable to infer ‘Bob didn’t drink all of the beer,’ but it is possible to think of situations where this inference would not hold. We might say that a speaker of the first sentence is implying the second, in a certain context. Speakers of all languages regularly predict and use such inferential behavior to convey their meaning, such that often more meaning seems to be communicated than is explicitly stated. All these aspects of sentence meaning are under study in various semantic frameworks.

Semanticists share with philosophers an interest in key issues in the use of language, notably in reference. We use this term to describe the way in which speakers can pick out, or name, entities in the world by using words as symbols. Many scholars, especially formal semanticists, accept Frege’s distinction between reference (in German, Bedeutung) and sense (Sinn); see Frege (1980). Reference is the act of identifying an entity (the referent) while sense is the means of doing so. Two different linguistic expressions such as ‘the number after nine’ and ‘the number before eleven’ differ in sense but they both share the same referent, ‘ten.’ For semanticists it is particularly interesting to study the various mechanisms that a language offers to speakers for this act of referring. These include names such as ‘Dublin,’ nouns such as ‘cat,’ which can be used to refer to a single individual, ‘your cat,’ or a whole class, ‘Cats are carnivorous,’ quantified nominals such as ‘many cats,’ ‘some cats,’ ‘a few cats,’ etc. Linguists as well as philosophers have to account for language’s ability to allow us to refer to nonexistent and hypothetical referents such as ‘World War Three,’ ‘the still undiscovered cure for cancer,’ ‘the end of the world.’

Semanticists also share interests with psychologists, for if sense is the meaning of an expression, it seems natural to many semanticists to equate it with a conceptual representation. Cognitive semanticists, in particular, (for example Lakoff 1987, Talmy 2000), but also some generative linguists (Jackendoff 1996), seek to explore the relationship between semantic structure and conceptual structure. One axis of the debate is whether words, for example, are simply labels for concepts, or whether there is a need for an independent semantic interface that isolates just grammatically relevant elements of conceptual structure. As Jackendoff (1996) points out, many languages make grammatical distinctions corresponding to the conceptual distinctions of gender and number, but few involve distinctions of colour or between different animal species. If certain aspects of concepts are more relevant to grammatical rules, as is also claimed by Pinker (1989), this may be justification for a semantic interface.

2. Approaches To Meaning

Even in these brief remarks we have had to touch on the crucial relationship between meaning and context. Language of course typically occurs in acts of communication, and linguists have to cope with the fact that utterances of the same words may communicate different meanings to different individuals in different contexts. One response to this problem is to hypothesize that linguistic units such as words, phrases, and sentences have an element of inherent meaning that does not vary across contexts. This is sometimes called inherent, or simply, sentence meaning. Language users, for example speakers and listeners, then enrich this sentence meaning with contextual information to create the particular meaning the speaker means to convey at the specific time, which can then be called speaker meaning. One common way of reflecting this view is to divide the study of meaning into semantics, which becomes the study of sentence meaning, and pragmatics, which is then the study of speaker meaning, or how speakers use language in concrete situations. This is an attempt to deal with the tension between the relative predictability of language between fellow speakers and the great variability of individual interpretations in interactive contexts. One consequence of this approach is the view that the words that a speaker utters underdetermine their intended meaning.

Semantics as a branch of linguistics is marked by the theoretical fragmentation of the field as a whole. The distinction between formal and functional approaches, for example, is as marked in semantics as elsewhere. This is a large subject to broach here but see Givon (1995) and Newmeyer (1998) for characteristic and somewhat antagonistic views. One important difference is the attitude to the autonomy of levels of analysis. Are semantics and syntax best treated as autonomous areas of study, each with its own characteristic entities and processes? A related question at a more general level is whether linguistic processes can be described independently of general psychological processes or the study of social interaction. Scholars in different theoretical frameworks will give contradic- tory answers to these questions of micro-and macro-autonomy. Autonomy at both levels is characteristic of semantics within generative grammar; see, for example, Chomsky (1995). Functionalists such as Halliday (1996) and Harder (1996) would on the other hand argue against microautonomy, suggesting that grammatical relations and structure cannot be under- stood without reference to semantic function. They also seek motivation for linguistic structure in the dynamics of communicative interaction. A slightly different external mapping is characteristic of cognitive semantics, for example Lakoff (1987) and Langacker (1987), where semantic structures are correlated to conceptual structures.

Another dividing issue in semantics is the value of formal representations. Scholars are divided on whether our knowledge of semantics is sufficiently mature to support attempts at mathematical or other symbolic modeling; indeed, on whether such modeling serves any use in this area. Partee (1996), for example, defends the view of formal semanticists that the application of symbolic logic to natural languages, following in particular the work of Montague (1974), represents a great advance in semantic description. Jackendoff (1990), on the other hand, acknowledges the value of formalism in semantic theory and description but argues that formal logic is too narrow adequately to describe meaning in language. Other scholars, such as Wierzbicka (1992), view the search for formalism as premature and distracting. There has been an explosive increase in the research in formal semantics since Montague’s (1974) proposal that the analysis of formal languages could serve as the basis for the description of natural languages. Montague’s original theory comprised a syntax for the natural language, say English, a syntax for the logical language into which English should be translated (intensional logic), rules for the translation, and rules for the semantic interpretation of the intensional logic. This and subsequent formal approaches are typically referential (or denotational) in that their emphasis is on the connection of language with a set of possible worlds, including the real, external world and the hypothetical worlds set up by speakers. Crucial to this correspondence is the notion of truth, defined at the sentence level. A sentence is true if it correctly describes a situation in some world. In this view, the meaning of a sentence is characterized by describing the conditions which must hold for it to be true. The central task for such approaches is to extend the formal language to cope with the semantic features of natural language while maintaining the rigor and precision of the methodology. See the papers in Lappin (1996) for typical research in this paradigm.

Research in cognitive semantics presents an alternative strategy. Cognitive semanticists reject what they see as the mathematical, antimentalist approach of formal semantics. In their view meaning is described by relating linguistic expressions to mental entities, conventionalized conceptual structures. These semanticists have proposed a number of conceptual structures and processes, many deriving from perception and bodily experience and, in particular, conceptual models of space. Proposals for underlying conceptual structures include image schemas (Johnson 1987), mental spaces (Fauconnier 1994), and conceptual spaces (Gardenfors 1999). Another focus of interest is the processes for extending concepts, and here special attention is given to metaphor. Lakoff (1987) and Johnson (1987) have argued against the classical view of metaphor and metonymy as something outside normal language, added as a kind of stylistic ornament. For these writers metaphor is an essential element in our categorization of the world and our thinking processes. Cognitive semanticists have also investigated the conceptual processes which reveal the importance of the speaker’s perspective and construal of a scene, including viewpoint shifting, figure-ground shifting, and profiling (Langacker 1987).

3. Topics In Sentence Semantics

Many of the semantic systems of language, for example tense, aspect, mood, and negation, are marked grammatically on individual words such as verbs. However, they operate over the whole sentence. This ‘localization’ is the reason that descriptive grammars usually distribute semantic description over their analyses of grammatical forms. Such semantic systems offer the speaker a range of meaning distinctions through which to communicate a message. Theoretical semanticists attempt to characterize each system qua system, as in for example Verkuyl’s (1993) work on aspect and Hornstein’s (1990) work on tense. Typological linguists try to characterize the variation in such systems across the world’s languages, as in the studies of tense and aspect by Comrie (1976, 1985), Binnick (1991), and Bybee et al. (1994). We can sketch some basic features of some of these systems.

3.1 Situation Type And Aspect

Situation type and aspect are terms for a language’s resources that allow a speaker to describe the temporal ‘shape’ of events. The term situation type is used to describe the system encoded in the words of a language, while aspect is used for the grammatical systems which perform a similar role. To take one example, languages typically allow speakers to describe a situation either as static, as in ‘The bananas are ripe,’ or as dynamic, as in ‘The bananas are ripening.’ Here the state is the result of the process but the same situation can be viewed as more static or dynamic, as in ‘The baby is asleep’ and ‘The baby is sleeping.’ As these examples show, this distinction is lexically marked: in English, for example, adjectives are typically used for states, and verbs for dynamic situations. There are, however, a group of stative verbs, such as ‘know,’ ‘understand,’ ‘love,’ ‘hate,’ which describe static situation types. There are a number of semantic distinctions typically found amongst dynamic verbs, for example the telic/atelic (bounded/unbounded) distinction and the punctual/durative distinction. Telic verbs describe processes which are seen as having a natural completion, which atelic verbs do not. A telic example is ‘Matthew was growing up,’ and an atelic example is ‘Matthew was drinking.’ If these procsses are interrupted at any point, we can automatically say ‘Matthew drank,’ but not ‘Matthew grew up.’ However, atelic verbs can form telic phrases and sentences by combining with other grammatical elements, so that ‘Matthew was drinking a pint of beer’ is telic. Durative verbs, as the term suggests, describe processes that last for a period of time, while punctual describes those that seem so instantaneous that they have no detectable internal structure, as in the comparison between ‘The man slept’ and ‘The light flashed.’ As has often been observed, if an English punctual verb is used with a durative adverbial, the result is an iterative meaning, as in ‘The light flashed all night,’ where we understand the event to be repeated over the time mentioned.

Situation type typically interacts with aspect. Aspect is the grammatical system that allows the speaker choices in how to portray the internal temporal nature of a situation. An event, for example, may be viewed as closed and completed, as in ‘Joan wrote a book,’ or as an ongoing process, perhaps unfinished, as in ‘Joan was writing a book.’

The latter verb form is described as being in the progressive aspect in English, but similar distinctions are very common in the languages of the world. In many languages we find described a distinction between perfective and imperfective aspects, used to describe complete versus incomplete events; see Bybee et al. (1994) for a survey. As mentioned above, aspect is intimately associated both with situation type and tense. In Classical Arabic the perfective is strongly associated with past tense (Comrie 1976, Binnick 1991). In English, for example, stative verbs are typically not used with progressive aspect, so that one may say ‘I know some French’ but not ‘I am knowing some French.’ Staying with the progressive, when it is used in the present tense in English (and in many other languages) it carries a meaning of proximate future or confident prediction as in ‘We’re driving to Los Angeles’ or ‘I’m leaving you.’ The combination of the three semantic categories of tense, situation type, and aspect produces a complex system that allows speakers to make subtle distinctions in relating an event or describing a situation.

3.2 Modality

Modality is a semantic system that allows speakers to express varying attitudes to a proposition. Semanticists have traditionally identified two types of modality. One is termed epistemic modality, which encodes a speaker’s commitment to, or belief in, a proposition, from the certainty of ‘The ozone layer is shrinking’ to the weaker commitments of ‘The ozone layer may/might/could be shrinking.’ The second is deontic modality, where the speaker signals a judgment toward social factors of obligation, responsibility, and permission, as in the various interpretations of ‘You must/can/may/ought to borrow this book.’ These examples show that similar markers, here auxiliary verbs, can be used for both types. When modality distinctions are marked by particular verbal forms, these are traditionally called moods. Thus many languages, including Classical Greek and Somali, have a verb form labeled the optative mood for expressing wishes and desires. Other markers of modality in English include verbs of propositional attitude, as in ‘I know/believe/think/doubt/that the ozone layer is shrinking,’ and modal adjectives, as in ‘It is certain/probable/likely/possible that the ozone layer is shrinking.’

A related semantic system is evidentiality, where a speaker communicates the basis or source for presenting a proposition. In English and many other languages this may be done by adding expressions like ‘allegedly,’ ‘so I’ve heard,’ ‘they say,’ etc., but certain languages mark such differences morphologically, as in Makah, a Nootkan language spoken in Washington State (Jacobsen 1986, p. 10):

wiki caxaw: ‘It’s bad weather’ (seen or experienced directly);

wiki caxakpi d: ‘It looks like bad weather’ (inference from physical evidence);

wiki caxakqad/ I: ‘It sounds like bad weather’ (on the evidence of hearing); and

wiki caxakwa d: ‘I’m told there’s bad weather’ (quoting someone else).

3.3 Semantic Roles

This term describes the speaker’s semantic repertoire for relating participants in a described event. One influential proposal in the semantics literature is that each language contains a set of semantic roles, the choice of which is partly determined by the lexical semantics of the verb selected by the speaker. A characteristic list of such roles is:

agent: the initiator of some action, capable of acting with volition;

patient: the entity undergoing the effect of some action, often undergoing some change in state;

theme: the entity which is moved by an action, or whose location is described;

experiencer: the entity which is aware of the action or state described by the predicate but which is not in control of the action or state;

beneficiary: the entity for whose benefit the action was performed;

instrument: the means by which an action is performed or something comes about;

location: the place in which something is situated or takes place;

goal: the entity towards which something moves;

recipient: the entity which receives something; and

source: the entity from which something moves.

In an example like ‘Harry immobilized the tank with a broomstick,’ the entity Harry is described as the agent, the tank as the patient, and the broomstick as the instrument. These roles have also variously been called deep semantic cases, thematic relations, participant roles, and thematic roles.

One concern is to explain the matching between semantic roles and grammatical relations. In many languages, as in the last example, there is a tendency for the subject of the sentence to correspond to the agent and for the direct object to correspond to a patient or theme; an instrument often occurs as a prepositional phrase. Certain verbs allow variations from this basic mapping, for example the we find with English verbs such as ‘break’: ‘The boy broke the window with a stone’ (subject = agent); ‘The stone broke the window’ (subject instrument); ‘The window broke’ (subject = patient). Clearly verbs can be arranged into classes depending on the variations of mappings they allow, and not all English verbs pattern like ‘break.’ We can say ‘The admiral watched the battle with a telescope,’ but ‘The telescope watched the battle’ and ‘The battle watched’ sound decidedly odd.

From this literature emerges the claim that certain mappings are more natural or universal. One proposal is that, for example, there is an implicational hierarchy governing the mapping to subject, typically such as: agent > recipient/benefactive > theme/patient > instrument > location. In such a hierarchy each left element is more preferred than its right neighbor, so that moving rightward along the string gives us fewer expected subjects. The hierarchy also makes certain typological claims: if a language allows a certain semantic role to be subject, it will allow all those to its left. Thus if we find that a language allows the role instrument to be subject, we predict that it allows the roles to the left, but we do not know if it allows location subjects.

One further application of semantic roles is in lexical semantics, where the notion allows verbs to be classified by their semantic argument structure. Verbs are assigned semantic role templates or grids by which they may be sorted into natural classes. Thus, English has a class of transfer, or giving verbs, which in one type includes the verbs ‘give,’ ‘lend,’ ‘supply,’ ‘pay,’ ‘donate,’ ‘contribute.’ These verbs encode a view of the transfer from the perspective of the agent and may be assigned the pattern < agent, theme, recipient > , as in ‘The committee donated aid to the famine victims.’ A second subclass of these transfer verbs encodes the process from the perspective of the recipient. These verbs include ‘receive,’ ‘accept,’ ‘borrow,’ ‘buy,’ ‘purchase,’ ‘rent,’ ‘hire,’ and have the pattern < recipient, theme, source >, as in ‘The victims received aid from the committee.’

3.4 Entailment, Presupposition, And Implication

These terms relate to types of information a hearer gains from an utterance but which are not stated directly by the speaker. These phenomena have received a lot of attention because they seem to straddle the putative divide between semantics and pragmatics described above, and because they reveal the dynamic and interactive nature of understanding the meaning of utterances. Entailment describes a relationship between sentences such that on the basis of one sentence, a hearer will accept a second, unstated sentence purely on the basis of the meaning of the first. Thus sentence A entails sentence B, if it is not possible to accept A but reject B. In this view a sentence such as ‘I bought a dog today’ entails ‘I bought an animal today’; or ‘President Kennedy was assassinated yesterday’ entails ‘President Kennedy is now dead.’ Clearly these sentential relations depend on lexical relations: a speaker who understands the meaning of the English word ‘dog’ knows that a dog is an animal; similarly the verb ‘assassinate’ necessarily involves the death of the unfortunate object argument. Entailment then is seen as a purely automatic process, involving no reasoning or deduction, but following from the hearer’s linguistic knowledge. Entailment is amenable to characterization by truth conditions. A sentence is said to entail another if the truth of the first guarantees the truth of the second, and the falsity of the second guarantees the falsity of the first.

Presupposition, on the other hand, is a more complicated notion. In basic terms, the idea is simple enough: that a speaker communicates certain assumptions aside from the main message. A range of linguistic elements communicates these assumptions. Some, such as names, and definiteness markers such as the articles ‘the’ and ‘my,’ presuppose the existence of entities. Thus ‘James Brown is in town’ presupposes the existence of a person so called. Other elements have more specific presuppositions. A verb such as ‘stop’ presupposes a preexisting situation. So a sentence ‘Christopher has stopped smoking’ presupposes ‘Christopher smoked.’ If treated as a truth-conditional relation, presupposition is distinguished from entailment by the fact that it survives under negation: ‘Christopher has not stopped smoking’ still presupposes ‘Christopher smoked,’ but the sentence ‘I didn’t buy a dog today’ does not entail ‘I bought an animal today.’

There are a number of other differences between entailment and presupposition that cast doubts on the ability of a purely semantic, truth-conditional account of the latter. Presuppositions are notoriously context sensitive, for example. They may be cancelled without causing an anomaly: a hearer can reply ‘Christopher hasn’t stopped smoking, because he never smoked’ to cancel the presupposition by what is sometimes called metalinguistic negation. This dependency on context has led some writers to propose that presupposition is a pragmatic notion, definable in terms of the set of background assumptions that the speaker assumes is shared in the conversation. See Beaver (1997) for discussion.

A third type of inference is Grice’s conversational implicature (1975, 1978). This is an extremely contextsensitive type of inference which allow participants in a conversation to maintain coherence. So, given the invented exchange below,

A: Did you give Mary the book?

B: I haven’t seen her yet.

It is reasonable for A to infer the answer ‘no’ to her question. Grice proposed that such inferences are routinely relied on by both speakers and hearers, and that this reliance is based on certain assumptions that hearers make about a speaker’s conduct. Grice classified these into several different types, giving rise to different types of inference, or, from the speaker’s point of view, what he termed implicatures. The four main maxims are called Quality, Quantity, Relevance, and Manner (Grice 1975, 1978). They amount to a claim that a listener will assume, unless there is evidence to the contrary, that a speaker will have calculated their utterance along a number of parameters: they will tell the truth, try to estimate what their audience knows, and package their material accordingly, have some idea of the current topic, and give some thought to their audience being able to understand them. In our example above, it is A’s assumption that B’s reply is intended to be relevant that allows the inference ‘no.’

Implicature has three characteristics: first, that it is implied rather than said; second, that its existence is a result of the context i.e., the specific interaction. There is no guarantee that in other contexts ‘I haven’t seen her’ will be used to communicate ‘no.’ Third, implicature is cancelable without causing a contradiction. Thus the implicature ‘no’ in our example can be cancelled if B adds the clause ‘but I mailed it to her last week.’

These three notions—entailment, presupposition, and implicature—can all be seen as types of inference. They are all produced in conversation, and are taken by participants to be part of the meaning of what a speaker has said. They differ in a number of features and crucially in context sensitivity. The attempt to provide a unified analysis of them all is a challenge to semantic and pragmatic theories. See Sperber and Wilson (1995) for an attempt at such a unified approach.

4. Future Developments

Although semantics remains theoretically a very diverse field it is possible to detect some shared trends which seem likely to develop further. One is a move away from a static view of sentences in isolation, detached from the speaker writer’s act of communication, toward dynamic, discourse-based approaches. This has always been characteristic of functional approaches to meaning but has also been noticeable in formal approaches as they move away from their more philosophical origins. Among examples of this we might mention discourse representation theory (Kamp and Reyle 1993) and dynamic semantics (Groenendijk et al. 1996).

Another development which seems likely to continue is a closer integration with other disciplines in cognitive science. In particular, computational techniques seem certain to make further impact on a range of semantic inquiry, from lexicography to the modeling of questions and other forms of dialogue. A subfield of computational semantics has emerged and will continue to develop; see Rosner and Johnson (1992) for example.

Bibliography:

  • Beaver D 1997 Presuppositions. In: Van Bentham J, Ter Meulen A (eds.). Handbook of Logic and Language. Elsevier, Amsterdam, pp. 939–1008
  • Binnick R I 1991 Time and the Verb: A Guide to Tense and Aspect. Oxford University Press, Oxford, UK
  • Bybee J, Perkins R, Pagliuca W 1994 The Evolution of Grammar: Tense, Aspect, and Modality in the Languages of the World. University of Chicago Press, Chicago
  • Chomsky N 1995 The Minimalist Program. MIT Press, Cambridge, MA
  • Comrie B 1976 Aspect: An Introduction to the Study of Verbal Aspect and Related Problems. Cambridge University Press, Cambridge, UK
  • Comrie B 1985 Tense. Cambridge University Press, Cambridge, UK
  • Fauconnier G 1994 Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge University Press, Cambridge, UK
  • Fisiak J (ed.) 1985 Historical Semantics—Historical Word Formation. Mouton de Gruyter, Berlin
  • Frege G 1980 Translations from the Philosophical Writings of Gottlob Frege [ed. Geach P, Black M]. Blackwell, Oxford, UK
  • Gardenfors P 1999 Some tenets of cognitive semantics. In: Allwood J, Gardenfors P (eds.). Cognitive Semantics: Meaning and Cognition. John Benjamins, Amsterdam, pp. 12–36
  • Givon T 1995 Functionalism and Grammar. John Benjamins, Amsterdam
  • Grice H P 1975 Logic and conversation. In: Cole P, Morgan J (eds.). Syntax and Semantics, Speech Acts. Academic Press, New York, Vol. 3 pp. 43–58
  • Grice H P 1978 Further notes on logic and conversation. In: Cole P (ed.). Syntax and Semantics 9: Pragmatics. Academic Press, New York, pp. 113–28
  • Groenendijk J, Stokhof M, Veltman F 1996 Coreference and modality. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 179–214
  • Halliday M A K 1994 An Introduction to Functional Grammar, 2nd edn. Edward Arnold, London
  • Harder P 1996 Functional Semantics: A Theory of Meaning, Structure and Tense in English. Mouton de Gruyter, Berlin
  • Hornstein N 1990 As Time Goes By: Tense and Universal Grammer. MIT Press, Cambridge, MA
  • Jackendoff R 1990 Semantic Structures. MIT Press, Cambridge, MA
  • Jackendoff R 1996 Semantics and cognition. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 539–60
  • Jacobsen W H Jr 1986 The heterogeneity of evidentials in Makah. In: Chafe W, Nichols J (eds.). E identiality: The Linguistic Coding of Epistemology. Ablex, Norwood, NJ, pp. 3–28
  • Johnson M 1987 The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason. University of Chicago Press, Chicago
  • Kamp H, Reyle U 1993 From Discourse to Logic. Kluwer, Dordrecht, The Netherlands
  • Lakoff G 1987 Women, Fire, and Dangerous Things: What Categories Reveal About the Mind. University of Chicago Press, Chicago
  • Langacker R W 1987 Foundations of Cognitive Grammar. Stanford University Press, Stanford, CA
  • Lappin S (ed.) 1996 The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK
  • Lehmann W P 1992 Historical Linguistics, 3rd edn. Routledge, London
  • Montague R 1974 Formal Philosophy: Selected Papers of Richard Montague [ed. Thomason R H]. Yale University Press, New Haven, CT
  • Newmeyer F J 1998 Language Form and Language Function. MIT Press, Cambridge, MA
  • Partee B H 1996 The development of formal semantics in linguistic theory. In: Lappin S (ed.). The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK, pp. 11–38
  • Pinker S 1989 Learnability and Cognition: The Acquisition of Argument Structure. MIT Press, Cambridge, MA
  • Rosner M, Johnson R (eds.) 1992 Computational Linguistics and Formal Semantics. Cambridge University Press, Cambridge, UK
  • Sperber D, Wilson D 1995 Relevance: Communication and Cognition, 2nd edn. Blackwell, Oxford, UK
  • Talmy L 2000 Toward a Cognitive Semantics. MIT Press, Cambridge, MA
  • Verkuyl H J 1993 A Theory of Aspectuality: The Interaction Between Temporal and Atemporal Structure. Cambridge University Press, Cambridge, UK
  • Wierzbicka A 1992 Semantics, Culture, and Cognition. Universal Concepts in Culture-specific Configurations. Oxford University Press, Oxford, UK

ORDER HIGH QUALITY CUSTOM PAPER

research topics of semantics

Help | Advanced Search

Computer Science > Computation and Language

Title: seamlessexpressivelm: speech language model for expressive speech-to-speech translation with chain-of-thought.

Abstract: Expressive speech-to-speech translation (S2ST) is a key research topic in seamless communication, which focuses on the preservation of semantics and speaker vocal style in translated speech. Early works synthesized speaker style aligned speech in order to directly learn the mapping from speech to target speech spectrogram. Without reliance on style aligned data, recent studies leverage the advances of language modeling (LM) and build cascaded LMs on semantic and acoustic tokens. This work proposes SeamlessExpressiveLM, a single speech language model for expressive S2ST. We decompose the complex source-to-target speech mapping into intermediate generation steps with chain-of-thought prompting. The model is first guided to translate target semantic content and then transfer the speaker style to multi-stream acoustic units. Evaluated on Spanish-to-English and Hungarian-to-English translations, SeamlessExpressiveLM outperforms cascaded LMs in both semantic quality and style transfer, meanwhile achieving better parameter efficiency.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Research Topics

Data Spaces

Data Spaces

Establishing a data exchange infrastructure in accordance with pertinent industry standards, with a focus on enabling data sovereignty, fostering the digital economy, promoting interoperability, and building trust.

Information Extraction and Ontology Learning 2

Information Extraction and Ontology Learning

This line of work looks at the task of extracting various information from textual input. By identifying entities and the relations between them, it is possible to derive a structured ontology representation from these unstructured documents.

Knowledge Infused Information Retrieval 3

Knowledge Graphs and (Generative) Language Models

Our objective is to integrate external taxonomies, ontologies, or knowledge graphs into language models to develop hybrid architectures that exploit the strengths of both technologies, effectively addressing their limitations for complex tasks.

SWeMLS Description Framework

SWeMLS Description Framework

The research seeks to create a structured framework for documenting and visualising the essential elements “SWeML” systems, i.e., hybrid AI systems that incorporate both Semantic Web Technologies and Machine Learning.

NLP Service Orchestration 1

NLP Service Orchestration

We present our research on the development, deployment, and usage of collections of interdependent natural language processing services. We aim at the best quality of processing and robustness of the deployed services.

Knowledge Graph enhanced Named Entity Recognition 2

Knowledge Graph enhanced Named Entity Recognition

In this area of research, we are investigating approaches to address the Named Entity Recognition (NER) task as well as how Knowledge Graphs can be used to improve upon and circumvent the shortcomings of existing models.

Word Sense Disambiguation, Target Sense Verification, and Entity Linking 1

Word Sense Disambiguation, Target Sense Verification, and Entity Linking

This line of work looks into challenges related to recognizing words in a text as specific entities, understanding their types and linking them to a knowledge graph.

Visualizing hidden communities of interest: A case-study analysis of topic-based social networks in astrobiology

  • Published: 27 May 2024

Cite this article

research topics of semantics

  • Christophe Malaterre   ORCID: orcid.org/0000-0003-1413-6710 1 , 2 &
  • Francis Lareau   ORCID: orcid.org/0000-0002-0352-5246 3  

32 Accesses

Explore all metrics

Author networks in science often rely on citation analyses. In such cases, as in others, network interpretation usually depends on supplementary data, notably about authors’ research domains when disciplinary interpretations are sought. More general social networks also face similar interpretation challenges as to the semantic content specificities of their members. In this research-in-progress, we propose to infer author networks not from citation analyses but from topic similarity analyses based on a topic-model of published documents. Such author networks reveal, as we call them, “hidden communities of interest” (HCoIs) whose semantic content can easily be interpreted by means of their associated topics in the model. We use an astrobiology corpus of full-text articles ( N  = 3,698) to illustrate the approach. Having conducted an LDA topic-model on all publications, we identify the underlying communities of authors by measuring author correlations in terms of topic distributions. Adding publication dates makes it possible to examine HCoI evolution over time. This approach to social networks supplements traditional methods in contexts where textual data are available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research topics of semantics

Similar content being viewed by others

research topics of semantics

Identifying Diachronic Topic-Based Research Communities by Clustering Shared Research Trajectories

research topics of semantics

A Content-Based Approach to Social Network Analysis: A Case Study on Research Communities

Analyzing evolution of research topics with neviewer: a new method based on dynamic co-word networks, abbreviations.

  • Hidden communities of interest

Social network analysis

Latent Dirichlet analysis

Angelov, D. (2020). Top2Vec: Distributed representations of topics ( arXiv:2008.09470 ). http://arxiv.org/abs/2008.09470

Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pérez, J. M., & Perona, I. (2013). An extensive comparative study of cluster validity indices. Pattern Recognition , 46(1), 243–256. https://doi.org/10.1016/j.patcog.2012.07.021

Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. International Conference on Weblogs and Social Media . https://doi.org/10.1609/icwsm.v3i1.13937

Article   Google Scholar  

Beyer, K., Goldstein, J., Ramakrishnan, R., & Shaft, U. (1999). When is “nearest neighbor” meaningful? International Conference on Database Theory . https://doi.org/10.1007/3-540-49257-7_15

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3 , 993–1022.

Google Scholar  

Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64 (3), 351–374. https://doi.org/10.1007/s11192-005-0255-6

Boyd-Graber, J. L., Hu, Y., & Mimno, D. (2017). Applications of topic models. Foundations and Trends in Information Retrieval . https://doi.org/10.1561/1500000030

Carley, K. (1993). Coding choices for textual analysis: A comparison of content analysis and map analysis. Sociological Methodology, 23 , 75–126. https://doi.org/10.2307/271007

Castelblanco, G., Guevara, J., Mesa, H., & Sanchez, A. (2021). Semantic network analysis of literature on public-private partnerships. Journal of Construction Engineering and Management, 147 (5), 04021033. https://doi.org/10.1061/(ASCE)CO.1943-7862.0002041

Christensen, A. P., & Kenett, Y. N. (2023). Semantic network analysis (SemNA): A tutorial on preprocessing, estimating, and analyzing semantic networks. Psychological Methods, 28 (4), 860–879. https://doi.org/10.1037/met0000463

Crane, D. (1969). Social structure in a group of scientists: A test of the “invisible college” hypothesis. American Sociological Review, 34 (3), 335. https://doi.org/10.2307/2092499

Danowski, J. A. (1993). Network analysis of message content. In W. D. Richards & G. A. Barnett (Eds.), Progress in communication sciences. Ablex Publishing Corporation.

Danowski, J. A. (2011). Counterterrorism mining for individuals semantically-similar to watchlist members. In U. K. Wiil (Ed.), Counterterrorism and open source intelligence. Springer.

Danowski, J. A., & Cepela, N. (2010). Automatic mapping of social networks of actors from text corpora: Time series analysis. In N. Memon, J. J. Xu, D. L. Hicks, & H. Chen (Eds.), Data mining for social network data. Springer.

Danowski, J. A., Van Klyton, A., Tavera-Mesías, J. F., Duque, K., Radwan, A., & Rutabayiro-Ngoga, S. (2023). Policy semantic networks associated with ICT utilization in Africa. Social Network Analysis and Mining, 13 (1), 73. https://doi.org/10.1007/s13278-023-01068-x

de Vries, E., Schoonvelde, M., & Schumacher, G. (2018). No longer lost in translation: Evidence that Google translate works for comparative bag-of-words text applications. Political Analysis, 26 (4), 417–430. https://doi.org/10.1017/pan.2018.26

Dick, S. J., & Strick, J. E. (2004). The living universe NASA and the development of astrobiology . Rutgers University Press.

Diesner, J., & Carley, K. M. (2004). Using network text analysis to detect the organizational structure of covert networks. In P roceedings of the North American Association for Computational Social and Organizational Science (NAACSOS) Conference (Vol. 3). Pittsburgh: NAACSOS.

DiMaggio, P., Nag, M., & Blei, D. (2013). Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of U.S. government arts funding. Poetics, 41 (6), 570–606. https://doi.org/10.1016/j.poetic.2013.08.004

Doerfel, M. L., & Barnett, G. A. (1999). A semantic network analysis of the international communication association. Human Communication Research, 25 (4), 589–603. https://doi.org/10.1111/j.1468-2958.1999.tb00463.x

Field, A. P. (2009). Discovering statistics using SPSS: And sex, drugs and rock “n” roll . SAGE Publications.

Firth, J. R. (1957). A synopsis of linguistic theory 1930–1955. In J. R. Firth (Ed.), Studies in linguistic analysis (pp. 1–32). Blackwell.

Fortunato, S., Bergstrom, C. T., Börner, K., Evans, J. A., Helbing, D., Milojević, S., Petersen, A. M., Radicchi, F., Sinatra, R., Uzzi, B., Vespignani, A., Waltman, L., Wang, D., & Barabási, A.-L. (2018). Science of science. Science . https://doi.org/10.1126/science.aao0185

Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101 , 5228–5235. https://doi.org/10.1073/pnas.0307752101

Harris, Z. S. (1954). Distributional structure. Word, 10 (2–3), 146–162. https://doi.org/10.1080/00437956.1954.11659520

Horneck, G., Walter, N., Westall, F., Grenfell, J. L., Martin, W. F., Gomez, F., Leuko, S., Lee, N., Onofri, S., Tsiganis, K., Saladino, R., Pilat-Lohinger, E., Palomba, E., Harrison, J., Rull, F., Muller, C., Strazzulla, G., Brucato, J. R., Rettberg, P., & Capria, M. T. (2016). AstRoMap European astrobiology roadmap. Astrobiology, 16 (3), 201–243. https://doi.org/10.1089/ast.2015.1441

Kherwa, P., & Bansal, P. (2020). Topic modeling: A comprehensive review. EAI Endorsed Transactions on Scalable Information Systems . https://doi.org/10.4108/eai.13-7-2018.159623

Malaterre, C., & Lareau, F. (2022). The early days of contemporary philosophy of science: Novel insights from machine translation and topic-modeling of non-parallel multilingual corpora. Synthese, 200 (3), 242. https://doi.org/10.1007/s11229-022-03722-x

Article   MathSciNet   Google Scholar  

Malaterre, C., & Lareau, F. (2023). The emergence of astrobiology: A topic-modeling perspective. Astrobiology, 23 (5), 496–512. https://doi.org/10.1089/ast.2022.0122

Des Marais, D. J., Allamandola, L. J., Benner, S. A., Boss, A. P., Deamer, D., Falkowski, P. G., Farmer, J. D., Hedges, S. B., Jakosky, B. M., Knoll, A. H., Liskowsky, D. R., Meadows, V. S., Meyer, M. A., Pilcher, C. B., Nealson, K. H., Spormann, A. M., Trent, J. D., Turner, W. W., Woolf, N. J., & Yorke, H. W. (2003). The NASA astrobiology roadmap. Astrobiology, 3 (2), 219–235. https://doi.org/10.1089/153110703769016299

McCallum, A., Wang, X., & Corrada-Emmanuel, A. (2007). Topic and role discovery in social networks with experiments on enron and academic email. Journal of Artificial Intelligence Research, 30 , 249–272. https://doi.org/10.1613/jair.2229

Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining-WSDM ’15 . https://doi.org/10.1145/2684822.2685324

Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. Proceedings of International Conference on New Methods in Language Processing , 44–49.

Segev, E. (2021). Semantic network analysis in social sciences . Routledge.

Book   Google Scholar  

Siew, C. S. Q., Wulff, D. U., Beckage, N. M., & Kenett, Y. N. (2019). Cognitive network science: A review of research on cognition through the lens of network representations, processes, and dynamics. Complexity, 2019 , e2108423. https://doi.org/10.1155/2019/2108423

Steyvers, M., Smyth, P., Rosen-Zvi, M., & Griffiths, T. (2004). Probabilistic author-topic models for information discovery. Proceedings of the of the 2004 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD ’04 . https://doi.org/10.1145/1014052.1014087

Ye, F., Chen, C., & Zheng, Z. (2018). Deep autoencoder-like nonnegative matrix factorization for community detection. Proceedings of the 27th ACM International Conference on Information and Knowledge Management , 1393–1402.

Zhang, H., Qiu, B., Giles, C. L., Foley, H. C., & Yen, J. (2007). An LDA-based community structure discovery approach for large-scale social networks. 2007 IEEE Intelligence and Security Informatics , 200–207

Zhao, W., Chen, J. J., Perkins, R., Liu, Z., Ge, W., Ding, Y., & Zou, W. (2015). A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics . https://doi.org/10.1186/1471-2105-16-S13-S8

Download references

Acknowledgements

C.M. acknowledges funding from Canada Social Sciences and Humanities Research Council (Grant 430-2018-00899) and Canada Research Chairs (CRC-950-230795). F.L. acknowledges funding from the Canada Social Sciences and Humanities Research Council (756-2024-0557) and the Canada Research Chair in Philosophy of the Life Sciences at UQAM. The authors thank the audience of ISSI 2023 for most helpful comments on an earlier version of this paper published in the conference proceedings as: Malaterre, C., & Lareau, F. (2023). Visualizing hidden communities of interest: A preliminary analysis of topic-based social networks in astrobiology. Proceedings of ISSI 2023 . The 19th Conference of the International Society for Scientometrics and Informetrics, Bloomington, IN.

Author information

Authors and affiliations.

Département de philosophie, Université du Québec à Montréal (UQAM), MontrealMontréal, Canada

Christophe Malaterre

Centre interuniversitaire de recherche sur la science et la technologie, Université du Québec à Montréal (UQAM), Montréal, Canada

Département d’informatique, Université du Québec à Montréal (UQAM), Montréal, Canada

Francis Lareau

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization: CM, FL; Data curation: FL; Formal analysis and investigation: CM, FL; Funding acquisition: CM; Investigation: CM, FL; Methodology: CM, FL; Project administration: CM; Resources: CM; Software: FL; Supervision: CM; Validation: CM, FL; Visualization: CM; Writing – original draft preparation: CM; Writing—review and editing: CM, FL. Both authors approved the final submitted manuscript.

Corresponding author

Correspondence to Christophe Malaterre .

Ethics declarations

Conflict of interest.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Malaterre, C., Lareau, F. Visualizing hidden communities of interest: A case-study analysis of topic-based social networks in astrobiology. Scientometrics (2024). https://doi.org/10.1007/s11192-024-05047-7

Download citation

Received : 17 November 2023

Accepted : 29 April 2024

Published : 27 May 2024

DOI : https://doi.org/10.1007/s11192-024-05047-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Hidden colleges
  • Social networks
  • Semantic networks
  • Topic-modeling
  • Philosophy of science
  • Find a journal
  • Publish with us
  • Track your research

ORIGINAL RESEARCH article

This article is part of the research topic.

Advances in Robot Learning-from-Demonstration for Smart Manufacturing Applications

Semantic Learning from Keyframe Demonstration using Object Attribute Constraints Provisionally Accepted

  • 1 Eindhoven University of Technology, Netherlands

The final, formatted version of the article will be published soon.

Learning from demonstration is an approach that allows users to personalize a robot's tasks. While demonstrations often focus on conveying the robot's motion or task plans, they can also communicate user intentions through object attributes in manipulation tasks. For instance, users might want to teach a robot to sort fruits and vegetables into separate boxes or to place cups next to plates of matching colors. This paper introduces a novel method that enables robots to learn the semantics of user demonstrations, with a particular emphasis on the relationships between object attributes. In our approach, users demonstrate essential task steps by manually guiding the robot through the necessary sequence of poses. We reduce the amount of data by utilizing only robot poses instead of trajectories, allowing us to focus on the task's goals, specifically the objects related to these goals. At each step, known as a keyframe, we record the end-effector pose, object poses, and object attributes. However, the number of keyframes saved in each demonstration can vary due to the user's decisions. This variability in each demonstration can lead to inconsistencies in the significance of keyframes, complicating keyframe alignment to generalize the robot's motion and the user's intention. Our method addresses this issue by focusing on teaching the higher-level goals of the task using only the required keyframes and relevant objects. It aims to teach the rationale behind object selection for a task and generalize this reasoning to environments with previously unseen objects. We validate our proposed method by conducting three manipulation tasks aiming at different object attribute constraints. In the reproduction phase, we demonstrate that even when the robot encounters previously unseen objects, it can generalize the user's intention and execute the task.

Keywords: Learning from demonstration, Keyframe demonstrations, Object attributes, task goal learning, semantic learning

Received: 17 Nov 2023; Accepted: 03 Jun 2024.

Copyright: © 2024 Sen, Elfring, Torta and van de Molengraft. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Mx. Busra Sen, Eindhoven University of Technology, Eindhoven, Netherlands

People also looked at

  • School of Computer Science
  • Research Groups

Foundations of Computation at Sheffield (FOX)

Our main research theme concerns the mathematical foundations of computer science. The topics we are interested in include algorithms, computational complexity and combinatorics, logical methods, program semantics, hardware and software verification and interactive theorem proving.

Foundations of Computation abstract image

Research topics in the FOX group range from the theoretical mathematical foundations that underpin computer science to their applications in real world contexts. FOX is one of the largest and most diverse research groups of its type in the UK.

Research areas  

Graph and approximation algorithms.

Andreas Emil Feldmann ,  Sagnik Mukhopadhyay , Joachim Spoerhase

Computational and logical complexity

Sagnik Mukhopadhyay , Joachim Spoerhase , Navid Talebanfard , Jonni Virtema , Maksim Zukhovskii

Combinatorics and combinatorial optimisation 

Andreas Emil Feldmann , Pietro Oliveto , Joachim Spoerhase , Maksim Zukhovskii

Logic and finite model theory

Mike Stannett , Jonni Virtema , Maksim Zhukhovskii

Program correctness and verification

Harsh Beohar , Kirill Bogdanov , John Derrick , Rob Hierons , Andrei Popescu , Georg Struth , Jonni Virtema , Charles Grellois

Semantics and applied category theory

Harsh Beohar , Andrei Popescu , Georg Struth ,  Charles Grellois

Interactive theorem proving

Andrei Popescu , Mike Stannett , Georg Struth

Seminars, members and courses

FOX seminars

Group members

Courses/ modules

Related information

Foundations of Computation research group website

Explore our research themes

Search for PhD opportunities at Sheffield and be part of our world-leading research.

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Research: Why Companies Should Disclose Their Lack of Progress on DEI

  • Evan Apfelbaum

research topics of semantics

Stakeholders value transparency and accountability — even when you’re falling short.

Many companies have set goals to increase employee diversity, and many companies have fallen short of meeting their goals. Most leaders would likely prefer to keep this lack of progress quiet, but research shows that there may be benefits to being transparent about it. Specifically, this type of disclosure can signal that you take diversity seriously and are genuinely committed to the goals you’ve set for your organization. That said, taking too long to make progress can dampen any goodwill you might receive from disclosure.

In the aftermath of George Floyd’s murder and the national reckoning around racial injustice in 2020, many companies redoubled their commitment to increase the diversity of their workforce. New practices and policies were introduced to help reach diversity goals set by leadership, and for quite a few, this commitment was broadcast widely: centered in a CEO speech, a press release, a company town hall, on social media, or in internal messages to employees.

research topics of semantics

  • Evan Apfelbaum is a social psychologist and associate professor at BU’s Questrom School of Business. His research leverages behavioral science to reveal the challenges and potential of diversity and social change.
  • Eileen Suh is a postdoctoral scholar at the Kellogg School of Management, Northwestern University. Her research focuses on understanding why organizations struggle to achieve diversity and inclusion goals and identifying effective strategies to improve these efforts.

Partner Center

  • About the Hub
  • Announcements
  • Faculty Experts Guide
  • Subscribe to the newsletter

Explore by Topic

  • Arts+Culture
  • Politics+Society
  • Science+Technology
  • Student Life
  • University News
  • Voices+Opinion
  • About Hub at Work
  • Gazette Archive
  • Benefits+Perks
  • Health+Well-Being
  • Current Issue
  • About the Magazine
  • Past Issues
  • Support Johns Hopkins Magazine
  • Subscribe to the Magazine

You are using an outdated browser. Please upgrade your browser to improve your experience.

Nick Wigginton joins Johns Hopkins as associate vice provost for research

He will oversee the university's prestigious bloomberg distinguished professorships program and lead other initiatives designed to support and elevate the research enterprise.

By Hub staff report

Johns Hopkins University has selected accomplished higher education administrator Nick Wigginton to serve as associate vice provost for research to oversee several of the university's nationally-recognized research initiatives, including the Bloomberg Distinguished Professorships (BDP) program. He will assume this role on July 8.

Image caption: Nick Wigginton

In addition to overseeing the BDP program, Wigginton will lead the Research Development Team , internal research funding awards programs, research communications, and research data and analytics. He will also help develop future research initiatives.

"Nick's experiences and values directly align with the priorities of the research enterprise at Johns Hopkins, and we look forward to the continued evolution of our programs to best serve our research community under his leadership," says Denis Wirtz, vice provost for research.

Wigginton joins Johns Hopkins from the University of Michigan, where he most recently served as its associate vice president for research–strategic initiatives. In that role, he developed and implemented programs and policies, and built new teams to increase the research competitiveness of Michigan's research enterprise. Wigginton also spent the 2019-2020 academic year at Stanford University working closely with its research office as part of the prestigious American Council of Education Fellows Program .

In discussing his decision to come to Johns Hopkins, Wigginton says he looks forward to building upon the strengths of the Johns Hopkins research community, and to being part of the BDP program, which is the largest program of its kind in the nation.

"It's incredibly inspiring to be joining the nation's first research university," Wigginton says. "Johns Hopkins has a stellar reputation for discovery and impact. This feels like a place where I can learn a lot, and I'm eager to build new initiatives and support existing ones such as the BDP program. It's a crown jewel in terms of faculty recruitment programs, and I am extremely excited to be able to help foster an environment where the world's most accomplished scholars come together to tackle society's most pressing challenges."

Wigginton graduated with a bachelor's degree in geology from Michigan State University and received a PhD in geosciences from Virginia Tech. He was a postdoctoral scholar at École Polytechnique Fédérale de Lausanne in Switzerland and a visiting researcher at Pacific Northwest National Laboratory in Richland, Washington. Before joining Michigan, Wigginton was a senior editor at Science , where he oversaw the selection and publication of manuscripts across a wide range of scientific disciplines.

Posted in University News

Tagged bloomberg distinguished professorships

You might also like

News network.

  • Johns Hopkins Magazine
  • Get Email Updates
  • Submit an Announcement
  • Submit an Event
  • Privacy Statement
  • Accessibility

Discover JHU

  • About the University
  • Schools & Divisions
  • Academic Programs
  • Plan a Visit
  • my.JohnsHopkins.edu
  • © 2024 Johns Hopkins University . All rights reserved.
  • University Communications
  • 3910 Keswick Rd., Suite N2600, Baltimore, MD
  • X Facebook LinkedIn YouTube Instagram

Numbers, Facts and Trends Shaping Your World

Read our research on:

Full Topic List

Regions & Countries

  • Publications
  • Our Methods
  • Short Reads
  • Tools & Resources

Read Our Research On:

  • The State of the American Middle Class

Who is in it and key trends from 1970 to 2023

Table of contents.

  • Acknowledgments

This report examines key changes in the economic status of the American middle class from 1970 to 2023 and its demographic attributes in 2022. The historical analysis is based on U.S. Census Bureau data from the Annual Social and Economic Supplements (ASEC) of the Current Population Survey (CPS). The demographic analysis is based on data from the American Community Survey (ACS). The data is sourced from IPUMS CPS and IPUMS USA , respectively.  

The CPS, a survey of about 60,000 households, is the U.S. government’s official source for monthly estimates of unemployment . The CPS ASEC, conducted in March each year, is the official source of U.S. government estimates of income and poverty . Our analysis of CPS data starts with the 1971 CPS ASEC, which records the incomes of households in 1970. It is also the first year for which data on race and ethnicity is available. The latest available CPS ASEC file is for 2023, which reports on household incomes in 2022.

The public-use version of the ACS is a 1% sample of the U.S. population, or more than 3 million people. This allows for a detailed study of the demographic characteristics of the middle class, including its status in U.S. metropolitan areas. But ACS data is available only from 2005 onward and is less suitable for long-term historical analyses. The latest available ACS data is for 2022.

Middle-income households are defined as those with an income that is two-thirds to double that of the U.S. median household income, after incomes have been adjusted for household size. Lower-income households have incomes less than two-thirds of the median, and upper-income households have incomes that are more than double the median. When using American Community Survey (ACS) data, incomes are also adjusted for cost of living in the areas in which households are located.

Estimates of household income are scaled to reflect a household size of three and expressed in 2023 dollars. In the Current Population Survey (CPS), household income refers to the calendar year prior to the survey year. Thus, the income data in the report refers to the 1970-2022 period, and the share of Americans in each income tier from the CPS refers to the 1971-2023 period.

The demographic attributes of Americans living in lower-, middle- or upper-income tiers are derived from ACS data. Except as noted, estimates pertain to the U.S. household population, excluding people living in group quarters.

The terms middle class and middle income are used interchangeably in this report.

White, Black, Asian, American Indian or Alaska Native, and Native Hawaiian or Pacific Islander include people who identified with a single major racial group and who are not Hispanic. Multiracial includes people who identified with more than one major racial group and are not Hispanic. Hispanics are of any race.

U.S. born refers to individuals who are U.S. citizens at birth, including people born in the 50 U.S. states, the District of Columbia, Puerto Rico or other U.S. territories, as well as those born elsewhere to at least one parent who is a U.S. citizen. The terms foreign born and immigrant are used interchangeably in this report. They refer to people who are not U.S. citizens at birth.

Occupations describe the broad kinds of work people do on their job. For example, health care occupations include doctors, nurses, pharmacists and others who are directly engaged in the provision of health care. Industries describe the broad type of products companies produce. Each industry encompasses a variety of occupations. For example, the health care and social assistance industry provides services that are produced by a combination of doctors, managers, technology and administrative staff, food preparation workers, and workers in other occupations.

The share of Americans who are in the middle class is smaller than it used to be. In 1971, 61% of Americans lived in middle-class households. By 2023, the share had fallen to 51%, according to a new Pew Research Center analysis of government data.

A bar chart showing that Share of Americans in the middle class has fallen since 1971

As a result, Americans are more apart than before financially. From 1971 to 2023, the share of Americans who live in lower-income households increased from 27% to 30%, and the share in upper-income households increased from 11% to 19%.

Notably, the increase in the share who are upper income was greater than the increase in the share who are lower income. In that sense, these changes are also a sign of economic progress overall.

But the middle class has fallen behind on two key counts. The growth in income for the middle class since 1970 has not kept pace with the growth in income for the upper-income tier. And the share of total U.S. household income held by the middle class has plunged.

Moreover, many groups still lag in their presence in the middle- and upper-income tiers. For instance, American Indians or Alaska Natives, Black and Hispanic Americans, and people who are not married are more likely than average to be in the lower-income tier. Several metro areas in the U.S. Southwest also have high shares of residents who are in the lower-income tier, after adjusting for differences in cost of living across areas.

  • Change in income
  • Share of total U.S. household income
  • Race and ethnicity
  • Marital status
  • Veteran status
  • Place of birth
  • Employment status
  • Metropolitan area of residence

Our report focuses on the current state of the American middle class. First, we examine changes in the financial well-being of the middle class and other income tiers since 1970. This is based on data from the Annual Social and Economic Supplements (ASEC) of the Current Population Survey (CPS), conducted from 1971 to 2023.

Then, we report on the attributes of people who were more or less likely to be middle class in 2022. Our focus is on their race and ethnicity , age , gender, marital and veteran status , place of birth , ancestry , education , occupation , industry , and metropolitan area of residence . These estimates are derived from American Community Survey (ACS) data and differ slightly from the CPS-based estimates. In part, that is because incomes can be adjusted for the local area cost of living only with the ACS data. (Refer to the methodology for details on these two data sources.)

This analysis and an accompanying report on the Asian American middle class are part of a series on the status of America’s racial and ethnic groups in the U.S. middle class and other income tiers. Forthcoming analyses will focus on White, Black, Hispanic, American Indian or Alaska Native, Native Hawaiian or Pacific Islander and multiracial Americans, including subgroups within these populations. These reports are, in part, updates of previous work by the Center . But they offer much greater detail on the demographic attributes of the American middle class.

Following are some key facts about the state of the American middle class:

In our analysis, “middle-income” Americans are those living in households with an annual income that is two-thirds to double the national median household income. The income it takes to be middle income varies by household size, with smaller households requiring less to support the same lifestyle as larger households. It also varies by the local cost of living, with households in a more expensive area, such as Honolulu, needing a higher income than those in a less expensive area, such as Wichita, Kansas.

We don’t always know the area in which a household is located. In our two data sources – the Current Population Survey, Annual Social and Economic Supplement (CPS ASEC) and the American Community Survey (ACS) – only the latter provides that information, specifically the metropolitan area of a household. Thus, we aren’t able to adjust for the local cost of living when using the CPS to track changes in the status of the middle class over time. But we do adjust for the metropolitan area cost of living when using the ACS to determine the demographic attributes of the middle class in 2022.

In the 2023 CPS ASEC data , which reports income for 2022, middle-income households with three people have incomes ranging from about $61,000 to $183,000 annually. “Lower-income” households have incomes less than $61,000, and “upper-income” households have incomes greater than $183,000.

In the 2022 ACS data , middle-income households with three people have incomes ranging from about $62,000 to $187,000 annually, with incomes also adjusted for the local area cost of living. (Incomes are expressed in 2023 dollars.)

The boundaries of the income tiers also vary across years as the national median income changes.

The terms “middle income” and “middle class” are used interchangeably in this report for the sake of exposition. But being middle class can refer to more than just income , be it education level, type of profession, economic security, home ownership or social and political values. Class also could simply be a matter of self-identification .

Households in all income tiers had much higher incomes in 2022 than in 1970, after adjusting for inflation. But the gains for middle- and lower-income households were less than the gains for upper-income households .

A bar chart showing that Incomes of upper-income U.S. households increased the most of any income tier from 1970 to 2022

The median income of middle-class households increased from about $66,400 in 1970 to $106,100 in 2022, or 60%. Over this period, the median income of upper-income households increased 78%, from about $144,100 to $256,900. (Incomes are scaled to a three-person household and expressed in 2023 dollars.)

The median income of lower-income households grew more slowly than that of other households, increasing from about $22,800 in 1970 to $35,300 in 2022, or 55%.

Consequently, there is now a larger gap between the incomes of upper-income households and other households. In 2022, the median income of upper-income households was 7.3 times that of lower-income households, up from 6.3 in 1970. It was 2.4 times the median income of middle-income households in 2022, up from 2.2 in 1970.

The share of total U.S. household income held by the middle class has fallen almost without fail in each decade since 1970 . In that year, middle-income households accounted for 62% of the aggregate income of all U.S. households, about the same as the share of people who lived in middle-class households.

A line chart showing that Share of total U.S. household income held by the middle class has plunged since 1970

By 2022, the middle-class share in overall household income had fallen to 43%, less than the share of the population in middle-class households (51%). Not only do a smaller share of people live in the middle class today, the incomes of middle-class households have also not risen as quickly as the incomes of upper-income households.  

Over the same period, the share of total U.S. household income held by upper-income households increased from 29% in 1970 to 48% in 2022. In part, this is because of the increase in the share of people who are in the upper-income tier.

The share of overall income held by lower-income households edged down from 10% in 1970 to 8% in 2022. This happened even though the share of people living in lower-income households increased over this period.

The share of people in the U.S. middle class varied from 46% to 55% across racial and ethnic groups in 2022. Black and Hispanic Americans, Native Hawaiians or Pacific Islanders, and American Indians or Alaska Natives were more likely than others to be in lower-income households .

A bar chart showing Black, Hispanic, Native Hawaiian/Pacific Islander and American Indian/Alaska Native people are more likely than others to live in lower-income U.S. households

In 2022, 39% to 47% of Americans in these four groups lived in lower-income households. In contrast, only 24% of White and Asian Americans and 31% of multiracial Americans were in the lower-income tier.

At the other end of the economic spectrum, 27% of Asian and 21% of White Americans lived in upper-income households in 2022, compared with about 10% or less of Black and Hispanic Americans, Native Hawaiians or Pacific Islanders, and American Indians or Alaska Natives.

Not surprisingly, lower-income status is correlated with the likelihood of living in poverty. According to the Census Bureau , the poverty rate among Black (17.1%) and Hispanic (16.9%) Americans and American Indians or Alaska Natives (25%) was greater than the rate among White and Asian Americans (8.6% for each). (The Census Bureau did not report the poverty rate for Native Hawaiians or Pacific Islanders.)

A bar chart showing Nearly 4 in 10 U.S. children lived in lower-income households in 2022, about half in the middle class

Children and adults 65 and older were more likely to live in lower-income households in 2022. Adults in the peak of their working years – ages 30 to 64 – were more likely to be upper income. In 2022, 38% of children (including teens) and 35% of adults 65 and older were lower income, compared with 26% of adults ages 30 to 44 and 23% of adults 45 to 64.

The share of people living in upper-income households ranged from 13% among children and young adults (up to age 29) to 24% among those 45 to 64. In each age group, about half or a little more were middle class in 2022.

Men were slightly more likely than women to live in middle-income households in 2022 , 53% vs. 51%. Their share in upper-income households (18%) was also somewhat greater than the share of women (16%) in upper-income households.

A bar chart showing that Men, veterans and married Americans were more likely than their counterparts to live in middle- or upper-income households in 2022

Marriage appears to boost the economic status of Americans. Among those who were married in 2022, eight-in-ten lived either in middle-income households (56%) or upper-income households (24%). In contrast, only about six-in-ten of those who were separated, divorced, widowed or never married were either middle class or upper income, while 37% lived in lower-income households.

Veterans were more likely than nonveterans to be middle income in 2022, 57% vs. 53%. Conversely, a higher share of nonveterans (29%) than veterans (24%) lived in lower-income households.

A bar chart showing that Immigrants were more likely than the U.S. born to be lower income in 2022; people born in Asia, Europe or Oceania were most likely to be upper income

Immigrants – about 14% of the U.S. population in 2022 – were less likely than the U.S. born to be in the middle class and more likely to live in lower-income households. In 2022, more than a third of immigrants (36%) lived in lower-income households, compared with 29% of the U.S. born. Immigrants also trailed the U.S. born in the shares who were in the middle class, 48% vs. 53%.

There are large gaps in the economic status of American residents by their region of birth. Among people born in Asia, Europe or Oceania, 25% lived in upper-income households in 2022. People from these regions represented 7% of the U.S. population.

By comparison, only 14% of people born in Africa or South America and 6% of those born in Central America and the Caribbean were in the upper-income tier in 2022. Together they accounted for 8% of the U.S. population.

The likelihood of being in the middle class or the upper-income tier varies considerably with the ancestry of Americans. In 2022, Americans reporting South Asian ancestry were about as likely to be upper income (38%) as they were to be middle income (42%). Only 20% of Americans of South Asian origin lived in lower-income households. South Asians accounted for about 2% of the U.S. population of known origin groups in 2022.

A bar chart showing that Americans of South Asian origin are the most likely to be upper income; Hispanic origins are the least likely

At least with respect to the share who were lower income, this was about matched by those with Soviet, Eastern European, other Asian or Western European origins. These groups represented the majority (54%) of the population of Americans whose ancestry was known in 2022.

On the other hand, only 7% of Americans with Central and South American or other Hispanic ancestry were in the upper-income tier, and 44% were lower income. The economic statuses of Americans with Caribbean, sub-Saharan African or North American ancestry were not very different from this.

Education matters for moving into the middle class and beyond, and so do jobs. Among Americans ages 25 and older in 2022, 52% of those with a bachelor’s degree or higher level of education lived in middle-class households and another 35% lived in upper-income households.

A bar chart showing that The share of Americans in the middle- or upper-income tier rises sharply with education and employment

In sharp contrast, 42% of Americans who did not graduate from high school were in the middle class, and only 5% were in the upper-income tier. Further, only 12% of college graduates were lower income, compared with 54% of those who did not complete high school.

Not surprisingly, having a job is strongly linked to movement from the lower-income tier to the middle- and upper-income tiers. Among employed American workers ages 16 and older, 58% were in the middle-income tier in 2022 and 23% were in the upper-income tier. Only 19% of employed workers were lower income, compared with 49% of unemployed Americans.

A bar chart showing that More than a third of U.S. workers in technology, management, and business and finance occupations were in the upper-income tier in 2022

In some occupations, about nine-in-ten U.S. workers are either in the middle class or in the upper-income tier, but in some other occupations almost four-in-ten workers are lower income. More than a third (36% to 39%) of workers in computer, science and engineering, management, and business and finance occupations lived in upper-income households in 2022. About half or more were in the middle class.

But many workers – about one-third or more – in construction, transportation, food preparation and serving, and personal care and other services were in the lower-income tier in 2022.

About six-in-ten workers or more in education; protective and building maintenance services; office and administrative support; the armed forces; and maintenance, repair and production were in the middle class.

A bar chart showing that About a third of U.S. workers in the information, financial and professional services sectors were in the upper-income tier in 2022

Depending on the industrial sector, anywhere from half to two-thirds of U.S. workers were in the middle class, and the share who are upper income or lower income varied greatly.

About a third of workers in the finance, insurance and real estate, information, and professional services sectors were in the upper-income tier in 2022. Nearly nine-in-ten workers (87%) in public administration – largely filling legislative functions and providing federal, state or local government services – were either in the middle class or the upper-income tier.

But nearly four-in-ten workers (38%) in accommodation and food services were lower income in 2022, along with three-in-ten workers in the retail trade and other services sectors.

The share of Americans who are in the middle class or in the upper- or lower-income tier differs across U.S. metropolitan areas. But a pattern emerges when it comes to which metro areas have the highest shares of people living in lower-, middle- or upper-income households. (We first adjust household incomes for differences in the cost of living across areas.)

A bar chart showing that The 10 U.S. metropolitan areas with the largest shares of residents in the middle class in 2022

The 10 metropolitan areas with the greatest shares of middle-income residents are small to midsize in population and are located mostly in the northern half of the U.S. About six-in-ten residents in these metro areas were in the middle class.

Several of these areas are in the so-called Rust Belt , namely, Wausau and Oshkosh-Neenah, both in Wisconsin; Grand Rapids-Wyoming, Michigan; and Lancaster, Pennsylvania. Two others – Dover and Olympia-Tumwater – include state capitals (Delaware and Washington, respectively).

In four of these areas – Bismarck, North Dakota, Ogden-Clearfield, Utah, Lancaster and Wausau – the share of residents in the upper-income tier ranged from 18% to 20%, about on par with the share nationally.

A bar chart showing that The 10 U.S. metropolitan areas with the largest shares of residents in the upper-income tier in 2022

The 10 U.S. metropolitan areas with the highest shares of residents in the upper-income tier are mostly large, coastal communities. Topping the list is San Jose-Sunnyvale-Santa Clara, California, a technology-driven economy, in which 40% of the population lived in upper-income households in 2022. Other tech-focused areas on this list include San Francisco-Oakland-Hayward; Seattle-Tacoma-Bellevue; and Raleigh, North Carolina.

Bridgeport-Stamford-Norwalk, Connecticut, is a financial hub. Several areas, including Washington, D.C.-Arlington-Alexandria and Boston-Cambridge-Newton, are home to major universities, leading research facilities and the government sector.

Notably, many of these metro areas also have sizable lower-income populations. For instance, about a quarter of the populations in Bridgeport-Stamford-Norwalk; Trenton, New Jersey; Boston-Cambridge-Newton; and Santa Cruz-Watsonville, California, were in the lower-income tier in 2022.

A bar chart showing that The 10 U.S. metropolitan areas with the largest shares of residents in the lower-income tier in 2022

Most of the 10 U.S. metropolitan areas with the highest shares of residents in the lower-income tier are in the Southwest, either on the southern border of Texas or in California’s Central Valley. The shares of people living in lower-income residents were largely similar across these areas, ranging from about 45% to 50%.

About 40% to 50% of residents in these metro areas were in the middle class, and only about one-in-ten or fewer lived in upper-income households.

Compared with the nation overall, the lower-income metro areas in Texas and California have disproportionately large Hispanic populations. The two metro areas in Louisiana – Monroe and Shreveport-Bossier City – have disproportionately large Black populations.

Note: For details on how this analysis was conducted,  refer to the methodology .

Sign up for our weekly newsletter

Fresh data delivery Saturday mornings

Sign up for The Briefing

Weekly updates on the world of news & information

  • Income, Wealth & Poverty
  • Middle Class

The State of the Asian American Middle Class

Black and hispanic americans, those with less education are more likely to fall out of the middle class each year, how the american middle class has changed in the past five decades, covid-19 pandemic pinches finances of america’s lower- and middle-income families, are you in the global middle class find out with our income calculator, most popular, report materials.

  • Methodology

1615 L St. NW, Suite 800 Washington, DC 20036 USA (+1) 202-419-4300 | Main (+1) 202-857-8562 | Fax (+1) 202-419-4372 |  Media Inquiries

Research Topics

  • Email Newsletters

ABOUT PEW RESEARCH CENTER  Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of  The Pew Charitable Trusts .

© 2024 Pew Research Center

IMAGES

  1. 18 Semantics Examples (2024)

    research topics of semantics

  2. Top Research Topic Ideas for IoT based Semantics

    research topics of semantics

  3. Semantics Definition : Overview of Semantics with Examples

    research topics of semantics

  4. Approaches of semantics

    research topics of semantics

  5. (PDF) Article: Semantic Study with an Affective Meaning and Different

    research topics of semantics

  6. Happy Learning Material: Approaches to the Study of Semantics

    research topics of semantics

VIDEO

  1. Semantic Coding [1/10]

  2. An introduction To Semantics Notes

  3. Semantics and Writing Mediums

  4. Lesson 1: What is Semantics?

  5. Build a Portfolio Website from Scratch, Day 3

  6. Linguistics

COMMENTS

  1. Semantics

    Semantics research is about how the meaning of a sentence is determined from its parts and the way the parts are put together. Semantics at Penn focuses on several new approaches to the field, including LTAG semantics and underspecification as well as the application of game theory. Florian Schwarz and Robin Clark lead Penn's research in formal ...

  2. Topics in Semantics and Pragmatics

    Topics in Semantics and Pragmatics. Topics in Semantics and Pragmatics. This course will provide a comprehensive overview of the empirical patterns, analytical challenges and broader theoretical issues surrounding a particular topic, such as information structure, presupposition, scalar implicature, binding, aspectual composition, nominal ...

  3. Key Topics in Semantics and Pragmatics

    About Key Topics in Semantics and Pragmatics. This new series focuses on the main topics of study in semantic and pragmatic theory today. It consists of accessible yet challenging accounts of the most important issues to consider when examining the semantics and pragmatics of natural languages. Some topics have been the subject of study for ...

  4. Semantics and Pragmatics

    The Stanford semantics and pragmatics community encompasses a broad range of interests including: Lexical semantics. Formal semantics and pragmatics, and their interfaces with syntax. Psycholinguistics. Numerous sub-areas of psychology, philosophy, and computer science. We share the goal of grounding theories of meaning in diverse research ...

  5. PDF Introducing Semantics

    Introducing Semantics Semantics is the study of meaning in language. This clear and ... each subject, presenting students with an overview of the main topics encountered in their course, and fea-tures a glossary of useful terms, chapter previews and summaries, suggestions for further reading, and help-

  6. Semantics

    The Routledge Handbook of Semantics provides a broad and state-of-the-art survey of this field, covering semantic research at both word and sentence level. It presents a synoptic view of the most important areas of semantic investigation, including contemporary methodologies and debates, and indicating possible future directions in the field.

  7. Lexical Semantics

    Lexical semantics is the study of word meaning. Descriptively speaking, the main topics studied within lexical semantics involve either the internal semantic structure of words, or the semantic relations that occur within the vocabulary. Within the first set, major phenomena include polysemy (in contrast with vagueness), metonymy, metaphor, and ...

  8. Semantics and Pragmatics

    Semantics and Pragmatics, founded in 2007 and first published in 2008, is a Diamond Open Access journal published by the Linguistic Society of America. Current Issue Vol. 17 (2024) Published: 2024-01-05 Main Articles. The semantics and probabilistic pragmatics of deadjectival intensifiers Rick Nouwen 2:EA ...

  9. 425008 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on SEMANTICS. Find methods information, sources, references or conduct a literature review on SEMANTICS

  10. Semantics and Pragmatics

    Research topics, theoretical tools and languages considered are quite diverse. Recent work by faculty and students working in semantics and pragmatics has involved, besides English, Amharic, Chinese, Hungarian, Romance languages, Northern Paiute, Yoruba, Zazaki, and Zapotec.

  11. 66253 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on LEXICAL SEMANTICS. Find methods information, sources, references or conduct a literature review on ...

  12. PDF Topics in Semantics

    The birth of dynamic semantics: File Change Potentials. Week 4 2/16 Harvard Chierchia "Standard" Dynamic Semantics of the 90's. Indefinites as Dynamic Generalized Quantifiers, weak and strong readings of donkey pronouns, existential disclosure. Week 7 3/9 MIT Chierchia The debate on long distance indefinites: Non canonical scope ...

  13. (PDF) Contemporary Issues in Syntax and Semantics

    semantics (Pires de Oliveira et al, 2020). Contemporary Issues in Syntax. and Semantics, the theme of the 3 EISSI, was the open call for paper. answered by all the papers in this volume. For this ...

  14. PDF Topics in semantic association

    topic model and discuss connections with previous research. Approaches to semantic representation Psychological theories of semantic representation are typically based on one of two kinds of representation: semantic networks or semantic spaces. We will discuss these two approaches in turn, identifying some of their strengths and weaknesses.

  15. Semantic Scholar

    Semantic Reader is an augmented reader with the potential to revolutionize scientific reading by making it more accessible and richly contextual. Try it for select papers. Semantic Scholar uses groundbreaking AI and engineering to understand the semantics of scientific literature to help Scholars discover relevant research.

  16. PDF Topics in Semantic Representation

    topics can be learned automatically from a collection of docu-ments, as a computational analogue of how human learners might form semantic representations through their linguistic experience (Griffiths & Steyvers, 2002, 2003, 2004). The topic model provides a starting point for an investigation of new forms of semantic representation.

  17. 100+ Compelling Linguistics Research Topics for University ...

    Some academic disciplines in linguistic semantics are conceptual semantics, cognitive semantics, formal semantics, computational semantics, and more. Linguistic research paper topics on Semantics are as follows: Examine meaning work in language interpretation and scrutinization; A critical evaluation of language acquisition and language use.

  18. 55 Top-Rated Research Topics in Linguistics For an A+

    A critical evaluation of language and ethnicity. Analyzing language attrition among most English speakers. Distinct functions of language among different communities. Interesting Topics in ...

  19. Journals, Articles and Papers

    The Semantic Web encompasses the technology that connects data from different sources across the Web as envisioned by Tim Berners-Lee and led by the World Wide Web Consortium (W3C). This Web of Data enables the linking of data sets across data silos on the Web by providing for machine-to-machine communication through the use of Linked Data.

  20. Semantics Research Paper

    Semantics is the study of meaning communicated through language, and is usually taken to be one of the three main branches of linguistics, along with phonology, the study of sound systems, and grammar, which includes the study of word structure (morphology) and of sentence structure (syntax). This entry surveys some of the main topics of ...

  21. SeamlessExpressiveLM: Speech Language Model for Expressive Speech-to

    Expressive speech-to-speech translation (S2ST) is a key research topic in seamless communication, which focuses on the preservation of semantics and speaker vocal style in translated speech. Early works synthesized speaker style aligned speech in order to directly learn the mapping from speech to target speech spectrogram. Without reliance on style aligned data, recent studies leverage the ...

  22. Topics

    Research Topics. Data Spaces. Establishing a data exchange infrastructure in accordance with pertinent industry standards, with a focus on enabling data sovereignty, fostering the digital economy, promoting interoperability, and building trust. Read more. Information Extraction and Ontology Learning. This line of work looks at the task of ...

  23. Visualizing hidden communities of interest: A case-study ...

    The results of the topic-model give an overview of the main research topics in astrobiology, as present throughout the publications of the three major journals in the field. ... the top-articles in which the topics are strongly expressed makes it possible to gain a better understanding about the semantic content of the 25 topics and their ...

  24. Frontiers

    Learning from demonstration is an approach that allows users to personalize a robot's tasks. While demonstrations often focus on conveying the robot's motion or task plans, they can also communicate user intentions through object attributes in manipulation tasks. For instance, users might want to teach a robot to sort fruits and vegetables into separate boxes or to place cups next to plates of ...

  25. Foundations of Computation at Sheffield (FOX)

    Our main research theme concerns the mathematical foundations of computer science. The topics we are interested in include algorithms, computational complexity and combinatorics, logical methods, program semantics, hardware and software verification and interactive theorem proving. Research topics in the FOX group range from the theoretical ...

  26. key research topics in the past and future

    Objective This paper discusses the past and present highlights of working hours and health research and identifies key research needs for the future. Method We analyzed over 220 original articles and reviews on working hours and health in the Scandinavian Journal of Work, Environment & Health published during the last 50 years.

  27. 272 questions with answers in SEMANTICS

    b. John does (= John knows who_k heard what stories about himself_k). - 1a: The sentence structure implies that the 'himself' refers back to the nearest noun, which is the 'who' in the embedded ...

  28. Research: Why Companies Should Disclose Their Lack of Progress on DEI

    In the aftermath of George Floyd's murder and the national reckoning around racial injustice in 2020, many companies redoubled their commitment to increase the diversity of their workforce.

  29. Nick Wigginton joins Johns Hopkins as associate vice provost for research

    In addition to overseeing the BDP program, Wigginton will lead the Research Development Team, internal research funding awards programs, research communications, and research data and analytics.He will also help develop future research initiatives. "Nick's experiences and values directly align with the priorities of the research enterprise at Johns Hopkins, and we look forward to the continued ...

  30. Key Facts, Data and Trends Since 1970

    The median income of middle-class households increased from about $66,400 in 1970 to $106,100 in 2022, or 60%. Over this period, the median income of upper-income households increased 78%, from about $144,100 to $256,900. (Incomes are scaled to a three-person household and expressed in 2023 dollars.)