A Map of Knowledge

by   Zachary A. Pardos, et al.

Knowledge representation has gained in relevance as data from the ubiquitous digitization of behaviors amass and academia and industry seek methods to understand and reason about the information they encode. Success in this pursuit has emerged with data from natural language, where skip-grams and other linear connectionist models of distributed representation have surfaced scrutable relational structures which have also served as artifacts of anthropological interest. Natural language is, however, only a fraction of the big data deluge. Here we show that latent semantic structure, comprised of elements from digital records of our interactions, can be informed by behavioral data and that domain knowledge can be extracted from this structure through visualization and a novel mapping of the literal descriptions of elements onto this behaviorally informed representation. We use the course enrollment behaviors of 124,000 students at a public university to learn vector representations of its courses. From these behaviorally informed representations, a notable 88 (e.g., department and division), as well as 40 constructed from prior domain knowledge and evaluated by analogy (e.g., Math 1B is to Math H1B as Physics 7B is to Physics H7B). To aid in interpretation of the learned structure, we create a semantic interpolation, translating course vectors to a bag-of-words of their respective catalog descriptions. We find that the representations learned from enrollments resolved course vectors to a level of semantic fidelity exceeding that of their catalog descriptions, depicting a vector space of high conceptual rationality. We end with a discussion of the possible mechanisms by which this knowledge structure may be informed and its implications for data science.


page 8

page 18


Natural Language Understanding with Distributed Representation

This is a lecture note for the course DS-GA 3001 <Natural Language Under...

Analysis of Student Behaviour in Habitable Worlds Using Continuous Representation Visualization

We introduce a novel approach to visualizing temporal clickstream behavi...

Connectionist Recommendation in the Wild

The aggregate behaviors of users can collectively encode deep semantic i...

Behavioral estimates of conceptual structure are robust across tasks in humans but not large language models

Neural network models of language have long been used as a tool for deve...

Accessible Visualization via Natural Language Descriptions: A Four-Level Model of Semantic Content

Natural language descriptions sometimes accompany visualizations to bett...

Extracting Conceptual Knowledge from Natural Language Text Using Maximum Likelihood Principle

Domain-specific knowledge graphs constructed from natural language text ...

Unified vector space mapping for knowledge representation systems

One of the most significant problems which inhibits further developments...

Please sign up or login with your details

Forgot password? Click here to reset