Abstract principles become concrete through example. The Ontology for Computational Sociology (OCS) demonstrates how the methodologies, patterns, and decisions discussed throughout this guide materialise in a substantial, deployed ontology addressing genuine research requirements.
Computational sociology presents unique ontological challenges. Unlike domains with established taxonomies (biological species, chemical compounds), sociology encompasses diverse theoretical traditions, contested concepts, and phenomena operating across scales from individual interactions to global processes. No single sociological paradigm commands universal acceptance; structuralism, functionalism, conflict theory, symbolic interactionism, and network analysis each conceptualise social reality differently.
The OCS ontology's scope deliberately boundaries this complexity: general sociological concepts, social processes, institutions, stratification systems, demographic phenomena, and research methodologies. It excludes exhaustive coverage of specific cultural contexts, historical periods, or micro-theoretical variations—pragmatic limits enabling completion whilst maintaining utility for computational sociology research, educational applications, and interdisciplinary integration.
Competency questions driving OCS development included: "Which social processes influence political mobilisation?" "How do organisational structures affect innovation capacity?" "What causal relationships connect economic inequality to social movement emergence?" "Which demographic transitions correlate with institutional changes?" These questions, grounded in sociological research priorities, specified the ontology's required knowledge representation capabilities.
The OCS ontology's classes are organised hierarchically under BFO 2020 categories. Major branches include Social_Processes (collective actions, institutional changes, demographic transitions), Social_Groups (primary/secondary groups, organisations, communities).
254 object properties define relationships—'influences', 'belongsToOrganisation', 'causesProcessChange', 'participatesIn', 'hasRole'. Each property specifies domain/range constraints, characteristic declarations (transitive, symmetric, functional), and comprehensive annotations explaining sociological significance. Systematic inverse property definitions enable bidirectional relationship navigation.
79 data properties attach measurable or descriptive attributes—population counts, temporal extents, spatial locations, classification codes. These ground abstract sociological concepts in quantifiable data, facilitating empirical research applications.
201 n-ary causal relations represent the ontology's most sophisticated element. Rather than oversimplifying multi-factor sociological causation into binary relationships, the OCS implements reification patterns where 'Collective_Causal_Event' individuals connect multiple causes to multiple effects whilst maintaining semantic precision and query tractability. This pattern, detailed in SLIDE 7, distinguishes OCS from simpler ontologies and demonstrates how careful architectural design captures domain complexity.
Every OCS class maps to appropriate BFO 2020 categories, ensuring philosophical coherence and enabling interoperability with biomedical, economic, and environmental ontologies sharing BFO foundations. 'Social_Process' aligns with BFO:process, 'Social_Group' with BFO:object_aggregate, 'Social_Role' with BFO:role.
This alignment wasn't merely taxonomic labelling—it required careful analysis of BFO's continuant/occurrent distinction, understanding of specifically dependent continuants, and resolution of tensions between sociological intuitions and BFO's metaphysical commitments. Iterative refinement, guided by reasoner validation, progressively improved alignment quality.
The payoff: The OCS ontology inherits BFO's logical rigour whilst gaining potential integration with diverse domains. Sociological phenomena can now be formally related to public health outcomes (via biomedical ontologies), economic indicators (via financial ontologies), or environmental conditions (via ecological ontologies)—all sharing BFO's foundational vocabulary.
The OCS development employed a text-driven workflow addressing practical realities of domain expert collaboration and iterative refinement. Rather than requiring sociologists to learn Protégé or developers to intuit sociological nuances, structured text files mediated knowledge capture.
Control files specified: Class hierarchies, class annotations (structured key-value pairs), object property definitions (characteristics, domains, ranges), data property specifications, and n-ary relation patterns (causes, effects, annotations). This format balanced human readability, version controllability, and automated processing.
The DBFOschemafy framework (developed specifically for OCS but subsequently generalised) reads these control files and generates comprehensive OWL/XML encoding. This automation ensures consistency, eliminates manual encoding errors, and dramatically accelerates ontology generation. Modifications require updating text files and regenerating—far more efficient than manual OWL editing for large-scale changes.
Eclipse IDE provided the programmatic foundation. 87 Java classes orchestrate control file parsing, OWL generation, validation, and export. This architecture separates domain knowledge (text files, their storage in RDB) from technical implementation (Java code), enabling evolution of either independently.
Protégé integration remained central for visualisation, reasoner execution, and refinement. The generated OWL/XML imports seamlessly into Protégé, where visual inspection, consistency checking, and manual adjustments occur before final deployment.
The OCS ontology is publicly accessible via NCBO BioPortal (https://bioportal.bioontology.org/ontologies/OCS), the premier repository for biomedical and scientific ontologies. This deployment provides web-based browsing, SPARQL querying, REST API access, and integration with BioPortal's extensive ontology ecosystem.
Applications enabled by OCS include:
Research: Computational sociology studies can query OCS for conceptual relationships, validate theoretical models against formal specifications, or integrate OCS with empirical data for hypothesis testing.
Education: Sociology students can explore conceptual relationships visually, understand theoretical frameworks through formal definitions, and grasp subdiscipline interconnections through navigation.
Data integration: Sociological datasets annotated with OCS concepts gain semantic richness, enabling cross-dataset queries, automated relationship inference, and integration with other domains.
What the OCS development revealed:
Domain expertise cannot be shortcuts. Sustained collaboration with sociological experts proved essential. Early architectural decisions made without sufficient domain consultation required costly rework.
Scope discipline is paramount. Initial ambitions for comprehensive coverage proved unrealistic; pragmatic boundary-setting enabled completion.
Automation pays dividends. The DBFOschemafy framework's upfront development cost was recovered many times through efficient generation, validation, and iteration.
N-ary relations are complex but necessary. The reification pattern's implementation required substantial effort but proved indispensable for faithfully representing sociological causation.
Upper ontology alignment takes time but delivers value. BFO mapping was intellectually demanding and time-consuming but positioned OCS for interdisciplinary integration impossible otherwise.
If developing OCS again, what would change ? Earlier and more systematic competency question definition would have focused development more efficiently. Greater initial investment in automated testing infrastructure would have caught errors sooner. More structured domain expert review cycles would have reduced late-stage conceptual revisions.
But the fundamental approach (text-driven control, data storage in RDB, programmatic generation, BFO alignment, n-ary causal relations—proved sound, and generating a HTML browser–showing all the details) would be retained in future projects.