Services for Researchers
Coalition Publica supports innovative research practices in the arts, humanities and social sciences, by developing large textual datasets, curating bibliometric data, and providing open source digital scholarly publishing software.
Textual data
Access to a treasure trove of texts as big data
Text plays a central role in the arts, humanities and social sciences. Being the primary vehicle for the dissemination of new knowledge, text can also act as raw material for new research. Gathered into vast collections of full-text publications in a digital format, text becomes research data with great potential for the generation of new knowledge.
Description of the collection: the breadth of publications in the collection makes our research corpus an important resource for research in many fields: history, sociology, linguistics, economics, literature and others.
Our corpora support research that integrates digital technology in a variety of ways: discourse analysis, automatic language processing, digital intelligence and text mining.
Journals, newspapers, magazines
Érudit, 342 journals, coverage from 1905 to 2024, 568,478 files, 203 GB
Bibliothèques et Archives nationales du Québec, coverage from the 17th century, 4,627,040 files, 18 TB
Canadiana/CRKN, coverage from 18th century to 1930, 80,085 files, 405 GB
Library and Archives Canada, coverage from 1820 to 1917, 789 files, 5 GB
Parliamentary debates
Library and Archives Canada
Canada Gazette, coverage from 1842 to 1997, 14,560 files, 206 GB
Cabinet Conclusions, coverage from 1944 to 1979, 41,249 files, 10 GB
Library of the National Assembly of Québec
Journal des débats de l'Assemblée nationale du Québec, coverage from 1908 to 2019, 33,339 files, 31 GB
Government Reports
National Centre for Truth and Reconciliation, coverage from 2002 to 2021, 15,391 files, 535 MB
Bibliometric data
Stimulating the study of the research ecosystem
Considered as intellectual objects bearing the traces of the conceptual, social or historical evolution of research, scholarly publications are a unique resource for those who study research as a social object, questioning how it is developed, communicated and built upon, or how the research ecosystem is structured and how scholarly knowledge develops.
Metadata and semantically tagged full text (entire collection)
Archived and current issues, metadata and full text tagged in EruditArticle XML, compatible with JATS XML, updated annually
For an overview of how the corpus is structured, see the EruditArticle XML schema, and see our documentation.
Enriched metadata (articles published between 2015 and the most recent calendar year)
Enriched metadata (articles published between 2015 and the most recent calendar year)
Enriched metadata of over 30,000 scholarly articles published on erudit.org: number of authors, order of appearance, raw and standardized affiliation, type of access, etc.
For the complete list of available metadata and their description, please contact us at corpus@erudit.org.
Access to the data
Textual data
Access to the corpora is reserved for research and teaching purposes. It is subject to Canadian copyright law. Researchers and students who access the corpora agree not to distribute or commercialize the copora’s publications.
The procedure is as follows:
Completion and submission of a project description form (please contact us at corpus@erudit.org to request the form)
Evaluation of the project on the basis of the following criteria: the applicant is affiliated with an educational institution and confirms that the corpus will not be used for commercial purposes or disseminated in its entirety
Signing a user agreement
Creation of an account on Compute Canada
Download (SSH key, Globus transfer tool)
Workshops
To learn more about the research possibilities offered by these corpora and how to use them, we recommend the workshops offered by Calcul Québec and the Digital Research Alliance of Canada. You can also subscribe to the Calcul Québec newsletter to be informed of upcoming events.
Enriched metadata
The enriched metadata are available for download on Dataverse. They are available under a CC0 license. For monitoring purposes, users are asked to identify themselves when downloading.
Contact
A team of experts will help you access quickly and efficiently the textual and bibliometric data of our research repository. If you have any questions or comments, you can reach us at corpus@erudit.org.
Open source scholarly publishing software
Supporting the technological development of scholarly information production and delivery systems
Free software and open standards for scholarly publishing are the result of applied research on information production and dissemination systems. They also assume a role as vehicles for additional research in the field of digital scholarly publishing: open software can be used, studied, modified or duplicated by anyone who wishes to do so, depending on the license assigned to it and the needs of the community.
Description of the software: Open Journal Systems (OJS) is a highly flexible editorial management software for digital scholarly journals. OJS can be downloaded for free and installed on a local web server. OJS uses PHP, JavaScript and MySQL/MariaDB as a storage database. OJS can be run on Linux/Mac server environments and is licensed under the GNU GPL v3. Integration with the erudit.org platform, as well as with the services of several organizations and infrastructures working towards interoperability of systems and content - such as Crossref, ORCID, SWORD, etc. - is facilitated by the use of plugins, available under open licenses.
Access and services: the developed software is available for free on the GitHub site. You can find OJS documentation in the PKP Documentation Hub. To contribute to the documentation, read our guidelines for contributors, contact us, or participate in a virtual documentation sprint. We encourage contributions to our software development, and provide documentation on how to write plugins and extend the software (among other things). Subscribe to our developer newsletter or contact us for more information or to participate in a sprint. PKP staff provide free support in the PKP Community Forum. PKP also provides paid advisory support, directly from PKP developers and technicians, on a case-by-case basis through PKP Publishing Services, and in some cases will undertake sponsored development of new features in OJS, if there is a demonstrated need from the community.
QUESTIONS?
If you have any questions about our services to researchers, you can contact our team at corpus@erudit.org.