Services for Researchers
Coalition Publica supports innovative research practices in the arts, humanities and social sciences, by developing large textual datasets, curating bibliometric data, and providing open source digital scholarly publishing software.
Textual data
Access to a treasure trove of texts as big data
Text plays a central role in the arts, humanities and social sciences. Being the primary vehicle for the dissemination of new knowledge, text can also act as raw material for new research. Gathered into vast collections of full-text publications in a digital format, text becomes research data with great potential for the generation of new knowledge.
Description of the collection: the breadth of publications in the collection makes our research corpus an important resource for research in many fields: history, sociology, linguistics, economics, literature and others. Our corpora support research that integrates digital technology in a variety of ways: discourse analysis, automatic language processing, digital intelligence and text mining.
Journals, newspapers, magazines
Érudit, 342 journals, coverage from 1905 to 2024, 568,478 files, 203 GB
Also available for the Érudit collection: metadata and semantically tagged full text
Archival and current issues, metadata and full text tagged in EruditArticle XML (in French), compatible with JATS XML, updated annually. For an overview of how the corpus is structured, please consult our documentation.
Bibliothèques et Archives nationales du Québec, coverage from the 17th century, 4,627,040 files, 18 TB
Canadiana/CRKN, coverage from 18th century to 1930, 80,085 files, 405 GB
Library and Archives Canada, coverage from 1820 to 1917, 789 files, 5 GB
Parliamentary debates
Library and Archives Canada
Canada Gazette, coverage from 1842 to 1997, 14,560 files, 206 GB
Cabinet Conclusions, coverage from 1944 to 1979, 41,249 files, 10 GB
Library of the National Assembly of Québec
Journal des débats de l'Assemblée nationale du Québec, coverage from 1908 to 2019, 33,339 files, 31 GB
Government Reports
National Centre for Truth and Reconciliation, coverage from 2002 to 2021, 15,391 files, 535 MB
How to Access the data
Access to the textual data corpora is reserved for research and teaching purposes. It is subject to Canadian copyright law. Researchers and students who access the corpora agree not to distribute or commercialize the copora’s publications. The access procedure is as follows:
Completion and submission of a project description form (please contact us at corpus@erudit.org to request the form)
Evaluation of the project on the basis of the following criteria: the applicant is affiliated with an educational institution and confirms that the corpus will not be used for commercial purposes or disseminated in its entirety
Signing a user agreement
Creation of an account on Compute Canada
Download (SSH key, Globus transfer tool)
Workshops
To learn more about the research possibilities offered by these corpora and how to use them, we recommend the workshops offered by Calcul Québec and the Digital Research Alliance of Canada. You can also subscribe to the Calcul Québec newsletter to be informed of upcoming events.
Contact: Our expert team will help you access quickly and efficiently the textual and bibliometric data of our research repository. If you have any questions or comments, you can reach us at corpus@erudit.org.
Bibliometric data
Stimulating the study of the research ecosystem
Considered as intellectual objects bearing the traces of the conceptual, social or historical evolution of research, scholarly publications are a unique resource for those who study research as a social object, questioning how it is developed, communicated and built upon, or how the research ecosystem is structured and how scholarly knowledge develops. The available collection is broadly conceived as belonging to two categories. Firstly, there is bibliometric data for the journals disseminated on the erudit.org platform, as well as the articles published in them. Secondly, we have created datasets describing active and historical Canadian scholarly journals.
Bibliometric data collection covering journals published on the Érudit platform
Basic metadata: the entire collection
Metadata for over 250,000 scholarly and cultural articles, including title, DOI, language of publication, years of publication and dissemination, authors' names, etc.
How to access the data
These data are available in the Borealis repository; https://doi.org/10.5683/SP3/BSSMC8 They are accessible under a CC BY licence. For reporting purposes, users are asked to identify themselves when downloading.
Enriched metadata: articles published from 2015 to the most recent calendar year
Enriched metadata for over 30,000 scholarly articles containing, in addition to the basic metadata, the order of appearance of authors, raw and standardised affiliation, country of affiliation, type of access, etc.
How to access the data
This data is available on Dataverse; https://doi.org/10.7910/DVN/VUUK8Q. They are accessible under a CC0 licence. For reporting purposes, users are asked to identify themselves when downloading.
Bibliometric data collection covering Canadian scholarly journals
A directory of active and historical Canadian scholarly journals has been created by the Coalition Publica team from a variety of sources. The Canadian journal data are corrected and updated thanks to a collective effort. In order to optimize this effort, a version open to comments and suggestions is available.
The journals listed are periodicals affiliated with a Canadian organization. In addition, they are peer-reviewed or peer-reviewed by an editorial board and identifiable by ISSN. We have excluded journals deemed to be predatory or to have questionable practices. The journals listed are described by a number of fields, including access status, use of publication fees, organization managing the publication, languages accepted and indexing in current databases, among others.
How to access the data
The directory is published in the Borealis repository under a CC BY licence and is updated regularly: https://doi.org/10.5683/SP3/9ONCEU
We invite all those interested in contributing updates to provide comments here. Detailed instructions can be found under the ‘Instructions for comments’ tab. Please note that this version of the directory is not stable and we do not recommend using it for research purposes - use the latest update available in Borealis instead.
For any information about bibliometric data, please contact us at corpus@erudit.org.
Open source scholarly publishing software
Supporting the technological development of scholarly information production and delivery systems
Free software and open standards for scholarly publishing are the result of applied research on information production and dissemination systems. They also assume a role as vehicles for additional research in the field of digital scholarly publishing: open software can be used, studied, modified or duplicated by anyone who wishes to do so, depending on the license assigned to it and the needs of the community.
Description of the software: Open Journal Systems (OJS) is a highly flexible editorial management software for digital scholarly journals. OJS can be downloaded for free and installed on a local web server. OJS uses PHP, JavaScript and MySQL/MariaDB as a storage database. OJS can be run on Linux/Mac server environments and is licensed under the GNU GPL v3. Integration with the erudit.org platform, as well as with the services of several organizations and infrastructures working towards interoperability of systems and content - such as Crossref, ORCID, SWORD, etc. - is facilitated by the use of plugins, available under open licenses.
Access and services: the developed software is available for free on the GitHub site. You can find OJS documentation in the PKP Documentation Hub. To contribute to the documentation, read our guidelines for contributors, contact us, or participate in a virtual documentation sprint. We encourage contributions to our software development, and provide documentation on how to write plugins and extend the software (among other things). Subscribe to our developer newsletter or contact us for more information or to participate in a sprint. PKP staff provide free support in the PKP Community Forum. PKP also provides paid advisory support, directly from PKP developers and technicians, on a case-by-case basis through PKP Publishing Services, and in some cases will undertake sponsored development of new features in OJS, if there is a demonstrated need from the community.
QUESTIONS?
If you have any questions about our services to researchers, you can contact our team at corpus@erudit.org.