Overton is largely language agnostic. We index policy documents regardless of their language or alphabet. For details about the processing of policy documents by language, see ‘Documents in Different Languages.’
To help users find policy documents in various languages, Overton provides a language filter in the ‘Policy Documents’ search.
How the filter works
The filter lets users view policy documents written in multiple languages. Users can select specific languages to focus on, such as French and Spanish.
We use an automated system to identify the primary language of each document. It evaluates the character set, grammar, and word frequency in a text sample to determine the language. This system might occasionally err, especially if the sample is inaccurate or the document contains multiple languages.
Why filter by language
The language filter is useful for several reasons:
- It helps users find documents in a particular language
- It can help analyse a set of policy documents, i.e. looking at the proportion of documents available in a native language, or how the language of a policy document could affect accessibility, etc.
Using the filter
You can apply the language filter at the start of your search or after obtaining search results. To apply the filter, scroll through the list on the left side of the ‘Policy Documents’ search page until you find the ‘Document language’ filter.
For more language options, click ‘Show more,’ select the languages you want, and click ‘Apply filter.’
It is important to note that English tends to be the lingua franca and this is reflected in the dominance of English as document language.
Searching for documents with multiple language PDFs
When documents have multiple PDFs and multiple language versions, these are noted on the policy document record below the PDF thumbnail image. This example is available in English, Russian, French, Arabic, Spanish and Chinese.
For example, a search for “International Conference on Small Island Developing States” with ‘Arabic’ and ‘French’ selected in the ‘Document Language’ filter, will result in records which have an attached PDF in either of those languages. This doesn’t mean those are the only available languages for that record but reflects your filter selection.