Quality of Online Science Info. Varies by Language

The Internet can reduce inequality in science literacy.

Nevertheless, structural factors, like income inequality, are a root cause of disparities in science literacy and in Internet access and use.

One of these structural factors is called the “language divide." It refers to the difference between those who speak dominant languages well and those who do not. This factor partly determines how successful users are in their Internet use in general, and when seeking online scientific information.

Although this is an important aspect of science learning, so far, little is known about the availablity of scientific information in different languages.

Research Question

Does the quality of online scientific information vary between languages and scientific fields?


We collected online search results regarding scientific terms in English, Hebrew, and Arabic, analyzed their content, and rated their scientific and pedagogical quality.

The terms belonged to three scientific fields (domains): Physics, chemistry and biology.

Statistical methods included univariate and multivariate ANOVA and Linear Discriminant Analysis (LDA).


Note: The Linear Discriminant Analysis was conducted by my co-author, Dr. Eyal Nitzany.

Findings and Discussion

Findings indicate that searches in English yielded overall higher quality results, compared with Hebrew and Arabic, but mostly in pedagogical aspects, rather than scientific ones.

The differences in information quality were more better explained by the language factor than by the scientific field factor (Figure 1).

Clustering the results by language yielded better separation than clustering by scientific field (Figure 2). This finding points to a “language divide” in access to online science content.

These findings should encourage scientific communities and institutions to mitigate this problem.


Figure 1. Effects of language and field on combined measures of information quality—MANOVA results. One, two, and three bullets (•, ••, and •••) denote statistical significance at the .05, .01, and .001 levels, respectively.


Figure 2. LDA of search terms (a) by language (colors) and (b) by field (shapes). (c) Differences in quality (areas of the triangles on the LDA plane) between equivalent terms in different languages. LD1 and LD2: First and second linear discriminants, respectively.

Press Releases and Media Coverage


Zoubi, K., Sharon, A. J., Nitzany, E., & Baram-Tsabari, A. (2021). Science, maddá and ‘ilm: The language divide in scientific information available to internet users. Public Understanding of Science. Advance online publication. https://doi.org/10.1177/09636625211022975


Open Access