Big data and the building of ‘true scholastic ability’


There are at least three prerequisites for enabling high school and university students to acquire the faculties of thinking, judgment and expression, which are collectively called “true scholastic ability” by the education ministry: the ability to read and comprehend Japanese and English; mathematical literacy, or the ability to think logically; and the basic scholastic ability to understand history, philosophy and natural sciences as circumstances require.

I am in full agreement with the ministry’s view that acquiring the faculties of thinking, judgment and expression represents the very minimum requirement for high school and university students to fulfill.

Indeed, regardless of what professional careers students may pursue after finishing school, being fully equipped with the aforementioned three faculties, in addition to knowledge in specialized fields, is absolutely indispensable for exercising leadership in any organization.

If I were asked what are the means of thinking, judgment and expression, I would answer that in addition to the knowledge of languages (that is, Japanese and English) and mathematics, the ability to understand and process data is indispensable.

Hayato Ikeda, who rose to the position of prime minister on the strength of his plan to double national income in the decade from December 1960, prided himself on being well-versed in numerical figures. When faced with questions from lawmakers at the Lower House Budget Committee, for example, Ikeda would often beat down the interpellators by making full use of statistical economic data that he knew by heart.

“Mathematics is a language” is a widely known phrase. But I also think it’s true that “data is a language.”

I have cited above three necessary conditions for acquiring “true scholastic ability.” For them to become both necessary and sufficient conditions, “data literacy” — the ability to visualize various kinds of information contained in data, to think on the basis of such information, to pass judgment, and to express the judgment in one’s own words — must be added.

If one has to make a persuasive argument, the ability to use data easily is indispensable. Visualizing information contained in data enhances the ability to think and judge and also contributes to making the argument more comprehensible.

Knowledge of statistics and informatics is indispensable to acquiring data literacy. In Japan, education and research in informatics are offered in information engineering and mathematical engineering departments of certain universities’ schools of engineering. But no Japanese universities have a department of statistics for undergraduate students and no Japanese graduate schools offer statistics as a major subject of study. In stark contrast, universities in the United States almost without exception have a statistics department. So do 157 universities in China.

Nowadays, the phrase “the arrival of the big data age” is flying around. Examples of big data include point-of-sales data from convenience stores and supermarkets, travel records accumulated through the use of prepaid electronic train fare cards, purchase records of mail order services, information contained in electronic medical files, statistical data related to sports, meteorological data and scholastic records of students taking the unified national achievement tests.

With rapid progress in information and communications technologies, a wide variety of big data has attained an objective and ubiquitous existence. As each area of big data contains huge amounts of information, statistics has started playing a significant role in visualizing the precious information with the aid of informatics.

Thus statistics has gained its “citizenship” as a branch of science that deals with the objective entity known as big data. In other words, it can rightly be said that statistics is evolving into a data science.

With the arrival of the age of big data, there was a sharp increase around 2010 in the number of statistics majors among undergraduate students at American universities. At the same time, some universities have started providing undergraduates with opportunities to major in data science by creating new departments devoted to that discipline.

A data scientist is a person with a balanced knowledge of both statistics and informatics who is capable of communicating with persons from diversified fields where these two subjects are applied.

Knowledge of statistics is only one of the necessary conditions for becoming a data scientist, as elementary knowledge of informatics is also needed. Furthermore, a data scientist must be a person with an “inverted T” configuration. When the letter T is inverted, the horizontal line comes at the bottom and this line represents knowledge of statistics and informatics. Students must become fully knowledgeable about the content of this horizontal line during their undergraduate years.

The vertical line in the inverted T represents areas where specialized skills of big data analysis are applied. By a random listing, these areas include marketing, finance, macroeconomics, environment, climate, transportation, medical science, health care and natural disasters, just to name a few. It is highly desirable that any data scientist be well versed in at least one specialized field, and two or more if possible.

A person cannot be called a full-fledged data scientist unless he or she is capable of visualizing information contained in big data by using knowledge of statistics and informatics, and of creating values based on that information.

Here is one example of value creation: Point-of-sales data at a convenience store keeps track of the daily sale of box lunches and beverages; during summer months, there must be fluctuations in the sale of these items depending on changes in the temperature.

A forecast of consumption for a two-week period can therefore be made by using weather big data and by combining the temperature forecast and point-of-sale data. The result can narrow the supply and demand gap for box lunches, minimizing the number of unsold box lunches, and reduce the number of customers who will become dissatisfied by finding that box lunches have sold out.

Here is another example: If electronic medical records of all hospitals in an area are integrated into big data, medical information on patients can be integrated and this can create new value by reducing health expenditures.

At Shiga University, a new department of data science will be created in fiscal 2017 to train data scientists who will not only be equipped with professional knowledge of statistics and informatics but will also be capable of communicating with businesspeople, civil servants, journalists, medical doctors and schoolteachers, and of creating new values.

This university department will become the first one in Japan aimed at nurturing future-oriented talents who will be equipped with “true scholastic ability” consisting of the faculties of thinking, judgment and expression through the learning of languages, mathematics and data science in a well-balanced manner.

Takamitsu Sawa is the president of Shiga University.