Literature data mining

Peter Paul Rubens (1577–1640), The Fall of Icarus (1636), oil on panel, 27 x 27 cm, Royal Museums of Fine Arts of Belgium, Brussels. Wikimedia Commons. image available here

Andrew Reagan at the Computational Story Lab at the University of Vermont in Burlington and a few pals have used sentiment analysis to map the emotional arcs of over 1,700 stories and then used data-mining techniques to reveal the most common arcs (…) The idea behind sentiment analysis is that words have a positive or negative emotional impact. So words can be a measure of the emotional valence of the text and how it changes from moment to moment. So measuring the shape of the story arc is simply a question of assessing the emotional polarity of a story at each instant and how it changes (…) Reagan and co say that their techniques all point to the existence of six basic emotional arcs that form the building blocks of more complex stories:

  • A steady, ongoing rise
  • A steady ongoing fall, in emotional valence
  • A fall then a rise
  • A rise then a fall (Icarus)
  • Rise-fall-rise
  • Fall-rise-fall (Oedipus )
Image available here

It turns out the most popular are stories that follow the Icarus and Oedipus arcs and stories that follow more complex arcs that use the basic building blocks in sequence

Excerpts from the article entitled “Data Mining Reveals the Six Basic Emotional Arcs of Storytelling,” available here

Full paper available here

Beautiful video of Kurt Vonnegut lecture (1995) on story arcs available here

Blog analysis


Blogs: open space for reflection/ forum for discussions/ portfolio of completed assignments/ opening up courses to a wider group of participants.

Blogs’major applications: maintaining a learning journal; recording personal life; expressing emotions; communicating with others; assessment and; managing tasks.

Blogosphere: blog interconnections as a. a social network and b. an ecosystem

Blog Benefits in learning environments: reading other blogs; receiving feedback on one’s own blog

Blog and Personal Learning Environment (PLE): use for personal info management; use for social interaction and collaboration; info aggreggation and management

Blog problems: fragmented discussions/ a lack of coordination structures/ weak support for awareness/ danger of over-scripting



Poldoja, H., Duval, E. & Leinonen, T. (2016). Design and evaluation of an online tool for open learning with blogs, in Australasian Journal of Educational Technology, Vol 32, No 2, pp. 61-81.

Image available here

Projections of future education


The breakdown by education level is especially interesting: It shows that our world will be inhabited by more and more educated people. The projection shows that the number of people with no education will decrease continuously and that by the end of this century virtually all people in the world will have received some level of education (…) By 2050, only five countries are predicted to have a rate of no education above 20%: these are Burkina Faso, Ethiopia, Guinea, Mali and Niger. There is also expected to be a large increase in the numbers of people obtaining degrees, while more people complete secondary school.


Image & Reference: 

Online Learning Analytics and the Quantified Self

Individual User Activity Plots for the Online Material. On the left the colour line shows the student’s overall attendance path. The chart on the right illustrates the students’ daily attendance. The straight line at the bottom, illustrates their weekly attendance.

Learning Analytics keep track of student engagement in online environments. They are invaluable in regard to the information they provide about attendance, student preferences and mostly their learning habits.

The first analytics I ever saw were those published by the University of Stanford in 2014 by Jennifer DeBoer, Andrew D. Ho, Glenda S. Stump, Lori Breslow and are available here. It wasn’t just the number of the students they monitored amazed me (I think it was more that 150.000 and that was itself an achievement) but their charts; those little crooked or flat lines that in all their simplicity actually depicted student activity. I can still remember how much I wanted to check this out for myself, set my own experiment and see how people learn, how different their approach is to learning.

And then, just a year later, our own analytics for “Methodological Tools of Analysis” 2015 course was issued with information about how our students performed and especially how they did it in completely different ways. What is more, the fact that our course ran both online and in-class gave us the opportunity to compare their performance and see how each environment worked in regard to the other for every one of the students.

We realized however, that despite the invaluable information we got out of these readings, they were more important to the students. That is why we printed the charts and we distributed them in class. Because it wasn’t at all about measuring the clicks -at that point we couldn’t do that anyway- as much as it was realizing that by having increased the stimuli and the ways the students could express themselves and engage in the course, we had offered them a learning environment that motivated them to be themselves. This process had allowed them to be free of educational preconceptions and shape their own learning styles.

So, those first charts and the ones that followed in 2016 have been our tacit manifestation of the emancipated learner. They may also be a manifestation of our quantified self as St. Downes claims in his recent presentation about new trends in online learning when he considers learning analytics to be one of the most important future trends. But he is right: it is not personalized learning in the sense that we adapt or customize the learning platform to suit them. It is personal because their learning paths are theirs alone.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.