I was surprised to discover that the ratio between white and blue in the first one thousand lines of Nájera’s poetry I scanned ( blanca/blanco 0.26328016 vs. azul/celeste 0.06194827) was incredibly consistent with the results I got using about four thousand lines (see previous post). The first 1,000 were from poems written between 1882 and 1886.
I decided to convert Nájera’s corpus from plain text to a TEI format. So far I have been mining the poems as if they were a long text, not as individual poems. Adding a <date> tag to each poem allows me to group texts by year. I was hoping to discover an interesting pattern in Nájera’s use of colors. However, the number of tokens varied greatly from year to year, not nly because I have only scanned one third of his poems but also becasue possibly due to the demands of his journalistic duties, Nájera’s poetic output from around 1888 to 1895 was sparse. This is, of course, a common problem when having such a small data set.
In the graphic below, azul (including azul, azules, azur, celeste) appears at the beginning of his career (1877) and again towards the end (1895). Blanco (including blanco, blanca, blancos, blancas, blancura), on the other hand, consistently appears throughout the years.
Grouping poems by date will also become useful in the future as I try to periodize Nájera’s poetry. Unlike Darío, Nájera never published a single book of poetry and most attempts at organizing his poetry in periods seem arbitrary (See González Guerrero’s preface to Poesías completas , for example).