Tag Archives: Manuel Gutiérrez Nájera

Periodizing Modernista Poetry

I. Intro

Gutiérrez Nájera never published his poems in the form of a book. They appeared in the numerous newspapers for which he worked and, after he passed away (1895), his friends collected them in a single volume, along with a preface written by Julio Sierra. The book (you can find a copy of it in archive.org) included 158 poems to which modernista scholars such as Mapes, Boyd C Carter, González Guerrero and others have continued to add new texts throughout the years. At the moment, according to Angel Muñoz Fernández, there are 235 poems attributed to the Mexican poet (13). It is usually assumed that Nájera, as a poet, had a “youthful” and a “mature” artistic periods, but there is no clear consensus about when one period ends and the other begins. The closest thing we have to a periodization of his poetry is the grouping of the poems introduced by González Guerrero in his 1953 edition of Poesías completas. Even though he divides Nájera’s poetic work into several chronological periods, González Guerrero also groups them according to themes and poetic forms. The critic’s seemingly chaotic periodization goes as follows: Under the general heading of “Primeras Poesías,” he adds two subdivisions, “La fe de mi infancia” (1875-1881) and “Trovas de amor” (1875-1880). The rest of the poems are placed in the following sections: “Otros poemas juveniles” (1877-1881), “Caminos del viento” (1880-1883), “Ala y abismo” (1884-1887), “Elegías” (1887-1890), “Nuevas canciones” (1888-1895), “Odas breves” (No dates given), “Poesías varias” (1876-1891), “Versiones” (1880-1884). The last group contains Nájera’s translations of French poems, some of which, at one point, were mistaken for original creations. One could argue that González Guerrero divides Nájera’s poetic trajectory into a youthful period that goes from 1875 to 1881, a transitional period from 1880-1883, a middle period, from 1884-1887, and a mature period that goes from 1888 to 1895.

My objective was to  apply a stylometric analysis to Nájera’s poetry with the purpose of creating a new periodization. In the next two sections of this post, I will summarize the problems I had with preparing the data and with some of technical aspects of the analytical process. If you prefer, you can jump to the last section of the post, in which I contrast my results to González Guerrero’s and propose a new periodization of Nájera’poetic work.

II. The 1896 edition and its afterlife

Although a total of 235 poems are recognized as forming Nájera’s poetic corpus, that number also includes poems translated from French literature and at least one poem written entirely in French. I excluded those from my analysis bringing down the total to 220 poems. The biggest problem in classifying the poems, however, had to do with the dates of composition and/or publication. The 1896 posthumous edition was supposed to be organized chronologically, but many of the texts do not follow that order, and many others have no date assigned to them. None of the scholars in charge of the editions of Nájera’s poems that came after, fixed the problem, often simply reproducing the composition/publication dates found in the 1896 edition. Angel Muñoz Fernández’s comments, in his preface to the 2000 edition of Nájera’s poetry (which contains a facsimile of the 1896 edition, of course), describes the complexity of the problem: “Revisando algunos diarios de la época, encontré que el célebre ‘Francia y México’, con fecha 1882 en la edición de 1896, fue publicado en El Nacional el 5 de mayo de 1881, apareciendo junto al título la fecha 1879, que pudiera corresponder al año en que el poema fue escrito” (17).

I was unable to determine the date of a total of 34 poems, bringing down the number of poems I could use for my analysis to 186.

III. Length, etc

The technical side of the project created additional problems. Initially, I envisioned grouping Nájera’s poems by year, and treating each year as if it were a single text. I would then tokenize the poems and get the word counts and frequencies in relation to that year alone. However, Nájera had a very uneven poetic production and some periods were more productive than others. Some years he wrote so few poems that it became impossible to get an accurate author signal because there were not enough tokens per year of production. In his paper, “Does Size Matter? Authorship Attribution, Short Samples, Big Problem,” Maciej Eder argues that the current methods for doing stylometric analysis do not allow the study of very short texts: “using 2,000-word samples will hardly provide a reliable result, to say nothing of shorter texts.” The number of words needed to get an accurate authorship signal in a text varies. With regard to poetry, Eder explains that in his experiment “the results for the three poetic corpora (Greek, Latin, English) proved ambiguous, suggesting that some 3,000 words or so would be usually enough, but significant misclassification would also occur occasionally.” To analyze Nájera’s poetic corpus, I combined the texts from adjacent years in order to create two-year periods with around 4000 words. Only a few of the years surpassed the 4000 token mark and I left those by themselves. I was forced to create a multi-year period for the last years of Nájera’s life because of his extremely low production during that time.

FECHA                   TAMAÑO DEL “TEXTO”

1879 ----------------------- 8611  

1880 ----------------------- 5896  

1881 ----------------------- 4352  

1875-1876 ------------------ 5461  

1877-1878 ------------------ 8594  

1882-1883 ------------------ 3995  

1884-1885 ------------------ 6073 

1886-1887 ------------------ 8648  

1888-1889 ------------------ 7335  

1890-1895 ------------------ 9541  

After combining the years to obtain a higher token number, I compared the style for each time period employing as my classification method, Burrow’s Delta with zscores. The following images show the results, employing 150 of the most frequent words


and with 300 MFWs


I did not eliminate any pronouns or overrepresented words. I have yet to apply other methods (such as SVM and PCA) to this data.

IV. Periodization.

In spite of all the problems related to dating the poems and length of samples, the stylometric analysis I performed makes it possible to propose a new periodization of Nájera’s poetic work (however provisional it might be). Looking at the following visualization of the classification resulting from the Burrow’s Delta method, the first thing one notices is how a cluster formed with the poetry from 1875 to around 1878/1879 (1879 often appears completely disconnected from the periods coming before and after). In González Guerrero’s view Nájera’s youthful period last until 1881, but in the stylometric analysis, the years from 1880 to around 1887 show strong similarities among them, almost always grouped together.

I was obviously concerned about having influenced the periodization by my creating two year periods to obtain a higher number of tokens. Addressing this problem was especially significant to determine when the transition from the middle to the mature period took place. González Guerrero employed 1888 as the year marking the beginning of Nájera’s last poetic period. When I tried combining 88-89 and 90-95, these two groups tended to move closer to each other than to the other 1880s groups. I then left 1888 by itself (there were enough tokens in that year to do that—over 5000) and created two more groups, 89-90 and 91-95. In this occasion 1888 moved toward 89-90, but not as close to 91-95 as I expected. The higher the number of MFW used, the more 91-95 distanced itself from the late 1880s. In other words, Nájera’s style definitely underwent a change in towards 1888 (possibly marking the beginning of a transitional period that goes until 1890?), but it is not clear that the last period of his poetry began as early as 1888.


To Do:

  • use of other classification methods such as SVM or PCA
  • analysis of the change of vocabulary from the 1870s to the 1890s (topic modeling needed?)
  • Adding prose documents to corpora. Establishing the publication date of those appears to be easier (should I assume that the difference between Nájera ‘s poetic style and his prose style is not significant?)

Topic Tree

Many of the techniques I have been applying to my corpora were specifically designed for use with “Big Data” —which I do not have, unless one considers about eighty something poems “Big” data. I believe that it is possible to apply quantitative analysis to smaller groups of text and obtain meaningful results. Ted Underwood would probably disagree with me on this, but I think even he would be surprised at the results I am getting from using his Topic Tree method to Nájera’s poetry.

Topic modeling is very popular at the moment, but when I started working with modernista poetry I had my doubts about that approach. I didn’t know if it was the right one for my data as topic modeling tends to emphasize  the isolation of key themes in a single text. Because I was treating each poem as an individual work, I needed a technique that could help me establish links among the poems based on the recurrence of similar words/topics. Underwood’s technique (an alternative to topic modeling, which he calls Topic Tree), does exactly that. He applied it to a huge collection of 18th century documents and produced this dendrogram tree. He also has a post explaining his technique in detail.  Underwood uses a vector space model to compare words among corpuses, but instead of employing the tf-idf scores normally used by search engines, he has developed his own formula, which he explains in this Tech Note. In the same Tech Note, he has released his R code (as well as a very handy script to divide large trees into manageable sections).

Let me now show the results I got using the Topic Tree technique.  I fed my script 250 common words and that produced four main branches. As expected the branches of my tree reflect Nájera’s main topics.

In branch one, the topics are poetry, faith and childhood:rama1In Branch two, all the words are related to the experience of death:


In the third branch the focus is on natural elements, with strong emphasis on flying animals and love:rama3The fourth and last branch is also the most interesting one. In the top part the main topic is poetry and its representation of beauty/nature, towards the middle the dominant topic is sexual attraction and the lower part of the branch shows the poetic representation of women in modernista poetry (notice the words that cluster around “white”) .rama4

The experiment in my opinion was very successful and, of course, I am now curious about what a topic tree of modernista poetry would look like.

More on Whiteness

I was surprised to discover that the ratio between white and blue in the first one thousand lines of Nájera’s poetry I scanned ( blanca/blanco 0.26328016 vs.     azul/celeste 0.06194827) was incredibly consistent with the results I got using about four thousand lines (see previous post). The first 1,000 were from poems written between 1882 and 1886.

I decided to convert Nájera’s corpus from plain text to a TEI format. So far I have been mining the poems as if they were a long text, not as individual poems. Adding a <date> tag to each poem allows me to group texts by year. I was hoping to discover an interesting pattern in Nájera’s use of colors. However, the number of tokens varied greatly from year to year, not nly because I have only scanned one third of his poems but also becasue possibly due to the demands of his journalistic duties, Nájera’s poetic output from around 1888 to 1895 was sparse. This is, of course, a common problem when having such a small data set.

In the graphic below, azul (including azul, azules, azur, celeste) appears at the beginning of his career (1877) and again towards the end (1895). Blanco (including blanco, blanca, blancos, blancas, blancura), on the other hand, consistently appears throughout the years.

azulesyblancosGrouping poems by date will also become useful in the future as I try to periodize Nájera’s poetry. Unlike Darío, Nájera never published a single book of poetry and most attempts at organizing his poetry in periods seem arbitrary (See González Guerrero’s preface to Poesías completas [1966], for example).


How Blue is Nájera’s Poetry?

It is fascinating to see how Nájera’s status as a modernista writer has changed since the 1960s. While he used to be considered a precursor of modernismo, critics now see him as full-fledged member of the movement.  One of the first things I wanted to do  for this project was to compile a list of the most frequently used modernista words and see how Nájera compares to other members of the group. Eventually, of course, the main objective is to compare stylistic characteristics–not just words– among modernista texts.

I decided to start with something simple, like comparing Nájera’s and Darío’s use of colors. We all know that, for Darío, art is supposed to be “azure,” but is it blue the most common color found in these writers’ poetic texts?

As it turns out, in Darío’s case there is no surprise. Blue is the most frequently mentioned color in his poems, followed by white.


The surprise comes when one looks at Nájera’s poems. Even though Nájera is known for being one of the founding editors of Revista Azul, white has a stronger presence in his poetry than blue. In addition, I was not able to locate the word “azur” in any of the texts.


[EDIT: I guess I forgot to add the plurals to the graphics above; adding those would tip the scale even more towards “blanco.”  5/28/14]


I should mention than in Nájera’s corpus (much more than in Darío’s) the color white is very often associated with skin color. Does it mean that Nájera’s preference for that color is related to the racial views during the Porfiriato? José María Martínez seems to think so.  In his article “Un duque en la corte del Rey Burgués: positivismo y porfirismo en Manuel Gutiérrez Nájera” BSS, LXXXIV (2007), he says about Nájera’s views on race:

Quizá desde este punto de vista se comprenda también por qué en Nájera el color blanco tiene una presencia tan singular—recordemos su emblemático poema ‘De blanco’—, o por qué prefiere los tipos rubios para los héroes y heroínas . . . No se trata de negar la posible y probable deuda libresca—romántica, prerrafaelita, simbolista—de estos personajes ni de este color, pero sí de reivindicar ese contexto sociológico como marco de una de las notas más perceptibles en la creación najeriana y que habría llevado al Duque a configurar un México literario racialmente acorde a las preferencias suyas y de gran parte de su público.

It is definitely important to stress the ideological closeness between Nájera and the Porfirio Díaz regime, as this continues to be an aspect of Nájera’s work that remains understudied. However, the most striking aspect of the quote above is Martínez’s aside: “recordemos su emblemático poema ‘De blanco’ .”  Now, Martínez analysis of Nájera is excellent and the critic supports his description of the poet’s racial views with brief readings of a couple of Najera’s essays, but in asking his readers to “remember” a specific poem, Martínez is using “De blanco” as a synecdochical representation of all of Nájera’s poetry–a common practice in traditional criticism, I suppose. To say “recordemos su emblemático poema ‘De blanco’ ” is tantamount to say “I do not know the exact numbers, but I know that the color white appears very often in Nájera’s poetic work.” I think that in this case, even a simple text mining exercise as the one I have just performed, seems to support Martínez’s argument.


Digitizing Modernista Poetry

I am currently working on a research paper on Nájera and I am hoping that the results from mining his texts will confirm my main theories about him. Obviously, I cannot mine Nájera’s work in isolation, but in my posts I will mainly focus on how my findings relate to his texts.

One of the most time consuming aspect of this project is going to be creating my corpora, which requires digitizing Nájera’s texts. Unlike Darío’s work, very little of Nájera’s is available in digital form. At the same time, because of the availability of Darío’s texts I have decided to start working with Nájera’s poetry as this will allow me to quickly start mining texts from both poets.

At the moment, I have digitized 4310 lines of Nájera’s poetry (82 poems) and I will be contrasting those to about 8402 lines of Darío’s poetry (183 poems). You can see the corpora I am using here.