Portuguese Amorim Corpus

Maria Clara Figueiredo Amorim
University of Porto


Participants: 80
Type of Study: picture descriptions
Location: Portugal
Media type: audio
DOI: doi:10.21415/RS10-E248

Browsable transcripts

Phon data

CHAT data

Link to media folder

Citation information

Amorim, C. 2015. Padrão de aquisição de contrastes do PE: a interação entre traços, segmentos e sílabas. Dissertação de doutoramento, Universidade do Porto.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

This corpus was gathered in the context of Maria Clara Figueiredo Amorim’s PhD research project (Amorim 2015; https://repositorio-aberto.up.pt/bitstream/10216/78848/2/34866.pdf); supervised by João Veloso and Carmen Matzenauer. This research project was developed at Centro de Linguística da Universidade do Porto (CLUP) and it was supported by Fundação para a Ciência e Tecnologia (FCT) individual PhD research grant SFRH/BD/69856/2010).

The corpus is experimental cross-sectional and contains production data from 80 typically developing children from the North of Portugal area (Porto and Ponte de Lima), all EP monolinguals, aged 3;0 to 4;11. It is divided in two wide groups (GA = Porto; GB = Ponte de Lima), which are subdivided in four age groups: G1 = 3;0-3;5; G2 = 3;6-3;11; G3 = 4;0-4;5; G4 = 4;6-4;11); with no history of cognitive, hearing or language disorders (see tables below). This table summarizes the ages and groups of the 80 children.

After informed consents were gathered, the speech samples were collected at each child’s school, using an original spontaneous picture naming task, built on the basis of the criteria used in Yavas, Hernandorena & Lamprecht (1992). Each child was invited to tell a story from a book with a sequence of five original colored thematic drawings forming a narrative. The target words were, therefore, elicited in connected speech.

Because of the physical space the recordings were made in, many of them have background noise. Some of the speech samples, especially those of younger children, were collected in two sessions.

The selection of target words had into account children’s age as well as the following phonological variables:

The majority of each target structure occurs at least in 3 target words in all syllable constituents and word positions. The only exceptions are /z/ at word initial position, which occurred only twice, and /S/ in medial Coda followed by a voiced consonant.

In Clusters, the criteria of three target words could not be maintained because: some clusters are very uncommon in certain word positions, or they occur in words that are not present in children’s vocabulary, or because they occur in words that cannot be represented graphically. Also, the distribution of the target segments in stressed and unstressed syllables is uneven because of the difficulty to find words that are present in the vocabulary of young children.

The speech data were recorded in a Sony Minidisc MZ-NH900 digital recorder with a Lifetech unidirectional microphone (model LF 65) and later saved in a laptop ASUS N43SL. The phonetic transcriptions and phonological analysis presented in Amorim (2015) were performed by the author (Maria Clara Figueiredo Amorim), using Praat software and Goldvarb X. All doubtful transcriptions were reviewed by an experienced phonetician–transcriber. Whenever the transcriptions were not coincident, the productions were not analyzed. Also the productions considered motivated by assimilation or that suffered vowel epenthesis were not analyzed.

This spreadsheet has the transcriptions from Porto.

This spreadsheet has the transcriptions from Ponte_de_Lima.