Amazon S3



SUPPLEMENTARY MATERIALSGENERAL PROCEDUREInformation letter. The information letter sent out to the participants (when recruiting) first described the study briefly and stated that participation was voluntary and could be terminated at any time. It also mentioned the anonymity of participants and the storage of results in a database. The PI (R?nnberg) is responsible for the security of the data-base. It was also stated in the letter that participation on occasion 1 included three testing sessions: one audiological (hearing examination, i.e. session 1), one cognitive testing (memory examination, i.e., session 2) and finally, the third, one session for testing of speech understanding under a variety of conditions (i.e., session 3). The three sessions lasted between 2 to 3 hours per session for each individual. Participants were informed that they would be called back five years later for follow-up testing (i.e., at test Occasion 2 and test Occasion 3), as well as for a regular bi-annual hearing check-up.Sessions. At the three test sessions (Session 1, Session 2 and Session 3) under Test Occasion 1, all testing was administered individually within a 6-week period. The stimuli in each of the tasks were balanced for relevant parameters (e.g., an equal number of true/false answers) when appropriate, and the order of the trials within each task was pre-randomized and then fixed for all participants. Also, the order of task administration was fixed for all participants to minimize any error variance due to a participant by order interaction. The tasks administered in Session 1 were (in the order of administration): the Mini Mental State Examination test (MMSE), Pure-tone audiometry (PTA, air & bone conduction), the Threshold Equalizing Noise Hearing-Level (TEN HL) test, a Phonemically Balanced Words (PB) test, and the executive function (EF) tasks: the Shifting, Updating, and Inhibition tasks; the Physical matching (PM) and Lexical decision (LD) tasks, the Rhyme, the Reading Span Test (RST) tasks, and finally, the Raven test and the subjective ratings of Speech, Spatial and Qualities of Hearing Scale test (SSQ) were administered.The tests administered in Session 2 were (again in the order of administration): the Text-Reception Threshold (TRT) test, the Inference Task (the Logical Inference-making Test (LIT), the Sentence Completion Test (SCT), the Visual Spatial Working Memory (VSWM) task, the Semantic word-pair span task (SWPST), the Non-word serial recall task (NSR), the Rapid Automatic Naming (RAN) test, the Distortion Product Oto-acoustic Emissions test (DPOAE) , the Temporal Fine Structure test (TFS LF), and the Spectro-Temporal Modulation test (STM).The behavioral speech-in-noise tests were administered in Session 3. They were, in the order of administration: the Swedish Hearing in Noise Test (HINT), the Samuelsson & R?nnberg sentences test, the AIM, the Hagerman sentence test conditions, and finally, the Consonant and Vowel gating and Vowel duration tests.Below we present the details of the HEARING TEST BATTERY, the COGNITIVE TEST BATTERY, and the OUTCOME TEST BATTERY, with the tests conceptually organized within each battery rather than in the session order of testing listed above. All tests were conducted either in a soundproof test room (PTA, TEN HL, PB, HINT, Hagerman sentences, Samuelsson & R?nnberg sentences, consonant and vowel gating, and vowel duration discrimination) or in a quiet office (the other tests). METHODSHEARING TEST BATTERYWhile the PTA, DPOAE and PB word lists are tests commonly used in a clinical setting, TEN HL, STM and TFS-LF are non-standardized experimental tests.The hearing thresholds, PB words and TEN HL were tested through an audiometer (Madsen Astera) with THD 39 earphones. DPOAE data were collected using the Interacoustics Eclipse system (Interacoustics A/S), while the STM and TFS LF tests were computer-based and the sound stimuli were presented through Sennheiser HDA200 headphones.PTA was conducted to measure hearing acuity. Air- and bone-conduction thresholds were obtained at .125, .25, .5, 1, 2, 3, 4, 6 and 8 kHz and .25, .5, 1, 2, 3 and 4 kHz respectively in both ears. Audiometry was conducted according to ISO/IEC 8253 (2010).TEN HL. The TEN HL test is a method for identifying dead regions of inner hair cells in the cochlea (Moore et al., 2000, 2004). A dead region of the cochlea reflects a large number of non-functioning inner hair cells and/or neurons over an area of the basilar membrane corresponding to a range of frequencies. In short, the TEN (HL) test is based on masked thresholds. The current version of the test was administered manually where the participants’ pure-tone thresholds at 0.5, 1, 2 and 4 kHz for each ear are obtained. Subsequently, a spectrally-shaped noise, called the TEN-noise, with bandwidth between 354 and 6500 Hz is presented 10 dB above the participants’ pure-tone thresholds for each of the four frequencies. The signal level is varied in 2-dB steps to determine the masked pure-tone thresholds relative to the unmasked thresholds. The masked thresholds at four frequencies per ear were recorded in the database. A masked threshold elevation (i.e., masked minus quiet threshold) of more than 10 dB is indicative of a dead region at the tested frequency.DPOAE test. The Distortion Product Oto-acoustic Emissions (DPOAE) test assesses the function of the outer hair cells (Neely et al., 2003). The DPOAEs were obtained at three frequencies (1, 2, and 4 kHz) and at three levels (40, 50 and 60 dB SPL). For each frequency and level, two values are recorded, the DPOAE sound pressure level and the estimated signal to noise ratio (SNR). In the subsequent analyses, a DPOAE recording is considered valid if the SNR at that frequency is 6 dB or better. Binaural TFS sensitivity. We used the “TFS low-frequency (TFS-LF) test” by Hopkins and Moore (2010) to measure interaural phase difference (IPD, here denoted dP) thresholds for pure tones. The test uses an adaptive two-alternative forced-choice procedure. Two intervals, which contained 4 tones, were presented binaurally in every test trial. One of the intervals contained tones having the same phase in two ears (AAAA) and in the other interval the second and fourth tones contained tones with an interaural phase shift (ABAB). The task was to identify which interval contained the interaural phase shift. Thresholds corresponding to 71%-correct performance were estimated by taking the geometric mean of the dP values at the last six (of eight) reversals for each run. Two runs were administered after one practice run. A non-adaptive percent-correct procedure was invoked if the maximum dP of 180 degrees was reached twice before the second reversal or at all after the second reversal. In this case, 40 further trials were presented to measure percent-correct performance. The test frequency and level were 250 Hz and 40 dB SL, respectively. The mean of the thresholds at the two test runs, ranging from 0 to 180 degrees for the adaptive procedure and from 0 to 100% for the non-adaptive procedure, were entered into the database. The %-correct scores were later transferred to?thresholds in degrees for subsequent analyses.STM. The Spectro-Temporal Modulation stimuli (Bernstein et al., 2013) consist of spectrally-rippled noise with spectral-peak frequencies that shift over time. STM, with a 2-cycle/octave spectral-ripple density and a 4-Hz modulation rate, was applied to a 2-kHz lowpass-filtered pink-noise carrier. Furthermore, stimuli were presented over headphones monaurally for each ear independently at 80 dB SPL (±5-dB roving). The fixed level at 80 dB SPL made it audible to all test subjects and can be compared to previous studies using the same procedure. In every test trial, two intervals were presented, where one interval contained unmodulated noise and the other interval contained modulated noise. The task was to determine which interval contained modulated noise. The threshold modulation depth was estimated adaptively in a two-alternative forced-choice task starting with a modulation depth for each run at 0 dB (i.e., full modulation). Then the depth was reduced by 6 dB until the first reversal, changed by 4 dB for the next two reversals, and changed by 2 dB for the last six reversals, for a total of nine reversals per run. Three runs were administered after one practice run. The threshold was determined by taking the mean of the modulation depth (in dB) of the last six reversal points. Therefore, if the adaptive track required a modulation depth greater than 0 dB, the next trial was presented with full modulation. If a listener was unable to detect the fully modulated signal more than five times in any one run, the run was terminated and no threshold was collected (Bernstein et al., 2013). Data were recorded automatically by the computer software. Three thresholds per ear were obtained and analyzed. The results for the three tests at each ear were averaged for the purposes of this overall study. PB test. The Phonetically Balanced words test (e.g., Magnusson, 1995) is a type of word recognition (in quiet) test. This type of test is a long-established tool for the clinical assessment of an individual’s ability to understand speech. PB-words in quiet together with pure tone thresholds can also be used to estimate influences of auditory neuropathy. It contains monosyllabic CVC words that have been selected in such a way that the lists reflect the statistical distribution of the phonemes in the Swedish language. The participants were instructed to listen to a man reading a sentence. Their task was to repeat (verbally) the last word in each presented phrase. For example, “Now you hear car”, and the participants should repeat “car”. Two recorded lists of 25 sentences each spoken by a male reader were presented. Scoring was based on percent correct responses. All stimuli were presented binaurally at 35 dB SL (i.e., 35 dB above the average PTA3 of 500, 1000 and 2000 Hz from both ears.).COGNITIVE TEST BATTERYAll cognitive tests (except MMSE, RAN and Raven) were computer-based. All visual stimuli were presented on a 20 inch LCD computer screen, and all auditory stimuli, if applicable, were presented binaurally via a pair of insert earphones (ER 3A).Phonological testsIn the first three tasks (consonant gating, vowel gating and vowel duration discrimination), speech stimuli were read by a male speaker and were presented in both auditory-only (A) and audiovisual (AV) modalities.Consonant gating.?The gating paradigm task is one of the best methods of assessing individual skill in early identification of phonetic information,?in particular?for speech?recognition?in degraded listening conditions?(Grosjean, 1980; Moradi et al., 2013,?2014). The participants’ task is to predict and identify a stimulus, given an initial portion of a speech sample.?The current test materials consisted of 5 Swedish consonants in a Vowel-Consonant-Vowel (VCV) syllable format (/ala, asa, ama, ata, afa/), with the initial /a/ as standard stimulus,?with the gate size of 40 ms. Consonants have critical temporal cues that can only be found using relatively narrow gates. Moreover, the gating paradigm allows us to study errors or biases in the responses of listeners prior to identification. The gating started after the first vowel /a/ and right at the beginning of the consonant onset. Hence, the first gate included the vowel /a/ plus the initial 40 msec of the consonant, the second gate gave an additional 40 msec of the consonant?(in total, 80 msec of initial duration of the consonant), and so on.?The minimum, average, and maximum total duration of the consonants were 85, 198, and 410 msec, respectively.?The maximum number of gates required for identification varied between 15 and 17.?In order to avoid random guessing, the presentation of gated stimuli?continued until the target phoneme was correctly identified on three?consecutive identical responses. The consonant gating task took approximately 7 minutes. Responses were scored?in terms of?accuracy (i.e., correct identification of consonants), ‘correct consonant cluster’ (i.e.,?is the?identification occurred?within its own cluster), and ‘correct voicing’ (i.e., voiced or not, but not necessarily the correct consonant). Only the results from ‘accuracy’, being the strictest measure, will be presented in this overall?paper. The isolation point (IP) is defined as the?shortest time?from the onset of a stimulus at which correct identification occurred, without any subsequent changes in decisions after listening to the remainder of the stimulus (Grosjean, 1980).Vowel gating task. The vowel gating task (Moradi ettt al., 2013, 2014a,b) was presented in a consonant vowel (CV) syllable format (/p?, ma:, m?, vi?, ma/), with the initial consonant as standard stimulus, and consecutive gates of 40 msec each. Five specific vowels were chosen to be as unambiguous as possible, with variation in duration (/vi: vs. p?/ and /ma: vs. ma/) and mouth shape (/a: a/ vs. /i: i/ and /?/). The first gate of each token consisted of the first two frames of the video-recorded token, up until just before the articulation of the vowel begins (i.e., the first 2 frames of /p/ in /p?/, i.e., 80 ms). The following gates were presented in 40-msec increments. The participants’ task was to verbally specify which vowel they perceived, or thought they perceived, with regard to both vowel and duration. The experimenter noted the oral responses on a scoring sheet. If the experimenter was in doubt about which vowel the participant uttered, the participant was asked to repeat it. New gates were presented until three consecutive identical responses were given, but not prior to the 7th gate. This was done to ensure that the participant had perceived enough of the stimulus to have a proper chance to perceive correctly (and thereby confirm or discard an early guess or hypothesis). We also wanted to avoid random guesses. The maximum number of gates varied between 10 and 15 because the stimuli were not of equal duration. Responses were as scored as correct when both the vowel and its duration were correct. The duration of the test was approximately 7 min.Vowel duration discrimination task. The vowel duration discrimination task (Lidestam, 2009) is a method of assessing individual ability to discriminate duration in a given listening situation. The materials consisted of sequences of recorded tokens (e.g., /lal/ and /mam/) pronounced with a long /a/. One /lal/ and one /mam/ recording were chosen and the two video files were edited into five tokens each to be used in the experiment, differing only with regard to how many frames had been excluded. The token length range selected reflected the duration of /a/ and /a:/ in everyday Swedish.For both the /lal/ and /mam/ stimuli, 20 pairs of tokens each were saved as files for the experiment: all 5 durations against all other tokens (5 4 tokens). Thus, there were 4 pairs where the first token was longer than the second token by one step (defined as 33 ms), 4 pairs where the second token was longer than the first by one step, 3 pairs where the first token was longer than the second by two steps, 3 pairs where the second token was longer than the first by two steps, and so on, down to only one pair with the maximum difference of +4 steps, and one pair with the maximum difference of –4 steps. The design focused on the smallest duration differences, by presenting many more small than large-duration differences.The pairs of tokens were quasi-randomised into one presentation order. No more than three consecutive presentations had longer first (or second) token; no more than three consecutive presentations were of /lal/ or /mam/ tokens; and step difference (–4 to 4) as well as token duration (the different lengths of token pairs ranging between shortest–shortest to longest–longest) were distributed across the list. A Macbook Pro with Tcl/Tk 8.5.1 and QuickTime Tcl were used to present stimuli and collect responses. A 20 inch LCD monitor was used in the AV condition.The subjects' task was to judge as accurately and quickly as possible which of the two tokens in each pair was the longest.. They were also informed that the difference was intended to be very difficult to detect in many cases, but that they should do their very best. Missing responses were scored as errors. The subjects did some practice on six examples (with increasing difficulty) to become familiar with the procedure and stimuli before the real experimental trial. In total, 40 test items were used and the dependent variable was the number of errors made. The task took approximately 7 minutes to complete.The rhyme judgment test is a phonological processing test (Classon et al., 2013; Lyxell et al., 1996). The subjects’ task was to decide whether two written words presented simultaneously on the computer screen rhymed or not (Lyxell et al., 1994). In two conditions, words are orthographically dissimilar in spelling but still rhyme (e.g., moose-juice), or orthographically similar but not rhymes. The two non-matching conditions are harder than the two matching conditions (i.e., when words rhyme and are similar, and when words do not rhyme and are dissimilar). The participants were to respond by pressing the predefined buttons for “yes” and “no”. Half of the words pairs rhymed, and half did not. The main performance measure was the response time in milliseconds, but the error rates were also calculated. In this overall study, we did not make any subdivisions of the rhyme judgement test scores (total score = 32), but scores for each of the four conditions were saved for future reporting.Semantic long-term memory access speedPM. The Physical matching task (H?llgren et al., 2005; R?nnberg, 1990) is a simple measure of long-term memory access speed. The material consists of a sequence of a list of 16 pairs of letters presented on the computer screen for 2000 msec. The participants’ task was to judge whether two simultaneously presented letters had the same physical shape (e.g., A-A, or not, A-a). The subjects’ task was to respond “yes” (for same physical shapes A-A) or “no” (for different physical shapes A-a) during a maximum of 5000 msec interval after the presentation of each word-pair. The dependent variable was response time in msec, but error rates were also calculated.LD. The Lexical Decision (e.g., R?nnberg, 1990) materials consist of sequences of letters that were presented on the screen. The participants’ task was to judge whether a string of three letters constituted a real word or not. Forty items were used of which half were common Swedish three-letter words. The participants’ task was to respond “yes” (for a real word) or “no” (for a non-word/pseudoword) during a 5000 msec interval before the presentation of the next word. The dependent variable was the response time, but error rates were also calculated.Working memoryIn all WM tasks, we used list lengths varying from two to five for all participants. Only for the RST, two trials per list length were used. For all the other working memory tests, three trials per list length were used.NSR. The non-word serial recall task (e.g., Majerus & van der Linden, 2010) is a measure of phonological processing based on working memory for phonological information devoid of semantic information. A sequence of nonsense words is presented on the screen (e.g. Bamm-Piv; Sok-Kas), varying from two to five non-words per trial. Each list length, but with different items, were presented three times. The participants’ task was to repeat all the non-words after list presentation was terminated. Dependent variable: The number of non-words correctly recalled (max = 42).RST. The Reading Span Test (Baddeley et al., 1985; Daneman & Carpenter, 1980; R?nnberg et al., 1989) is a working memory test designed to tax memory storage and processing simultaneously. The test consists of a sequence of sentences presented in blocks of 2-5 sentences (such as “The dog drives a car”). The participant’s task was to make semantic verification judgments of each sentence in the list (whether the sentence made sense or not) and to recall either the first or the final words of each sentence from the presented sequence of sentences (Lunner, 2003).The sentences were presented in a word-by-word fashion on a computer screen at a rate of one word per 800 msec. Half of the sentences were absurd (e.g. “The car drinks milk”), and half of the sentences were normal (e.g. “The farmer builds a house”). The subjects’ task was to respond “yes” (for a normal sentence) or “no” (for an absurd sentence) during a fixed 5000-msec interval after the presentation of each sentence (cf. Ng et al, 2013, 2015). After a sequence of sentences (i.e., two, three, four, or five sentences), the experimenter indicated that the participant should start to recall either the first or the final words of each presented sentence in the sequence, in their correct serial presentation order. There were two trials per list length in the RST, thus giving a total of 28 points. The main performance measure was the total number of words correctly recalled. SWPST. A Semantic Word-Pair Test (cf. Maki, 2007 for materials) is a working memory test that does not – compared to RST – include any syntactic elements in the processing and storage components. The test consists of a sequence of word-pairs (such as “Bun, Hippo”). The list length varied from 2 to 5, with three trials per length. The subjects’ task was to identify which word in a word-pair was a living object. After button-pressing of a list of word-pairs, the subjects were asked to orally recall either the Left or Right hand-side of the words in the pair. Recall had to be in the correct order of presentation. The word-pairs were presented in a word-pair by word-pair fashion on a computer screen and remained visible until a response had been given. The main performance measure was the number of words correctly recalled (max = 42).Visual spatial working memory test (VSWM). In this test, two ellipsoid shapes are presented in one of 25 squares in a 5x5 grid on a computer screen. The task was to judge whether the shapes were identical. After that, two ellipsoid shapes were presented in another of the 25 squares and judged in the same way. Right after the last response was given, the participants were asked to note (in the correct order) on a sheet of paper with an empty grid the squares in which the ellipsoids were presented. The list length varied from 2 to 5, with three trials per length. The ellipsoids were adopted from Olsson and Poom (2005) and are designed to be hard to name in a distinguishable way. The ellipsoids consist of a blue outer circle and a white inner circle. They can vary in two ways, the width of the inner circle (20, 40, 60, or 80% of the width of the outer circle) and the width of the outer circle (20, 40, 60, or 80% of the height of the outer circle; see Olsson and Poom (2005) for more details). The presentation of the circles were balanced so that the ellipsoids were identical 50% of the times, and in the other instances, they were identical on one of the two criteria half of the time. Differences in the width of one of the circles were at least 40% in all presentations to avoid mistakes from not being able to discriminate and judge properly.After instructions, two test trials (using a 4x4 grid) were completed before the actual test started. In the test, a 5x5 grid was employed. The main performance measure was the number of squares (where the ellipsoids had been placed) correctly recalled (max = 42).Executive and inference-making functionsShifting test. The task chosen to tap the Shifting function is the number–letter task (Rogers & Monsell, 1995), in which the subjects’ task is to carry out verbal and visual categorizations, respectively. The material consisted of sequences of number–letter pairs (e.g.,7 g). Numbers were either odd (e.g., 1, 3, 5, 7) or even (e.g., 2, 4, 6, 8) digits, and letters were either capital letters (e.g., A, M, T, K) or small letters (e.g., a, r, y, p). These number–letter pairs were presented visually one at a time in one of the four quadrants of the computer screen. In the first control task, number–letter pairs were presented in the two upper quadrants of the screen. The subjects’ task was to process only the number and to specify whether it was odd or even. In the second control task, number–letter pairs were presented in the lower part of the screen and the subjects’ task was to make a capital letter/ small letter decision on the letter. In the shifting task, number–letter pairs were presented in all four quadrants in a clockwise rotation. The subjects’ task was to press the button corresponding to the number (odd/ even) when the pair was presented at the top of the screen, and the letter (capital letter/small letter) when the pair was presented at the bottom. The trials within the control tasks thus required no shifting, whereas half of the trials in the experimental tasks required shifting between the two types of categorization operations. Stimuli remained on the screen until a response was given or until 10 sec had elapsed. For the current paper we only calculated the shifting cost as the dependent variable by subtracting reaction times in the no shifting task from the shifting task (for details of the calculation, see Rogers & Monsell, 1995). Updating test. The function of updating is to monitor and to code incoming information for relevance to the task and to then appropriately revise the items held in working memory by replacing old, no longer relevant information with newer, more relevant information (Morris & Jones, 1990). The task chosen to tap the updating function was the keep track task (Yntema, 1963). The material consists of 36 Swedish mono-, bi-, and tri-syllabic words. There are six different semantic categories such as metal (e.g. zinc, copper, iron), color (e.g., blue, yellow, green), animal (e.g., cat, dog, fish), relatives (brother, sister, cousin, mum), countries (e.g. Sweden, Spain, Norway), and fruits (e.g., apple, kiwi, banana). Before the commencement of each test trial, the participants were instructed to attend to only four out of six categories. Words were presented at a word-by-word fashion on a visual display at a rate of 3 seconds with an inter-stimulus interval of 0.5 second. The participants’ task was to retain only the last words presented for each of four categories (out of six possible). For example, fruits (e.g. kiwi), countries (e.g., Norway), color (e.g., Yellow), and metal (e.g., zinc) while the names of the categories remained on the top of the screen. The task was to recall them at the end of the series. Four trials were used. The performance measure was the number of words correctly recalled (max = 16).Inhibition test. The inhibition test (Miyake et al, 2000) is concerned with measuring a participant’s ability to deliberately inhibit dominant, automatic, or prepotent responses when necessary. The task used to tap inhibition ability is the Stop-signal task (Logan, 1994). The material consists of a series of digits presented on the screen at a rate of 1 second (or until a response was given) with an inter-stimulus rate of 0.5 second. The subjects’ task was to press the space bar on a computer keyboard every time they saw a digit other than 3, but if a digit 3 appeared, not to press the space bar. The instruction was to respond as quickly and as correctly as possible. The dependent measure was the percentage of the number of total of prepotent responses (i.e., digit 3) correctly inhibited, as well as latency. In this study, we employed the error rates as a dependent variable.TRT. The Text-Reception Threshold test, introduced by Zekveld et al. (2007), is based on an adaptive procedure that determines the percentage of unmasked text needed to read 50% of the words in a sentence correctly (see also Besser et al., 2012, 2013). The materials consists of two lists of 20 written sentences (e.g., “the baby drinks milk”), taken from the Swedish HINT materials; H?llgren et al., 2006; cf. Zekveld et al., 2007), that were masked with a bar pattern. Sentences were presented on a white background: the text color was red, the mask (bar pattern) was black. For each trial, the mask consisted of bars of equal width. Between trials, the percentage of unmasked text varied by changing the width of the bars. At the start of each trial, the mask becomes visible and the text appears ‘‘behind’’ it in a word-by-word fashion. The timing of the onset of the visual presentation of each word in the sentence was equal to the timing of the start of the utterance of each word in the corresponding audio file. The preceding words remained on the screen until the whole sentence was visible. After the presentation of the last word, the complete sentence remained visible for 3500 msec. The participants responded orally and the test leader scored the sentence as correct only when the whole sentence was correctly reproduced. One practice list was administered. Thus, lower thresholds indicate better performance. The average percentage of unmasked text of sentences 5 to 21was adopted as the TRT. SCT. To assess the participants’ context-bound, verbal inference-making ability, a Sentence Completion Test was administered. Incomplete sentences are presented on screen for 7 seconds. In each sentence, 2 to 4 words are missing (e.g. Can_help _?). Each blank in the sentence represents a missing word. The task was to create complete, grammatically and semantically correct sentences by telling the experimenter which words were missing in the blank spaces. The participants were instructed to respond when the sentence had disappeared from the screen (i.e., after 7 seconds). They were then allowed 30 seconds to answer before the next sentence was presented. The missing words were scored as correct if they were grammatically correct and semantically reasonable (max = 63; cf. Lyxell & R?nnberg, 1987, 1989; R?nnberg, 1990).LIT and AIM. Based on the original study by Hannon & Daneman (2001), materials were created to target inference-making ability. The current task has two versions, one text-based (the Logical Inference-making Test, the LIT) and one Auditory Inference-Making test, the AIM (i.e., the outcome variable, see under the Outcome Test Battery). In the LIT, an initial question (in text) is presented in the beginning of a trial (example: Is a JAL larger than a PONY?), and then the participant is requested to answer to the question (yes or no) by means of two types of successive statements per trial (A JAL is larger than a TOC; A TOC is larger than a PONY). All three statements were shown on a computer screen at the same time. The dependent variable was the number of correct responses (max = 16).General Cognitive FunctioningMini Mental State Examination (MMSE). The MMSE is used as a screening test for dementia. (Folstein et al., 1975). The material consists of 11 questions probing orientation to time ( “Can you tell me the year, season, date, day of the week and month?”) and place ("Can you tell me the (state), (town), (county), (building), (floor) we are in/on?"); immediate recall (can you clearly and slowly repeat the following words: “key, toothbrush and lamp”); attention and counting (the subject is asked to subtract 7 from 100, and then keep subtracting 7 from the result until the participant is told to stop after 5 subtractions “93, 86, 79, 72, 65”. Also, the participant is asked to spell the word “konst” (i.e., the Swedish word for art) backwards); memory, (i.e., asking the participant to recall the 3 words that s/he repeated in the previously question); language skills (e.g., asking the subject to point at an object and name it) and spatial and constructional ability (the subject is given a piece of paper with a geometric figure on it (pentagon) and is asked to copy it exactly as it is. The performance measure was the overall percentage of correct answers. A number of different cutoffs are used but typically scores above 26 are interpreted to indicate normal cognitive functioning (Folstein et al., 2001) and, based on the original article (Folstein,et al., 1975), scores below 24 indicate impairment. RAN. The Rapid Automatized Naming test is a quick test of cognitive speed (Nielsen et al, 2011) and dementia (Wiig et al., 2010). The RAN test is an individually administered measure designed to estimate an individual's ability to recognize shapes or colors, and to name them accurately and rapidly. The materials consist of three tasks administered visually in a book. The first task (RANC) comprises Color items (black, red, yellow, blue); the second task (RANS) comprises form items (circle, bar, triangle, square), and the third task is based on a combination of color and form items (RANCS). Each task has 40 items. A test trial for each of the two tasks was carried out to check that the participants understood the task and that color-blindness was not an issue. The subjects’ task was to name each stimulus item as quickly as possible without making any mistakes. The dependent variable was based on the amount of time required to name all of the stimulus items on each subtest, where the RANCS was the more complex, and hence, sensitive measure used. The difference measure (i.e. the RANCS minus the sum of RANC plus RANS) is the most sensitive when it comes to disability (Nielsen et al., 2011).Raven’s standard progressive matrices (e.g., Raven, 2008) is used to measure non-verbal reasoning ability (fluid intelligence). Only three (A, D, and E) out of five sets of the Raven test were administered. Set A was used for practice, and the experimenter was allowed to give feedback to the participant. Sets D and E, 12 items each, were administered on paper without feedback and time limit. In each item, there is a missing piece in a pattern. The participant was asked to choose one out of the six alternatives which best completes the overall pattern. The participant responded by marking one of six alternatives with a pencil, on scoring sheets. The test was scored by the sum of points on Sets D and E. The maximum score for each set was 12 and the maximum score for the entire test was 24. OUTCOME TEST BATTERYAll auditory stimuli were generated with a high-quality 24-bit external PC soundcard (ECHO Audiofire 8) at a sampling rate of 22.05 kHz , and transmitted to the microphone of an Oticon Epoq XW behind-the-ear experimental hearing aid in an anechoic chamber (Brüel & Kj?r, type 4232) through a measuring amplifier (Brüel & Kj?r, type 2636). All auditory tests (i.e., the Swedish HINT, the Samuelsson & R?nnberg sentences, and the AIM), used the individually prescribed linear amplification algorithm and a stationary white noise (see Figure 2). In the Hagerman test, we used additional signal processing features and background noises (see below for details). ---------------------------------------------------------------------------Insert Figure 2 about here---------------------------------------------------------------------------The Swedish HINT. A Swedish Hearing In Noise Test (HINT, H?llgren et al., 2006; see Nilsson et al., 1994 for the original test) was performed to measure the speech recognition threshold in noise. The standard HINT procedure was used and the signal-to-noise ratio (SNR) threshold for 50 percent correct sentence recognition was measured using a list of 20 HINT sentences. Both sentence stimuli and noise were first presented at 65 dB SPL and the presentation level of the noise varied according to the participant’s response. The calculation of the SNR was based on averaging the presentation levels of the 5th through the 21st sentences. One practice list was administered.The Hagerman test. Hagerman sentences consist of five words with a proper noun, verb, numeral, adjective, and noun (Example: “Ann had five red boxes”). An adaptive procedure was used to estimate the SNRs at which 50 and 80 % of the words could be repeated correctly by the participant (Brand, 2000). Since we will (in future, more in depth reports) study predictions of aided hearing in noise where cognitive functions are challenged, we manipulated hearing aid features and background noise (Foo et al., 2007; Lunner & Sundewall-Thorén, 2007; Ng et al., 2013b, 2015). We manipulated two hearing aid features: (1) in addition to linear amplification, we employed a condition with fast acting non-linear compression (see below); and (2) noise reduction scheme: with and without a binary mask noise reduction algorithm.Linear and non-linear amplification: Amplification was prescribed to assure audibility based on individual hearing thresholds. Two types of amplification settings, linear and non-linear, were used in the study. The linear amplification, which is based on a voice aligned compression rationale (VAC, see Ng et al., 2013b for technical details), provides a linear gain (1:1 compression ratio) for pure-tone input levels in the range from 30 to 90 dB SPL. All signals and noises used in the Hagerman test were within the region of linear gain and thus unaffected by any compression knee-point or output limiting (see Figure 2).The non-linear amplification with fast acting multichannel (four channels) wide dynamic range compression conditions employs a 10-msec attack time and an 80-msec release time. All channels have a 2:1 compression ratio. Fifteen frequency bands were used for individual frequency shaping according to the audiogram (with the VAC rationale).Noise reduction (NR): The binary mask NR algorithm was used in the present study. This signal processing algorithm reduces the masking effect of interfering speech noise by removing noise dominant spectro-temporal regions in the speech-in-noise mixture (Wang, 2008; Wang et al. 2009). A 64-channel gammatone filterbank followed by time-windowing is applied to speech-in-noise mixtures to form time-frequency (TF) units. For each TF unit in the binary matrix, when the local SNR (Local criterion, LC) of any TF unit is less than 0 dB LC (i.e., when the energy of the noise exceeds the energy of the target speech), that unit is reduced by 10 dB. Otherwise, the TF unit is retained in the binary matrix. In this way, the TF units dominated by the interfering noise in the mixture are segregated from the target signal, and hence the resulting SNR of the processed mixtures becomes more favorable for speech perception even in adverse listening conditions (Brungart et al., 2006). The LC of 0 dB was chosen in this study because it is believed to optimize the SNR gain with the binary masks (Li & Wang, 2009). The processing condition in this study was non-ideal binary mask NR (Wang et al., 2008).Background noise: Two versions of background noise were utilized, (1) unmodulated speech-weighted noise and (2) a multi-talker background (4 talkers, 2 males and 2 females). Since we want to study predictions of aided hearing in noise where cognitive functions are challenged (R?nnberg et al., 2013), we also assessed the Hagerman psychometric functions (50 and 80% SRT) under the following test conditions:1. Linear amplification without noise reduction in an unmodulated noise background.2. Linear amplification combined with noise reduction in an unmodulated noise background.3. Linear amplification without noise reduction in a multi-talker background.4. Linear amplification combined with noise reduction in a multi-talker background.5. Fast-acting compression signal processing in an unmodulated noise background (no noise reduction).6. Fast-acting compression signal processing in a multi-talker background (no noise reduction)The experimental hearing aid was positioned in an anechoic box (Brüel & Kj?r type 4232), and its output was routed via an ear simulator (Brüel & Kj?r type 4157) and lastly presented to the participants through a pair of ER3A insert earphones. With the calibrated set-up it was be possible to provide the individual participant with an individually prescribed frequency response (VAC linear) to ensure the audibility of speech sounds.The Samuelsson & R?nnberg (1993) sentences target implicit and explicit use of scripted constraints. This means that we used well-known scripts like “being in a restaurant” to aid inference-making in lip-reading by combining script headers (and contextual cues, i.e., script headers: e.g., “Restaurant”, “Shop”, and “Train”) with sentence types that optimize implicit and explicit processing, respectively. There are three orthogonal dimensions that characterize script use in relation to context, i.e., sentence abstraction, typicality and temporal order (for normative data, see Samuelsson & R?nnberg, 1993). Typicality and abstraction represent the most powerful determinants of script-based lip-reading (Samuelsson & R?nnberg, 1993; examples from the restaurant script: a basic, typical, and late sentence: “Can we pay for our dinner by credit card?” vs a low-level, atypical, and early sentence: “Can you hang my overcoat beside the dark coat?”).In the present study, the Samuelsson & R?nnberg stimuli were presented binaurally via a pair of insert earphones (ER 3A) in both the auditory and audiovisual conditions (with speech-shaped noise in both presentation modalities), and aligned with the difficulty levels of the HINT materials using an SNR that was 1 dB worse than the SNR obtained for 50% correct on the HINT. HINT sentences which are everyday sentences without context and with relatively high typicality compare well with the Samuelsson & R?nnberg stimuli with no context and high typicality. The order of presentation in terms of modality (auditory and audiovisual), contextual cues, dimensions of typicality, temporal order and abstraction was balanced and the order fixed for all participants. Twenty-four sentences per condition (pooled over the other variables mentioned) constituted the basis for the proportion correct for each condition and participant. AIM. In the Auditory-Inference-Making test, a text-based question on a computer screen (e.g. is Jerker faster than a train?) is first presented. Then another spoken sentence, this time a statement (e.g., a rocket is slower than Jerker) is presented via insert earphones. After that, the participant had to reply to the initial question by pressing either the yes (green) or no (red) button. The auditory stimuli were presented in background of a speech-shaped steady-state noise at the individual’s SRT (50%), determined using the HINT sentences, plus an additional 2 dB. Setting an individual’s base-line SNR in this way allowed sufficient audibility while also making it cognitively demanding to listen. Reaction time and the number of correct responses (max = 16) were dependent measures. SSQ. The Speech, Spatial and Qualities of hearing scale questionnaire (Gatehouse and Noble, 2004) is a self-report assessment consisting of three subscales. The three subscales relate to speech perception, spatial hearing and quality of sound. Each subscale comprises a series of listening situations that the participant marks a response out of 10 on a visual ruler, where a score of 10 indicates complete ability and a score of zero indicates minimal ability (Noble and Gatehouse, 2006). A total of 50 listening situations are rated. Scores for each subscale, as well as the overall total score for the entire test were calculated. Only the total score for each individual was used relating to the general aims of this study (max = 500). The participants had the opportunity to fill out the SSQ at home. Please note that the references listed in the text of the supplement are to be found in the reference list of the article.----------------------------------------------------------------------------------------------------------TABLES 1-3In Tables 1-3, we show the results of the LEVEL 1 factor analyses of some of the HEARING, COGNITION and OUTCOME test variablesTable 1. HEARING factors?Factor loadingPTAsAir-conduction PTA4 (left)0.95Air-conduction PTA4 (right)0.93Bone-conduction PTA4 (left)0.98Bone-conduction PTA4 (right)0.98STMSTM_threshold1_left0.78STM_threshold2_left0.79STM_threshold3_left0.83STM_threshold1_right0.80STM_threshold2_right0.87STM_threshold3_right0.87Table 2. COGNITION factors?Factor loadingPhonologyConsonant gating (auditory) (average IP, in msec)0.36Vowel gating (auditory) (average IP)0.69Consonant gating (audiovisual) (average IP)0.36Vowel gating (audiovisual) (average IP)0.73Vowel duration discrimination (auditory, in msec)0.34Vowel duration discrimination (audiovisual) 0.33Rhyme judgment-0.40LTM Access SpeedRhyme judgment (response time, in msec)0.69Physical matching (PM, response time, in msec)0.58Lexical decision making (LD, response time, in msec)1.00Working memoryNon-word serial recall (NSR, % correct)0.52Reading span (RST, % correct)0.68Semantic word-pair span test (SWPST, % corect)0.68Visuospatial working memory (VSWM, % correct)0.57Executive/Inference-makingText Reception Threshold (TRT, % unmasked text)0.76Sentence completion (SCT, % correct)-0.70Rapid Automatized Naming0.48Logical Inference-making Test (LIT, % correct)-0.37Updating task (% correct)-0.25Shifting cost (in msec)0.18Inhibition task (in msec)-0.04General cognitive abilityMini Mental Test (% correct)1.00Rapid Automatized Naming (difference, in sec)0.12Raven total score (%correct)0.28Table 3. OUTCOME factors?Factor loadingHagerman sentences (SRTs)Hag_SSN_NP_500.85Hag_SSN_NP_800.65Hag_4TB_NP_500.77Hag_4TB_NP_800.73Hag_SSN_NR_500.76Hag_SSN_NR_800.68Hag_4TB_NR_500.78Hag_4TB_NR_800.57Hag_SSN_fast_500.84Hag_SSN_fast_800.70Hag_4TB_fast_500.72Hag_4TB_fast_800.65Samuelsson & R?nnberg sentences (% correct)Auditory, no contextual cues0.88Auditory, with contextual cues0.89Audiovisual, no contextual cues0.57Audiovisual, with contextual cues0.56 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download