NSI Recommendations

Guidelines for Types of Non-Speech Information (NSI)

General Guideline

If a descriptive caption or feature would in any way clarify or enhance the viewer's awareness of the audio, it should be indicated. Consumers prefer that more of such information be included than is often done in current practice.

Background Music

Background music should be indicated, especially if it contributes to the plot or mood of the video. A description of the background music should be given wherever possible.

Sound Effects

Where feasible, a combination of description and onomatopoeia should be used to indicate sound effects. If space or other limitations do not permit the two to be used together, descriptors should be used. Onomatopoeia should not be used alone. A descriptor is particularly important if the source of the sound effect is not obvious from the video.


Continue the practice of using the musical-note icon surrounding the caption. All-caps and upper/lowercase type are equally acceptable for the caption portion.

Multiple Speakers On Screen

Where multiple speakers appear on the screen, placement should be used to distinguish among them. Explicit identification should be used in combination with placement if dialogue is fast, if faces are obscured, if characters are moving, or if other circumstances could confuse the viewer. If the character cannot be identified by name, then a descriptor should be provided. An acceptable format for explicit identification is the character's name or descriptor in upper/lower case, surrounded by parentheses, above the caption and left justified with the caption. Other formats are probably uncontroversial.


Explicitly identify off-screen narrators, rather than using features, such as italics or color, that require the viewer to interpret the feature/code while reading captions.

Whispered Speech

Whispered lines should be identified as such and combined with upper/lowercase captions.


Indicate with italics the emphasized word(s) within a caption.


Use quotation marks when indicating the title of a book, movie, etc.

Audience Reaction

Audience reaction should be captioned. This is particularly important where the reaction itself becomes part of the plot or comedy. Audience laughter should also be described. (It is possible that repeating the descriptor every time the audience laughs, over the length of an entire sitcom episode, would become annoying. This length of exposure was not tested. Therefore, discretion is advised; but audience laughter should be indicated much more often than is now the industry's practice.)

Conveying Emotion

Where strong emotion is conveyed, the emotion should be described with the caption. This feature should be used especially where strong emotion is not entirely obvious in the facial expression and actions of the speaker. Caption writers may be concerned that this feature could be overused. However, based on consumer feedback, caption writers should use this feature more than is current practice.


Indicate foreign or regional accents with a one-time description at the beginning of the character's lines. (Note: This issue was tested only with a fictional character, and probably should not be generalized to other speakers.)


Puns should be explained briefly when feasible.

Guidelines for Features

General Guideline

Consumers have indicated a preference for explicit description or identification over features that assume understanding on the part of the viewer. Examples of such features, requiring interpretation by the viewer, include: use of italics for the entire caption, color, and upper and lower case type without explanation.


Color was not the preferred method of indication in this study, although it was tested in five different circumstances. Color also tested poorly against placement and speaker identification in an earlier study by King and LaSasso (1993). Color is judged unacceptable by more viewers than are many other features. Note that color in real-time captioning (where other options may be problematic) was not tested. (Color in a digital video environment is being studied further by King and LaSasso in 1994-1996).


Flashing captions were not preferred in the two applications tested in this study, and were unacceptable to an appreciable minority of respondents. Further study may be warranted of whether or how to use this feature.


Paint-on captions were tested in only one context, and they were not preferred. Further study may be warranted of whether or how to use this feature.


Italics were less desirable than explicit definition in several contexts. Italics are widely used and should be used less frequently, as their intent is frequently lost on viewers.


Underlining was the last choice of respondents in the two applications tested. Further study may be warranted of whether or how to use this feature.

Quotation Marks

Quotation marks were preferred (contrasted with italics and underlining) for indicating a title.

